-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator Name "Gotchas" (Function Application Operators, Others) #75
Comments
Some few comments:
(* WMA both CLI and Notebook interface *)
In[1]:= X=Subscript[a,b]
Out[1]= a
b Internally, it is stored as an expression, with
Its InputForm still is a Subscript:
And only after formatting it is converted into a
The conversion to
Other boxes work in the same way. |
Regarding In Mathics we still have a rudimentary implementation of these symbols, but I guess there is no need to store in MathicsScanner specific details about how to parse and render them. |
To clarify, I am not concerned in this discussion about how different (M-expression) functions are evaluated. Rather, I am talking about the M-expression representation of operator forms. So for example, in the expression |
Oops, I included it in the table by accident. The operator |
Ah, OK, what you are talking about is the "string representation of boxes" So, when you write some string between
In Mathics, it works exactly in the same way:
|
Maybe it's worth talking about what it means to "parse" a Mathematica expression. Here are a few different meanings, all of which are valid and useful for different purposes.
For my own purposes, I spend most of my time thinking about (3). But all three require the parser has knowledge of operator properties. Observe that once (1) is obtained, the operators can just be represented by their own (quoted) textual representation, which is what the frontend does. So there is a sense in which the names of expressions are unimportant. But in a way I feel like this work is part of making (3) happen, which is a necessary step for evaluation. |
OK, but in any case, in the table, where you write |
It took me a while to understand this, and without @mmatera's follow-up discussion I doubt I would have. I agree with @mmatera that the "Usage String" should be corrected: While the idea of "boxing" is not new — it comes from Knuth's TeX and CSS adopts this idea — using operators in boxing expressions as found in WL, is a bit new and rare. It is not the kind of thing that I would suspect most people would think of when thinking about WL or Mathics3 parsing. As far as I can tell, the "wrong name" for the association between So in sum, while this may be a minor problem in what As for thoughts on how we should address this: yes, I guess these should be added to |
Apparently, there are two uses we have in Mathics3. In WL these are built-in functions ( In
Things are not ideal, so let me explain Initially, MathicsScanner was created because Mathics-core was too large and I wanted to break off pieces of this. So right from the start, I knew things were a little coarse. Before the recent addition of YAML for operators, the MathicsScanner github repository had two things primarily: the Python Mathics3 scanner code, and "Named Character" information The project mathicsscript, mathics-pygments, and mathics-core do use information from Of course, some named characters also happen to be operators. And this is also noted — we indicate when a named character is used as either operator or part of an operator. Recently, I added a new YAML table from Robert Jacobson's CSV. This is used to gather information about operators that we can use in a machine-readable way. In fact, right now it is used in Mathics-core for operator precedence information. In an ideal world, we would split the data portion from mathics-scanner. (It should have been done back in January 2021, but things were messier then and and we had far fewer unit tests; so it would have been beyond my capabilities). Right now though that hasn't been a big priority for me. If someone else wants to do though, go for it! But it will be a bit of work since 3 other repositories will have to get adjusted to point the to be split off new repository. |
Very productive discussion! I might have confused matters with a typo:
Oops.
Yes, also called the box sublanguage. The fact that operators like But this raises a software engineering question of how to design the parser for the box sublanguage. This is because |
Indeed, it seems that the WMA parser takes into account the precedence when it parses the "sublanguage". For example,
Also, if you continue the expression with a "normal" multiplication, what you get is the same as if you had write the Box expression explicitly:
Regarding if it actually matters, I think it does, because our parser should be able to parse this kind of inputs. |
This issue is to discuss and track the following work I propose to do. For the sake of limiting scope, this issue is restricted to working on "incorrectly" named operators (explained below).
The purpose of this work is to make Mathics' operator database more complete, correct, and compatible.
Background
In a small handful of cases, Mathematica gives the wrong name to an operator. The primary reason, as far as I can tell, is that Mathematica gives preference to the typographical representation of operators rather than to their computational/semantic meaning. Thus, the naming convention used in Mathematica maps the name of the operator to the function that formats an expression typographically like the operator rather than the underlying function implementing the operator's computational function. There are 13 cases.
{"OverscriptBox", "[", "expr1", ",", "expr2", "]"}
expr1&expr2
{"UnderscriptBox", "[", "expr1", ",", "expr2", "]"}
expr1+expr2
{"UnderoverscriptBox", "[", "expr1", ",", "expr3", ",", "expr2", "]"}
expr1&expr2\%expr3
{"UnderoverscriptBox", "[", "expr1", ",", "expr2", ",", "expr3", "]"}
expr1+expr2\%expr3
*expr
{"SubscriptBox", "[", "expr1", ",", "expr2", "]"}
expr1_expr2
{"SubsuperscriptBox", "[", "expr1", ",", "expr2", ",", "expr3", "]"}
expr1_expr2\%expr3
{"expr1", "[", "expr2", "]"}
expr1[expr2]
expr1@expr2
{"expr2", "[", "expr1", ",", "expr3", "]"}
expr2[expr1, expr3]
expr1~expr2~expr3
Infix[f[x,y]]
will display asx~f~y
. Precedence identifies Infix with this operator, andPrecedence[Infix]==30
which is almost correct.{"SubsuperscriptBox", "[", "expr1", ",", "expr3", ",", "expr2", "]"}
expr1\^expr2\%expr3
{"SqrtBox", "[", "expr", "]"}
\@expr
{"Integrate", "[", "expr1", ",", "expr2", "]"}
Integrate[expr1, expr2]
∫expr1expr2
{"expr2", "[", "expr1", "]"}
expr2[expr1]
expr1//expr2
Postfix[f[x]]
will display asx//f
. Precedence identifies Postfix with this operator.For some of these, it is obvious that they are misnamed based on how they are parsed.
OverscriptBox
, for example, is calledOverscript
by Mathematica but is parsed asOverscriptBox
. The case of what I callFunctionApplyInfix
, for example, is fundamentally the same, but is easy to misunderstand because the underlying semantic meaning is function application which has no corresponding named function. The Mathematica functionInfix
, despite being the name Mathematica gives this operator, is not the corresponding functional meaning of this operator! TheInfix
function is concerned with how a function is displayed.An interesting case is
FunctionApplyPrefix
, which Mathematica callsPrefix
. Again, the Mathematica functionPrefix
is a directive for displaying a function. The@
operator is actually an alias (sort of) for the square brackets operator[ ]
, which does not have a name in Mathematica—at least it didn't have a name until theConstruct
function was introduced!Challenges
I am assuming in this discussion that operators should be given the name of their underlying (functional) semantic meaning, that is, they should have the same name as the function they are parsed into. There are a few realities that challenge this assumption:
UnaryPlus
vs.Plus
, both of which Mathematica just callsPlus
.I haven't made a thorough survey of what Mathics is doing with the 13 operators I identify in the table above, but I think just choosing reasonable alternative names will not present any problems. Mathics already does this for postfix
&
, for example. But again, part of the work is to figure out what might break.@rocky should check that this all makes sense.
The text was updated successfully, but these errors were encountered: