The Polish name for seagull. It's difficult to find short memorizable names. I got stuck to names of birds in the Polish language without non-ASCII characters.
The intermediate representation of LLVM (https://llvm.org/). The LLVM IR has a textual representation that is used as the intermediate format in the examples of Mewa. The tests also use the programs lli (interpreter) and llc (compiler) that can take this textual representation of LLVM IR as input and run it (lli) or translate it into a binary object file (llc).
Any item addressable in a program is a type. In Mewa types are represented by an unsigned integer number that is created by the typedb library from a counter. Every type definition has a tuple consisting of a (context type) and a name as the key that identifies it plus some optional (parameter) definitions attached.
A rule to derive a type from another with a description called constructor that represents or implements the construction of an instance of the target type from the source type attached.
A state transition occurring after the last item of a production has been parsed, replacing the right side of the production with the left side on the parser stack.
Pair of integer numbers that address a subtree of the AST. The first number defines the start of the scope and the second number defines one number after the last step that belongs to the scope. The scope defines the validity of a definition in the language defined. A definition is valid if the scope includes the scope-step of the instruction that queries the definition.
- Item 'ABC' defined in scope [1,123]
- Item 'ABC' defined in scope [5,78]
- Item 'ABC' defined in scope [23,77]
- Item 'ABC' defined in scope [81,99]
The Query for a type 'ABC' in an instruction with scope-step 56 assigned, returns the 3rd definition.
Some other compiler models represent hierarchies of data structures by lexical scoping. In Mewa best practice is considered to represent visibility in hierarchies of data structures with context types and not by scope.
Counter that is incremented for every production in the grammar marked with the operators >> or {}. The scope-step defines the start and the end of the scope assigned to productions by the scope operator {}.
A scope starts with the scope-step counter value when first entering the traversal of an AST node with a production marked as {}. It ends with the scope-step increment after exiting the traversal of the AST node.
The type used as first parameter in a type declaration is called the context type of the definition. The context type is either a type defined before or 0 representing the absence of a context type or the global context. Context types are used to describe relations like membership. They are also used to express visibility rules.
A constructor implements the building of an instance representing a type. It is either a structure describing the initial construction of the instance or a function describing the derivation of an instance from the constructor of the derived type.
Type parameters if not nil are represented as a list of (type) / (constructor) pairs. Type parameters are treated as attributes and interpreted by the typedb library only to check for duplicate type definitions in the same scope. Besides that, they are also printed as part of the type in its representation as string. Any other interpretation is up to the Lua part of the compiler.
To prevent a mess in the glossary we refer to a constructor of an object in the programming language our compiler translates as ctor. The corresponding destructor is called dtor.
A rule the grammar describing the language. Further reading in Wikipedia: Context Free Grammar.
A basic item of the language. Further reading in Wikipedia: Context Free Grammar.
A named structure of the language. Further reading in Wikipedia: Context Free Grammar.
Languages to describe a context free grammar. There exist many dialects for a formal description of a context free language grammar based on BNF (Backus-Naur form). The most similar to the grammar of Mewa is the language used in Yacc/Bison. Further reading in Wikipedia: Backus-Naur form and Wikipedia: Extended Backus-Naur form.
The class of language covered by Mewa. Further reading in Wikipedia: Context Free Grammar.
A component that scans the input and produces a stream of items called tokens that are the atoms the grammar of the language is based on. In Mewa the lexer is defined as a set of named patterns. The patterns are regular expressions that are matched on the first/next non-whitespace character in the input. Contradicting lexeme matches are resolved in Mewa by selecting the longest match or the first definition comparing matches of the same length as the token value emitted.
The intermediate representation of the program. The output of the program parser. The AST in Mewa is described here.
The types are split into the following categories, each category having a different constructor function interface.
These types (const and non-const instance) are referring to an address of a variable. Similar to an lvalue in C++.
These types (const and non-const instance) are referring to a value of a variable but are also used for the pure type (without qualifiers). Similar to an rvalue in C++.
These types (const and non-const instance) are an internal representation of a reference type whose address has not yet been determined. They are mainly used as the return type of a function where the return slot is provided by the caller. The address of this reference type is injected by an assignment constructor.
These types are used as context type for accessing private members (in class methods).
This type is used as a self reference in constructors. It redirects assignment calls to constructor calls during the initialization phase of the constructor. During this phase, the elements of a class are initialized in the order of their definition as members of the class. The initialization phase is completed when the first method is called or the first member is accessed otherwise than through the assignment operator redirected to a constructor call. The initialization of members not explicitly initialized is implicitly completed in the constructor.
Most language specifications require the evaluation of a boolean expression to terminate as early as the result is guaranteed, even when some branches of the expressions are undefined. Thus if (a && a->x)
should evaluate to false if a is NULL without trying to evaluate the undefined branch a->x
that would lead to a segmentation fault on most platforms. For representing boolean expressions we define the types controlTrueType that contains the code that is executed for the expression remaining true and contains an unbound label in the out variable where the code jumps to when the expression evaluates to false. The mirror type of the controlTrueType is the type controlFalseType that contains the code that is executed for the expression remaining false and it contains an unbound label in the out variable where the code jumps to when the expression evaluates to true.
As a class of types, they are also referred to as Control Boolean Types.
Pointer types are implicitly created types. A pointer type is created when used the first time.
Constants in the source and expressions built from constants are represented by the following types
- constexprIntegerType const expression integers implemented as arbitrary precision BCD numbers
- constexprUIntegerType const expression unsigned integer implemented as arbitrary precision BCD numbers
- constexprFloatType const expression floating-point numbers implemented as 'double'
- constexprBooleanType const expression boolean implemented as boolean true/false
- constexprNullType const expression null value
- constexprStructureType const expression tree structure implemented as a list of type/constructor pairs (envelop for structure recursively resolved)
The frame object defines the context for implicit cleanup of resources after the exit of the scope the allocation frame is associated with. Every form of exit has a chain of commands executed before the final exit code is executed. The allocation frame provides a label to jump to depending on the current scope-step and the exit code. With the jump to this label, the cleanup followed by the exit from the allocation frame is initiated.
The callable environment holds the data associated with a callable during the processing of its body. Such data are for example the generators of registers and labels, the list of allocation frames holding the code executed in case of exceptions, the return type in case of a function, the initialization state in case of a constructor, some flags that indicate some events needed for printing the function declaration, etc...
For first-class scalar types we also need to look at the 2nd argument to determine the constructor function to call.
The multiplication of an int
with a double
is defined as the conversion of the first operand to a double
followed by a multiplication of two double
s.
This is an example of a promote call. It is "promoting" the first argument to the type of the second argument before applying the operator.