Skip to content

Commit

Permalink
📝 Updating documentation removing file tree of project
Browse files Browse the repository at this point in the history
  • Loading branch information
bgcicca committed Jan 2, 2025
1 parent f57f837 commit 29594fa
Showing 1 changed file with 49 additions and 172 deletions.
221 changes: 49 additions & 172 deletions docs/contributors_guide/start.md
Original file line number Diff line number Diff line change
@@ -1,131 +1,20 @@
## Welcome
# Galaxy Language Documentation

Welcome, before starting with code, you must understand the structure of our compiler, this is its current tree:
Welcome to the Galaxy Language project! This documentation provides an overview of the core components of the Galaxy language, focusing on the frontend's Lexer and Parser modules. The goal is to help contributors and users understand the structure and functionality of these modules, enabling efficient development and debugging.

```
.
├── CMakeLists.txt
├── CODE_OF_CONDUCT.md
├── docs
│ ├── contributing.md
│ ├── contributors_guide
│ │ └── start.md
│ ├── first_steps.md
│ └── grammar.md
├── examples
│ ├── a.glx
│ ├── small.glx
│ └── test.glx
├── include
│ ├── frontend
│ │ ├── ast
│ │ │ ├── core.h
│ │ │ └── definitions.h
│ │ ├── lexer
│ │ │ ├── core.h
│ │ │ ├── error.h
│ │ │ └── freeTokens.h
│ │ └── parser
│ │ ├── core.h
│ │ ├── expressions
│ │ │ ├── binary_operations
│ │ │ │ ├── parse_additive_expr.h
│ │ │ │ └── parse_multiplicative_expr.h
│ │ │ ├── parse_assignment_expr.h
│ │ │ ├── parse_expr.h
│ │ │ ├── parse_object_expr.h
│ │ │ ├── parse_primary_expr.h
│ │ │ └── parse_unary_expr.h
│ │ ├── printer
│ │ │ ├── nodes
│ │ │ │ ├── print_assignment.h
│ │ │ │ ├── print_binary_expr.h
│ │ │ │ ├── print_identifier.h
│ │ │ │ ├── print_import.h
│ │ │ │ ├── print_logical_not.h
│ │ │ │ ├── print_numeric_literal.h
│ │ │ │ ├── print_object.h
│ │ │ │ ├── print_package.h
│ │ │ │ ├── print_pre_decrement.h
│ │ │ │ ├── print_pre_increment.h
│ │ │ │ ├── print_program.h
│ │ │ │ ├── print_property.h
│ │ │ │ ├── print_unary_bitwise_not.h
│ │ │ │ └── print_unary_minus.h
│ │ │ ├── print_ast.h
│ │ │ ├── print_indent.h
│ │ │ └── visited.h
│ │ └── statements
│ │ ├── parse_import_stmt.h
│ │ ├── parse_package_stmt.h
│ │ └── parse_stmt.h
│ └── utils.h
├── LICENSE
├── README.md
└── src
├── frontend
│ ├── CMakeLists.txt
│ ├── lexer
│ │ ├── CMakeLists.txt
│ │ ├── lexer.c
│ │ ├── lexer_error.c
│ │ └── lexer.test.c
│ ├── node_definitions
│ │ ├── ast.c
│ │ └── CMakeLists.txt
│ └── parser
│ ├── CMakeLists.txt
│ ├── expressions
│ │ ├── binary_operations
│ │ │ ├── parse_additive_expr.c
│ │ │ └── parse_multiplicative_expr.c
│ │ ├── parse_assignment_expr.c
│ │ ├── parse_expr.c
│ │ ├── parse_object_expr.c
│ │ ├── parse_primary_expr.c
│ │ └── parse_unary_expr.c
│ ├── parser.c
│ ├── parser.test.c
│ ├── printer
│ │ ├── nodes
│ │ │ ├── print_assignment.c
│ │ │ ├── print_binary_expr.c
│ │ │ ├── print_identifier.c
│ │ │ ├── print_import.c
│ │ │ ├── print_logical_not.c
│ │ │ ├── print_numeric_literal.c
│ │ │ ├── print_object.c
│ │ │ ├── print_package.c
│ │ │ ├── print_pre_decrement.c
│ │ │ ├── print_pre_increment.c
│ │ │ ├── print_program.c
│ │ │ ├── print_property.c
│ │ │ ├── print_unary_bitwise_not.c
│ │ │ └── print_unary_minus.c
│ │ ├── print_ast.c
│ │ ├── print_indent.c
│ │ └── visited.c
│ └── statements
│ ├── parse_import_stmt.c
│ ├── parse_package_stmt.c
│ └── parse_stmt.c
└── main.c
```
## Frontend

Don't be scared, it's easier than it looks, let's start with the frontend!
### Lexer

## frontend
The Lexer is located in the `frontend/lexer` folder, comprising three main files:

#### Lexer:
```
.
├── CMakeLists.txt
├── lexer.c
├── lexer_error.c
└── lexer.test.c
```
- **lexer.c**: Responsible for identifying and tokenizing the essential parts of your code.
- **lexer\_error.c**: Handles tokenization errors with concise and detailed error messages.
- **lexer.test.c**: Ensures the Lexer works correctly through unit tests.

#### Token List

In the lexer folder we have 3 **main** files: **lexer.c**, **lexer.test.c** and **lexer_error.c**. Let's start by talking about [lexer.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/lexer/lexer.c), the lexer is responsible for identifying and tokenizing the important parts of your code, so that the parser can generate the ast and pro semantic check the syntax. Our lexer so far has these tokens:
The Lexer generates the following tokens:

```
TOKEN_UNKNOWN,
Expand Down Expand Up @@ -179,63 +68,51 @@ In the lexer folder we have 3 **main** files: **lexer.c**, **lexer.test.c** and
TOKEN_EOF
```

> Where each of these tokens has a responsibility, which I already mentioned above, lexer.c depends on the header files core.h, error.h and utils.h, which are in the include folder.
Each token has a specific role in facilitating syntax analysis and semantic checks by the parser.

Currently [lexer_error.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/lexer/lexer_error.c) is our error handler, it is responsible for handling tokenization errors in a concise and detailed way, so that the error messages are better to understand where these errors came from, or what is These errors, tokenization errors, are the only ones that take care of.
#### File Dependencies

Currently [lexer.test.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/lexer/lexer.test.c) is our test file, it is very important to update it whenever you add or change a token, and always before committing, try to run unit tests, this is very important to maintain the quality of our language.
- `lexer.c` depends on header files located in the `include` folder, including `core.h`, `error.h`, and `utils.h`.

#### Parser:
#### Error Handling

```
.
├── CMakeLists.txt
├── expressions
│ ├── binary_operations
│ │ ├── parse_additive_expr.c
│ │ └── parse_multiplicative_expr.c
│ ├── parse_assignment_expr.c
│ ├── parse_expr.c
│ ├── parse_object_expr.c
│ ├── parse_primary_expr.c
│ └── parse_unary_expr.c
├── parser.c
├── parser.test.c
├── printer
│ ├── nodes
│ │ ├── print_assignment.c
│ │ ├── print_binary_expr.c
│ │ ├── print_identifier.c
│ │ ├── print_import.c
│ │ ├── print_logical_not.c
│ │ ├── print_numeric_literal.c
│ │ ├── print_object.c
│ │ ├── print_package.c
│ │ ├── print_pre_decrement.c
│ │ ├── print_pre_increment.c
│ │ ├── print_program.c
│ │ ├── print_property.c
│ │ ├── print_unary_bitwise_not.c
│ │ └── print_unary_minus.c
│ ├── print_ast.c
│ ├── print_indent.c
│ └── visited.c
└── statements
├── parse_import_stmt.c
├── parse_package_stmt.c
└── parse_stmt.c
```
- **lexer\_error.c** manages tokenization errors, ensuring messages are informative and help identify the source of issues.

#### Testing

- **lexer.test.c** contains unit tests. Always update tests when adding or modifying tokens and run them before committing changes.


### Parser

The Parser is the next step after tokenization. Its main responsibilities are to construct the Abstract Syntax Tree (AST) and validate the program's syntax.

#### File Structure

- **parser.c**: The main entry point for parsing.
- **parser.test.c**: Unit tests for the Parser.
- **expressions/**: Handles various types of expressions (e.g., binary, unary, object).
- **statements/**: Processes program statements (e.g., package, import).
- **printer/**: Contains utilities for printing the AST.

#### Parsing Process

The Parser begins by creating the **Program Node**, which holds all statements. It then:

1. Calls `parse_stmt.c` to process statements such as:
- **Package statement** ([parse\_package\_stmt.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/statements/parse_package_stmt.c))
- **Import statement** ([parse\_import\_stmt.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/statements/parse_import_stmt.c))
2. If no statement is detected, it processes expressions using `parse_expr.c`.

On the parser we have the main entry being [parser.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/parser.c) and the tester for it being [parser.test.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/parser.test.c).
#### Expression Parsing

The parser structure runs in a specific order: it starts creating the **Program Node**, which holds all the program statements. Then it calls [parse_stmt.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/statements/parse_stmt.c), which handle the current statements:
- **Unary Expressions** ([parse\_unary\_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_unary_expr.c))
- **Assignment Expressions** ([parse\_assignment\_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_assignment_expr.c))

- Package statement ([parse_package_stmt](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/statements/parse_package_stmt.c))
- Import statement ([parse_import_stmt](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/statements/parse_import_stmt.c))
Expressions are processed hierarchically:

And if neither of the statement options above is detected, it runs [parse_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_expr.c), which handle the expressions such as:
- Objects ([parse\_object\_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_object_expr.c))
- Binary Operations ([parse\_additive\_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/binary_operations/parse_additive_expr.c)[parse\_multiplicative\_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/binary_operations/parse_multiplicative_expr.c))
- Primary Expressions ([parse\_primary\_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_primary_expr.c))

- Unary Expressions ([parse_unary_expr](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_unary_expr.c))
- Assignment Expressions ([parse_assignment_expr](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_assignment_expr.c))

The **assignment expression** works as parsing first of all **objects** ([parse_object_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_object_expr.c)), that checks if the current token is in fact the expected for a object parsing. If it's not, it calls the series of **binary operations** ([parse_additive_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/binary_operations/parse_additive_expr.c), which calls [parse_multiplicative_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/binary_operations/parse_multiplicative_expr.c)) and finnally calls the last parseable type of node in **primary expressions** ([parse_primary_expr.c](https://github.com/galaxy-lang/galaxy/blob/main/src/frontend/parser/expressions/parse_primary_expr.c)).

0 comments on commit 29594fa

Please sign in to comment.