Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve code generator design #199

Open
agronholm opened this issue Apr 26, 2022 · 13 comments
Open

Improve code generator design #199

agronholm opened this issue Apr 26, 2022 · 13 comments

Comments

@agronholm
Copy link
Owner

I'm pretty unhappy with the current design of the code generator system. There's duplication of logic between the model generation and rendering parts. The rendering system is not nearly flexible enough. There's a lot of awkwardness that I see everywhere.

Here's how I would like the code generation process to work:

  1. Generator receives a MetaData object
  2. Generator iterates through the tables, generating models and module variables
  3. Generator assigns names to tables, classes and class attributes, potentially aliasing existing imports
  4. Generator renders imports, module variables, tables and classes

Import collection must happen at or after phase 2, so that they could potentially be aliased to avoid conflicts with module variables and attribute names. This is preferable to renaming attributes to avoid clashes. In order for this to work, we need to factor out actual Columns and Tables out of the abstract model that is passed to the renderer. Any callables referred from the model must be abstracted so that they could be dynamically renamed if their relevant imports end up being aliased to avoid a name clash.

@leonarduschen do you see any obvious potential problems with this plan?

@agronholm agronholm added this to the 3.0 milestone Apr 26, 2022
@leonarduschen
Copy link
Contributor

Nope, I don't see any obvious problems with the plan, I think we can definitely benefit from a higher degree of abstraction between the components of the generated code.

A few things that came to mind:

In order for this to work, we need to factor out actual Columns and Tables out of the abstract model that is passed to the renderer. Any callables referred from the model must be abstracted so that they could be dynamically renamed if their relevant imports end up being aliased to avoid a name clash.

Beyond the problem of name clashes, abstraction of Constraints/Indexes will also help in solving:

  • Code duplication when determining if constraint should rendered as part of column (e.g., Column(..., primary_key=True)) or as table args (e.g., PrimaryKeyConstraint(...)), currently the logic is duplicated in render_column and collect_imports_for_constraints
  • Somewhat awkward implementation on deciding whether a Constraint/Index should have an explicit name - and what the name should be

Import collection must happen at or after phase 2, so that they could potentially be aliased to avoid conflicts with module variables and attribute names.

There could also be conflict between module variables and attribute names (e.g., if model has the column name "metadata"), so module variables will also need to be abstracted. So in the event of name clashes, I think we can prioritize like so: attribute names, module variables, imports.

@agronholm
Copy link
Owner Author

Yes, we seem to be on the same page as for what to do. I've already started writing the new code. If this works out, it will remove the last obstacle before v3.0.0 final.

@agronholm
Copy link
Owner Author

agronholm commented May 1, 2022

I have 6 2 of 39 tests failing in the tables generator tests. No major roadblocks in sight. The new generator code already looks so much better than the old one.

@agronholm
Copy link
Owner Author

All tests are passing locally for TableGenerator. Next I will update DeclarativeGenerator.

@leonarduschen
Copy link
Contributor

How's the refactoring going? If you're short on time and if it is at a state where you can offload some work, I'd be happy to help out

@agronholm
Copy link
Owner Author

It's progressing, but I haven't worked on it in a couple weeks. It's normal for me to juggle between my various projects. I would like to first finish my proposal before I let others work on it, so as to not get the wrong idea about my aims there. With the declarative generator, I was facing a choice: to either convert a generated table model into a class model, or to generate a class model from scratch. The latter option may require more code, but the former option required me to add the original table name to the model – something the table generator approach didn't require. So I'm still pondering my options here.

@agronholm
Copy link
Owner Author

Now that I looked at the code again, I think I figured out a way to do it with the table model conversion approach.

@agronholm
Copy link
Owner Author

I've been working on this all day today. 16 or 37 tests fail, most either due to yet missing m2m relationship support, others due to the missing name conflict fixer code.

@agronholm
Copy link
Owner Author

Back to 20 tests failing, but m2m relationship support works now. Once the declarative tests start passing, I expect the rest of the work will be substantially easier.

@agronholm
Copy link
Owner Author

11 tests failing. Most are about testing for attribute name/import conflicts.

@agronholm
Copy link
Owner Author

The declarative generator passes now. The relationship generation method is 273 lines long and I'm looking to reduce some of the duplication there, but I will focus first on getting the entire test suite to pass.

@agronholm
Copy link
Owner Author

I've pushed my changes so far to the generator-rewrite branch.

@leonarduschen
Copy link
Contributor

Thanks for the updates!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants