Replies: 3 comments 7 replies
-
At Whatnot, we have tried three approaches (in order):
The single code location, with a single repo, is by far the most intuitive for new (and old, honestly) users. The conclusion I came to was that you should use multiple code locations only when Python requirements for jobs begin to diverge. In most other cases, Within our one repo, we have our Python module organized like so:
What is nice about this structure is that all of the repositories exist as submodules within the project, so they can easily be added or removed without impacting any other repositories. System-level utilities such as resources, hooks, sensors, and default job configs -- we have a A typical
After iterating a bunch, I'm very happy with this approach, and I can see it scaling to a large number of repos nicely. The one drawback with this approach is that you have "all your eggs in one basket", so to speak, so if someone pushes a change to prod that stops the code location from compiling, you have a big problem. But you can mitigate that risk pretty well with good testing and dev processes. |
Beta Was this translation helpful? Give feedback.
-
Hello, In fact I am using Copier instead of dagster new-project but the idea is the same. |
Beta Was this translation helpful? Give feedback.
-
I am trying to create a dagster structure could handle something like this. Think that each node could potentially growth and change |
Beta Was this translation helpful? Give feedback.
-
Hi everyone, I’m looking to learn more about how you organize your complex Dagster projects and the troubles you are facing when organizing the code. Besides, when structuring your projects, are there any goals you’re trying to optimize?
We’re working on a more comprehensive recommendation for Dagster project structure. In the past, we’ve come up with the experimental Create a New Project which comes with a CLI command that scaffolds a project skeleton, but we’ve heard great feedback about it not being well suited for larger complex and prod-scale projects. So I’d love to learn from you about any good practices when you design and structure your data projects.
Beta Was this translation helpful? Give feedback.
All reactions