Analytics / data science projects with Dagster #7222
yuhan
started this conversation in
Show and tell
Replies: 1 comment
-
What I personally found helpful is
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This discussion is for folks to share how you use Dagster for analytics or data science projects.
Question was originally asked in Dagster Slack. Reposting it here for posterity and discoverability.
"""
Question on project best practices (with regards to analytics/data science pipelines i.e. reading in a csv, dropping columns, renaming columns, filtering, modelling, etc.): what is the recommended project structure? I searched a bit on the dagster documentation and found 'Create a new project' but that seems more geared toward generating a general project skeleton. We've got a bunch of ipynb files that perform each step in the pipeline - does it make more sense to: wire them together using dagstermill? Convert them into python files and then wire them together with dagster? It's helpful to see cell outputs when the process runs, but it seems like you lose a bit of flexibility. Additionally when we are quickly validating data, conducting EDA, etc. it's nice that I can see plots/visualizations right in an ipynb. Before I embarked on a full blown conversion, just thought I'd check with the community of experts! Thank you in advanced!
"""
Beta Was this translation helpful? Give feedback.
All reactions