-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine graduation process for contributions #581
Comments
My suggestions of how the graduation process should look like for datasets (of course heavily depends on #583): Graduation process for experimental datasetsWe should consider graduation of a dataset when:
Steps to Graduate an Experimental Dataset:
|
Whilst I think this is a perfectly well defined process - I worry the governance overhead is higher than the reactive pattern we adopt today. I would like to get some clarity on the actual problem we're solving:
My worry is that I've seen well intentioned codified processes fail to sustain themselves the minute priorities change / people move teams etc. My gut feeling is that this falls into that category. |
From my understanding, the goal is to attract more contributions while not lowering the standard of the 1st party supported datasets. Who will be responsible for the graduation process? Is it the dataset owner or the core Kedro team? I also think we need to do a better job to document "how" to achieve all above regardless of the graduation process. The goal is that someone who want to contribute a Polar dataset shouldn't worry about fixing a mypy issue that they don't understand or random RTD fail/linting error. Ideally, they should be able to run tests locally without relying on the CI. They shouldn't worry about how to fix the dependency hell issues (or at least Kedro core team would help to solve this, the dataset owner should care about their own datasets only) |
Additionally, usage should be an important factor in this. I guess this links to the vision of counting datasets used by Kedro-Telemetry. |
Largely happy with this!
Nit: Reword to all versions supported by Kedro/under NEP 29.
As mentioned in #583, on a case-by-case basis I can understand not having support for Windows. But this is the exception, rather than the rule, and should also be noted clearly.
Also, dependencies should be relaxed as much as possible, to avoid conflicting with other core datasets, and not provide poor user experience. :)
We can help maintain. We should encourage the author to keep involved as much as possible! |
The suggested steps seem clear and perfectly defined! One thought that came to my mind is how we will motivate contributors of experimental datasets to update them for graduation. If we give them an instrument to simplify their contributions, it can be hard to push for further updates since their goal is reached with less effort. At the same time, we want to avoid situations where experimental datasets are growing while regular datasets are never updated. So we should probably stay aware of what other users utilise (via telemetry or pip) and take graduation on ourselves if we decide it's in demand. |
Closing this in favour of continuing the discussion in #583 Bottom line is that the following make up the graduation process:
|
Description
Datasets within the experimental contributions folder may evolve and improve over time. Successful and well-maintained contributions can graduate from the experimental folder and move to the regular
kedro_datasets
space.We need to establish clear criteria and guidelines for determining how a experimental contribution can graduate.
Task
The text was updated successfully, but these errors were encountered: