Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove DerivedDataDescription class #1288

Open
saskiad opened this issue Feb 28, 2025 · 6 comments
Open

remove DerivedDataDescription class #1288

saskiad opened this issue Feb 28, 2025 · 6 comments

Comments

@saskiad
Copy link
Collaborator

saskiad commented Feb 28, 2025

I don't think we need separate DerivedDataDescription class. The process field doesn't belong in the data description ... or is this used to automate the naming?

@saskiad saskiad added this to the data-schema 2.0 milestone Feb 28, 2025
@dyf
Copy link
Member

dyf commented Mar 5, 2025

this class does two things:

  1. automate naming
  2. automate creating a derived data description based on a primary data description

if you want to get rid of the class I would vote to keep the utility methods

@saskiad
Copy link
Collaborator Author

saskiad commented Mar 6, 2025

If you poke around the database, you'll see the derived data description is one of the thing that is missing the most. People aren't using this.
I'm fine having it as a utility function, but I think it's confusing having it as a separate class

@saskiad
Copy link
Collaborator Author

saskiad commented Mar 6, 2025

I also just don't think the process_name belongs in the data description. So having this as a function where the process_name is a variable to the function used to create the name would be good, but adding a field that doesn't belong in the data description because we're using this class to create the name feels off to me.

@tmchartrand
Copy link
Member

I agree on all counts here, moving this to utility functions will be less confusing in my view.

@dbirman
Copy link
Member

dbirman commented Mar 10, 2025

Collapse into a single class

  • related_data goes away
  • input_data moves up to DataDescription becomes an Optional[], with a validator
  • label -> tags List[str]
  • UpgradeToDerivedDataDescription but less long becomes a utility function

@tmchartrand
Copy link
Member

@dbirman sounds good, and also resolves #1045. I would just add the suggestion that the description should clarify the intent of the input/related data field is not to be comprehensive (that should be in the processing side) but to list a single or few primary inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants