Data transformation ("normalization"?) recipes #695
MattF-NSIDC
started this conversation in
QGreenland-Net development
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
An element of our proposal is shifting data processing to a cloud platform and saving the outputs as derived dataproducts (but with all the niceties of a "real" dataproduct, like DOI?). QGreenland would be decoupled from data processing and rely on ready-to-use data being available on a public data repository.
How do we manage the transformation steps for each data product?
That's what I'd like to discuss here :)
We've thrown around the word "recipe" as the configuration artifact representing the transformation steps. Pangeo Forge is an example of a community-owned repository of data recipes that we could be inspired from. If we start a similar recipe-repository organization in GitHub, people could use that for creating all kinds of derived data products for various use cases (after all, there is no one best geospatial data representation for all use cases) that have nothing to do with QGreenland.
When data isn't formatted ready-to-use, researchers and other data users have to all do duplicate work to make the data ready-to-use. This recipe repository could (a) save that duplicate effort and provide users with access to ready-to-use versions of their desired data; (b) provide concrete examples to researchers of how they could make their data more ready-to-use out of the gate.
If we something like what Pangeo Forge already does, we should get in contact with them and see how we can work together or share resources.
Beta Was this translation helpful? Give feedback.
All reactions