This repo holds the schema and associated scripts used by the Dataset Credit Engine.
The Dataset Credit Engine is a project aimed at ensuring that appropriate citation information exists for data entering and/or produced by biological and environmental research platforms to allow credit to be attributed to those who produced the data.
The dataset credit metadata schema is maintained in LinkML format; other formats (including the python class) can be generated from the LinkML schema file.
See the LinkML documentation for full details on using the LinkML format and the related tools.
Full schema documentation can be found at https://kbase.github.io/credit_engine/.
Generated from the Pydantic version of the Dataset Credit Metadata Schema using erdantic.
See below for how to regenerate the ER diagram after making changes to the schema.
This repo uses uv to manage the python environment and dependencies.
See the uv docs for uv installation instructions.
Install the project dependencies and create a virtual environment:
uv sync
Run tests or other scripts:
uv run <command>
uv run pytest tests/
These assume that you have already run uv sync
to install the credit engine virtual environment and dependencies.
generate derived files in all formats and save them to the project
directory:
uv run gen-project -d project/ schema/dcm/linkml/credit_metadata.yaml
lint the LinkML schema file:
uv run linkml-lint -f terminal schema/dcm/linkml/credit_metadata.yaml
validate data (in file data.yaml
) against the schema:
uv run linkml-validate -s schema/dcm/linkml/credit_metadata.yaml data.yaml
generate JSON Schema version:
uv run gen-json-schema schema/dcm/linkml/credit_metadata.yaml > schema/dcm/jsonschema/credit_metadata.schema.json
generate Python classes:
uv run gen-python schema/dcm/linkml/credit_metadata.yaml > schema/dcm/python/credit_metadata.py
generate Pydantic classes:
uv run gen-pydantic schema/dcm/linkml/credit_metadata.yaml > schema/dcm/python/credit_metadata_pydantic.py
generate an ER diagram from the Pydantic classes using erdantic (assumes that erdantic has been installed already):
uv run erdantic schema.dcm.python.credit_metadata_pydantic.CreditMetadata -o schema/dcm/dcm-schema.png
generate a YUML schema diagram (can be visualised at yuml.me):
uv run gen-yuml schema/dcm/linkml/credit_metadata.yaml
install the JSONschema check script:
# install with Homebrew
brew install check-jsonschema
or
# install with pip
pip install check-jsonschema
To test a file or files against the schema, use the command:
check-jsonschema --schemafile schema/dcm/jsonschema/credit_metadata.schema.json data_file_1.json data_file_2.json
or
check-jsonschema --schemafile schema/dcm/jsonschema/credit_metadata.schema.json sample_data/**/*_dcm.json