Skip to content

Latest commit

 

History

History
102 lines (67 loc) · 3.22 KB

README.md

File metadata and controls

102 lines (67 loc) · 3.22 KB

Collector for dcc Datasets

Build Status Coverage Status

This script ...

Development

Environment

Development is currently done using Python 3.12. We recommend using a virtual environment such as venv:

python3.12 -m venv venv
source venv/bin/activate

In your virtual environment, please install all packages for development by running:

pip install -r requirements.txt

Installing and running

For the script to run, you will need to have a file called .hdx_configuration.yaml in your home directory containing your HDX key, e.g.:

hdx_key: "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
hdx_read_only: false
hdx_site: prod

You will also need to supply the universal .useragents.yaml file in your home directory as specified in the parameter user_agent_config_yaml passed to facade in run.py. The collector reads the key hdx-scraper-dcc as specified in the parameter user_agent_lookup.

Alternatively, you can set up environment variables: USER_AGENT, HDX_KEY, HDX_SITE, EXTRA_PARAMS, TEMP_DIR, and LOG_FILE_ONLY.

To install and run, execute:

pip install .
python -m hdx.scraper.dcc

Environment

Development is currently done using Python 3.11. We recommend using a virtual environment such as venv:

python3.12 -m venv venv
source venv/bin/activate

Pre-commit

Be sure to install pre-commit, which is run every time you make a git commit:

pip install pre-commit
pre-commit install

The configuration file for this project is in a non-start location. Thus, you will need to edit your .git/hooks/pre-commit file to reflect this. Change the first line that begins with ARGS to:

ARGS=(hook-impl --config=.config/pre-commit-config.yaml --hook-type=pre-commit)

With pre-commit, all code is formatted according to black and ruff guidelines.

To check if your changes pass pre-commit without committing, run:

pre-commit run --all-files --config=.config/pre-commit-config.yaml

Testing

Ensure you have the required packages to run the tests:

pip install -r requirements-test.txt

To run the tests and view coverage, execute:

pytest -c .config/pytest.ini --cov hdx --cov-config .config/coveragerc

Packages

pip-tools is used for package management. If you’ve introduced a new package to the source code please add it to the dependencies section of pyproject.toml with any known version constraints.

For adding packages for testing, add them to the test sections under [project.optional-dependencies].

Any changes to the dependencies will be automatically reflected in requirements.txt and requirements-test.txt with pre-commit, but you can re-generate the file without committing by executing:

pre-commit run pip-compile --all-files --config=.config/pre-commit-config.yaml