-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding loader for Hainsworth dataset #617
Closed
tanmayy24
wants to merge
248
commits into
mir-dataset-loaders:master
from
tanmayy24:tanmay/hainsworth
Closed
Adding loader for Hainsworth dataset #617
tanmayy24
wants to merge
248
commits into
mir-dataset-loaders:master
from
tanmayy24:tanmay/hainsworth
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* fix attribute bug, move audio loaders to Track * remove check_validated * update tests * test ikala loading functionality * rollback librosa * code review * bump version Former-commit-id: 2bd45a7
* don't allow data_home to be none in any functions, data_home is now one level lower * update beatles * update some tests, most still failing * fix tests for beatles - todo, fix download mocking * update ikala data_home * rollback create validated/invalid * update medleydb_melody data_home * update medleydb_pitch data_home * update orchset data_home * update salami data_home * remove commented code * uncomment failing tests * update tests * remove old version module, update version Former-commit-id: 0c78960
…tructions Former-commit-id: bbdfc51
Former-commit-id: 799e506
* draft for mirdata rtd. including example, faq(placeholder) * add more comments per review Former-commit-id: e9c777e
Former-commit-id: 3b947fb
* update __repr__ * format very long string * test all __repr__ methods * bump version Former-commit-id: 64a3fa6
* adding rwc collection * add docs step * add rwc datasets to docs * add tests and bugfix for rwc classical * handle incomplete first measure * update contributing * rwc genre tests * rwc jazz tests * rwc popular tests * add __repr__ to contributing example * bump version Former-commit-id: 2bf915f
* fix when annotsations are missing and test * increase test coverge * update version Former-commit-id: 5a59577
* initial downloader * proposed downloader and how it would work in two loaders * update download in beatles * data_home -> save_dir * three more examples * move all downloading functions to new module * move tests * start updating loaders * update some imports * web_downloader -> download * update tests * download.py -> download_utils.py to avoid name collision * bump version * test for downloader (#105) Former-commit-id: 6e07377
* beat_positions type from str to int * rwc collection: track.track_duration_sec to track.duration_sec * beatles beat_positions int * fix missing annotations beat and key * try to fix coverage * consistency with salami loader Former-commit-id: 7c32f59
Former-commit-id: ce23bf6
* replace print with logging.info * fix black error * bump version Former-commit-id: 63dc7cb
* Finished generating index.json Started Data Loader * finished writing tests * changed dependency in setup.py * one more try at fixing dependency * Gave up on soundfile. Didn't realize librosa already had support for multichannel wave files * post CR fix. CR by @rabitt. * bump version Former-commit-id: 10bb98a
* fix for guitarset.download() * increase coverage * refactored and increased coverage * update RemoteFileMetadata * update example loader * run black on guitarset index * bump version * rm old docstring Former-commit-id: 13c0bec
* Librosa now >= 0.7.0 * Add libsndfile as a dependency elsewhere * bump version Former-commit-id: 4022668
* make all subdirectories * remove unused import * bump version Former-commit-id: dccf93f
* fix checksum in beatles and salami, simplify tar.gz downloader * beat positions to int rwc * fix download folder * remove comments * update version Former-commit-id: caadd71
* multi channel support w/ GuitarSet * bumped version to 0.0.17 and added `requests` to setup Former-commit-id: b467bfb
* begin writing medley_solos_db.py * write track_ids in medley_solos_db * write load in medley_solos_db * write cite in medley_solos_db * add medley_solos_db module to __init__ * add song_id to track_metadata in medley-solos-db * write make_medley_solos_db script * skip header in medley-solos-DB csv * bugfix make_medley_solos_db_index * import hashlib in make_medley_solos_db_index * update index * import json in make_medley_solos_db_index * define msdb index keys by uuid4 * upload msdb JSON index * bugfix _track_metadata * write validate in msdb * write track_ids in msdb * write _reload_metadata in msdb * write _load_metadat in msdb * finish writing _load_metadata in msdb * bugfix msdb annotation checksum * update urls and checksums in msdb * pep8 * .wav.wav -> .wav * update msdb index (audio/ subdir) * update metadata_path in msdb * bugfix _track_paths * set sr to 22050 in msdb audio * reformat using black * upload metadata CSV in resources * upload one track of MSDB for tests * start MSDB test file * test_cite in MSDB * test_load in MSDB * start test_track in MSDB * add MSDB to docs * import DEFAULT_DATA_HOME in msdb test * typo in MSDB test_track * bugfix test_load MSDB * test_track_ids in MSDB * bugfix test_cite in MSDB * annotations -> annotation folder in MSDB test * finish test_track in msdb Former-commit-id: 5d17ed4
Former-commit-id: d30c342
* remove validate from inside load * bump version Former-commit-id: 5667f47
Former-commit-id: 95bb10b
Former-commit-id: 8a31ce0
* fix download error * reformat * add tests for all downloaders * run black * bump version Former-commit-id: 39002ce
* adding DALI dataset * updating dali to current master version * formatting * remove maps * fix metadata, format and loaders * add tests, waiting for small file for final testing * reduced metadata file dali * dali tests * dali test resources * fix * fix rep str * dic metadata to attributes * fix rounding issue p27 * update setup * add one more test Former-commit-id: e340d67
* first draft of to_jams * add jams_utils and move to_jams to Track() * add to_jams() and change to hierarhcy-corrected labels * change sections and chords to mir_eval format, add more to_jams(), track duration to float * update tests * update files tests * metadata to dict, add data type checks, f0s_to_jams * change metadata in to_jams * add to_jams * add to jams guitarset * add lyrics to_jams * tests for chords and beats * tests sections and multi_sections * tests keys and lyrics * metadata and data type checking * tests metadata and data type checking * typo * add to_jams * simplfy metadata in to_jams function * formatting * chords format, metadata keys, a bit of file formatting * update jazz test * update guitarset * start test dali draft * Contributing, API and docstings * black * remove test_dali * import in contributing * add to_jams to medley_solos_db * black * remove maps from docs * comment audio checking until solving backend * some tests to_jams * remove unused code * update metadata test resources * utf-8 * fix contributing * tests for to_jams all datasets Former-commit-id: ed5756f
Former-commit-id: c9fd249
* Fix tox for formatting test * Pin black version to 23.1.0 * Upgrade librosa version and ensure python3.6 compatibility * Black formatting with new 23.1.0 version * Fixing egfxset expected return value * Mock pandas import at sphinx autodoc * Fix black version for python3.6 Former-commit-id: 0620b8c
* Add BAF loader * Extend docstring and improve test coverage * Better check if file does not exist and improve test coverage * formatting... * Remove unused imports Former-commit-id: 8c84e50
* Create script and index * Create loader and fix index * Create tests and undo index fix * Add test resources and fix index again * Fix test resources * Add loaders and tests for taala and tonic * Loader finished * Update loader with new dataset * fix loader with new updates * add testing files and structre function * core fixes to get the tests passing * fix carnatic varnam with new dataset updates * black formatting * remove unused function * new version 1.1 [wip] * remove prints, loader good * add load notation as exception * index updated with new version 1.1 * update setup * fix problem in _metadata * merging... * adding testing file and smart open * shorten test file name * add Exception is load_notation * fix problem with exception * add test coverage * update remotes, improve docs * fix remotes * fix data folder naming in dataset Former-commit-id: 496eb4a
* ADD formatting workflow * FIX variables in formatting workflow * ADD python linting workflow and environment * ADD CI workflow and environment * Remove CircleCI * UPDATE new_loader.md PR template * ADD readthedocs * UPDATE readme badges * FIX numpy asarray bug * CHANGE arg name due to librosa update * REMOVE tox.ini * UPDATE dependencies for test * ADD dependencies * dependencies.. * dependencies.. * dependencies.. * dependencies.. * fix dependencies versions * ADD all smart_open protocols install for CI * MOVE smart_open[all] to pip install * INSTALL types to pass mypy * intall types-pyaml for python linting test * Change to work with music21 v9.* * TEST Environment CI with no reestrictions. python3.10 test passing in local * FIX ikala test to pass linux tests * FIX normpath for windows tests * BLACK * Change assert tolerance for floats in test_ikala The motivation of this change is that linux and macos return different floats. MacOS returns 260.946404518887 while Linux 260.94640451888694. So we adjust the tolerance of the test * Remove windows CI test * Assert modification forgot in the last commit * Specifying packages versinos on environment yml * Fix h5py version for python3.7 * CI test dependencies fixed at the versions of last passing test * CI test dependencies without lowerbound * CI test dependencies that should work * sort dependencies by alphabetical order * Update setup dependencies * Update test-lint dependencies * Update contributing docs to match new testing pipeline * Remove comment from test_ikala This comment was showing the assertion test done before the PR#596 * Set dependencies packages versions for docs * Remove comment * jams get_duration handling * Trigger tests after CircleCI removing --------- Co-authored-by: Magdalena Fuentes <[email protected]> Former-commit-id: dfaadca
* Update badges url * Trigger doc build again Former-commit-id: e3c3253
Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: cbd3dc4
A previous link for "OMRAS2 Metadata Project 2009" gives "404 Not found error", I replaced it with the one from the CiteSeerX. Co-authored-by: Harsh Palan <[email protected]> Co-authored-by: Genís Plaja-Roglans <[email protected]> Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: e3c34a5
Former-commit-id: 5a70ff2
* minor fixes (fix pip install syntax to install optional dependencies and remove --black from pytest) * fix: typos --------- Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: 4bc5b18
* add new version of dataset * black * fix in tox * fixing librosa (@dagett) * black formatting * remove old index, add note in docs * fix formatting, add tests for makam * formatting four way tabla dataset * add normpath to tests and script * formatting --------- Co-authored-by: Harsh Palan <[email protected]> Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: ef72b2c
* idmt_smt_audio_effects dataset script and index * idmt_smt_audio_effects loader * idmt_smt_audio_effects tests and resources * idmt_smt_audio_effects dataset added to docs * black formatter * fixing type error in mypy test * loader docstring * pytest fix: folder delete in dataloader * remove download func, adding unpacking dirs * fixed resources path * formatting * formatting * added tests * added tests * fixing test * fix dependencies in setup.py * modified dataset tests and added custom track to test_loaders.py * changed docstrings and exception handling * docstrings * GitHub Actions migration (#596) * ADD formatting workflow * FIX variables in formatting workflow * ADD python linting workflow and environment * ADD CI workflow and environment * Remove CircleCI * UPDATE new_loader.md PR template * ADD readthedocs * UPDATE readme badges * FIX numpy asarray bug * CHANGE arg name due to librosa update * REMOVE tox.ini * UPDATE dependencies for test * ADD dependencies * dependencies.. * dependencies.. * dependencies.. * dependencies.. * fix dependencies versions * ADD all smart_open protocols install for CI * MOVE smart_open[all] to pip install * INSTALL types to pass mypy * intall types-pyaml for python linting test * Change to work with music21 v9.* * TEST Environment CI with no reestrictions. python3.10 test passing in local * FIX ikala test to pass linux tests * FIX normpath for windows tests * BLACK * Change assert tolerance for floats in test_ikala The motivation of this change is that linux and macos return different floats. MacOS returns 260.946404518887 while Linux 260.94640451888694. So we adjust the tolerance of the test * Remove windows CI test * Assert modification forgot in the last commit * Specifying packages versinos on environment yml * Fix h5py version for python3.7 * CI test dependencies fixed at the versions of last passing test * CI test dependencies without lowerbound * CI test dependencies that should work * sort dependencies by alphabetical order * Update setup dependencies * Update test-lint dependencies * Update contributing docs to match new testing pipeline * Remove comment from test_ikala This comment was showing the assertion test done before the PR#596 * Set dependencies packages versions for docs * Remove comment * jams get_duration handling * Trigger tests after CircleCI removing --------- Co-authored-by: Magdalena Fuentes <[email protected]> * Update badges url (#598) * Update badges url * Trigger doc build again * metadata exception, whitespaces in table.rst * fixing table.rst * fixing mirdata.rst and adding references to quick_reference.rst * increasing test coverage * adding corrupted xml file for testing * modified metadata logic for xml files * removed general exception * removing FileNotFoundError, changing dirs for _ and moving Cached Properties to Attributes * revert to FileNotFoundError and test --------- Co-authored-by: Magdalena Fuentes <[email protected]> Co-authored-by: Genís Plaja-Roglans <[email protected]> Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: afbc0c3
* scripts/make, index and track and dataset class. TODO tests * fix docstring * modify the docs * download disclaimer * black * first test * fix metadata * remove embeddings * add more tests * black * modify tests * modify fix.py for adding music21 (optional) * fix bugt with load_scores * fix bugs * from smart_open import open * from smart_open import open * same error than francesco * test fulldataset * test fulldataset * test fulldataset * genis suggestion * replace os.path.exists by try catch * fix plobles with try catch * add cipi to CUSTOM_TEST_TRACKS * modify all the tests * black * smart open test * black * check embeddings * check embeddings * check embeddings * imrpoving codecov * rollback haydn_op20.py * rollback haydn_op20.py * comentario de los embeddings * cante100 -> cipi * baclk * expressiveness * fix make * Done! * Update cipi.py * difficulty annotation * fix docs table * add dataset details and fix error message * now doing the fixes right :) * address problem in table.rst --------- Co-authored-by: PRamoneda <[email protected]> Co-authored-by: Genís Plaja-Roglans <[email protected]> Co-authored-by: Guillem Cortès <[email protected]> Co-authored-by: genisplaja <[email protected]> Former-commit-id: c1dccb0
* first commit * update on dependencies * missing track in test_loaders * fix testing hindustani track * check mypy problem * add more tests * create entry for annotation type activation in quick ref * update table, quick ref, and docstring, add test * remove deprecated functions of dataset * black --------- Co-authored-by: Magdalena Fuentes <[email protected]> Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: a944318
* added required files for candombe_beat_downbeat dataset * fixed empty new line at the end of file * changed dataloader name from candombe_beat_downbeat to candombe * added documentation for load_beats function in candombe dataloader * reformatted with black * included candombe dataset information in table.rst and mirdata.rst * formatted candombe.py with black * changed candombe.py according to black formatting * updated candombe Track class docstring * made slight docstring changes * slight docstring changes * Fixed black issues --------- Co-authored-by: Jimena Arruti <[email protected]> Co-authored-by: Jimena Arruti <[email protected]> Co-authored-by: Harsh Palan <[email protected]> Co-authored-by: Guillem Cortès <[email protected]> Co-authored-by: Genís Plaja-Roglans <[email protected]> Co-authored-by: Tanmay Khandelwal <[email protected]> Former-commit-id: 7fceb2f
Former-commit-id: ff99c5d
* Badge fixes * Badge zenodo addition * Badge zenodo addition * Addition of zenodo in docs * Fixed doi issue * Fixed doi issue * Added main branch run and pointed badges to right main * Edited branch name --------- Co-authored-by: Tanmay Khandelwal <[email protected]> Former-commit-id: a72fc2c
…611) * custom download function, tests and test dataset * remove redundant data_home check * added to download exceptions * formatting * download info * Update tree in idmt_smt_audio_effects.py Co-authored-by: Guillem Cortès <[email protected]> * corrected moving folders, directory tree * added comments to download info * typo --------- Co-authored-by: Guillem Cortès <[email protected]> Former-commit-id: 459833a
Former-commit-id: b7f6006
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #617 +/- ##
=======================================
Coverage 97.07% 97.07%
=======================================
Files 63 64 +1
Lines 7341 7392 +51
=======================================
+ Hits 7126 7176 +50
- Misses 215 216 +1 |
* Addition of ballroom * description text update * lint error fix * fixed path * Updated download info * Update in sphinx * Update in sphinx * Update in sphinx * added references * Sphinx version update * update in doc for ballroom * Fixed removal * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * addition of dataset link * Reverted back the version in requirements.txt * change in ballroom status * Update in ballroom remote * black formatting fixes * black formatting fixes --------- Co-authored-by: Tanmay Khandelwal <[email protected]> Former-commit-id: 2c4ee52
* Addition of ballroom * description text update * lint error fix * fixed path * Updated download info * Update in sphinx * Update in sphinx * Update in sphinx * added references * Sphinx version update * update in doc for ballroom * Fixed removal * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * Fixes in sphinx format * addition of dataset link * Reverted back the version in requirements.txt * change in ballroom status * Update in ballroom remote * black formatting fixes * black formatting fixes * fixes in black version * fixes in black version --------- Co-authored-by: Tanmay Khandelwal <[email protected]> Former-commit-id: c1e3cf9
Former-commit-id: d6e709a
Former-commit-id: 87dffcb
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please include the following information at the top level docstring for the dataset's module mydataset.py:
Dataset loaders checklist:
scripts/
, e.g.make_my_dataset_index.py
, which generates an index file.mirdata/indexes/
e.g.my_dataset_index.json
.mirdata/my_dataset.py
tests/datasets/
, e.g.test_my_dataset.py
docs/source/mirdata.rst
anddocs/source/table.rst
black
,flake8
andmypy
(see Running your tests locally).tests/test_full_dataset.py
on your dataset.If your dataset is not fully downloadable there are two extra steps you should follow:
pytest -s tests/test_full_dataset.py --local --dataset my_dataset
once on your dataset locally and confirmed it passes.Please-do-not-edit flag
To reduce friction, we will make commits on top of contributor's pull requests by default unless they use the
please-do-not-edit
flag. If you don't want this to happen don't forget to add the flag when you start your pull request.