Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete old reference data code 😝 #990

Merged
merged 57 commits into from
Nov 25, 2024

Conversation

jklugherz
Copy link
Contributor

No description provided.

@jklugherz jklugherz requested a review from a team as a code owner November 25, 2024 17:41
@@ -3,18 +3,15 @@
import hail as hl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can just be deleted!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the rest of the stuff in the v02 directory?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can go now, everything should have an appropriate analog in v3!

@bpblanken
Copy link
Collaborator

🔥

@@ -8,7 +8,7 @@
from v03_pipeline.lib.model.reference_dataset_collection import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is also deleteable now right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TRUE

Base automatically changed from reference-dataset-update-vat to reference-data-refactor November 25, 2024 18:08
@bpblanken bpblanken merged commit a3baf5a into reference-data-refactor Nov 25, 2024
1 check passed
@bpblanken bpblanken deleted the reference-data-delete branch November 25, 2024 19:06
bpblanken added a commit that referenced this pull request Nov 25, 2024
* begin reference dataset refactor

* hgmd

* basewritetask

* PR commentes

* Reference data refactor feature branch

* remove utils for now

* cadd

* hgmd selects

* import

* minor things

* config enum attribute

* config out of enum, get_ht, for_reference_genome_dataset_type

* return table

* kwargs

* tiny changes

* frozenset

* cadd filtering

* changes to the cadd script that will be moot?

* add some gnomad datasets

* hacking on clinvar

* ruff

* add 38 dbnsfp config

* get cadd from dbnsfp

* get primate ai and mpc from dbnsfp

* Cleanup

* cleanup

* Update misc.py

* Update clinvar.py

* Update clinvar.py

* Update clinvar_test.py

* poach some files from bens pr

* Update definitions.py

* first pass enums

* use liftover for 37 data instead of old version

* remove cadd

* Add clinvar path (#961)

* Add clinvar path

* Fix missing requires bug

* remove dataset type from filter contigs

* Move filter_contigs to "get_ht" so its generalizable

* gnomad_exomes unit tests

* all enum selects helper

* gnomad_genomes tests

* clean up

* Generalize enum annotation

* fix tempdir usage

* add topmed

* Benb/clinvar refactor (#960)

* hacking on clinvar

* ruff

* Cleanup

* cleanup

* Update misc.py

* Update clinvar.py

* Update clinvar.py

* Update clinvar_test.py

* Update definitions.py

* Add clinvar path (#961)

* Add clinvar path

* Fix missing requires bug

* remove dataset type from filter contigs

* Move filter_contigs to "get_ht" so its generalizable

* Generalize enum annotation

* Add back enum select fields

* remove unnecessary line

* clean up

* ruff

* wip hgmd test

* ruff

* share enum transmute

* done

* notebook

* ruff

* linter for now

* first pass splice ai

* Mitimpact

* Add the enum 🤦

* bad typo

* gnomad_mito, gnomad_non_coding_constraint, local_constraint_mito, screen

* gnomad_qc typo

* module_file_name

* gnomad_genomes CONFIG deduplication

* zipfile helper

* MITIMPACT (#965)

* Mitimpact

* Add the enum 🤦

* bad typo

* use helper for zip download

* pr feedback

* ruff

* ruff

* ruff

* ruff

* unshare extracted filename

* clean up transmute

* ruff

* trailing comma

* maybe clearer gnomad

* fix property syntax

* gnomad_mito selects

* use hanas enum notation

* shared import vcf helper

* proper splice ai parsing

* valid paths

* ruff

* ruff

* mitomap

* add coment

* merge

* screenums

* explicit handling for already mapped enums

* add tests

* ruff

* ruff

* ruff

* min_partitions

* simplify mitomap

* jupyter

* hmtvar reference dataset (#971)

* hmtvar reference dataset

* ruff

* eigen reference dataset (#970)

* eigen reference dataset

* Fix typo

---------

Co-authored-by: Benjamin Blankenmeister <[email protected]>

* Exac reference dataset (#969)

* add exac reference dataset

* use vcf

* remove comment

---------

Co-authored-by: Benjamin Blankenmeister <[email protected]>

* helix mito (#972)

* split genomes and exomes again

* fix screen

* screen and gnomad non coding

* unzip local_constraint_mito

* Fix bugs related to nested fields/split_multi (#973)

* helix mito

* Fix split_multi and select bugs

* fixme

* ruff

* Add test for exac

* Add test for split multi check

* Add test for `UpdatedReferenceDataset` and `UpdatedReferenceDatasetQuery` (#974)

* helix mito

* Fix split_multi and select bugs

* fixme

* ruff

* get test working

* fix bugs

* bug fixes

* Bugfixes

* Refactor tests

* Add comment

* quixotic

* missed one

* Add test for exac

* Add test for split multi check

* fix zip write

* Benb/add missing queries (#977)

* Add missing datasets

* Fix reference

* Add test

* lint

* remove complete() (#979)

* remove complete()

* ruff

* Fix mock

* Benb/update gnomad qc crdq with updated format (#980)

* remove complete()

* ruff

* Fix mock

* Replace the gnomad_qc crdq

* Fix test

* format

* Remove ht and tests (#981)

* remove complete()

* ruff

* Fix mock

* Replace the gnomad_qc crdq

* Fix test

* format

* Remove ht and tests

* Updated `gnomad_coding_and_noncoding` test table. (#982)

* remove complete()

* ruff

* Fix mock

* Replace the gnomad_qc crdq

* Fix test

* format

* Remove ht and tests

* Change validation table reference

* Update README.txt

* remove crdq reference

* Update mock

* ruff

* Fix imports

* remove mock

* fixme

* Change rsync to new path (#983)

* Remove `version` from reference dataset query path (#984)

* Change rsync to new path

* Remove version from reference dataset query path

* Make rdq dataset type specific (#985)

* Make rdq dataset type specific

* Add test for mito

* Add pathogenicities to clinvar

* tweak

* update annotations with updated reference datasets refactor (#978)

* first pass update vat

* merge feature

* fix the diff for now

* include_queries

* interval ht

* tests

* exclude

* nicer

* fix inteval test

* split fn

* eigen test

* clinvar wip

* hgmd

* clinvar

* gnomad genomes and exomes

* delete

* 38 snv_indel done

* mito tests

* done with tests?

* custom_select

* fields test

* disable write new samples tests for now

* working on tests

* update update vat with new samples tests

* extra file

* other skipped test

* make select and filter similar

* tweak

* rename path and locus/interval filtering

* make select and filter similar (#988)

* make select and filter similar

* tweak

* Cleanest set diff

* Finish off

* Tests passing!

* ruff

* ruff

* Change the params

* Fix params

* params

* More clinvar mocking

* hardcode these

---------

Co-authored-by: Benjamin Blankenmeister <[email protected]>
Co-authored-by: Benjamin Blankenmeister <[email protected]>

* delete old reference data code 😝  (#990)

* first pass update vat

* merge feature

* fix the diff for now

* include_queries

* interval ht

* tests

* exclude

* nicer

* fix inteval test

* split fn

* eigen test

* clinvar wip

* hgmd

* clinvar

* gnomad genomes and exomes

* delete

* 38 snv_indel done

* mito tests

* done with tests?

* custom_select

* fields test

* disable write new samples tests for now

* working on tests

* update update vat with new samples tests

* extra file

* other skipped test

* make select and filter similar

* tweak

* rename path and locus/interval filtering

* make select and filter similar (#988)

* make select and filter similar

* tweak

* Cleanest set diff

* Finish off

* Tests passing!

* ruff

* ruff

* Change the params

* Fix params

* params

* More clinvar mocking

* hardcode these

* delete a bunch of stuff

* ruff

* remove rdc and crdq

* delete v02

* remove comment references to deleted file

* last test

---------

Co-authored-by: Benjamin Blankenmeister <[email protected]>
Co-authored-by: Benjamin Blankenmeister <[email protected]>

---------

Co-authored-by: Julia Klugherz <[email protected]>
Co-authored-by: Hana Snow <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants