Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

From Text search to Patent data (targets) #352

Open
danidi opened this issue Feb 4, 2016 · 6 comments
Open

From Text search to Patent data (targets) #352

danidi opened this issue Feb 4, 2016 · 6 comments

Comments

@danidi
Copy link

danidi commented Feb 4, 2016

It seems there are currently two issues complicating the retrieval of patent data for targets, when starting with a ConceptWiki Target.
One is the missing conversion of HGNC names (Data source H, http://www.genenames.org/data/hgnc_data.php?match=$id) to http://rdf.ebi.ac.uk/resource/surechembl/target/ (it is included here openphacts/IdentityMappingService#14).
The other issue is the missing transitives from CW to HGNC names (probably related to #319).

The current workaround to reach patent data for CW targets (ie a text search result) is the following:

@danidi
Copy link
Author

danidi commented Apr 7, 2016

Just to confirm, this will also affect queries made with uniprot uris:

with HGNC name: https://ops2.few.vu.nl/2.0/mapUri?app_id=f91c5b2b&app_key=18a5d823d0e4933ac5fe22a3d52974c1&Uri=http%3A%2F%2Fidentifiers.org%2Fhgnc.symbol%2FDAPK1 finds only the HGNC accession number.

with HGNC accession number: https://ops2.few.vu.nl/2.0/mapUri?app_id=f91c5b2b&app_key=18a5d823d0e4933ac5fe22a3d52974c1&Uri=http%3A%2F%2Fidentifiers.org%2Fhgnc%2F2674 finds unigene and ensembl, but not uniprot

@danidi
Copy link
Author

danidi commented May 8, 2018

In 2.2, the given HGNC accession number example (https://beta.openphacts.org/mapUri?app_id=f91c5b2b&app_key=18a5d823d0e4933ac5fe22a3d52974c1&Uri=http%3A%2F%2Fidentifiers.org%2Fhgnc%2F2674) does no longer link to unigene and ensembl.

@ianwdunlop
Copy link
Member

Hi @danidi do you know any of the unigene & ensembl ids that HGNC:2674 used to map to?
Also, are those issues from 2016 above still open and relevant? Maybe the CW one is not so important although uniprot should be.

@Chris-Evelo
Copy link

Hi @ianwdunlop @egonw checked that: Ensembl has that given HGNC number (2674) for ENSG00000196730...

So it should be in our linkset. @JonathanMELIUS is checking whether it really is.

It is a Death related protein BTW, so maybe that is why it is acting weird ;-)

@ianwdunlop
Copy link
Member

ianwdunlop commented May 16, 2018

I did a bit of digging last night. Here are my first findings after a bit of 'grep'. No conclusions, just that there are some mentions of the ensembl & hgnc gene.

grep '2674' linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_hgnc.dependent.LS.ttl
ensembl:ENSG00000196730
        ensemblterms:DEPENDENT  <http://identifiers.org/hgnc/2674> .
grep -r 'ENSG00000196730' linksets/
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_insdc.inferred_from_translation.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_ncbigene.dependent.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_refseq.inferred_from_transcript.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_uniprot.inferred_from_translation.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_pdb.inferred_from_translation.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_unigene.dependent.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_ena.embl.inferred_from_translation.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_wikigenes.dependent.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_hgnc.dependent.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_ccds.inferred_from_transcript.LS.ttl:ensembl:ENSG00000196730
/home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_refseq.inferred_from_translation.LS.ttl:ensembl:ENSG00000196730
grep 'ENSG00000196730' /home/ubuntu/d/linksets/ops-ensembl-homosapiens-linksets/Ensembl_Hs_uniprot.inferred_from_translation.LS.ttl
ensembl:ENSG00000196730
        ensemblterms:INFERRED_FROM_TRANSLATION
                <http://identifiers.org/uniprot/F8WCQ3> , <http://identifiers.org/uniprot/P53355> .

@danidi
Copy link
Author

danidi commented May 23, 2018

I would assume that the original issue is still present (the related github issues openphacts/IdentityMappingService#14 and #319 are at least still open). Of course CW is no longer used, but at least uniprot should work ideally.

I was wondering if it is an issue, that we have dependent and direct as predicates for the ensembl linksets (not exact or close match). As far as I am aware, #319 was not solved. It might be, that in earlier versions of the IMS the mappings were provided from CW linksets which were circumventing the ensembl linksets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants