Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test RSC SureChembl linksets #32

Open
stain opened this issue Jan 21, 2016 · 9 comments
Open

Test RSC SureChembl linksets #32

stain opened this issue Jan 21, 2016 · 9 comments
Assignees

Comments

@stain
Copy link
Contributor

stain commented Jan 21, 2016

@nicklynch said

RSC have created a linkset for a 100 K mini dataset for SureChMBL
Linkset download for 100K http://ops.rsc.org/download/RDF-2016.01.12.zip

There is properties in this but we should not load ttl file that for now as not sure of licensing of the ACD/Labs for this
Would it be possible to load this into Dev IMS so we can start some testing on this and the API calls?

I've made https://github.com/openphacts/ops-rsc-surechembl-dataset that captures this and produces the linksets only at http://repository.mygrid.org.uk/artifactory/ops/org/openphacts/data/ops-rsc-linksets/0.20151104.0-SNAPSHOT/

I've validated these with IMS, and found that they all load, except SURE_CHEMBL/LINKSET_EXACT_SURE_CHEMBL20160112.ttl, e.g.

<http://ops.rsc.org/OPS1931331> skos:exactMatch <http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL170241> .
<http://ops.rsc.org/OPS1931518> skos:exactMatch <http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL167446> .
<http://ops.rsc.org/OPS1931950> skos:exactMatch <http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL175877> .

This is the most important one as it links the SCHEMBL identifiers to the OPS identifiers - thus linking SureChembl to the rest.

The DataSources.txt in IMS needs to be modified to support the new URI pattern for SureChembl. Nothing is currently resolved from those URIs - @agaulton - are they going to remain as those URIs?

@stain stain self-assigned this Jan 21, 2016
@agaulton
Copy link

The URIs used in the SureChEMBL RDF should be like: http://rdf.ebi.ac.uk/resource/surechembl/molecule/SCHEMBL175877 rather than resource/chembl/molecule. So looks like there is an issue in the linkset? We have no plans to change our URIs in the foreseeable future though.
Anna

On 21 Jan 2016, at 14:59, Stian Soiland-Reyes [email protected] wrote:

@nicklynch said

RSC have created a linkset for a 100 K mini dataset for SureChMBL
Linkset download for 100K http://ops.rsc.org/download/RDF-2016.01.12.zip

There is properties in this but we should not load ttl file that for now as not sure of licensing of the ACD/Labs for this
Would it be possible to load this into Dev IMS so we can start some testing on this and the API calls?

I've made https://github.com/openphacts/ops-rsc-surechembl-dataset that captures this and produces the linksets only at http://repository.mygrid.org.uk/artifactory/ops/org/openphacts/data/ops-rsc-linksets/0.20151104.0-SNAPSHOT/

I've validated these with IMS, and found that they all load, except SURE_CHEMBL/LINKSET_EXACT_SURE_CHEMBL20160112.ttl, e.g.

http://ops.rsc.org/OPS1931331 skos:exactMatch http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL170241 .
http://ops.rsc.org/OPS1931518 skos:exactMatch http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL167446 .
http://ops.rsc.org/OPS1931950 skos:exactMatch http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL175877 .
This is the most important one as it links the SCHEMBL identifiers to the OPS identifiers - thus linking SureChembl to the rest.

The DataSources.txt in IMS needs to be modified to support the new URI pattern for SureChembl. Nothing is currently resolved from those URIs - @agaulton - are they going to remain as those URIs?

Reply to this email directly or view it on GitHub.

@nicklynch
Copy link

@batchelorc Any thoughts on linkset comments here?

@stain
Copy link
Contributor Author

stain commented Jan 22, 2016

Thanks, @agaulton - I'll add the pattern http://rdf.ebi.ac.uk/resource/surechembl/molecule/SCHEMBL(\d+) (still 404) together with https://www.surechembl.org/chemical/SCHEMBL(\d)+

BTW, if I do

curl -v -H "Accept: application/rdf+xml" https://www.surechembl.org/chemical/SCHEMBL175877

.. I get a 500 Internal Server error :-( (I guess it should redirect to the rdf.ebi.ac.uk resource, even if it's still 404 Not Found)

@batchelorc
Copy link

Hello chaps,

Well spotted! We need to fix this, so I’ll put a JIRA item in before we generate the RDF.

All the best,
Colin.

From: Stian Soiland-Reyes [mailto:[email protected]]
Sent: 21 January 2016 14:59
To: openphacts/IdentityMappingService
Subject: [IdentityMappingService] Test RSC SureChembl linksets (#32)

@nicklynchhttps://github.com/nicklynch said

RSC have created a linkset for a 100 K mini dataset for SureChMBL
Linkset download for 100K http://ops.rsc.org/download/RDF-2016.01.12.zip

There is properties in this but we should not load ttl file that for now as not sure of licensing of the ACD/Labs for this
Would it be possible to load this into Dev IMS so we can start some testing on this and the API calls?

I've made https://github.com/openphacts/ops-rsc-surechembl-dataset that captures this and produces the linksets only at http://repository.mygrid.org.uk/artifactory/ops/org/openphacts/data/ops-rsc-linksets/0.20151104.0-SNAPSHOT/

I've validated these with IMS, and found that they all load, except SURE_CHEMBL/LINKSET_EXACT_SURE_CHEMBL20160112.ttl, e.g.

http://ops.rsc.org/OPS1931331 skos:exactMatch http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL170241 .

http://ops.rsc.org/OPS1931518 skos:exactMatch http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL167446 .

http://ops.rsc.org/OPS1931950 skos:exactMatch http://rdf.ebi.ac.uk/resource/chembl/molecule/SCHEMBL175877 .

This is the most important one as it links the SCHEMBL identifiers to the OPS identifiers - thus linking SureChembl to the rest.

The DataSources.txt in IMS needs to be modified to support the new URI pattern for SureChembl. Nothing is currently resolved from those URIs - @agaultonhttps://github.com/agaulton - are they going to remain as those URIs?


Reply to this email directly or view it on GitHubhttps://github.com//issues/32.

DISCLAIMER:

This communication (including any attachments) is intended for the use of the addressee only and may contain confidential, privileged or copyright material. It may not be relied upon or disclosed to any other person without the consent of the Royal Society of Chemistry. If you have received it in error, please contact us immediately. Any advice given by the Royal Society of Chemistry has been carefully formulated but is necessarily based on the information available, and the Royal Society of Chemistry cannot be held responsible for accuracy or completeness. In this respect, the Royal Society of Chemistry owes no duty of care and shall not be liable for any resulting damage or loss. The Royal Society of Chemistry acknowledges that a disclaimer cannot restrict liability at law for personal injury or death arising through a finding of negligence. The Royal Society of Chemistry does not warrant that its emails or attachments are Virus-free: Please rely on your own screening. The Royal Society of Chemistry is a charity, registered in England and Wales, number 207890 - Registered office: Thomas Graham House, Science Park, Milton Road, Cambridge CB4 0WF

@agaulton
Copy link

Yep, we don’t have the SureChembl RDF in the EBI RDF platform yet, so the URIs won’t resolve at present.

Are we using the SureChEMBL interface URLs somewhere in the RDF/Open PHACTS (e.g. https://www.surechembl.org/chemical/SCHEMBL175877)? I.e., is there a reason we should expect this to resolve to RDF?

Cheers
Anna

On 22 Jan 2016, at 10:04, Stian Soiland-Reyes [email protected] wrote:

Thanks, @agaulton - I'll add http://rdf.ebi.ac.uk/resource/surechembl/molecule/SCHEMBL175877 (still 404) together with https://www.surechembl.org/chemical/SCHEMBL175877

BTW, if I do

curl -v -H "Accept: application/rdf+xml" https://www.surechembl.org/chemical/SCHEMBL175877
.. I get a 500 Internal Server error :-( (I guess it should redirect to the rdf.ebi.ac.uk resource, even if it's still 404 Not Found)


Reply to this email directly or view it on GitHub.

@danidi
Copy link
Contributor

danidi commented Jan 22, 2016

From a user perspective, adding the interface URL to the IMS helps to retrieve data from the API. E.g. you could search on the SureChEMBL page and the use this URL directly in the API.

@agaulton
Copy link

OK that makes sense, I just wasn’t sure if it came from our side...
A

On 22 Jan 2016, at 10:55, danidi [email protected] wrote:

From a user perspective, adding the interface URL to the IMS helps to retrieve data from the API. E.g. you could search on the SureChEMBL page and the use this URL directly in the API.


Reply to this email directly or view it on GitHub.

@madgpap
Copy link

madgpap commented Jan 22, 2016

👍

@stain
Copy link
Contributor Author

stain commented Jan 22, 2016

Agree on @danidi - also the www URI will be coming back from IMS as part of the response when looking up any of the URIs that lead to it through the RSC linkset; which could be useful for users to click on in a browser.

The redirect from www to rdf is just about being good web citizen (once) you have the RDF live. I expect it fails with a 500 (rather than 406 Not Acceptable) is because you have partially got the redirection functionality there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants