-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assistance with Bulk Download of Arthropod Species and Traits #118
Comments
Apologies for the availability of the web interface! We're in the long, tedious process of purchasing our next set of servers. I think your easiest access for such a large clade will be one of our bulk data compilations. Have a look at the "all trait data" archive. Let me know if that won't meet your needs. :) |
Thank you very much for the information. I have successfully downloaded the
"all trait data" but am encountering some difficulty interpreting it. Could
you please confirm if my understanding of the information in these files is
correct?
I assume the traits.csv file contains scientific names along with their
corresponding traits, where available. I extracted the records for Xestia
sexstrigata from traits.csv (see attached). I believe that eol_pk
corresponds to the trait ID and page_id to the taxa ID. The value_uri
column seems to contain the trait name and its associated value ID, which
can be retrieved from terms.csv.
Could you help clarify what the other columns represent, such as
resource_pk, resource_id, and predicate? Additionally, while terms.csv
includes many traits, I found only 13 records for Xestia sexstrigata in
traits.csv. Does this mean that only these traits were collected for Xestia
sexstrigata, or might I have missed others? My goal is to gather as many
traits as possible, including predator-prey relationships, toxicity, body
size, coloration, wing morphology, life span, mouthparts, reproductive
rate, habitat preference, and developmental stages. Do you have any
suggestions on how to best approach this?
Thanks a lot!
…On Tue, 12 Nov 2024 at 06:51, Jen Hammock ***@***.***> wrote:
Apologies for the availability of the web interface! We're in the long,
tedious process of purchasing our next set of servers.
I think your easiest access for such a large clade will be one of our bulk
data compilations. Have a look at the "all trait data
<https://zenodo.org/records/13305577>" archive. Let me know if that won't
meet your needs. :)
—
Reply to this email directly, view it on GitHub
<#118 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANN2H34L7EEUSXPMCCT2A532AIIXRAVCNFSM6AAAAABRTUWOGCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQG42DGOJUHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
You're on the right track! A "resource" is a dataset, generally derived from a single open access third party and put into our import format. If you strike one that looks promising for your work, those are also available for download (example). The "value" is an important part of the trait and is often sufficient to deduce the whole fact, but if it were "blue" or "12 cm", you'd also need the predicate to interpret it- "flower color", or "leg length", etc. All our trait records have a predicate and a value. Trophic relationships are all from the Relations Ontology and will have predicates like http://purl.obolibrary.org/obo/RO_0002470; if you find them awkward to extract from our data model, they nearly all originate from Global Biotic Interactions, which has excellent direct data services of its own. A lot of our trait coverage in Arthropoda comes from taxonomic inference; you'll find those records in the "inferred" file. All insects have wings, except for a long list of small and medium size clades in which they were lost. Each such trait has only one record in our database, but it is attached to a large number of taxa. The "inferred" file maps those trait records to those additional taxa. You will still find Arthopoda trait-space more gap than content, I'm afraid. It's one of the most thinly populated areas of our knowledge and among the slowest to be digitized and dug out of the literature. |
Thank you so much for the info!
…On Wed, 13 Nov 2024 at 05:25, Jen Hammock ***@***.***> wrote:
You're on the right track! A "resource" is a dataset, generally derived
from a single open access third party and put into our import format. If
you strike one that looks promising for your work, those are also available
for download (example <https://zenodo.org/records/14121663>).
The "value" is an important part of the trait and is often sufficient to
deduce the whole fact, but if it were "blue" or "12 cm", you'd also need
the predicate to interpret it- "flower color", or "leg length", etc. All
our trait records have a predicate and a value.
Trophic relationships are all from the Relations Ontology and will have
predicates like http://purl.obolibrary.org/obo/RO_0002470; if you find
them awkward to extract from our data model, they nearly all originate from Global
Biotic Interactions <https://www.globalbioticinteractions.org/>, which
has excellent direct data services of its own.
A lot of our trait coverage in Arthropoda comes from taxonomic inference;
you'll find those records in the "inferred" file. All insects have wings,
except for a long list of small and medium size clades in which they were
lost. Each such trait has only one record in our database, but it is
attached to a large number of taxa. The "inferred" file maps those trait
records to those additional taxa.
You will still find Arthopoda trait-space more gap than content, I'm
afraid. It's one of the most thinly populated areas of our knowledge and
among the slowest to be digitized and dug out of the literature.
—
Reply to this email directly, view it on GitHub
<#118 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANN2H3ZQH2XFIHE3FM7XN6D2ANHM3AVCNFSM6AAAAABRTUWOGCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZTGYYTKOBRGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hi
Thank you for maintaining such a valuable and comprehensive resource, which is immensely helpful for my research. I am interested in downloading a comprehensive dataset of all arthropod species along with their available traits from EOL.
However, I am encountering some issues. When I try to browse or access large datasets on the website, I frequently receive an "Internal Server Error (500)." I registered and logged in to get an api token but also failed. Could you please advise on the best way to effectively download this data? Is there an alternative method or a direct link for bulk downloads that might help avoid this issue?
Many thanks!
The text was updated successfully, but these errors were encountered: