Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subject metadata #11

Closed
CodyCBakerPhD opened this issue Apr 1, 2024 · 10 comments
Closed

Subject metadata #11

CodyCBakerPhD opened this issue Apr 1, 2024 · 10 comments
Assignees

Comments

@CodyCBakerPhD
Copy link
Member

@emosb Related to the general question in #10, but there are many other fields NWB would like to have about each of the subjects used in the dataset

For each subject, we would need to have

  • subject_id - if only one subject was used per day I suppose this could also just be the date format string - and if more than one per day, I suppose we could append a counter to the end of that date string
  • sex of each subject (or let me know if all XX or XO with no variability)
  • age or date_of_birth (as ISO), or a range if the exact value is not known on the individual subject level

We would also appreciate

  • growth_stage
  • growth_stage_time
  • cultivation_temp
@CodyCBakerPhD CodyCBakerPhD self-assigned this Apr 1, 2024
@emosb
Copy link

emosb commented Apr 1, 2024

Generally somewhere between 2 and 5 subjects are taken per day, so an explicit counter or subject_id could be added. Unless otherwise stated, the sex is Hermaphrodite. I put a "Male" tag in the logbook for all of my data on male worms - I can put the specific path to this information in a later comment. Age is included in the same logbook for all subjects. Cultivation temperature for all subjects is 20 degrees Celcius. Could you please clarify growth_stage fields, as I am unsure how this differs from age.

@CodyCBakerPhD
Copy link
Member Author

CodyCBakerPhD commented Apr 1, 2024

Cool, sounds like I just need this logbook then

The descriptions of those last fields are

  • growth_stage: Growth stage of C. elegans. One of two-fold, three-fold, L1-L4, YA, OA, duaer, post-dauer L4, post-dauer YA, post-dauer OA
  • growth_stage_time: amount of time in current growth stage

@emosb
Copy link

emosb commented Apr 9, 2024

The logbook.txt file should be included now in the Globus data. The logbook.txt reflects the entire day of recordings (i.e. multiple subjects). A new log is appended each time the multicolor program is run to get the multicolorworm_###### data, with user input data like the strain name, days on dex, and growth stage of worm. We do not record the amount of time in the current growth stage, but for us YA indicates at most one day post L4 (for hermaphrodites we look for non-eggy adults, indicating less than 16 hours since their L4 molt).

@CodyCBakerPhD
Copy link
Member Author

@emosb Excellent, thank you - this looks good to me

One question - as you continue to upload data for all the subjects used in the paper, can the outer folders follow the convention I see on the D:\Data drive mentioned in the logs? (that is, %y%m%d/...)

This will make it easier to iterate over sessions during the conversion

@CodyCBakerPhD
Copy link
Member Author

@emosb Pinging on this again - any progress on uploading the data for the other subjects used in the paper?

@emosb
Copy link

emosb commented Apr 29, 2024

I will start transferring other subjects - I will restrict to hermaphrodites for the moment that are included in the Nature data. I will name the outer folders with the %y%m%d convention as well with the subject number indicated within the inner folders. As they are transferred please let me know if you want me to modify the naming convention for the subject folders

@emosb
Copy link

emosb commented Apr 29, 2024

To clarify, do you still want these to be partitioned into raw and post-processed files. This is the most time limiting step for me. The original format of the data (and the quickest to transfer) has the %y%m%d format for the outer folder already, but does not have inner folders labeled with subject_id

@CodyCBakerPhD
Copy link
Member Author

To clarify, do you still want these to be partitioned into raw and post-processed files. This is the most time limiting step for me. The original format of the data (and the quickest to transfer) has the %y%m%d format for the outer folder already, but does not have inner folders labeled with subject_id

I suppose just transfer 'as-is' for now with minimal alterations; as long as both the raw and post-processed data are included under some indicator of what constitutes an associated session I can hopefully figure it out or if not ask for more details in order to figure it out

@emosb
Copy link

emosb commented Apr 30, 2024

Alright, I have transferred another 4 subjects - labeled with subject_id_#, where the # corresponds to the order of the published data. Each subject_id folder contains the multicolor and pumpprobe folders for the given subject, though not partitioned into raw and preprocessed. I will continue transferring further data - but let me know if there are alterations to the formating or clarification needed.

@CodyCBakerPhD
Copy link
Member Author

Will be solved via the upcoming YAML compilation in #16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants