Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Completion QA Checklist: TissueStability #5

Open
2 tasks done
jychien opened this issue Nov 22, 2019 · 3 comments
Open
2 tasks done

Project Completion QA Checklist: TissueStability #5

jychien opened this issue Nov 22, 2019 · 3 comments

Comments

@jychien
Copy link

jychien commented Nov 22, 2019

Project UUID: c4077b3c-5c98-4d26-a614-246d12c2e5d7
Project Title: Ischaemic sensitivity of human tissue by single cell RNA seq
Project Short Name: TissueStability
Submission UUID: fd52efcc-6924-4c8a-b68c-a299aea1d80f
Environment: Production

This project is also know as the "Meyer dataset" or "TissueSensitivity" and was re-ingested to incorporate additional data from the contributor.

  • Process QA of bundles
  • Metadata tsv validation: Run metadatatsv_validator.py
@jychien
Copy link
Author

jychien commented Nov 22, 2019

Process QA found the following issue:

  1. After downloading mtx for the project, barcodes.tsv was found to have redundant barcodes in the file. Matrix is currently implementing a change to use cell ID rather than barcodes for the file HumanCellAtlas/matrix-service/issues/428

  2. This project contains bulk RNAseq and WGS data. For these assay types, there are no cell suspensions since the specimen does not undergo a dissociation protocol. Sequence files are directly linked to specimen. This experimental graph may be an issue for downstream components, and will need to be followed up on before a pipeline is made available in production. (Related slack thread)

@jahilton
Copy link
Contributor

validator results...
cell_suspension.cell_morphology.cell_viability_method:Trypan blue; manual haemacytometer should be reviewed for consistency across projects
library_preparation_protocol.input_nucleic_acid_molecule.ontology:OBI:0000869 is polyA RNA extract but input as polyA RNA
library_preparation_protocol.library_construction_method.ontology_label:10X v2 sequencing needs the more specific 3' or 5' term
library_preparation_protocol.library_construction_method.ontology_label:DNA library construction & cDNA library construction don't feel parallel to the other values in this field
project.insdc_project_accessions:ERP114453 does not match pattern
project.insdc_study_accessions:PRJEB31843 does not match pattern
sequence_file.file_core.format:fastq.gz not in ['fastq']
sequencing_protocol.10x.pooled_channels:4.0 does not match pattern (expecting integer)

the polyA RNA is across-the-board issue so should bee decided on for all projects.
All others have been noted in https://github.com/HumanCellAtlas/hca-data-wrangling/issues/355

@jychien
Copy link
Author

jychien commented Feb 25, 2020

The paper is studying affects of cold ischaemic time affects on scRNA-seq analysis. There should be a timecourse module attached to the biomaterial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants