Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read haploid dosages with pgenlib #231

Open
23andme-jaredo opened this issue Jan 18, 2023 · 5 comments
Open

read haploid dosages with pgenlib #231

23andme-jaredo opened this issue Jan 18, 2023 · 5 comments

Comments

@23andme-jaredo
Copy link

Is it possible to read haploid dosages with pgenlib.PgenReader?

thanks,

Jared

@chrchang
Copy link
Owner

chrchang commented Jan 18, 2023

As with the plink .bed format, haploid vs. diploid is not directly encoded in the .pgen. Instead, plink and plink2 divide the encoded values by two when the .bim/.pvar (and on chrX, .fam/.psam) file indicates that we're dealing with haploid data.

@23andme-jaredo
Copy link
Author

hmmm so I am a bit confused. I have imputed data converted from bcf via:

plink2 --bcf $bcf dosage=HDS --make-pfile

and I can see that the two haploid dosages per individual are stored because I can recover them via:

 plink2 --pfile plink2 --export vcf bgz vcf-dosage=HDS

so I am try to extract those HDS values via pgenlib

@23andme-jaredo
Copy link
Author

Maybe I wasn't clear that I meant imputed haploid/phased probabilities, not hard genotypes.

@chrchang
Copy link
Owner

Oh, sorry, I thought you were referring to e.g. chrX/chrY/chrM.

The PgrGetDp() function in pgenlib_read.h is the simplest one that can return biallelic phased dosages.

@23andme-jaredo
Copy link
Author

Thanks! We'll try exposing that in python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants