-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filtering of munge_sumstats.py #439
Comments
@yesyj-yuns It is more difficult to detect errors when computing GWAS associations or meta-analyses for strand ambiguous variants. The simplest strategy to avoid introducing noise into ldsc from such errors is to remove ambiguous variants. For your second question, ldsc flips alleles and effect sizes as necessary to ensure that A1 is ref, A2 is alt, and effect allele is alt. As long as A1 in the input is consistently ref or consistently alt, you will get the correct answer. |
Thank you so much for your response. I'm really sorry, but I'm inquiring again because I don't understand the answer to the second question yet. If you look at the descript_cname inside the munge_sumstats.py, the statistics are defined based on A1 as shown below.
In my GWAS file, A1 is effect allele, but not ref allele. If not, I would like to check whether A1 allele in the Input GWAS summary file should be set as a reference allele and the statistics (beta, OR) of my GWAS should also be multiplied by -1 according to the ref allele. I would like to thank you once again. |
I have misspoken since I didn't read the source code correctly: you do need to make sure that A1 is ref and multiply beta by -1 accordingly. |
@aksarkar If you don`t mind, may I ask why "munge_sumstats.py" should set A1 to ref? |
@yesyj-yuns LD scores are computed assuming that A1 is ref, and the summary statistics must be consistent with the LD scores. |
Hi,
I tried to convert my GWAS data into a sumstats file using munge_sumstats.py.
When you look at https://github.com/bulik/ldsc/wiki/Summary-Statistics-File-Format#sumstats , it says that it filters ambiguous SNPs or non-SNP variants.
Could you please let me know why you filter ambiguous SNP when converting GWAS summary statistics to summary.sumstats?
And #375
You answered the question above, but I still don't understand it, so I'm asking you.
My GWAS summary statistics (logistic regression data) were generated using plink2. In this data, the effect allele, A1, is an alternative allele, not a reference allele.
In this case, can I know whether the A1 allele in the file to be entered in munge_sumstats.py should be a reference allele or an effect allele?
Thank you very much:)
The text was updated successfully, but these errors were encountered: