-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Non-concatenating --pmerge[-list] is under development.
#232
Comments
The error message means exactly what it says: this feature isn't implemented in plink2 yet. ("Concatenating" merge refers to the "bcftools concat" use case, though plink2's behavior differs a bit from bcftools's here.) Use e.g. bcftools or plink 1.9 to merge for now. |
Are you sure? as of march 13th, we were able to use plink2 to concat data sets. Here is a log of a working example: PLINK v2.00a3.7LM AVX2 Intel (24 Oct 2022)
Options in effect:
--out ukb24068_c5_merged_sample_filtered
--pfile ukb24068_c5_b1_merged_sample_filtered
--pmerge-list chr5_list
Hostname: 80b217465abd
Working directory: /home/ubuntu/exome_pgen
Start time: Mon Mar 13 15:49:53 2023
Random number seed: 1678722593
63628 MiB RAM detected; reserving 31814 MiB for main workspace.
Using up to 16 threads (change this with --threads).
--pmerge-list: 19 filesets specified (including main fileset).
--pmerge-list: 422625 samples present.
--pmerge-list: Merged .psam written to ukb24068_c5_merged_sample_filtered.psam
.
--pmerge-list: 19 .pvar files scanned, headers merged.
Concatenation job detected.
Concatenating... 747813/747813 variants complete.
Results written to ukb24068_c5_merged_sample_filtered.pgen +
ukb24068_c5_merged_sample_filtered.pvar .
End time: Mon Mar 13 15:51:11 2023 However, we see this same error for 2 of our chromosomes, not sure why yet. Same code is run in a loop, the PLINK v2.00a3.7LM AVX2 Intel (24 Oct 2022)
Options in effect:
--out ukb24068_c8_merged_sample_filtered
--pfile ukb24068_c8_b1_merged_sample_filtered
--pmerge-list chr8_list
Hostname: 80b217465abd
Working directory: /home/ubuntu/exome_pgen
Start time: Mon Mar 13 15:54:05 2023
Random number seed: 1678722845
63628 MiB RAM detected; reserving 31814 MiB for main workspace.
Using up to 16 threads (change this with --threads).
--pmerge-list: 15 filesets specified (including main fileset).
--pmerge-list: 422625 samples present.
--pmerge-list: Merged .psam written to ukb24068_c8_merged_sample_filtered.psam
.
--pmerge-list: 15 .pvar files scanned, headers merged.
Error: Non-concatenating --pmerge-list is under development.
End time: Mon Mar 13 15:54:10 2023 @gulumk for visibility |
When two variants share a position, --pmerge-list uses the --sort-vars setting (https://www.cog-genomics.org/plink/2.0/data#sort_vars ) to determine their output order. In particular, if the end of one .pvar and the beginning of the next have variants at the same position, and their IDs are in the wrong order, --pmerge-list can no longer "concatenate". I will update the online documentation today to spell this out. |
I see, thank you for the quick reply. Would you say that inspecting the heads and tails of the |
|
Thanks @chrchang , we were able to resolve our issue |
We have seen plink2 failing to concatenate multiple chinks coming after liftover operation. \n We found he issue to be caused efter lifting over in splits within chromosomes. \n We found some positions hg19->hg39 SNP positions to fall very far from the neighbour SNPs. This causes some variants to break the incresing position ordering per chromosome, in which case, some variants should belong to other splits. \n To fix this we are performing the merge relying on plink1.9 merge-list function. \n To see more about the issue -> chrchang/plink-ng#232
@myz540 Hi Mike, would you mind providing me with your codes to address this issue since I got the same issue as yours? I really look forward to receiving your help. |
Are there any updates on this? |
Hey @123huynguyen, I would love to help but this was at an old job so I no longer have access to the code base or the context required to provide you a solution. I believe the issue was in the sorting, when we inspected the |
Hi,
running Plink (v2.00a4LM AVX2 Intel) errors out when merging multiple datasets.
Contents of
input_sources.txt
:test3
andtest4
have been generated from VCF files:I'm a newbie with Plink and suspect I'm doing something wrong but after some digging I've found no clue.
System specs: CentOS 7.9, Intel(R) Xeon(R) Silver 4210R
The text was updated successfully, but these errors were encountered: