You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A researcher wrote in with the following:
I am using the WholeGenomeReprocessing pipeline to process some large BAM files (100GB+ each). In particular, the SortSam task in the pipeline runs slowly. I noticed the pipeline uses picard SortSam, which is not multithreaded, instead of something like samtools sort which can use multiple cores/threads. I am wondering if there is a reason for this choice, and would it be possible to change this step of the pipeline to use multithreaded options?
The text was updated successfully, but these errors were encountered:
Hi, thanks for posting! I'm raising this question because for large BAM files, the sorting step currently is the one that takes by far the most time. If using 4-8 cores, the walltime of the pipeline could probably be reduced by half!
A researcher wrote in with the following:
I am using the WholeGenomeReprocessing pipeline to process some large BAM files (100GB+ each). In particular, the SortSam task in the pipeline runs slowly. I noticed the pipeline uses picard SortSam, which is not multithreaded, instead of something like samtools sort which can use multiple cores/threads. I am wondering if there is a reason for this choice, and would it be possible to change this step of the pipeline to use multithreaded options?
The text was updated successfully, but these errors were encountered: