Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addindel.py error: Could not pile up over region - no successful mutations #186

Open
gtollefson opened this issue Sep 1, 2021 · 3 comments

Comments

@gtollefson
Copy link

gtollefson commented Sep 1, 2021

Hi @adamewing

I am attempting to insert a short indel sequence into my hg38 aligned bam file which has chromosomes names in the chr# convention. However i get a warning and an error stating that the program could not pile up over the given region. I've pasted the error message, my running code, and the bedfile I'm using to add this mutation with addindel.py below. I've also added a screenshot of my bam file formatting. Can you help me to troubleshoot?

Running code:

python3 
my_path/addindel.py -v my_path/BRCA2_COSV66457558.bed -f my_path/wgsim_reference_sorted.bam -r my_path/hg38.fa -o my_path/BRCA2_added_wgsim_reference_sorted.bam

Bedfile contents:

chr13 32332699 32332700 0.50 INS GCCAGG

Error message:

WARNING 2021-09-01 16:41:13,722 could not pile up over region: haplo_chr13_32332699_32332700
ERROR 2021-09-01 16:41:13,728 no succesful mutations

BAM file formatting (to view it, you will need to click the image and zoom in since it is very wide):
Screen Shot 2021-09-01 at 4 53 56 PM

Thank you very much for your help,
George

@adamewing
Copy link
Owner

Hi George,

Sorry for the slow response. What does the output look like if you run the following command:

samtools mpileup -r chr13:32332699-32332700 my_path/wgsim_reference_sorted.bam

@gtollefson
Copy link
Author

gtollefson commented Sep 3, 2021

Hi @adamewing,

No worries at all! When I run the above command I get the output below:

command:
[gtollefs@node1344 simulated_sequence_data]$ samtools mpileup -r chr13:32332699-32332700 wgsim_reference_sorted.bam
output:

[mpileup] 1 samples in 1 input files

Edit:

I read that the default mpileup ignores non-properly paired reads. I checked the bam file with flagstat and it looks like the reads are properly paired:

flagstat command:
[gtollefs@login005 simulated_sequence_data]$ samtools flagstat wgsim_reference_sorted.bam

flagstat output:

1999993 + 0 in total (QC-passed reads + QC-failed reads)
7 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
1999936 + 0 mapped (100.00% : N/A)
1999986 + 0 paired in sequencing
999993 + 0 read1
999993 + 0 read2
1999330 + 0 properly paired (99.97% : N/A)
1999886 + 0 with itself and mate mapped
43 + 0 singletons (0.00% : N/A)
482 + 0 with mate mapped to a different chr
67 + 0 with mate mapped to a different chr (mapQ>=5)

@Xiahaohao
Copy link

Your Python may be more than version 3.9! You can use Python 3.7 to solve this problem in bamsurgeon !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants