:::warning
cd /work/username
mkdir SV
cd SV
rsync -avzP /work/u2499286/SV/format_change.py ./
rsync -avzP /work/u2499286/SV/HG002_SVs_Tier1_noVDJorXorY_v0.6.2_hs38DH.bed ./
rsync -avzP /work/u2499286/SV/manta_modified.sh ./
rsync -avzP /work/u2499286/SV/thalassemia_pipeline_course/demo/NGS1_20170103B.hs38DH.dedup.postalt.sorted.BQSR_chr16.bam ./
rsync -avzP /work/u2499286/SV/thalassemia_pipeline_course/demo/NGS1_20170103B.hs38DH.dedup.postalt.sorted.BQSR_chr16.bam.bai ./
:::
:::info
What is Manta
?
Manta
is a tool for detecting genomic structural variations (SVs). The structural variants include large insertions, deletions, rearrangements, inversions, and other complex variation types. Illumina developed this tool, specifically designed for handling high-throughput sequencing (HTS) data.
What are structural variants (SVs)?
- Structural variants refer to DNA structural changes in chromosomes greater than 50 base pairs (bp). These variants may involve phenomena such as insertions, deletions, inversions, and duplications in chromosomes, potentially having profound impacts on gene function and the health of an organism. :::
-
Open
manta_modified.sh
and update the file with the correct path names. -
Execute
manta_modified.sh
sbatch manta_modified.sh
- Get two folders:
raw
andfilter
.
- candidateSV.vcf.gz
- candidateSV.vcf.gz.tbi
- candidateSmallIndels.vcf.gz
- candidateSmallIndels.vcf.gz.tbi
- diploidSV.vcf
- diploidSV.vcf.gz.tbi
File names with "pass": Filtered for rows with the PASS field.
File names with "recode": Filtered for regions in the confident interval.
- thalassemia_hg38_manta.vcf : All information.
- thalassemia_hg38_manta_del_pass.vcf
- thalassemia_hg38_manta_del_pass_filtered.recode.vcf
- thalassemia_hg38_manta_ins_pass.recode.vcf
- thalassemia_hg38_manta_ins_pass.vcf
This sample in the class is a thalassemia carrier. One large deletion can be observed in the positions of the HBA1 and HBA2 genes on chromosome 16 .
:::warning
Command to open IGV
:
sh /opt/ohpc/Taiwania3/pkg/biology/IGV/IGV_v2.10.3/igv.sh
:::
-
Select input file:
- Genome selection: hg38
- File selection: NGS1_20170103B.hs38DH.dedup.postalt.sorted.BQSR_chr16.bam
- Chr selection: chr16
- Position selection:
chr16:162,130-189,352
-
Right-click to select:
- Expanded view
- View as pairs
- Color alignments by -> insert size and pair orientation
- Sort alignments by -> insert size
:::warning
cd /work/username
mkdir SV
cd SV
rsync -avzP /work/u2499286/SV/format_change.py ./
rsync -avzP /work/u2499286/SV/HG002_SVs_Tier1_noVDJorXorY_v0.6.2_hs38DH.bed ./
rsync -avzP /work/u2499286/SV/manta_modified.sh ./
rsync -avzP /work/u2499286/SV/thalassemia_pipeline_course/demo/NGS1_20170103B.hs38DH.dedup.postalt.sorted.BQSR_chr16.bam ./
rsync -avzP /work/u2499286/SV/thalassemia_pipeline_course/demo/NGS1_20170103B.hs38DH.dedup.postalt.sorted.BQSR_chr16.bam.bai ./
:::
:::info
manta
是一個用於檢測基因組結構變異(Structural Variations, SVs)的工具,處理的structural variants 包括大片段插入(Insertion)、缺失(Deletion)、重組(Rearrangement)、倒位(Inversion)以及其他複雜變異類型,此工具是由 Illumina 開發的,專為處理高通量測序(High-Throughput Sequencing, HTS)數據而設計。
- 指相對於 reference genome ,染色體中大於50個 bp 的 DNA 結構改變。這些 variants 可能涉及染色體的插入、刪除、倒位、重複等現象,可能對基因功能和生物體健康產生深遠影響。
:::
sbatch manta_modified.sh
- 得到兩個資料夾:
raw
和filter
- candidateSV.vcf.gz
- candidateSV.vcf.gz.tbi
- candidateSmallIndels.vcf.gz
- candidateSmallIndels.vcf.gz.tbi
- diploidSV.vcf
- diploidSV.vcf.gz.tbi
- thalassemia_hg38_manta.vcf : 紀錄全部資訊。
- thalassemia_hg38_manta_del_pass.vcf
- thalassemia_hg38_manta_del_pass_filtered.recode.vcf
- thalassemia_hg38_manta_ins_pass.recode.vcf
- thalassemia_hg38_manta_ins_pass.vcf
上課所使用的樣本是 thalassemia 的樣本,這種樣本會在 chr16 中 HBA1 及 HBA2 genes 的位置看到大片段 deletion。 :::warning
sh /opt/ohpc/Taiwania3/pkg/biology/IGV/IGV_v2.10.3/igv.sh
:::
-
選擇輸入檔
- Genome 選擇: hg38
- File 選擇: NGS1_20170103B.hs38DH.dedup.postalt.sorted.BQSR_chr16.bam
- chr 選擇: chr16
- 位置選擇:
chr16:162,130-189,352
-
按右鍵選擇:
- Expanded view
- View as pairs
- Color alignments by -> insert size and pair orientation
- Sort alignments by -> insert size