neoscan pipeline v1.3

Pipeline for detecting neoantigen from snvs and indels

Install the third-party software

conda install -c bioconda optitype

Change the path for OptiPathPipeline.py in script neoscan.pl to where you install optitype

Usage

perl neoscan.pl --rdir --log --bamfq --bed --rna --refdir --step <step_number>

    <rdir> = full path of the folder holding files for this sequence run

    <log> = full path of the folder saving log files

    <bamfq> = 1, input is bam; 0, input is fastq: default 1

    <rna> =1, input data is rna, otherwise is dna. For HLA genotype

    <bed> = bed file for annotation: ensembl: /gscmnt/gc2518/dinglab/scao/db/ensembl38.85/proteome-first.bed

     refseq: /gscmnt/gc2518/dinglab/scao/db/refseq_hg38_june29/proteome.bed

    <refdir> = ref directory: /gscmnt/gc2518/dinglab/scao/db/refseq_hg38_june29

    <step_number> run this pipeline step by step. (running the whole pipeline if step number is 0)

files required in the running directory

vcf file format for snvs with columns: chromosome, start position, ref allele, alt allele, gene hugo symbol, HGSV short, is it somatic or germline mutation. Filename: .snp.vcf

 1       113854971       C       G       PTPN22  p.E207Q Somatic

 1       113900168       C       A       AP4B1   p.A284S Somatic

 1       117623561       C       G       FAM46C  p.I231M Somatic

vcf file for indels with the same columns. Filename .indel.vcf

 3       161086280       T       -       B3GALNT1        p.T159Pfs*8     Somatic

RNA-Seq or exome bam or fastq file for HLA type

All three input files should be in one folder. One set of files per sample

Some hints for running neoscan pipleine in WU internal MGI cluster

Copy the tool to a folder. This command will create neoscan/ folder in your current folder:
git clone https://github.com/ding-lab/neoscan.git

This is required to be able to do git checkout, Needed just once:

LSF_DOCKER_PRESERVE_ENVIRONMENT=true bsub -Is -R "select[mem>15000] rusage[mem=15000]" -M 32000000 -q docker-interactive -a "docker(scao/dailybox)" /bin/bash

install optitype

conda install -c bioconda optitype

Change the path for OptiPathPipeline.py in script neoscan.pl to where you install optitype

To prepare vcf and bam input files, you can follow the example at /gscmnt/gc2524/dinglab/akarpova/cptac3/CCRCC_neoscan_test.

perl /gscmnt/gc2524/dinglab/akarpova/software/neoscan/neoscan.pl --rdir /gscmnt/gc2524/dinglab/akarpova/cptac3/CCRCC_neoscan_test --log /gscmnt/gc2524/dinglab/akarpova/cptac3 --bamfq 1 --bed /gscmnt/gc2518/dinglab/scao/db/refseq_hg38_june29/proteome.bed --rna 1 --refdir /gscmnt/gc2518/dinglab/scao/db/refseq_hg38_june29 --step 1

Then change --step 2/3/4/5

After finishing running step 5, you can get the final result in the followint two files:

SAMPLE.neo.snv.summary

SAMPLE.neo.indel.summary

Then you may want to filter out peptides found in human cells in general. Just grep every single peptide in this database /gscmnt/gc2518/dinglab/scao/db/ensembl38.85/Homo_sapiens.GRCh38.pep.all.fa.cleaned.fa

Contact

Song Cao, [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
OptiTypePipeline.py		OptiTypePipeline.py
README.md		README.md
alleles.h5		alleles.h5
config.ini		config.ini
extract_novel_sequence.pl		extract_novel_sequence.pl
extract_reads_supporting_ref.pl		extract_reads_supporting_ref.pl
extract_support_reads_from_novel_sequence.pl		extract_support_reads_from_novel_sequence.pl
fasta_seq_for_indel_using_refseq_bed.pl		fasta_seq_for_indel_using_refseq_bed.pl
fasta_seq_for_snv_using_refseq_bed.pl		fasta_seq_for_snv_using_refseq_bed.pl
filter_reads_in_cdna.pl		filter_reads_in_cdna.pl
find_first_diff_pos_mut_wt_indel.pl		find_first_diff_pos_mut_wt_indel.pl
find_first_diff_pos_mut_wt_snv.pl		find_first_diff_pos_mut_wt_snv.pl
generate_mut_peptide_indel.pl		generate_mut_peptide_indel.pl
generate_mut_peptide_snv.pl		generate_mut_peptide_snv.pl
generate_report_summary.pl		generate_report_summary.pl
generate_report_summary_2.pl		generate_report_summary_2.pl
get_min_result.pl		get_min_result.pl
get_perferct_mapped_reads.pl		get_perferct_mapped_reads.pl
guess-encoded.py		guess-encoded.py
hla_reference_dna.fasta		hla_reference_dna.fasta
hla_reference_rna.fasta		hla_reference_rna.fasta
neoscan.pl		neoscan.pl
parseHLAresult.pl		parseHLAresult.pl
parseNetMHC4result.pl		parseNetMHC4result.pl
protein_seq_for_indel_using_refseq_bed.pl		protein_seq_for_indel_using_refseq_bed.pl
protein_seq_for_snv_using_refseq_bed.pl		protein_seq_for_snv_using_refseq_bed.pl
protein_seq_for_snv_using_refseq_bed_orig.pl		protein_seq_for_snv_using_refseq_bed_orig.pl
q20_filter.pl		q20_filter.pl
quality_sanger.table		quality_sanger.table
quality_sanger.table.2col.tsv		quality_sanger.table.2col.tsv
remove_duplicate_mut_peptide_snv.pl		remove_duplicate_mut_peptide_snv.pl
reportSummary.pl		reportSummary.pl
reportSummary_ns.pl		reportSummary_ns.pl
runNetMHC4.py		runNetMHC4.py
work_log_BRCA		work_log_BRCA
work_log_CCRCC		work_log_CCRCC
work_log_CO		work_log_CO
work_log_GBM		work_log_GBM
work_log_HNSCC		work_log_HNSCC
work_log_HNSCC_tumor		work_log_HNSCC_tumor
work_log_LSCC		work_log_LSCC
work_log_LUAD		work_log_LUAD
work_log_OV		work_log_OV
work_log_PDA		work_log_PDA
work_log_UCEC		work_log_UCEC
work_log_leftover_brca		work_log_leftover_brca
work_log_leftover_ov		work_log_leftover_ov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

neoscan pipeline v1.3

Install the third-party software

Usage

files required in the running directory

Some hints for running neoscan pipleine in WU internal MGI cluster

Contact

About

Releases

Packages

Languages

ding-lab/neoscan

Folders and files

Latest commit

History

Repository files navigation

neoscan pipeline v1.3

Install the third-party software

Usage

files required in the running directory

Some hints for running neoscan pipleine in WU internal MGI cluster

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages