Skip to content

Creating necessary files for Fasta Reference

Ian edited this page Apr 24, 2018 · 2 revisions

When writing a genomics pipeline it is very useful to have test data that is downsampled so that changes to the pipeline can be made and tested quickly.

See generating test data for how to get the initial fastqs.

Next follow these steps to create the necessary supplementary files required by GATK, Picard, etc...

  1. Create sequence dictionary
java -jar ~/Desktop/vendor_tools/picard-2.8.1.jar CreateSequenceDictionary R=Homo_sapiens_assembly19-14-105258897-105259017.fasta O=Homo_sapiens_assembly19-14-105258897-105259017.fasta.dict
  1. Create BWA transform & other required files
bwa index Homo_sapiens_assembly19-14-105258897-105259017.fasta
  1. Create fasta index
samtools faidx Homo_sapiens_assembly19-14-105258897-105259017.fasta