NGS-Variant-Calling

NGS Workflow Documentation

Overview

Key Steps

  1. Data Acquisition:
    • Download raw FASTQ files from SRA (500,000 reads per sample).
    • Example command:
      fastq-dump --split-files --gzip -X 500000 ERR11468775
      
  2. Reference Genome Setup:
    • Chromosomes 6 and 7 from UCSC hg38.
    • Merged and indexed with BWA.
  3. Quality Control:
    • FastQC reports for raw and processed data.
  4. Alignment:
    • BWA-MEM for paired-end alignment.
    • Samtools for sorting and indexing BAM files.
    • Example command:
      bwa mem ref_genome/hg38_chr6_7.fa sample_1.fastq.gz sample_2.fastq.gz | samtools sort -o sample.bam
      
  5. Variant Calling:
    • GATK HaplotypeCaller for germline variants, Mutect2 for somatic variants.
  6. Annotation: ENSEMBL VEP.

Explore the Code →