bio-variant-calling

star 30

Call SNPs and indels from aligned reads using bcftools mpileup and call. Use when detecting variants from BAM files or generating VCF from alignments.

mdbabumiamssm By mdbabumiamssm schedule Updated 2/4/2026

name: bio-variant-calling description: Call SNPs and indels from aligned reads using bcftools mpileup and call. Use when detecting variants from BAM files or generating VCF from alignments. tool_type: cli primary_tool: bcftools measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools: - read_file - run_shell_command

Variant Calling

Call SNPs and indels from aligned reads using bcftools.

Basic Workflow

BAM file + Reference FASTA
         |
         v
   bcftools mpileup (generate pileup)
         |
         v
   bcftools call (call variants)
         |
         v
   VCF file

bcftools mpileup + call

Basic Variant Calling

bcftools mpileup -f reference.fa input.bam | bcftools call -mv -o variants.vcf

Output Compressed VCF

bcftools mpileup -f reference.fa input.bam | bcftools call -mv -Oz -o variants.vcf.gz
bcftools index variants.vcf.gz

Call Specific Region

bcftools mpileup -f reference.fa -r chr1:1000000-2000000 input.bam | \
    bcftools call -mv -o region.vcf

Call from Multiple BAMs

bcftools mpileup -f reference.fa sample1.bam sample2.bam sample3.bam | \
    bcftools call -mv -o variants.vcf

BAM List File

# bams.txt: one BAM path per line
bcftools mpileup -f reference.fa -b bams.txt | bcftools call -mv -o variants.vcf

mpileup Options

Quality Filtering

bcftools mpileup -f reference.fa \
    -q 20 \           # Min mapping quality
    -Q 20 \           # Min base quality
    input.bam | bcftools call -mv -o variants.vcf

Annotate with Read Depth

bcftools mpileup -f reference.fa -a DP,AD input.bam | bcftools call -mv -o variants.vcf

Full Annotation Set

bcftools mpileup -f reference.fa \
    -a FORMAT/DP,FORMAT/AD,FORMAT/ADF,FORMAT/ADR,INFO/AD \
    input.bam | bcftools call -mv -o variants.vcf

Target Regions (BED)

bcftools mpileup -f reference.fa -R targets.bed input.bam | \
    bcftools call -mv -o variants.vcf

Max Depth

bcftools mpileup -f reference.fa -d 1000 input.bam | bcftools call -mv -o variants.vcf

call Options

Calling Models

Flag Model Use Case
-m Multiallelic caller Default, recommended
-c Consensus caller Legacy, single sample

Output Variants Only

bcftools mpileup -f reference.fa input.bam | bcftools call -mv -o variants.vcf
# -v outputs variant sites only (not reference calls)

Output All Sites

bcftools mpileup -f reference.fa input.bam | bcftools call -m -o all_sites.vcf
# Without -v, outputs all sites including reference

Ploidy

# Haploid calling
bcftools mpileup -f reference.fa input.bam | bcftools call -m --ploidy 1 -o variants.vcf

# Specify ploidy file
bcftools mpileup -f reference.fa input.bam | bcftools call -m --ploidy-file ploidy.txt -o variants.vcf

Prior Probability

# Adjust variant prior (default 1.1e-3)
bcftools mpileup -f reference.fa input.bam | bcftools call -m -P 0.001 -o variants.vcf

Common Pipelines

Standard SNP/Indel Calling

bcftools mpileup -Ou -f reference.fa \
    -q 20 -Q 20 \
    -a FORMAT/DP,FORMAT/AD \
    input.bam | \
bcftools call -mv -Oz -o variants.vcf.gz

bcftools index variants.vcf.gz

Multi-sample Calling

bcftools mpileup -Ou -f reference.fa \
    -a FORMAT/DP,FORMAT/AD \
    sample1.bam sample2.bam sample3.bam | \
bcftools call -mv -Oz -o cohort.vcf.gz

bcftools index cohort.vcf.gz

Calling with Regions

bcftools mpileup -Ou -f reference.fa \
    -R targets.bed \
    -a FORMAT/DP,FORMAT/AD \
    input.bam | \
bcftools call -mv -Oz -o targets.vcf.gz

Parallel by Chromosome

for chr in chr1 chr2 chr3; do
    bcftools mpileup -Ou -f reference.fa -r "$chr" input.bam | \
        bcftools call -mv -Oz -o "${chr}.vcf.gz" &
done
wait

# Concatenate results
bcftools concat -Oz -o all.vcf.gz chr*.vcf.gz
bcftools index all.vcf.gz

Annotation Tags

INFO Tags

Tag Description
DP Total read depth
AD Allelic depths
MQ Mapping quality
FS Fisher strand bias
SGB Segregation based metric

FORMAT Tags

Tag Description
GT Genotype
DP Read depth per sample
AD Allelic depths per sample
ADF Forward strand allelic depths
ADR Reverse strand allelic depths
GQ Genotype quality
PL Phred-scaled likelihoods

Request Specific Annotations

bcftools mpileup -f reference.fa \
    -a FORMAT/DP,FORMAT/AD,FORMAT/SP,INFO/AD \
    input.bam | bcftools call -mv -o variants.vcf

Performance Options

Multi-threading

bcftools mpileup -f reference.fa --threads 4 input.bam | \
    bcftools call -mv --threads 4 -o variants.vcf

Uncompressed BCF for Speed

bcftools mpileup -Ou -f reference.fa input.bam | bcftools call -mv -Ou | \
    bcftools filter -Oz -o filtered.vcf.gz

Quick Reference

Task Command
Basic calling bcftools mpileup -f ref.fa in.bam | bcftools call -mv -o out.vcf
With quality filter bcftools mpileup -f ref.fa -q 20 -Q 20 in.bam | bcftools call -mv
Region bcftools mpileup -f ref.fa -r chr1:1-1000 in.bam | bcftools call -mv
Multi-sample bcftools mpileup -f ref.fa s1.bam s2.bam | bcftools call -mv
With annotations bcftools mpileup -f ref.fa -a DP,AD in.bam | bcftools call -mv

Common Errors

Error Cause Solution
no FASTA reference Missing -f Add -f reference.fa
reference mismatch Wrong reference Use same reference as alignment
no variants called Low quality/depth Lower quality thresholds

Related Skills

  • vcf-basics - View and query resulting VCF
  • filtering-best-practices - Filter variants by quality
  • variant-normalization - Normalize indels
  • alignment-files/pileup-generation - Alternative pileup generation
Install via CLI
npx skills add https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- --skill bio-variant-calling
Repository Details
star Stars 30
call_split Forks 7
navigation Branch main
article Path SKILL.md
More from Creator
mdbabumiamssm
mdbabumiamssm Explore all skills →