name: bio-variant-calling
description: Call SNPs and indels from aligned reads using bcftools mpileup and call. Use when detecting variants from BAM files or generating VCF from alignments.
tool_type: cli
primary_tool: bcftools
measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes.
allowed-tools:
- read_file
- run_shell_command
Variant Calling
Call SNPs and indels from aligned reads using bcftools.
Basic Workflow
BAM file + Reference FASTA
|
v
bcftools mpileup (generate pileup)
|
v
bcftools call (call variants)
|
v
VCF file
bcftools mpileup + call
Basic Variant Calling
bcftools mpileup -f reference.fa input.bam | bcftools call -mv -o variants.vcf
Output Compressed VCF
bcftools mpileup -f reference.fa input.bam | bcftools call -mv -Oz -o variants.vcf.gz
bcftools index variants.vcf.gz
Call Specific Region
bcftools mpileup -f reference.fa -r chr1:1000000-2000000 input.bam | \
bcftools call -mv -o region.vcf
Call from Multiple BAMs
bcftools mpileup -f reference.fa sample1.bam sample2.bam sample3.bam | \
bcftools call -mv -o variants.vcf
BAM List File
# bams.txt: one BAM path per line
bcftools mpileup -f reference.fa -b bams.txt | bcftools call -mv -o variants.vcf
mpileup Options
Quality Filtering
bcftools mpileup -f reference.fa \
-q 20 \ # Min mapping quality
-Q 20 \ # Min base quality
input.bam | bcftools call -mv -o variants.vcf
Annotate with Read Depth
bcftools mpileup -f reference.fa -a DP,AD input.bam | bcftools call -mv -o variants.vcf
Full Annotation Set
bcftools mpileup -f reference.fa \
-a FORMAT/DP,FORMAT/AD,FORMAT/ADF,FORMAT/ADR,INFO/AD \
input.bam | bcftools call -mv -o variants.vcf
Target Regions (BED)
bcftools mpileup -f reference.fa -R targets.bed input.bam | \
bcftools call -mv -o variants.vcf
Max Depth
bcftools mpileup -f reference.fa -d 1000 input.bam | bcftools call -mv -o variants.vcf
call Options
Calling Models
| Flag |
Model |
Use Case |
-m |
Multiallelic caller |
Default, recommended |
-c |
Consensus caller |
Legacy, single sample |
Output Variants Only
bcftools mpileup -f reference.fa input.bam | bcftools call -mv -o variants.vcf
# -v outputs variant sites only (not reference calls)
Output All Sites
bcftools mpileup -f reference.fa input.bam | bcftools call -m -o all_sites.vcf
# Without -v, outputs all sites including reference
Ploidy
# Haploid calling
bcftools mpileup -f reference.fa input.bam | bcftools call -m --ploidy 1 -o variants.vcf
# Specify ploidy file
bcftools mpileup -f reference.fa input.bam | bcftools call -m --ploidy-file ploidy.txt -o variants.vcf
Prior Probability
# Adjust variant prior (default 1.1e-3)
bcftools mpileup -f reference.fa input.bam | bcftools call -m -P 0.001 -o variants.vcf
Common Pipelines
Standard SNP/Indel Calling
bcftools mpileup -Ou -f reference.fa \
-q 20 -Q 20 \
-a FORMAT/DP,FORMAT/AD \
input.bam | \
bcftools call -mv -Oz -o variants.vcf.gz
bcftools index variants.vcf.gz
Multi-sample Calling
bcftools mpileup -Ou -f reference.fa \
-a FORMAT/DP,FORMAT/AD \
sample1.bam sample2.bam sample3.bam | \
bcftools call -mv -Oz -o cohort.vcf.gz
bcftools index cohort.vcf.gz
Calling with Regions
bcftools mpileup -Ou -f reference.fa \
-R targets.bed \
-a FORMAT/DP,FORMAT/AD \
input.bam | \
bcftools call -mv -Oz -o targets.vcf.gz
Parallel by Chromosome
for chr in chr1 chr2 chr3; do
bcftools mpileup -Ou -f reference.fa -r "$chr" input.bam | \
bcftools call -mv -Oz -o "${chr}.vcf.gz" &
done
wait
# Concatenate results
bcftools concat -Oz -o all.vcf.gz chr*.vcf.gz
bcftools index all.vcf.gz
Annotation Tags
INFO Tags
| Tag |
Description |
DP |
Total read depth |
AD |
Allelic depths |
MQ |
Mapping quality |
FS |
Fisher strand bias |
SGB |
Segregation based metric |
FORMAT Tags
| Tag |
Description |
GT |
Genotype |
DP |
Read depth per sample |
AD |
Allelic depths per sample |
ADF |
Forward strand allelic depths |
ADR |
Reverse strand allelic depths |
GQ |
Genotype quality |
PL |
Phred-scaled likelihoods |
Request Specific Annotations
bcftools mpileup -f reference.fa \
-a FORMAT/DP,FORMAT/AD,FORMAT/SP,INFO/AD \
input.bam | bcftools call -mv -o variants.vcf
Performance Options
Multi-threading
bcftools mpileup -f reference.fa --threads 4 input.bam | \
bcftools call -mv --threads 4 -o variants.vcf
Uncompressed BCF for Speed
bcftools mpileup -Ou -f reference.fa input.bam | bcftools call -mv -Ou | \
bcftools filter -Oz -o filtered.vcf.gz
Quick Reference
| Task |
Command |
| Basic calling |
bcftools mpileup -f ref.fa in.bam | bcftools call -mv -o out.vcf |
| With quality filter |
bcftools mpileup -f ref.fa -q 20 -Q 20 in.bam | bcftools call -mv |
| Region |
bcftools mpileup -f ref.fa -r chr1:1-1000 in.bam | bcftools call -mv |
| Multi-sample |
bcftools mpileup -f ref.fa s1.bam s2.bam | bcftools call -mv |
| With annotations |
bcftools mpileup -f ref.fa -a DP,AD in.bam | bcftools call -mv |
Common Errors
| Error |
Cause |
Solution |
no FASTA reference |
Missing -f |
Add -f reference.fa |
reference mismatch |
Wrong reference |
Use same reference as alignment |
no variants called |
Low quality/depth |
Lower quality thresholds |
Related Skills
- vcf-basics - View and query resulting VCF
- filtering-best-practices - Filter variants by quality
- variant-normalization - Normalize indels
- alignment-files/pileup-generation - Alternative pileup generation