bcftools — VCF/BCF Variant Manipulation Toolkit

Overview

bcftools is the standard command-line toolkit for processing VCF (Variant Call Format) and BCF (Binary Call Format) files in the HTSlib ecosystem. It covers the complete post-variant-calling workflow: format conversion, quality filtering, variant normalization, multi-sample merging, annotation with external databases, genotype extraction, and QC statistics. bcftools uses streaming by design — most commands read from stdin and write to stdout, making it ideal for memory-efficient pipelines on large cohorts.

When to Use

Filtering variants by quality (QUAL, DP, AF) after variant calling
Merging VCF files from multiple samples into a joint call set
Adding rsIDs or gene annotations to variant calls
Extracting specific fields (genotypes, allele depths) as tabular output
Normalizing indel representations and splitting multi-allelic records
Calling variants from pileup output (mpileup + call)
Computing per-sample and overall VCF QC statistics
Use GATK HaplotypeCaller instead when calling variants with local realignment in human samples
Use VCFtools instead for population genetics statistics (Fst, LD, Hardy-Weinberg)
Use bcftools in the HTSlib pipeline; use picard for duplicate-marking and library metrics

bcftools-variant-manipulation

bcftools — VCF/BCF Variant Manipulation Toolkit

Overview

When to Use

More from jaechang-hits/sciagent-skills

scientific-brainstorming

gene-database

snakemake-workflow-engine

esm-protein-language-model

matchms-spectral-matching

chembl-database-bioactivity