plink2-gwas-analysis

Installation
SKILL.md

PLINK2 — GWAS and Population Genetics

Overview

PLINK2 is the high-performance successor to PLINK 1.9, designed for genome-wide association studies (GWAS) and population genetics analysis on large cohorts. It processes genotype data in PLINK binary format (.bed/.bim/.fam), VCF, and BGEN formats — performing sample and variant quality control (QC), kinship estimation, principal component analysis (PCA), and linear/logistic regression association testing. PLINK2 is 10–100× faster than PLINK 1.9 on most tasks due to multithreading and optimized I/O. Output files are compatible with downstream visualization (Manhattan/QQ plots) and meta-analysis tools.

When to Use

  • Running GWAS on a case-control or quantitative trait cohort after genotyping array QC
  • Performing sample QC: missingness, heterozygosity outliers, sex check, cryptic relatedness
  • Computing genome-wide LD pruning for PCA or relatedness estimation
  • Running PCA on genotype data to identify population stratification
  • Converting between PLINK binary, VCF, and BGEN formats
  • Filtering variants by MAF, HWE, missingness, or INFO score in VCF/imputed data
  • Use regenie or SAIGE instead for biobank-scale GWAS (>100k samples) requiring mixed model association to control for population structure
  • Use VCFtools as an alternative for VCF-specific population genetics statistics

Prerequisites

Related skills

More from jaechang-hits/sciagent-skills

Installs
9
GitHub Stars
152
First Seen
Mar 16, 2026