RDKit Cheminformatics Toolkit

Overview

RDKit is the standard open-source cheminformatics library for Python, providing comprehensive APIs for molecular parsing, descriptor calculation, fingerprinting, substructure searching, and chemical reactions. This skill walks through a complete compound library profiling and virtual screening workflow — from loading molecules through drug-likeness filtering, similarity screening, and result visualization.

When to Use

Calculate molecular properties (MW, LogP, TPSA, HBD/HBA) for a compound set
Screen a library against a reference compound using fingerprint similarity
Filter compounds by substructure (SMARTS patterns) for functional group analysis
Assess drug-likeness using Lipinski's Rule of Five or custom filters
Generate 2D depictions or 3D conformers for downstream docking
Enumerate chemical libraries using reaction SMARTS (combinatorial chemistry)
Cluster compounds by structural similarity for diversity analysis
Standardize and deduplicate molecular datasets (canonical SMILES, InChI)
Use datamol-cheminformatics instead for a higher-level RDKit wrapper with batching and error handling; use openbabel instead for multi-format conversion (MOL2, XYZ, PDB)

rdkit-cheminformatics

RDKit Cheminformatics Toolkit

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

gene-database

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability