datamol

Installation
SKILL.md

Datamol Cheminformatics Skill

Overview

Datamol is a Python library that provides a lightweight, Pythonic abstraction layer over RDKit for molecular cheminformatics. Simplify complex molecular operations with sensible defaults, efficient parallelization, and modern I/O capabilities. All molecular objects are native rdkit.Chem.Mol instances, ensuring full compatibility with the RDKit ecosystem.

Version note: Examples target datamol 0.12.x (PyPI stable: 0.12.5, June 2024). Since 0.10.0, modules are lazy-loaded by default (set DATAMOL_DISABLE_LAZY_LOADING=1 to disable). Since 0.12.2, RDKit is a direct PyPI dependency of datamol. Fingerprints use RDKit's rdFingerprintGenerator API (0.12.5+).

Key capabilities:

  • Molecular format conversion (SMILES, SELFIES, InChI)
  • Structure standardization and sanitization
  • Molecular descriptors and fingerprints
  • 3D conformer generation and analysis
  • Clustering and diversity selection
  • Scaffold and fragment analysis
  • Chemical reaction application
  • Visualization and alignment
  • Batch processing with parallelization
  • Cloud storage support via fsspec
Installs
587
GitHub Stars
28.4K
First Seen
Apr 9, 2026
datamol — k-dense-ai/scientific-agent-skills