ENA Database — European Nucleotide Archive Programmatic Access

Overview

The European Nucleotide Archive (ENA) is EMBL-EBI's comprehensive nucleotide sequence database, encompassing raw sequencing reads, genome assemblies, annotated sequences, and associated metadata. It mirrors and extends INSDC data (GenBank, DDBJ). All access is via REST APIs with no authentication required.

When to Use

Searching for sequencing studies, samples, or experiments by organism, project, or keyword
Downloading raw FASTQ/BAM files for reanalysis of public sequencing datasets
Retrieving genome assemblies with quality statistics (N50, contig count, genome size)
Fetching nucleotide sequences in FASTA or EMBL flat-file format by accession
Exploring taxonomic lineage and finding organisms by partial name
Cross-referencing ENA records with external databases (ArrayExpress, UniProt, PDB)
Building bulk download lists for large-scale sequencing projects
For multi-database Python queries (ENA + UniProt + KEGG), prefer bioservices instead
For NCBI-specific queries (PubMed literature, GenBank records), use pubmed-database or Biopython Entrez

ena-database

ENA Database — European Nucleotide Archive Programmatic Access

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability

archs4-database