bioc-pmc-api
BioC API for PMC Open Access
Overview
The BioC API provides full-text articles from PubMed Central (PMC) in the BioC format — a simplified XML/JSON structure designed specifically for biomedical text mining. Unlike the standard PMC OAI service (which returns JATS XML), BioC pre-segments text into passages with offset annotations, making it ideal for NLP pipelines, named entity recognition, relation extraction, and other text mining tasks. Free, no authentication required.
API Endpoints
Base URL
https://www.ncbi.nlm.nih.gov/research/bionlp/RESTful/pmcoa.cgi/BioC_json/{PMCID}/unicode
Retrieve by PMC ID
# JSON format (recommended for programmatic use)
curl "https://www.ncbi.nlm.nih.gov/research/bionlp/RESTful/pmcoa.cgi/BioC_json/PMC6267067/unicode"
More from wentorai/research-plugins
academic-paper-summarizer
Summarize academic papers with structured extraction of key elements
43academic-translation-guide
Academic translation, post-editing, and Chinglish correction guide
38academic-writing-refiner
Checklist-driven academic English polishing and Chinglish correction
34academic-citation-manager
Manage academic citations across BibTeX, APA, MLA, and Chicago formats
33abstract-writing-guide
Craft structured research abstracts that maximize clarity and journal acceptance
15ai-writing-humanizer
Remove AI-generated patterns to produce natural, authentic academic writing
14