conf-abstract-enrichment
Conference Abstract Enrichment
Most of what a scheduler needs to know about a talk is not in its abstract, because the abstract is short, sometimes missing, and written to attract an audience rather than to be classified. This is the one stage in the pipeline where runtime web search earns its keep: a one-line blurb plus the speaker's recent work, the linked GitHub repo, and the sponsoring company is often enough to recover the topic, depth, and stakes that the program left out.
Enrichment is powerful and therefore dangerous. Unconstrained, it invents plausible detail, over-enriches talks that were already fine, and launders a guess into the record as if it were sourced. This skill keeps it honest with four rails: a gate (only enrich what is actually thin), a source-priority ladder (search the most reliable signal first), mandatory provenance (every claim names where it came from), and separation (enriched text never overwrites the original — it lives beside it so the source of truth survives).
The output is an enrichment record written to the side; the ingestor merges it back into the event. Enrichment never edits the program's own words and never reclassifies on its own — it supplies better text and clearer axis signals with sources, and lets extraction/clustering do the classifying.
The enrichment record (output contract)
One JSON file per event, written to enrichment/{event_id}.json. Canonical — reproduce these field names.