ai-hallucination-fact-check-protocol

Installation
SKILL.md

AI Hallucination Fact-Check Protocol

What This Skill Does

Generates a fact-checking protocol specifically adapted for AI-generated text — extending the SIFT framework (Caulfield, 2019) with AI-specific moves that address the unique challenge of LLM hallucination. Standard lateral reading assumes a source has an institutional author whose funding and credibility can be investigated. This assumption breaks down for AI-generated text: there is no author to investigate, no institutional funding to check, no About Us page to scrutinise. What remains is the "Trace claims" move — and that move needs AI-specific calibration. AI hallucinations come in several forms: fabricated citations (a named study that does not exist, or exists but was never published), invented statistics (a number with plausible precision but no verifiable origin), real citations misattributed (a real paper attributed to the wrong author or journal), and false consensus claims ("most scientists agree" when no such consensus exists). Each requires a different verification move. The output includes a taxonomy of hallucination types for the subject area, an AI-adapted SIFT protocol, specific verification moves for each claim type, a Hallucination Hunt classroom activity, and a teacher modelling script showing the difference between finding a real and a fabricated citation.

Evidence Foundation

Wineburg & McGrew (2017, 2019) established through empirical research that professional fact-checkers outperform both students and professors at source evaluation because they use lateral reading — immediately opening new tabs to check what external sources say about a source — rather than vertical reading (analysing the source itself for credibility cues). This research is the foundation of the SIFT framework. However, lateral reading was designed for sources with institutional identities that can be investigated. When the "source" is an LLM, the Investigate step of SIFT requires adaptation: there is no institutional identity, no funding chain, no editorial board. What survives from lateral reading is the "Trace claims" move — verifying that cited evidence exists and says what the AI claims. Caulfield's (2019) SIFT operationalisation provides the structural framework extended here. Breakstone et al. (2021) found that students are poorly equipped to evaluate online sources, relying on surface credibility markers — a vulnerability dramatically amplified by AI outputs that are fluent and authoritative-sounding. Ji et al. (2023) conducted a systematic survey of hallucination in natural language generation, documenting the prevalence and types of hallucination in LLMs: intrinsic hallucinations (contradicting source material), extrinsic hallucinations (adding unverifiable or fabricated information), and factual inconsistencies. Their taxonomy directly informs the hallucination categories in this protocol.

Input Schema

The teacher must provide:

  • AI output context: The type of AI content being fact-checked. e.g. "ChatGPT summary of recent psychology research with named study citations" / "AI explanation of the causes of WWI that names specific historians and their arguments" / "Chatbot response about nutrition with statistics about teenage dietary patterns"
  • Student level: Year group and digital literacy. e.g. "Year 11, familiar with Google searches but have not formally studied source evaluation" / "Year 9, basic internet literacy"

Optional (injected by context engine if available):

  • Subject area: Discipline context — hallucination patterns differ by subject
  • Hallucination risk: The most likely hallucination type for this context
Related skills
Installs
3
GitHub Stars
227
First Seen
1 day ago