ai-output-critical-audit-designer

Installation
SKILL.md

AI Output Critical Audit Designer

What This Skill Does

Generates a structured protocol for critically auditing AI-generated text against Ennis's (2015) six critical thinking standards — clarity, accuracy, precision, relevance, depth, and breadth — with adaptations that address AI-characteristic failure modes not covered by general critical thinking frameworks. The key pedagogical challenge is that AI-generated text is fluent, confident, and well-formed, which makes it harder to evaluate critically than text that looks suspicious. Standard source credibility heuristics (Who wrote this? Who funds it?) break down because the author is an LLM. What replaces them is a close-reading protocol trained on AI-specific patterns: assertions stated with unearned confidence, claims with plausible-sounding precision but no verifiable source, expert-sounding language without genuine epistemic depth, and the systematic absence of "I don't know." This skill generates the domain anchor for the ai-literacy suite: an annotation protocol (students mark up AI text in real time), an audit rubric (scoring AI text Weak/Moderate/Strong on each CT standard), push-back sentence stems calibrated for AI failure modes, and a teacher modelling script. This is the equivalent of sourcing-skill-builder in the historical-thinking domain — the foundational move students must learn before they can do the more specialised work.

Evidence Foundation

Ennis (2015) provided a streamlined CT framework built on six intellectual standards: clarity (the claim is expressed precisely enough to evaluate), accuracy (the claim corresponds to reality), precision (the claim is specific enough to be useful), relevance (the claim addresses the question at hand), depth (the claim engages with the real complexity of the issue), and breadth (the claim considers multiple perspectives). These standards are the explicit theoretical grounding for Kharbach's (2026) AI-age CT activities. Paul & Elder (2008) operationalised similar standards into the intellectual standards framework used widely in CT education, providing the pedagogical tradition behind Ennis's schema. Facione's (1990) Delphi consensus defined CT as comprising interpretation, analysis, evaluation, inference, explanation, and self-regulation — the evaluative dimension is precisely what AI audit activates. Dai et al. (2023) conducted a large-scale empirical analysis of LLM-generated feedback and documented the characteristic failure pattern: AI outputs are fluent, well-structured, and tend toward overconfidence with insufficient epistemic hedging. Their findings — that LLMs produce vague suggestions while avoiding identification of specific errors — map directly onto Ennis's precision and accuracy standards. Wineburg & McGrew (2019) established that effective text evaluation requires what they call "disciplined scrutiny" — a trained, protocol-driven reading practice rather than intuitive judgment. This provides the methodological justification for a structured annotation protocol rather than open-ended evaluation.

Input Schema

The teacher must provide:

  • AI output sample: The specific AI text to audit, OR a description of the type to design the protocol for. e.g. "This is the text of a ChatGPT response about the causes of World War One: [paste text]" / "AI-generated essay introduction for a Year 10 Geography task on climate change" / "Chatbot explanation of quadratic equations for Year 9 Maths students"
  • Student level: Year group and CT experience. e.g. "Year 10, can identify basic argument claims but haven't formally studied CT standards" / "Year 12, familiar with Paul & Elder framework"

Optional (injected by context engine if available):

  • CT standard focus: Which standards to emphasise. e.g. "Accuracy and precision — students often accept AI statistics without checking" / "All six — for a full audit activity"
  • Subject area: The discipline context — what counts as adequate evidence differs by subject
Related skills
Installs
2
GitHub Stars
216
First Seen
6 days ago