ai-output-critical-audit-designer
AI Output Critical Audit Designer
What This Skill Does
Generates a structured protocol for critically auditing AI-generated text against Ennis's (2015) six critical thinking standards — clarity, accuracy, precision, relevance, depth, and breadth — with adaptations that address AI-characteristic failure modes not covered by general critical thinking frameworks. The key pedagogical challenge is that AI-generated text is fluent, confident, and well-formed, which makes it harder to evaluate critically than text that looks suspicious. Standard source credibility heuristics (Who wrote this? Who funds it?) break down because the author is an LLM. What replaces them is a close-reading protocol trained on AI-specific patterns: assertions stated with unearned confidence, claims with plausible-sounding precision but no verifiable source, expert-sounding language without genuine epistemic depth, and the systematic absence of "I don't know." This skill generates the domain anchor for the ai-literacy suite: an annotation protocol (students mark up AI text in real time), an audit rubric (scoring AI text Weak/Moderate/Strong on each CT standard), push-back sentence stems calibrated for AI failure modes, and a teacher modelling script. This is the equivalent of sourcing-skill-builder in the historical-thinking domain — the foundational move students must learn before they can do the more specialised work.
Evidence Foundation
Ennis (2015) provided a streamlined CT framework built on six intellectual standards: clarity (the claim is expressed precisely enough to evaluate), accuracy (the claim corresponds to reality), precision (the claim is specific enough to be useful), relevance (the claim addresses the question at hand), depth (the claim engages with the real complexity of the issue), and breadth (the claim considers multiple perspectives). These standards are the explicit theoretical grounding for Kharbach's (2026) AI-age CT activities. Paul & Elder (2008) operationalised similar standards into the intellectual standards framework used widely in CT education, providing the pedagogical tradition behind Ennis's schema. Facione's (1990) Delphi consensus defined CT as comprising interpretation, analysis, evaluation, inference, explanation, and self-regulation — the evaluative dimension is precisely what AI audit activates. Dai et al. (2023) conducted a large-scale empirical analysis of LLM-generated feedback and documented the characteristic failure pattern: AI outputs are fluent, well-structured, and tend toward overconfidence with insufficient epistemic hedging. Their findings — that LLMs produce vague suggestions while avoiding identification of specific errors — map directly onto Ennis's precision and accuracy standards. Wineburg & McGrew (2019) established that effective text evaluation requires what they call "disciplined scrutiny" — a trained, protocol-driven reading practice rather than intuitive judgment. This provides the methodological justification for a structured annotation protocol rather than open-ended evaluation.
Input Schema
The teacher must provide:
- AI output sample: The specific AI text to audit, OR a description of the type to design the protocol for. e.g. "This is the text of a ChatGPT response about the causes of World War One: [paste text]" / "AI-generated essay introduction for a Year 10 Geography task on climate change" / "Chatbot explanation of quadratic equations for Year 9 Maths students"
- Student level: Year group and CT experience. e.g. "Year 10, can identify basic argument claims but haven't formally studied CT standards" / "Year 12, familiar with Paul & Elder framework"
Optional (injected by context engine if available):
- CT standard focus: Which standards to emphasise. e.g. "Accuracy and precision — students often accept AI statistics without checking" / "All six — for a full audit activity"
- Subject area: The discipline context — what counts as adequate evidence differs by subject
More from garethmanning/claude-education-skills
intelligent-tutoring-dialogue-designer
Script a multi-turn tutoring dialogue with branching responses for anticipated student difficulties. Use when designing AI tutors, chatbot interactions, or structured one-to-one support scripts.
15scaffolded-task-modifier
Modify a classroom task with language scaffolds that preserve cognitive demand for EAL learners. Use when adapting existing tasks for students at different English proficiency levels.
14experiential-learning-cycle-designer
Structure a direct experience into a full learning cycle with concrete experience, reflection, and conceptual transfer. Use when planning field trips, simulations, or practical tasks.
14gap-analysis-from-student-work
Analyse student work against criteria to identify specific gaps between current performance and learning objectives. Use when reviewing submissions, planning feedback, or diagnosing learning needs.
13backwards-design-unit-planner
Plan a unit using backwards design from desired outcomes through assessment evidence to learning activities. Use when starting a new unit or redesigning an existing one from standards.
13dual-coding-designer
Design a visual complement to verbal content using dual coding principles for stronger encoding. Use when creating slides, diagrams, posters, or visual explanations of complex concepts.
12