ai-feedback-design-principles
AI Feedback Design Principles
What This Skill Does
Evaluates a proposed AI feedback design against research criteria for effective automated feedback and suggests specific improvements. This skill takes a feedback scenario (what the student did) and the current or proposed AI response (what the system says), then analyses the feedback against principles from Shute (2008), Narciss (2008), Hattie & Timperley (2007), and emerging LLM feedback research (Dai et al., 2023). The output includes a diagnosis of what's working and what isn't, a redesigned version of the feedback, and practical implementation guidance. The core challenge is that most AI feedback falls into one of two failure modes: it's either too vague to be actionable ("Good effort! Try to improve your argument.") or too specific and does the thinking for the student ("Your thesis should be: Climate change is the defining challenge of our generation because…"). Effective feedback lives in the narrow space between these extremes — specific enough that the student knows what to do, but not so specific that it bypasses the cognitive work that produces learning. AI is specifically valuable here because it can generate feedback at scale, but this makes the design principles even more critical: bad feedback at scale is worse than no feedback at all.
Evidence Foundation
Hattie & Timperley (2007) conducted a meta-analysis finding that feedback has an average effect size of 0.73 — making it one of the most powerful influences on learning. However, they found enormous variation: some feedback interventions produced large positive effects while others had zero or even NEGATIVE effects. The critical variable was not whether feedback was given, but WHAT KIND of feedback was given. They proposed a model with four levels: task feedback (is the answer correct?), process feedback (what strategies can improve the work?), self-regulation feedback (how can you monitor your own learning?), and self feedback (you're a great student!). Task and process feedback were most effective; self feedback ("Good job!") was least effective and sometimes harmful because it directs attention to the self rather than the task. Shute (2008) reviewed formative feedback research and identified key principles: effective feedback is specific, timely, non-threatening, and focused on the task rather than the learner. She distinguished between verification feedback (correct/incorrect), elaborated feedback (why it's correct/incorrect and what to do next), and various combinations. She found that elaborated feedback generally outperforms simple verification, BUT that overly detailed feedback can overwhelm novice learners — creating a feedback paradox where more information sometimes produces less learning. Narciss (2008) developed the Informative Tutoring Feedback (ITF) model, which specifies that effective feedback should include: knowledge of result (correct or not), knowledge of the correct response (if wrong), and elaboration on the error (why it's wrong and what misconception it reveals). Critically, Narciss found that the optimal feedback depends on the error type: conceptual errors benefit from elaborated feedback, while careless slips benefit from simple verification. Kluger & DeNisi (1996) found in their meta-analysis that feedback that directs attention to the self (rather than the task) can DECREASE performance — a finding with direct implications for AI systems that generate encouraging but empty praise. Dai et al. (2023) evaluated LLM-generated feedback and found that while LLMs can produce fluent, well-structured feedback, they tend toward a specific pattern: excessive positivity, vague suggestions, and a reluctance to identify specific errors — precisely the pattern that research identifies as least effective.
Input Schema
The teacher must provide:
- Feedback scenario: What the student did. e.g. "Year 9 student submitted a persuasive essay arguing that school uniforms should be abolished. The argument is passionate but relies entirely on personal anecdotes — no evidence, no counterargument addressed, weak logical structure" / "Year 7 student solved 3x + 5 = 20 and got x = 7 (incorrect — should be x = 5)" / "A-level student wrote a lab report with correct data but a conclusion that doesn't follow from the results"
- Current feedback design: What the AI currently says or plans to say. e.g. "Great essay! You clearly feel strongly about this topic. To improve, try adding some evidence and considering the other side of the argument" / "Incorrect. The answer is x = 5. Try again" / "Your conclusion needs work. Think about what your data actually shows"
Optional (injected by context engine if available):
- Student level: Year group and proficiency
- Subject area: The curriculum subject
More from garethmanning/education-agent-skills
curriculum-crosswalk
Compares two or more band-tagged frameworks and produces a framework-neutral topic matrix showing coverage and gaps across all inputs, plus an optional reference-centric PLC crosswalk document when a reference framework is supplied.
1competency-unpacker
Unpack a broad standard or competency descriptor into specific, assessable success criteria and sub-skills. Use when interpreting curriculum standards or writing learning objectives.
1project-brief-designer
Design a project-based learning brief with a driving question, milestones, and assessment criteria. Use when planning PBL units, inquiry projects, or extended investigations.
1language-demand-analyser
Analyse the language demands of a classroom task to identify barriers for EAL and multilingual learners. Use when adapting tasks, planning support, or assessing linguistic accessibility.
1differentiation-adapter
Adapt a classroom task for specific learner needs while preserving the core learning objective intact. Use when differentiating for SEND, EAL, gifted, ADHD, dyslexia, or anxiety.
1learning-analytics-interpretation-guide
Interpret learning analytics data and translate dashboard findings into actionable teaching decisions. Use when reviewing LMS data, quiz patterns, or engagement metrics.
1