Method library › Student Learning

Confidence Calibration Check

moderate evidence · Student Learning

Capture confidence ratings before and after a learning attempt to identify overconfidence and underconfidence patterns. Use when a student wants to understand how well they actually know something versus how well they think they know it.

What it does

Captures a confidence rating (0–100) before a knowledge attempt and again after receiving feedback, then compares the two. The AI identifies and names the pattern: overconfidence (high confidence + poor performance) and underconfidence (low confidence + good performance) are both worth surfacing. Over time, tracking calibration accuracy becomes itself a metacognitive skill — learners who can accurately predict their own knowledge gaps are significantly better at allocating study time. This skill makes the "illusion of competence" visible and actionable.

The evidence behind it

Bjork & Bjork (2011) identified the illusion of competence as one of the primary obstacles to effective self-study: re-reading material produces a feeling of familiarity that learners mistake for genuine understanding. Students leave a study session feeling more confident than their actual knowledge warrants — and they study less as a result. Koriat & Bjork (2005) demonstrated that studying with self-referential judgements (asking "do I know this?") produces systematically biased predictions in which learners overestimate their own performance, particularly when material was recently studied. Thiede et al. (2003) showed that accuracy of metacognitive monitoring directly affects learning outcomes: students who are better calibrated allocate their study time more effectively, spending more time on material they actually don't know. Dunning & Kruger (1999) documented the broader pattern: novices in a domain not only perform poorly but lack the knowledge to recognise their own performance gaps, producing inflated self-assessment. Hacker et al. (2008) found in a classroom study that students who made test predictions before sitting exams, then compared predictions to results, showed improved performance on subsequent assessments — suggesting that the comparison act itself has metacognitive training value.

Sources

How to use it in your lesson

For the best results with EvidenceLesson, give it:

Known limitations

  1. The 0–100 confidence scale is intuitive but not psychometrically calibrated. Different learners interpret the scale differently. A 70 from a naturally anxious learner may represent stronger actual knowledge than a 70 from an overconfident one. The skill uses relative calibration (before vs. after, across sessions) rather than absolute numbers — which is more informative than any single rating.
  1. The calibration pattern depends on the quality of the performance evaluation. If the AI misjudges the accuracy of the learner's attempt — calling something "partial" when it was actually strong, or missing a genuine misconception — the calibration feedback will be misleading. The AI's subject-matter accuracy is a hard dependency.
  1. Frequent calibration checks can produce strategic responding if learners game the rating. A learner who understands the pattern may artificially lower pre-attempt confidence to appear "well-calibrated" after a good performance. This is unlikely with genuinely motivated learners but worth noting.
  1. Calibration data across sessions requires session history to be useful. A single calibration check gives a snapshot. The pattern — persistent overconfidence, improving calibration — only becomes visible over multiple sessions. This skill pairs with 20-10 (SRL Session Wrapper) and 20-12 (Weekly Agency Review) for longitudinal tracking.

Pairs well with

Plan a research-backed lesson in 30 seconds

EvidenceLesson cites a real teaching method on every step — standards-aligned and classroom-ready.

Try it free →