AI Hallucination Fact-Check Protocol
Design a fact-checking protocol for AI-generated text, extending SIFT with AI-specific adaptations for hallucination detection. Use when students need to verify AI claims and citations.
What it does
Generates a fact-checking protocol specifically adapted for AI-generated text — extending the SIFT framework (Caulfield, 2019) with AI-specific moves that address the unique challenge of LLM hallucination. Standard lateral reading assumes a source has an institutional author whose funding and credibility can be investigated. This assumption breaks down for AI-generated text: there is no author to investigate, no institutional funding to check, no About Us page to scrutinise. What remains is the "Trace claims" move — and that move needs AI-specific calibration. AI hallucinations come in several forms: fabricated citations (a named study that does not exist, or exists but was never published), invented statistics (a number with plausible precision but no verifiable origin), real citations misattributed (a real paper attributed to the wrong author or journal), and false consensus claims ("most scientists agree" when no such consensus exists). Each requires a different verification move. The output includes a taxonomy of hallucination types for the subject area, an AI-adapted SIFT protocol, specific verification moves for each claim type, a Hallucination Hunt classroom activity, and a teacher modelling script showing the difference between finding a real and a fabricated citation.
The evidence behind it
Wineburg & McGrew (2017, 2019) established through empirical research that professional fact-checkers outperform both students and professors at source evaluation because they use lateral reading — immediately opening new tabs to check what external sources say about a source — rather than vertical reading (analysing the source itself for credibility cues). This research is the foundation of the SIFT framework. However, lateral reading was designed for sources with institutional identities that can be investigated. When the "source" is an LLM, the Investigate step of SIFT requires adaptation: there is no institutional identity, no funding chain, no editorial board. What survives from lateral reading is the "Trace claims" move — verifying that cited evidence exists and says what the AI claims. Caulfield's (2019) SIFT operationalisation provides the structural framework extended here. Breakstone et al. (2021) found that students are poorly equipped to evaluate online sources, relying on surface credibility markers — a vulnerability dramatically amplified by AI outputs that are fluent and authoritative-sounding. Ji et al. (2023) conducted a systematic survey of hallucination in natural language generation, documenting the prevalence and types of hallucination in LLMs: intrinsic hallucinations (contradicting source material), extrinsic hallucinations (adding unverifiable or fabricated information), and factual inconsistencies. Their taxonomy directly informs the hallucination categories in this protocol.
Sources
- Wineburg & McGrew (2017) — Lateral reading: reading less and learning more when evaluating digital information
- Wineburg & McGrew (2019) — Lateral reading and the nature of expertise
- Caulfield (2019) — SIFT: the four moves (Stop, Investigate, Find better coverage, Trace claims)
- Breakstone et al. (2021) — Students' civic online reasoning: a national portrait
- Ji et al. (2023) — Survey of hallucination in natural language generation
How to use it in your lesson
For the best results with EvidenceLesson, give it:
- ai_output_context — The type of AI-generated content students are fact-checking — e.g. 'ChatGPT explanation of the French Revolution with cited historian names', 'AI research summary with statistics about teen mental health'
- student_level — Age/year group and digital literacy level
- subject_area (optional) — The discipline — affects what hallucination types are most common and how to verify claims
- hallucination_risk (optional) — The specific hallucination type most likely in this context — citation fabrication, statistical invention, event misattribution, false consensus claims
- verification_resources (optional) — What verification tools students have access to — library databases, Google Scholar, specific trusted websites
- ai_tool (optional) — Which AI tool students are fact-checking output from
Known limitations
- Verification requires time and database access. The full verification protocol — finding a study, checking the abstract, verifying the claim — takes 3-5 minutes per citation. In an essay-writing context, students may verify one or two key claims but cannot verify every AI statement. This skill teaches the verification habit, not the expectation of exhaustive fact-checking.
- Some hallucinations are genuinely hard to detect. A real paper by a real author, published in a real journal, correctly summarised but slightly out of date or from a different population — this requires reading the methods section, not just confirming the paper exists. Students with limited academic reading skills may not reach this level of verification independently.
- LLM hallucination rates vary by model and topic. Ji et al. (2023) documented hallucination across multiple models; rates vary significantly by task type and subject domain. Hallucination is less common in well-represented domains (recent high-profile science, mainstream political history) and more common in niche topics, cutting-edge research, and specialised sub-fields. Teachers should calibrate expectations accordingly.
- AI-specific applications of lateral reading have limited direct empirical validation. The lateral reading / SIFT evidence base (Wineburg & McGrew, 2017, 2019; Caulfield, 2019) is strong for general source evaluation. The AI-specific adaptations in this protocol are principled extensions of that evidence base, not independently validated interventions. The "source reconstruction" move is logically sound but has not been formally tested in educational research.