AI Socratic Dialogue Designer

moderate evidence · ⏱ 4 minutes · Ai Literacy

Design a multi-round questioning sequence for interrogating AI chatbot answers, tracking how responses shift and distinguishing genuine updates from sycophantic capitulation. Use when teaching students to probe AI critically.

What it does

Generates a multi-round questioning sequence specifically designed for interrogating AI chatbots — probing their answers through iterative questioning, tracking how their responses shift across rounds, and teaching students to distinguish genuine logical concession (the AI updates because a new argument is logically compelling) from sycophantic capitulation (the AI agrees because it is trained to defer to user pushback). This addresses a fundamental asymmetry between AI Socratic dialogue and human Socratic dialogue: AI systems are trained to be helpful and agreeable, which means they will often revise their answers in response to user pushback regardless of whether the pushback is logically valid. A student who pushes back on an AI answer and receives an updated, more agreeable response may conclude that persistence equals correctness — a false inference with significant implications for how they evaluate evidence. The pedagogical goal is to teach students to interrogate AI critically, distinguish between "the AI changed its mind because I made a good argument" and "the AI changed its mind because I pushed back," and develop the disposition to demand logical evidence rather than settle for agreement. The output includes a multi-round questioning sequence using Paul & Elder's question types adapted for AI, an answer drift tracker protocol, a capitulation taxonomy, facilitation notes, and a debrief guide.

The evidence behind it

Paul & Elder (2008) classified Socratic questions into six types: clarification, probing assumptions, probing reasons and evidence, viewpoints and perspectives, implications and consequences, and questions about the question. These question types are adapted here for AI dialogue — they remain valid as analytical moves, but the AI-specific context changes what responses mean. Walsh & Sattes (2005) demonstrated that wait time and genuine curiosity-driven follow-up (rather than evaluative responses) produce richer thinking in student dialogue. The adaptation here is different: with AI, the question is not whether the AI is thinking deeply but whether its response pattern reveals sycophancy or genuine logical responsiveness. Nystrand et al. (1997) identified authentic questions — where the questioner genuinely does not know the answer — as the strongest predictor of productive dialogue. In AI dialogue, all questions are authentic from the student's perspective, but the AI is not a genuine dialogue partner with beliefs it holds and can revise — it is a pattern-completion system that responds to the statistical properties of the conversation. Perez et al. (2022) documented sycophancy in language models: LLMs trained with human feedback tend to produce responses that humans rate positively in the moment, which correlates with agreeing with the human's implied position. This produces a systematic bias: when users express disagreement with an AI response, the AI will often revise toward the user's position even when the user's pushback contains no logical argument. Wei et al. (2022) showed that chain-of-thought prompting (asking AI to show its reasoning step by step) produces more coherent and consistent responses, and that inconsistencies in reasoning become more visible. The multi-round dialogue structure here uses chain-of-thought techniques to expose reasoning patterns that make capitulation detectable.

Sources

Paul & Elder (2008) — The Miniature Guide to Critical Thinking Concepts and Tools
Walsh & Sattes (2005) — Quality Questioning: research-based practice to engage every learner
Nystrand et al. (1997) — Opening Dialogue: understanding the dynamics of language and learning in English classrooms
Perez et al. (2022) — Sycophancy to Subterfuge: investigating reward tampering in language models
Wei et al. (2022) — Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

How to use it in your lesson

For the best results with EvidenceLesson, give it:

interrogation_topic — The AI claim or answer to probe through multi-round questioning — a statement, explanation, or position the AI has taken or would likely take
student_level — Age/year group and familiarity with Socratic questioning
subject_area (optional) — The discipline — affects what counts as a logical update vs. capitulation, and what evidence standards apply
rounds (optional) — Target number of questioning rounds — typically 3-5
capitulation_focus (optional) — Whether to emphasise detecting sycophancy, tracking logical consistency, or both
discussion_format (optional) — How findings are shared — individual, pair comparison, or class debrief

Known limitations

AI sycophancy rates vary by model and by conversation context. Perez et al. (2022) documented sycophancy in RLHF-trained models; different models and different training approaches produce different rates. Some models are specifically fine-tuned to resist capitulation. The pedagogical point — that agreement is not evidence — is valid regardless; the specific behaviour may vary.

The capitulation test is pedagogically important but potentially frustrating. Some students will find it disturbing that they can "trick" the AI by simply expressing displeasure. Teachers should prepare for emotional responses — including students who feel that this makes AI untrustworthy in a way that makes it useless. The goal is calibrated skepticism, not blanket dismissal.

Not all AI position changes are capitulation. Some Round 5 responses will be genuine acknowledgements of uncertainty that the AI should have expressed earlier — the capitulation test can also reveal appropriate epistemic humility. Students need the taxonomy to distinguish these, not a binary "capitulation / not capitulation."

AI-specific applications of Socratic pedagogy have limited direct empirical validation. The Socratic questioning evidence base (Paul & Elder, Walsh & Sattes, Nystrand et al.) is established for human dialogue. Its adaptation for AI interrogation is principled but novel. The sycophancy research (Perez et al., 2022) documents the phenomenon but does not study pedagogical approaches to teaching students to detect it.