Method library › Curriculum Assessment

Criterion-Referenced Rubric Generator

strong evidence · ⏱ 3 minutes · Curriculum Assessment

Generate a criterion-referenced rubric with descriptive performance levels for a task or objective. Use for marking guides and general curriculum contexts. For Manning programmes where Competent = success, use coherent-rubric-logic-builder instead.

What it does

Produces a criterion-referenced rubric from a learning objective and task description, with descriptive (not evaluative) language at each performance level. Each criterion describes what the student's work LOOKS LIKE at each level — not how "good" it is. The output includes the full rubric, a design rationale, a student-friendly version for self/peer assessment, and calibration notes for consistency across markers. AI is specifically valuable here because effective rubric design requires precise, descriptive language that distinguishes between performance levels without using evaluative labels ("excellent," "good," "poor") or vague quantity indicators ("some," "many," "thorough") — and each descriptor must be qualitatively distinct from the adjacent levels, not just a scaled version of the same description.

The evidence behind it

Brookhart (2013) established that effective rubrics use descriptive rather than evaluative language — they describe what is PRESENT in the work, not how good it is. "Uses specific textual evidence to support each analytical point" is descriptive; "Good use of evidence" is evaluative. Descriptive rubrics produce more reliable scoring and more useful feedback because they tell students exactly what to do differently, not just that they need to "do better." Andrade (2000, 2013) demonstrated that rubrics improve both instruction and learning when shared with students before the task — they function as learning tools, not just grading tools. The effect is strongest when rubrics are used for self-assessment. Jonsson & Svingby (2007) found that analytic rubrics (separate criteria scored independently) are more reliable and produce better feedback than holistic rubrics (single overall judgment), though they take longer to use. Sadler (1989) established that assessment quality depends on the "gap" being visible — students must be able to see the difference between where they are and where they need to be. Descriptive rubric levels make this gap concrete. Panadero & Jonsson (2013) confirmed that rubric use improves student performance, particularly when combined with self-assessment, with moderate effect sizes.

Sources

Brookhart (2013) — How to Create and Use Rubrics for Formative Assessment and Grading
Andrade (2000, 2013) — Using rubrics to promote thinking and learning
Jonsson & Svingby (2007) — The use of scoring rubrics: reliability, validity and educational consequences
Sadler (1989) — Formative assessment and the design of instructional systems
Panadero & Jonsson (2013) — The use of scoring rubrics for formative assessment purposes revisited

How to use it in your lesson

For the best results with EvidenceLesson, give it:

learning_objective — The learning objective the rubric assesses
task_description — The specific task students will complete
student_level — Age/year group
criteria_count (optional) — Number of criteria (default: 4)
performance_levels (optional) — Number of performance levels (default: 4)
subject_area (optional) — The curriculum subject
existing_criteria (optional) — Any criteria the teacher wants included — the rubric will build around these

Known limitations

Rubrics describe performance but do not explain how to improve. A student who reads the rubric and sees they are at Level 2 for Rhetorical Devices knows WHAT to do differently (integrate devices with purpose) but may not know HOW. The rubric should be paired with teaching and feedback that shows students how to move from one level to the next. Chain with Feedback Quality Analyser for targeted improvement advice.

Four levels is a practical compromise. Some tasks would benefit from more levels (to distinguish fine gradations) or fewer (to simplify assessment). Four levels balance reliability (enough levels to be informative) with usability (few enough to be practical). If the rubric is being used for high-stakes grading, additional level descriptors may be needed.

Descriptive language is harder to write but more useful than evaluative language. The rubric avoids "good," "excellent," and "poor," which makes each cell longer and more specific. This is a deliberate trade-off — evaluative rubrics are shorter but less useful for feedback. Teachers may need time to become comfortable with descriptive rubric language.