Seven hours into a marking marathon and here, finally, is the last essay. But your eyes are glazing over, the post-caffeine slump has set in and the student in question is intensely irritating - can you really guarantee him a fair deal?
Educators everywhere regularly face scenarios like this, but a new assessment system has revolutionary promise: by applying principles from opticians' tests and chess tournaments, it could virtually eliminate inconsistency.
A trial of "adaptive comparative judgement" (ACJ) in five Scottish further education colleges has resulted in "exceptionally high" reliability, with marking "significantly" more consistent than under traditional methods.
The basic concept is simple: rather than teachers marking individually with a prescriptive checklist, they take two anonymised essays and decide intuitively which one is better. This process is repeated many times by a team of markers. The concept is similar to an eye test, where pairs of lenses are compared until the optimum one is found.
ACJ employs the "Swiss tournament" method from chess: rather than a knockout system, with the inferior essay dropping out instantly, all essays continue to be assessed after the first round. But they are paired with essays of similar quality in a fine-tuning process that ensures reliable rankings.
Research has consistently shown that traditional marking is hugely susceptible to human frailties. "The problem is that, typically, you mark in silos - you can't help but be subjective," said Matt Wingfield, managing director of London-based Tag Assessment, which developed the algorithm that allows the new technique to be applied on a large scale. "With ACJ, it doesn't take any longer to mark assessments, but you end up with significantly higher consistency and reliability."
Reliability is measured on a scale from 0 to 1, with 1 signifying a perfect system. Scores for traditional marking are typically between 0.5 and 0.6 but ACJ reaches 0.93 to 0.98, said Mr Wingfield, who chairs the e-Assessment Association.
The trial, evaluated by Alastair Pollitt of Cambridge Exam Research, looked at 72 handwritten essays and shorter answers from the former Angus, Aberdeen, Dundee, Ayr, and Banff and Buchan colleges. It produced a consistency rating of 0.96.
Australia, Sweden and Singapore have taken ACJ furthest, with the Swedish education agency running trials in science, design and technology, English and Swedish. The University of Edinburgh and the University of Limerick in Ireland have also adopted ACJ, with the latter using it for student teachers' work.
There was still widespread reluctance to employ radically different forms of marking, Mr Wingfield said, despite general acceptance that existing systems were flawed. And some organisations feared that ACJ would be misrepresented by the media as an educational version of "Hot or Not", the precursor to Facebook that asked users to decide which of two women was more attractive.
Last month Mr Wingfield spoke at an event in Stirling held by the Emporium of Dangerous Ideas, which proposes radical ways to improve education. "What ACJ does is employ the teacher to use their professional judgement to make decisions, and that shouldn't be a dangerous concept - but you get the impression from society that it is," he said.
The report, for the College Development Network, finds that ACJ could be an ideal addition to the wider range of assessment approaches encouraged by Curriculum for Excellence: for example, it would allow reliable comparisons of essays, videos, performances and posters. It recommends a bigger trial involving colleges and schools, and the College Development Network is appealing for volunteers.
The Scottish Qualifications Authority ran a small ACJ pilot in 2011, but the results have not been published. Martyn Ware, head of assessment, development and delivery at the body, said: "The results of the trial were encouraging and we are now reviewing a number of possible applications of the technique to support the assessment of our qualifications."
Read the report at bit.lyACJreport.