Comparisons are tedious
Nicholas Pyke reports on a claim that the Government is wasting its time comparing exam results
DETAILED comparisons of examination results over long periods are a waste of time, according to the head of research at a leading A-level board.
Such exercises say nothing about school standards because individual examiners are incapable of producing scientifically objective judgements.
"Direct comparisons of standards are impossible, both between subjects and over time," said Dr Mike Cresswell, of the Associated Examining Board.
This is in sharp conflict with the policies of Education and Employment Secretary David Blunkett, who is continuing to press for complete comparability of exam and test results.
Two years ago, the Government published a major investigation of exam standards over 25 years, which found no evidence of a rise or a fall. But Dr Cresswell believes such exercises are futile because history shows that results are inherently unstable.
Addressing an audience of leading academics at the British Academy in London (in a personal capacity), he said that examiners' marks vary wildly between years. As a result, a process of modification takes place when the final grades are produced, bringing the results into line with expectations.
"This is nobody's fault," he said. "It's how the world is."
In other words, true "criterion referencing" of the type sought by ministers is unfeasible on a mass basis. (It would mean asking exactly the same questions every year.) "The policy of `strong' criterion referencing has failed," said Dr Cresswell. "There's evidence everywhere that human judgment can't be reduced to a system of mechanics. Yet for the past 20 years people in this country have tried to do that."
Part of the problem is the inability of examiners to predict the difficulty of questions. Studies show that they under-estimate the difficulty of hard papers - producing low marks all round. And they over-estimate the difficulty of easy papers - leading to high scores.
Such inconsistencies, however, are not reflected in the examination grades themselves, which remain close to the norms of recent years.
"What's clear is that the process of setting the standard has to be done by comparison of results from one year to the next. So the very process of awarding grades reduces to some extent the effects of average changes over time," said Dr Cresswell.
But this is not to say that exam results are unreliable on a year-to-year basis, he added. On the contrary, they are an important tool, "so long as you acknowledge that the value judgments simply express the preferences of the judges involved".
Comment, page 16