A large proportion of the differences in national test results in English are due to inconsistency between examiners rather than pupil ability, research shows.
An analysis using this year's writing test for 11-year-olds reveals that on every element of the marking scheme, pupils' ability accounted for less than 70 per cent of the variations between their scores.
On one element, handwriting, marker inconsistency accounts for more than half of the differences in marks, with actual performance only explaining 47 per cent.
The National Foundation for Educational Research study said: "Even closed response questions that would be expected to show complete consistency can display higher than expected levels of variation between graders. In writing the analysis shows the difficulty in achieving objective measurement of ability."
The study found that some of the marking differences on the writing paper cancelled each other out. But overall, inconsistency still accounted for more than a fifth of the variation in scores.
Markers were more likely to consistently give generous or harsh marks when faced with pupils with poor spelling ability. It also showed that the inconsistency of marking increased slightly if the candidate was a girl, had used extra paper or had written in paragraphs.
The researchers write: "Different graders may prefer different styles of writing or attach greater weight to different elements of a pupil's answer."
But they believe that these differences will not affect school comparisons as they are likely to cancel each other out when looking at large groups of pupils.
The reading paper, with multiple choice and short response questions and requiring less marker judgment, stood up better to the analysis. Pupil ability explained nearly 97 per cent of the differences.
The study involved 49 test scripts each independently evaluated by nine experienced examiners last autumn.
Kathleen Tattersall, chair of the Institute of Educational Assessors, said examiners had been trained two months before the research began. But in reality it would have come immediately before they marked tests. Their performance would also have been monitored closely leading to greater consistency.
"Exploring the importance of graders in determining pupils' examination results using cross-classified multilevel modelling," Tom Benton. To be published soon at brs.leeds.ac.ukbeiwwwel.htm