Yet, according to the boards' own statisticians, examiners' judgments are so variable and, in crucial respects, subjective, they cannot bear the weight of all these expectations.
Dr Mike Cresswell, a director of the Assessment and Qualifications Alliance exam board, and head of its research division, is well-placed to say that the marking system is too crude to measure standards over time or between subjects. He believes the whole process is as fallible as the people running it.
Year after year, he says, individual markers have produced results of such random variability that boards have been obliged to cook or "moderate" the data. This inevitably means adjusting the results in line with expectations.
Exam statistics became more important with the arrival of the supposedly objective criterion-referenced marking system in 1988, replacing its predecessor, norm referencing. The old system ranked all the candidates in order and automatically failed the bottom 30 per cent - even if they produced answers of genius.
The more recent criterion-referencing system relies on a statistically-weighted mark scheme which judges students on the objective quality of their papers. In theory, a 100 per cent pass rate is now possible. More importantly, the approach appears to offer a clear comparison between candidates, subjects and even generations.
But Dr Cresswell said in a lecture to fellow statisticians at the British Academy: "The policy of 'strong' criterion referencing has failed. There is evidence everywhere that human judgment cannot be reduced to a system of mechanics. Yet for the past 20 years people in this country have tried to do that."
In particular, researchers have found that markers tend to underestimate the difficulty of hard papers and overestimate the difficulty of easy ones, leading to major fluctuations. Add this to the shortage of people willing to process the scripts and the political pressure on boards, and it is no wonder teachers and students are suspicious.