But way back, when the national curriculum had ceased to be a glint in the eye of the Tory government and had begun to take on an embryonic form, tests were meant to have "fitness for purpose". It was a cumbersome, ill-defined phrase. But, at the time, those involved in education saw it as a hopeful one because it appeared to reflect the practice of the day. This was because they took it to mean that all tests should fit the purpose of the subject which was being assessed.
Such an interpretation enabled different subjects to be assessed differently. Art exams, for example, took place over a number of days and allowed prior preparation and a portfolio of work. English had a small timed element but was essentially coursework-based. On the other hand, while maths could be assessed through coursework, most schools entered their pupils for terminal exams, and, in the main, science teachers opted for timed end of module tests.
The early pilots of the national tests echoed these differences. The exams for seven-year-olds were quite unlike those for 14-year-olds, and the trial maths tests at key stage 3 were very different from the English ones. Most maths teachers had always embraced, in part, the principle of pencil and paper tests and the pilot reflected this. The English Sats, however, were more task-based, designed to take place over a number of days and to imitate, as much as possible, the natural course of a lesson. This difference arose from the English teachers' belief that their subject could not validly be assessed in timed exams.
But all that changed in the summer of 1992 when the pilot tests, and the first run of tasks at key stage 1, were deemed to be "over-elaborate nonsense". John Major aborted the trials as well as most of the variety of GCSE assessment. The understanding of "fitness for purpose" which emphasised the subject was replaced by a reading which emphasised the purpose of the test itself. And the purpose of the tests had become to act as an accountability measure by which school performance could be aggregated.
In such a system what matters is not how valid the exam is, nor how much it reflects the subject or child it is meant to be assessing, but how reliable the test is, how easy it is to standardise. The problem is, of course, that subject differences, or the needs of the pupils, do not go away. A recent survey undertaken by the National Union of Teachers showed that while secondary maths teachers were, in the main, content with the key stage 3 tests, English teachers were almost universally hostile to them. A similar antagonism was found amongst those who teach seven-year-olds.
There seems to be a general acceptance, for the first time in over a decade, that very young children should be assessed differently from older pupils, and for all countries in the UK, except England, this has meant abandoning the tests. GCSEs and A-levels do still differ from each other.
But there is far more uniformity now than there used to be and this should change if we are genuinely interested in encouraging achievement in different subjects. As for the tests at 11 and 14, were we to reclaim the original notion of fitness for purpose, it might help politicians to ask what possible educational point they serve in their current form.