A study carried out for the Association of Teachers and Lecturers reveals widespread marking mistakes as well as poorly-worded questions in the English, maths and science papers taken by 580,000 children last summer.
Its findings suggest that as many as one in four of those who took the English tests and one in 10 sitting the science test could have been given the wrong, usually lower, national curriculum level of achievement.
Marking mistakes were identified in every English paper examined for the ATL study. Similar defects emerged in 84 per cent of science scripts, and 54 per cent of maths papers.
The Government sought to counter the potentially damaging claims by highlighting a Bath University study, published earlier this year, which found general satisfaction with the tests.
The study, which was conducted for the School Curriculum and Assessment Authority, was based, among other things, on a scrutiny of 10,000 scripts. The ATL research, carried out by researchers from King's College, London University, was conducted on 338 test scripts which were re-marked with the SCAA official marking scheme.
Although the research used a small sample, the ATL claimed it was based on a far more detailed examination of scripts than had been carried out by SCAA when it evaluated the same 1996 key stage tests.
Ministers said there was general satisfaction among parents and teachers, and the Department for Education and Employment claimed approval for the tests from its favourite Mr Fixit: "Sir Ron Dearing has confirmed that the tests are fully bedded in and appropriate for performance tables."
SCAA said that two-thirds of teachers thought the English mark schemes would promote accuracy and consistency while over three-quarters believed the same of the science and maths schemes.
That was not a view shared by the ATL whose study revealed widespread marking mistakes, as well as poorly-worded questions.
Ministers have spent more than Pounds 100 million developing the tests which they now propose to use for benchmarking and setting national targets.
Cheryl Gillan, junior education and employment minister, said: "It is depressing to see teachers' unions, who have been fully involved at every stage in extensive consultations on the tables, expending energy in rehearsing their excuses."
The DFEE claimed that the schools asked for less than 1 per cent of the papers in each of the three subjects to be remarked. Of those, half were successful.
Again, it was a different story in the ATL study. Of the scripts originally scoring level 3, 89 per cent were altered by the re-mark but only 6 per cent fell below level 3, and 11 per cent would have moved up to level 4.
All of the scripts originally scoring level 4 were altered by remarking. None was reduced to level 3, but 44 per cent were raised to level 5.
All of the level 5 awards were altered by re-marking, but only one fell below the recommended cut-off point.
In the maths tests, more than half of the pupils had at least one question mismarked, while in science 84 per cent of scripts needed remarking for one or more questions.
In the science tests, only four pupils had their levels changed by the re-marking.
More than seven out of 10 teachers in the ATL study said the tests had either led to a deterioration or no improvement in English or maths, and 65 per cent claimed they had not achieved better results in maths.
Nick Tate, chief executive of SCAA, said: "It is often easier to criticise than to be constructive and the ATL seems again to have fallen into this trap in its response to the 1996 key stage 2 tests.
"It is to be regretted that (the ATL) should be so apparently opposed to progress."
The Validity of the 1996 Key Stage 2 tests in English, Mathematics and Science, Pounds 5, from the ATL, 7 Northumberland Street, London WC2N 5DA.
The ATL report criticised the marking of the following question from last summer's science paper : "Why is it safer to stir the very hot soup with a wooden spoon?" One pupil was credited for the answer "it is good because then you don't burn yourself, with the metal spoon you burn yourself", whereas another who said "because the metal spoon would burn" was not. The mark scheme looked for attributes of wood andor metal in terms of their conduction of heat. It also credited contrasting attributes - metal gets hotter than wood, metal conducts heat but wood is an insulator. There was no mention in the scheme of "burning" yet the question was in the context of safety. 18 per cent included a reference to it, 16 per cent gave it as their only answer. Some were given a mark, others were not.
The report also said that the following question from the maths paper was badly worded: "John has Pounds 2. He goes on one ride and has exactly 80p left. Which ride does he go on?" (The choice was the Galaxy, which cost Pounds 1.50, the Laser, 90p, the Big Wheel, Pounds 1.20, or the Spaceship, 75p.).
The use of the present rather than the past tense encouraged some pupils to think the question was asking for the next ride John would go on. They selected the spaceship. If the past tense had been used, it would not have been open to interpretation.