Most were assigned a higher level by their teacher than by the test, which gave only 44 per cent a level five or above, while teacher assessment figures gave results of around 70 per cent.
Consistency with the key stage 3 test was also a problem, as only 63 per cent of children who took it twice achieved the same mark on both occasions.
The findings are contained in a report written for the Qualifications and Curriculum Authority after the third pilot of the test last September.
The test was five years in development and due to become statutory in 2008.
The findings offer the most definitive account of official views on the on-screen exam, which ministers have now replaced with a voluntary system.
The evaluation's findings have been kept secret until now, with teachers complaining that they have never been given a proper explanation on the change of policy.
The report, by Andrew Boyle of the QCA's assessment research team, said that the exam has not been proved to be a valid assessment and teachers and pupils had yet to be convinced that the test yielded information that was meaningful and accurate.
The evaluators compared the 2006 test results of 18,004 pupils with their teachers' views of their levels, finding that they agreed in only 7,101 cases. In 1,049 cases, the assessment of pupils by their teachers and the test differed by two levels.
There were also technical difficulties around level-setting and concerns about "unfavourable public comment" if the test was to be made statutory and then failed to work well.
In January, ministers announced they were replacing the test with a voluntary series of tasks that teachers could use to inform their judgments on pupils' progress.