Crude facts don't make the grade
IT is hardly surprising that many primary teachers were more nervous than their pupils about last week's national curriculum tests.
The proportion of 11-year-olds gaining at least level 4 has become the single most important measure of a primary school's performance.
But this all-powerful thumbnail indicator is radically flawed and can lead to false analysis, inappropriate comparisons, and damaging misjudgments.
Part of the problem is that the mark range of each attainment level covers a broad band. In mathematics, level 4 is achieved with a mark of 52 per cent, and level 5 with 80. Any pupil scoring 51 per cent or lower is excluded, which means they are rated as zero.
This turns the exercise into a passfail test. The test score of those who pass is, in some respects, disregarded.
In certain other areas, such as performance and assessment (PANDA) reports, the proportion of children achieving level 5 is also taken into consideration. But level 5 straddles test scores from around 80 up to 100 per cent. Grouping large numbers of results together, whether as "level 5" or "level 4 and above", conceals more than it reveals.
If the thumbnail indicator of "level 4 and above" is to work, it needs to mirror pupils' actual achievement in the tests. It fails to do this because pupils scoring between 52 and 100 per cent are rated the same. In an extreme case, a school whose entire year group scored 52 per cent each would rate equally with a school whose entire year group scored 100 per cent each. Both would achieve 100 per cent "level 4 and above".
Even outside this extreme case, there remains enormous scope for inconsistency.
If the "level 4 and above" statistic reflected a rise or decline in pupil achievement as measured by actual test marks, then it could be reliably fed into further comparative analyses. But several years' examples from our own school, with 170 Year 6 pupils, convey a different picture.
Changes in the "level 4 and above" figure are not matched by the total or average scores achieved by the year group.
A decline in actual performance can still produce a rise in the proportion achieving "level 4 and above".
Alternatively, there can be an improvement in the tests but a fall in the thumbnail.
On other occasions, the thumbnail moves up or down, unpredictably, when the actual pupil performance from one year to the next remains constant. The obvious conclusion is tht data are being simplified at the expense of accuracy.
In comparing schools in this way, major differences in overall pupil attainment may be camouflaged or falsified by this defective statistic.
The consequences are serious for individual teachers as well as schools. This thumbnail percentage has a bearing not only on league-table rankings but inspection reports by the Office for Standards in Education, local education authority audits, PANDA reports and judgments about teachers' performance, soon to be related to pay.
The criticism that the "level 4 and above" statistic did not reflect level 5 attainment has been belatedly addressed by the introduction of a points system. This confers six points on each level, then awards the median of the six points to each pupil attaining that level.
The scale, at key stage 2, stops at 36, so a pupil at level 5 would be given 33 points, this being the median of 30-36. Level 4 would receive 27, and level 3 gets 21.
Using this assessment, a pupil scoring 100 per cent on the maths test would be credited with 33 points, and a pupil scoring 52 per cent would earn 27 points. The same telescoping of data that we see in the league tables is at work here, corrupting the figures and provoking another round of unreliable judgments.
Given the prominence of the points system in benchmarking as the basis for assessing the value that schools add to their pupils' learning, a radical rethink is urgently needed.
Our experience with these statistics cannot be unique. Schools and teachers are being presented with cases to answer based on figures that conceal or distort their true achievements.
Given the increasingly sophisticated analytical climate in education, the reporting of results which ignores pupils' actual test scores can no longer be justified.
Summaries of actual scores, totals or averages, would give a genuine picture of schools' achievements and considerably illuminate the performance analysis process, from classroom right up to government level.
While league tables, PANDA reports and the like have their place as tools of analysis, we believe that the approach we are proposing, and which we employ in our own school, would return diagnostic power to the workplace, making usable knowledge available to those responsible for these all-important results.
It should be introduced without further delay.
Ivan Ruff is the maths co-ordinator at Broadstone Middle School, Poole, Dorset. Stephen Wathen is the school's deputy head. firstname.lastname@example.org