Education faces a crisis of measurement.
It is perhaps not quite a rallying call to get the masses on the street but it does have serious consequences.
The problem is that it will be increasingly hard to use public data to measure whether the education system is improving and to make judgements about the relative performance of groups of pupils, institutions and regions.
The consequence is that it will be harder to accurately judge the effectiveness of initiatives, to target interventions or to fairly hold organisations and leaders to account. There are two major causes:
The first cause is the significant and simultaneous changes in the content, structure, standards and marking of key assessments at school entry, end of primary, GCSE and A-level. For example: new subject content, a shift to linear terminal assessment, more demanding requirements, the reduction of coursework, increases in online marking, new grade setting procedures, etc. None of these are necessarily wrong in their own right, but it becomes difficult to compare results in one year to previous or subsequent years. The sheer scale of the simultaneous changes makes it hard even to translate the numbers from one year to the next. The numbers may look similar but they measure something completely different.
This is not just a legacy issue of past reforms: every year of students at a secondary school today, from 11-year-olds to 16-year-olds, faces a different set of exams. Primary pupils face new Sats in 2016, which will be immediately revised in 2017. The 2016 Sats themselves are still changing mere months before they will be taken, with new administration procedures and new exemplification materials. We don't even know how 2016's primary progress threshold will be set even though we've already started the year. This violates one of the most basic principles of effective accountability – that you know what you will be held accountable for. I cannot imagine an effective private enterprise run this way.
Have patience though: I have heard experts confidently predicting that the system will settle down eventually. In about 2022.
The second cause is a shift towards norm-referencing at GCSE (also described as "comparable outcomes"). This means that the overall spread of results and grades is held as steady as possible year on year unless there is independent evidence of improvement in achievement. Without such evidence, and we have none at the moment, improvements in delivery are erased from the end results.
Imagine the challenge faced by school leaders who are targeted for year-on-year improvement in a system which automatically erases that improvement. The only way for one school to succeed is for another school to fail. This increases competition, which reduces the spread of good practice, and increases panic, which usually narrows the curriculum.
The Department for Education would argue that national reference tests will mitigate the effect of norm referencing. The aim of these tests is to provide that independent assessment of ability to calibrate public examinations. But these tests do not yet exist and there are significant concerns with their design. I suspect officials would also argue that norm referencing addresses a previous problem of grade inflation, which also obscured the measurement of genuine improvement. This is true but it seems to me that they have merely swapped one problem for another.
We should watch Ofsted's role in this very carefully. Inspection is potentially a qualitative counterweight to quantitative data. This is why Ofsted's own heavy reliance on data is such a disappointment: it undermines its unique contribution. Yet already we see a growing movement to sideline the role of inspection. The coasting school initiative is one such activity and now former advisers are flying kites about a future without inspection. Sounds good but be careful what you wish for. The alternative to inspection-led accountability is not no accountability. It is league table-led accountability.
Is the measurement crisis really a cause for concern? Is it not a cause for celebration? A growing lack of credibility could surely end the tyranny of the league table. I fear not. We will continue to make important decisions using data but it will be bad data. And therefore we will make bad decisions. We will invest in the wrong initiatives; we will celebrate and punish the wrong leaders; we will make unsupportable comments about the relative performance of regions; and we will target our resources inappropriately.
Russell Hobby is general secretary of the NAHT headteachers’ union. He tweets as @russellhobby