Moderation in everything: how Ofsted could reassure the sector

Whatever shape Ofsted’s report cards end up taking, we need to be confident that the judgements are reliable says CST’s deputy chief executive Steve Rollett
7th February 2025, 4:30pm

Share

Moderation in everything: how Ofsted could reassure the sector

https://www.tes.com/magazine/analysis/general/moderation-in-everything-how-ofsted-could-reassure-the-sector
Lots of life buoys

After weeks of speculation, it is welcome that the consultation about Ofsted’s next framework has now landed. The Big Listen revealed a clear impetus for change from a range of stakeholders. The challenge for Ofsted is landing the right set of changes.

At the Confederation of School Trusts (CST), we’re in listening mode, too. There’s a lot of change being proposed from both Ofsted and the Department for Education with their plans for a new accountability regime, and collectively we need to understand and reflect on the details, as well as the aggregate system it adds up to.

In the meantime, however, there is something more Ofsted could do to build confidence in its plans.

The peril of unfair outcomes

Having spent a long time around inspection policy, and talked to hundreds of leaders about inspection, I’m struck the concerns people express are often not about a negative grade per se, but more about receiving an unfair negative grade (unsurprisingly, leaders tend to complain less about unfair positive grades).

This is the space where concerns about consistency tend to play out. Some of this is about the expertise and training of inspectors, and the new Ofsted Academy may be helpful in addressing that.

But it’s also about the actual inferences that inspections consist of. Some areas are, to put it simply, more subjective than others. These areas require inspectors to exercise a high degree of inference and increase the chances that two people looking at the same thing reach a different view.

We have to be really careful about how we deal with this. If we drive these higher inference areas out of inspection, we may end up with a system that is more consistent but lacks validity because the inspection isn’t regarded as sufficiently representing the full breadth of what makes a school effective.


More on Ofsted’s new inspections:


This is why data-only approaches are unlikely to be satisfactory - there’s much more about school quality than achievement data alone can tell us. An inspection that doesn’t capture broadly enough what our schools are like may feel invalid, and unfair as a result.

On the other hand, running a high inference model of inspection means we may think we’ve improved validity, but it can come at the cost of consistency and reliability: a different grade in school A to school B but the same underlying practice. This, too, can feel unfair and can undermine validity if stakeholders don’t trust the grades issued.

So, what can be done? Well, for one thing, we must recognise that inspection will always be located somewhere in this tension. It’s an assessment, and we know from other types of assessment, like SATs, GCSEs, and so on, that trade-offs and careful calibration are always required. But it will unlikely ever be perfect.

Comparing outcomes

This is not to say we should shrug our shoulders about Ofsted’s new framework. Rather, Ofsted might seek to deploy some of the tools that we know from assessment can improve reliability and validity.

For example, during its pilots Ofsted could adopt an independent triangulation methodology: it could send two separate teams of inspectors to the same school, ensuring they remain isolated from one another. This could be on the same day, or at a different point in time. The two teams could then compare findings and identify the commonality, helping us to be more confident about the conclusions reached.

This isn’t unlike how we might ask two separate markers to assess a piece of work to help us arrive more confidently at a final mark. Doing so during the pilots would give Ofsted - and everyone else with a stake in the outcome - a better sense of how reliably inspection teams can generate findings across each of the evaluation areas proposed.

My guess is that doing this across a five-point grading scale would be easier in some evaluation areas than others.

Creating confidence

Doing this piece of analysis would be a way of either reassuring those who are sceptical about Ofsted’s plans or confirming their worries and encouraging Ofsted to iterate on them. In either case, it would move the debate forward.

Ofsted has done something similar previously, when it looked at the reliability of judgements in short inspections. But, to my knowledge, this check on reliability hasn’t been done for full inspections where judgements are more complex and the consequences more important. And it certainly won’t have been done for the proposed new framework.

Analysing the proposals with this sort of method could help to build confidence around whatever the final version of the framework looks like. It would be a different approach, for sure, but isn’t that the point of all this?

Steve Rollett is deputy CEO of the Confederation of Schools Trusts

For the latest education news and analysis delivered every weekday morning, sign up for the Tes Daily newsletter

Want to keep reading for free?

Register with Tes and you can read two free articles every month plus you'll have access to our range of award-winning newsletters.

Register with Tes and you can read two free articles every month plus you'll have access to our range of award-winning newsletters.

Keep reading with our special offer!

You’ve reached your limit of free articles this month.

/per month for 12 months
  • Unlimited access to all Tes magazine content
  • Save your favourite articles and gift them to your colleagues
  • Exclusive subscriber-only stories
  • Over 200,000 archived articles
  • Unlimited access to all Tes magazine content
  • Save your favourite articles and gift them to your colleagues
  • Exclusive subscriber-only stories
  • Over 200,000 archived articles

topics in this article

Recent
Most read
Most shared