Are teacher observations worth the bother?

Lesson observations have come under heavy fire in recent years as an inaccurate way of measuring teacher performance, yet they persist in most schools in various guises. Grainne Hallahan asks whether the time has come to finally rule them out completely
5th February 2021, 12:00am
Are Teacher Observations Worth The Bother?
Grainne Hallahan


Are teacher observations worth the bother?

Here is a question that sounds easier to answer than it actually is: how do you know if someone is a good teacher?

Schools employ a number of tools they believe can measure teacher effectiveness. Leaders configure data and analyse patterns, compare lesson starts and finishes; some schools will also use student voice surveys to decide how good a teacher is. But by far the commonest way to assess teaching is through lesson observations.

The accuracy of most of these measures is patchy and subject to bias, but lesson observations are a particularly complex problem: firstly, because so many schools still put so much faith in them; and secondly, because the evidence for their usefulness is highly questionable. So, what do school leaders and teachers need to know about this staple of education management?

Dylan Wiliam, emeritus professor of educational assessment at UCL Institute of Education, says that a big problem with current lesson observation practices is that they are too high stakes to ever be a true reflection of teaching.

"Teachers will pretend everything is fine in a formal observation," says Wiliam. "They play safe, they put on a safe lesson."

Part of the problem is that schools have been able to link individual teachers' pay to performance since 2013. It is now common practice for senior leaders to conduct biannual formal lesson observations as part of the appraisals process.

This means that the pressure is really on for staff who are being observed, as their salary could depend on the outcome - which makes it unlikely that what they show to the observer will be a true indication of their teaching.

However, this is just the beginning of a long list of issues with observations. There are many variables that can affect a lesson. What if it starts to snow, for instance? Or the fire bell goes off? Or six of your class have just had an argument in the corridor? All these things can have an effect on student behaviour and on how successful the lesson is, suggests Heather Hill, of the Harvard Graduate School of Education.

"The issue is that observations are subject to lots of different factors that we call 'noise'," Hill explains. "This includes the fact that some things are easier to teach than others. For example, in mathematics, it's easier to teach basic addition than it is to teach fractions."

Her research into rater reliability and observation systems found that observers needed to effectively balance this "noise" against the rest of the information they were receiving from the observation. This is tricky to do, she says, when you only have one or two observations to inform your judgement - unless the activity you're judging is something very straightforward to observe, such as whether or not students are engaged.

"The issue is how much 'noise' versus how much 'signal' you're getting out of an observation instrument, and with one observation, you're just not getting as much signal as you need," says Hill.

So, what is the magic number of observations to aim for? "You need four or six observations to get a 'stable score' for a teacher who is delivering an activity that is more nuanced or doesn't happen every lesson, [such as teaching fractions]," says Hill.

If you multiply that out across a year, you would need to conduct dozens of observations per teacher to minimise the impact of all the distracting noise and get an accurate overview of what their teaching is actually like. Scale that up for every teacher in a school and you're easily looking at triple figures.

Walk the chalk

Based on Hill's research, using observations to monitor teaching quality across a school requires a serious investment of time. And with leaders already busy, many might question if it is realistic for them to conduct the number of observations needed to obtain accurate data.

The same problem applies when you ask teachers to observe their colleagues. An evaluation by the Education Endowment Foundation that aimed to explore the impact of structured teacher observation on learning found that teachers had difficulty making time to observe.

The intervention trained teachers in a structured observation technique in which they used software to rate their colleagues across a range of activities, such as managing behaviour and communicating with students. Teachers had to conduct a minimum of either three or four observations per year.

However, even fitting in this small number of observations proved too difficult, with teachers citing challenges around timetabling and arranging cover.

Conducting shorter, more informal observations - perhaps in the form of "learning walks" - might seem to be a solution here. But while learning walks can provide a snapshot of teaching, they are more useful for observing something like general levels of student engagement, rather than assessing the quality of - to reuse Hill's example - how well somebody can teach fractions.

In a blog post entitled "Most observations don't help teachers improve: why?", Harry Fletcher-Wood, associate dean at Ambition Institute, explains that "solutions to problems of frequency are often superficial or unhelpful. Learning walks tend to imply a check for compliance with school policy/basic order in the classroom, rather than identifying exactly who's learning what in a particular lesson".

So, while shorter observations do have their uses, if what you are trying to do is assess the true quality of somebody's teaching, simply popping your head around the door is not likely to help much. For that, you would need to spend longer observing, and conduct those observations more than once or twice a year.

But even if you did manage to conduct enough in-depth observations, there is another problem to contend with: do all teachers and leaders even know how to judge whether the teaching they are witnessing is effective?

In Do We Know a Successful Teacher When We See One?: Experiments in the identification of effective teachers, a paper published in 2011, researchers Michael Strong, John Gargani and Özge Hacıfazlıoğlu called into question how far we can trust the judgements of people conducting teaching observations.

The team found that even experienced teachers were unable to tell an effective teacher from a less effective one based on observation alone.

In one experiment, 165 administrators were shown eight video clips of different teachers delivering a lesson, with four teachers achieving above-average results and four below-average results. The observers were not able to tell the difference. In fact, tossing a coin would have been a more accurate way of identifying the effective teachers because the observers identified just 3.85 correctly.

This lack of accuracy is perhaps not surprising, as what "good" teaching looks like is ultimately quite subjective.

"I don't think there's even a consensus on what teaching is, let alone good teaching," argues Stephen Lane, head of year and teacher of English at Lichfield Cathedral School in Staffordshire.

However, Joshua Goodrich, national lead for CPD at the Oasis multi-academy trust and the assistant principal for teaching and learning at Southbank Oasis Academy in London, disagrees. He believes that there are elements of good practice that all effective teachers will display.

"Do I think that there is basically a best way to teach? Broadly speaking, yes, because the research points to more and less effective practices. So, should all teachers be using evidence-based practices in their classrooms? One hundred per cent yes," he says.

Spaced retrieval practice and explicit modelling would be just two examples of the evidence-based approaches that Goodrich might look for. However, he also acknowledges that teachers will not all teach in exactly the same way and should not be forced to. In his view, more experienced practitioners in particular will need to be given a degree of autonomy in how they teach, as this is a "powerful motivator" for them to improve.

While there may be some concrete strategies observers can look out for, then, "good" teaching can ultimately appear quite different from classroom to classroom.

This issue is compounded by the fact that observers will also need a certain level of subject knowledge to be able to appreciate whether a topic is being taught well. If the observer does not understand the principles of, say, literary theory, how far will they really be able to judge whether a teacher is communicating those principles effectively? The simple answer is: probably not very far.

The evidence against lesson observations seems to be stacking up, and this is just one reason why some schools are choosing to drop the practice altogether.

Palace Wood Primary School, in Kent, is one such school. According to headteacher Mark Chatley, he put a stop to observations not only because of their unreliability but also because of the impact they were having on staff wellbeing. In place of observations, the leadership team has now put together a mentoring programme.

"We had been thinking about the value we were getting from formal observations. They were feeling very judgemental and weren't having a positive effect on the teachers," explains Chatley. "Despite our best efforts, the teachers would focus on any development point and it became a negative experience."

Getting rid of the observations was Chatley's solution. However, other schools have found ways to adapt the practice to make it work better for everyone.

Joanne Trewin is acting associate headteacher at Ealing Fields High School in London. In the past few years, she has abolished formal lesson observations conducted by the senior leadership team. However, observation is still a central part of the school's CPD programme.

Inspired by Paul Bambrick's Leadership Leverage model from US non-profit organisation Uncommon Schools, Trewin has made it a priority that teachers conduct regular, short, low-stakes peer-to-peer observations, where feedback focuses on small changes that are easy to implement.

"We do weekly 15-minute observations, and 15-minute feedback at a fixed time, but the observation can happen any time in the week," Trewin explains. "It's about identifying one small thing for a manageable action."

Although the observations at Trewin's school are short, the key thing to note is that they are being used as a tool for improvement, rather than to make a summative judgement of somebody's teaching.

The process is also flexible. Observations can take place whenever teachers want them to; cover is arranged in advance for observers to make sure that everyone can take part.

Goodrich has taken a similar approach at his schools, which is based on the "instructional coaching" model. This involves a trained expert working with individual teachers to help them develop their practice and to provide them with feedback on their performance.

"Instructional coaching is about these granular micro-changes and the idea behind it is that although the change might seem very small - it has to be really a small enough change that a teacher can embed it into their practice between one lesson and another, or one day and another - and it doesn't seem like it would make a difference, 30 of those changes over the course of a year can have a transformative impact," he says.

Using observations to develop teachers in this way isn't exactly a revolutionary idea. But Trewin says that, in a system where staff have long been used to observations that are designed to hold them to account, it has still been hard for some people to shake off that fear of being observed.

"At first, there might be some resistance if the culture of lesson observations has been that of fear and criticism," says Trewin. "But you have to reassure staff that isn't going to be the case - and prove it."

Marginal gains

So, how do you do that? For starters, if your aim is to create a system where observations have the primary aim of teacher development, you need to be careful about placing too much focus on identifying teachers' weaknesses - even though this might seem counterintuitive, says Wiliam.

"For novice teachers, [using observations to address] weakness is likely to be effective. But for experienced teachers, having them become outstanding at what they're already good at may benefit students more than worrying about their idiosyncrasies," he explains.

Part of this is making sure the feedback that observers give to teachers is as useful as it can be. This is something that schools often get wrong, Fletcher-Wood suggests. Either too much or too little feedback is often given, and while the former can be overwhelming, the latter can leave a teacher confused.

"Teachers [may] walk away with a target, but may not be clear exactly what that looks like, or be unsure how to apply it in their next lesson," he writes.

So, what does good observation feedback look like? According to Hill, there are four key ingredients. First, the feedback needs to be "timely".

"It's no good waiting weeks because the moment has passed," Hill says. "You've also got to give the teacher ownership over what is being fed back on." This means asking the person being observed what it is they want you, as the observer, to focus on ahead of time and then making that element of their teaching the focus for your feedback.

The third element is that the feedback has to make sense, or it will simply be disregarded, she says.

"And finally, you have to give concrete strategies to improve. You can't be vague. You must let them know what practical steps they need to take."

Trewin agrees with Hill about all these ingredients, but she reiterates the point about feedback also focusing on marginal gains. Small, specific targets are more manageable, and so prevent teachers from feeling overwhelmed by the observation process, making them more willing to act on feedback.

"We want our teachers to think about being just 1 per cent better each week. This isn't tracked, it isn't about [accountability]. This is about genuine improvement," Trewin says.

Ultimately, the way in which the observation is conducted is perhaps less important than what we see as the purpose behind observing teachers.

"As I see it, there are different motivations [behind observations], which are potentially in tension with each other," says Lane. "I think using them for quality assurance results in conflict with the goal of development."

It sounds like there is a simple fix here: stop linking observations to performance management and pay, and they can become more useful to teachers and schools alike.

But what about the need for leaders to judge the quality of the teaching in their school? Is this something they simply don't need to worry about until Ofsted arrives?

According to Wiliam, it is possible to use observations to take the temperature of teaching quality across the school without making them so high stakes that it affects the reliability of the exercise - and that is where practices such as learning walks come in.

"Headteachers will say they do observations because they want to monitor the quality of teaching and learning in their school," he says. "Right, true. But why do you record the teacher's name on the record sheet? You could get that information by conducting 100 10-minute observations across the course of a week. If your primary aim is monitoring, you don't need to have the teacher's name."

It seems that many of the biggest issues with lesson observations stem from the judgements attached to them. When your observation is tied to your performance management review, of course teachers are going to pull out all the stops and deliver a "whizzy" lesson. No one is going to put themselves on the line and use that lesson observation to try to develop their teaching practice. Problems are hidden in this type of observation, not highlighted.

But once those judgements are removed, observations become more accurate and more warmly received. So, let's leave the judging to Ofsted and focus instead on just getting better as teachers.

Grainne Hallahan is senior content writer at Tes

This article originally appeared in the 5 February 2021 issue under the headline "Watch and learn (but don't judge)"

Tes magazine subscription provides access to the most up-to-date information, the latest education thinking, current teaching discussions and a space for sharing best practice.

You’ve reached your limit of free articles this month

Register for free to read more

You can read two more articles on Tes for free this month if you register using the button below.

Alternatively, you can subscribe for just £1 per month for the next three months and get:

  • Unlimited access to all Tes magazine content
  • Exclusive subscriber-only articles 
  • Email newsletters

Already registered? Log in

You’ve reached your limit of free articles this month

Subscribe to read more

You can subscribe for just £1 per month for the next three months and get:

  • Unlimited access to all Tes magazine content
  • Exclusive subscriber-only articles 
  • Email newsletters