Methods to set GCSE boundaries will devalue Ofsted judgements

9th January 2015, 12:02am

Official measures to prevent grade inflation will confuse parents and employers and make a mockery of Ofsted’s method of judging schools, critics have warned.

Senior exam board officials have joined the Association of School and College Leaders (ASCL) in calling for greater public debate about Ofqual’s proposal to introduce stricter limits on rising GCSE grades.

The ASCL has expressed concerns about the “comparable outcomes” approach, which Ofqual already uses to prevent grade inflation by pegging GCSE and A-level results to the performance of students in previous years.

The watchdog is now considering a shift towards even greater reliance on historical statistics, rather than judgements of pupil work, when setting grade boundaries for the reformed GCSEs, which are due to be taught from next year.

But Cherry Ridgway, ASCL curriculum and assessment specialist, has called for a system that “reflects students’ real achievement - not that of students in the past”. The current use of comparable outcomes had already unfairly penalised schools, she said, because Ofsted had not taken account of the clampdown on rising grades.

Last month, Ofsted chief inspector Sir Michael Wilshaw used GCSE results to argue that improvement in secondary schools had “stalled”. But Ms Ridgway told a Westminster Education Forum seminar: “If it is measured in terms of outcomes, it has to stall. Comparable outcomes fixes grades where they have been in the past.

“We believe that Ofsted is failing to recognise that overall attainment by 16-year-olds is effectively capped by the current and proposed GCSE awarding process. As student attainment is the critical factor in Ofsted judgements, it is no surprise that the proportion of schools graded good or better is relatively unchanged.”

The comparable outcomes approach was introduced more than a decade ago in England. Since then, grade boundaries have been set using a combination of statistics from previous years and examiners’ judgements of student work.

But Ofqual has already confirmed that grade descriptors - criteria that should be fulfilled for a student to reach a particular grade - will not be used at all in 2017, the first year that the new GCSEs are awarded. Their use beyond that date is also in doubt: an Ofqual consultation published last year says, “We will consider in due course whether in future such [grade] descriptions could have any role in awarding.”

Sylvia Green, director of research at Cambridge Assessment, the parent company of the OCR exam board, believes that a greater emphasis on statistics would be a mistake.

“The day that we take the judgement out, or give it a lesser role, and [allow] the statistics to determine the standards too much then we can very easily lose sight of what that standard really is, in terms of the performance of students,” she told the Westminster seminar. “I would ask [Ofqual] to think very carefully about what that balance should be.”

Ms Ridgway said that because GCSE grades would be set in line with a cohort’s primary test results five years previously, rather than their performance on the day, a certain grade could mean something different every year.

“Grading decisions will not, in any way, be based on criteria required to achieve a grade,” she said. “Therefore, how can that grade be used safely to differentiate between achievement?

“For a further or higher education institution or employer receiving those grades, what does it mean to them? It doesn’t mean that child has achieved a standard.”

She added: “It is also going to create massive difficulties in teaching and learning in the classroom - and for students, teachers and parents to understand what they need to do to improve.”

Ms Ridgway claimed that the reality of comparable outcomes was that GCSE results “cannot improve nationally”.

Ofqual’s answer is that the national reference test it is developing will act as an independent guide to pupil performance and will reveal when GCSE grades should be allowed to increase. However, one exam board official has told TES: “The notion that we can produce a reference test that is infallible is a bit of a fairy story, really.”

Tim Oates, Cambridge Assessment’s director of assessment research and development, said that comparable outcomes posed “really demanding problems” for Ofsted and “the chief inspector’s judgement about the overall trajectory of the education system”.

He said the approach made it harder to tell what a grade meant in terms of what a student could do, but did mean that pupils in schools that were slow to adapt to the new GCSEs would not lose out.

“This is a very tricky debate and we desperately need to know what we are doing and which way we are jumping,” Mr Oates said. “It does need to be exposed to some public scrutiny.”

An Ofqual spokesperson said: “Statistical evidence will be a key tool in the first year of awarding for new GCSEs, to minimise the possible disadvantage to students at a time of change.

“The approach to maintaining standards after the first year of awarding is still being considered…statistics will continue to be used and the process will be supported by the introduction of a new national reference test.”

An Ofsted spokesperson said: “GCSE results are only one factor that Ofsted inspectors take into consideration when judging the overall effectiveness of a school.

“Inspectors make professional judgements based on a broad range of data and inspection evidence. They look at the quality of teaching, the progress pupils have made from their starting points, leadership and management of the school, and the behaviour and safety of pupils. Exam results alone do not determine inspection outcomes.”

Notes on ‘comparable outcomes’

The idea of “comparable outcomes” was first developed by exam boards in 2002, to ensure that the first cohort to take reformed A-levels was not unfairly disadvantaged.
The exam boards decided to prioritise “comparable outcomes” over “comparable performance”. This ensured that the national proportion of students gaining each grade in a particular subject remained roughly the same.
To an extent, the idea that a certain level of performance should lead to exactly the same grade as it had in previous years was sacrificed.
The approach has been used ever since and was given increased emphasis by Ofqual at the start of the decade as the watchdog sought to curb grade inflation.
As a result, grade boundaries have since been set through a trade-off between examiners’ judgements and calculations based on historical statistics.
Ofqual is considering another major shift for the reformed GCSEs, with the possibility that descriptors (demonstrating the grade that should be awarded to work of a particular standard) will not be used at all.