Academics have been quizzed on alternative assessment systems by MPs today as part of an inquiry into the best ways to judge primary schools.
The inquiry was set up after some of the biggest reforms in primary assessment in the past 25 years were introduced last year. The "chaotic" introduction of the new Sats has been condemned by unions as leading to "unreliable and meaningless results".
The government has already promised to review primary assessment.
Today key issues discussed included:
1. Changing the writing assessments
“Assessment of writing remains extremely problematic,” said Tim Oates, group director of assessment research and development at Cambridge Assessment, adding that one alternative was comparative judgement.
Comparative judgement, would involve teachers submitting a portfolio of work for each Year 6 pupil. Teachers at another school would then go through them and compare two pieces of writing at a time, making a judgement on which is better and enabling children’s writing to be ranked.
“It is very hard for us to write down as a checklist what constitutes good writing,” said Dr Becky Allen, director of Education Datalab. “But there is such a thing as good writing, there is a shared expert understanding of whether writing is good or not, and that is where comparative judgement works really, really well.”
There were also concerns about the conflict of interest in teachers making the judgements on children, which are then used to judge the teachers.
2. The need for a baseline measure
Dr Allen was adamant that a new baseline measure in reception was needed. Without a baseline, she argued, it was hard to have a good view of what it’s reasonable for schools to achieve. She suggested that the government should look closely at the assessments that already exist and make a decision on which one to use. “It does not have to be perfect, it cannot be perfect,” she said.
Professor Robert Coe is director of the Durham University’s CEM centre, which provided one of the baseline tests that were scrapped by the government as a means of assessing progress last year.
He said that having a high stakes system around the baseline was what distorted the results. “It is perfectly possible to reliably assess four-year-olds,” he said. “But it may not be possible to do it when there is pressure attached to these assessments for schools to look good.”
In a separate move, the NAHT assessment review group has also today called for a return of a reception baseline.
3. Training for teachers
The need to train teachers in assessment, especially after the removal of levels, was agreed by many panellists.
But Professor Coe warned this was not something to be crammed into an already packed initial teacher training curriculum. “The problem is we don’t have a good model for teachers to continue to learn about complex aspects of how to be a better teacher after their initial training. We need to seriously consider how that works,” he said. “The scale of the problem is enormous,” he added.
Catherine Kirkup, research director at the National Foundation for Education, added there was a need for much greater “data literacy” among teachers.
4. Adaptive testing
Dr Oates said that he could see maths tests - in particular, a future move towards online adaptive testing – leading to a system where children are presented with test questions within their range of ability.
Presenting children with a spread of questions, some of which were far below, or above, their ability was “a waste of time”, he said.
5. Getting rid of thresholds
Dr Allen said she was not a fan of tests which resulted in a pass/fail judgement: “I don’t like thresholds and I don’t think it is useful or necessary to talk about the ‘expected standard’, and the introduction of the scaled scores meant we didn’t have to do that.”
At a school level, Catherine Kirkup, research director at National Foundation for Educational Research, recommended that rolling averages, rather than a single year’s results, should be used when looking at whether individual schools met standards.
Harvey Goldstein, professor of social statistics at the University of Bristol, also said that the notion of a threshold for schools was problematic, especially with small cohorts. “The DfE recognises this, it won’t publish results if you have less than 10 pupils”, he said. “But that is arbitrary, why 10? If you have 11 pupils that’s just as uncertain as a school with nine pupils.”