The DFE now appears to concede that exam results tables give a misleading impresion of schools' achievements. But will it ever be possible to compile "value-added" rankings - and should we even try? David Budge opens a two-page report.
The more value-added work is publicised, the more essential it is to get it right. There are, however, several ways in which it is possible to get it wrong. All, alas, are in use.
(a) Failing to take subject difficulty levels into account.
The "difficult" or "severely graded" subjects are the sciences, mathematics and foreign languages. This pattern holds at GCSE, at A-level, in England, Wales and Northern Ireland, for Highers in Scotland and across all exam boards. Every value-added system must deal with subject difficulties in every year, and yet: Software is being sold to encourage the calculation of value added within a single school. There is no way, within a single school, to take account of the difficulties of subjects that year. Some teachers will be blamed unjustly.
An accountancy firm is calculating value added between GCSE and A-level using the explicitly-stated assumption that all A-level subjects are equally difficult. They assume, falsely, that if students take, say, English, maths and theatre studies, they should be expected to get the same grade in each subject. The result will be that teachers of more difficult subjects (or subjects more severely graded) will regularly be blamed for having produced poor value added.
Whole baskets of examination results (such as total UCAS points scores or percentage of A to C passes at GCSE) are used as outcome measures without regard to which subjects are involved. The net effect could be to encourage schools to steer pupils away from the sciences, mathematics and foreign languages - subjects vitally important for a modern society.
(b) Being ready to use any available test as a baseline.
Tests may be "standardised", meaning that they are published and have "norms" associated with them, so that a score of, say, 100 is supposed to represent the score of "an average child", but different standardised tests have not been shown to be equivalent. Analyses based on the assumption that they are equivalent may be severely unfair. The norms in standardised tests may refer to two decades ago or two years ago, yet not only have curricula changed but there is no guarantee the norming populations were in any way equivalent.
The norms will be subject to sampling variation as in any sample -- and to an unknown degree. Indeed, attempts in the United States to equate standardised tests concluded it could not be done.
If a school submits standardised scores as a baseline against which value added can be calculated, then it runs the risk of having accidentally selected an easy test, making its students look particularly able, and thus making its value-added scores look particularly poor. Only tests based on the same norming populations and given under the same conditions at the same time should be used if serious attention is to be paid to value added scores.
c) Using socio-economic data such as free school meals instead of prior achievement or ability measures.
In efforts to counteract the "league tables", people have been tempted into relating schools' results to measures of home background such as free school meals (FSM) or postcode-based "guesstimates". These attempts may seem successful in altering the rank ordering of schools, but are misleading and dangerous for several reasons: They look like an excuse for expecting little from "the poor". This is insulting and not consistent with the data.
Measures of home background of any kind (parental occupation or education, FSM, cultural capital etc.) correlate weakly with examination results. If you measure pupil-by-pupil - the only proper way to calculate value added - then home background measures generally account for no more than about 9 per cent of the variance in exam results, whereas a measure of prior achievement generally accounts for at least 36 per cent.
But haven't we seen very strong correlations with FSMs? Yes, but: the effect that the FSM variable has on the ordering in a league table will depend upon how much schools in the LEA are segregated by social class. Relatively unsegregated schools will not yield strong correlations between the examination scores and FSM variables, so little adjustment will be made. On the other hand, where schools are strongly segregated by social class (the rich go to one school and the poor to another) the correlations may be so strong that the two-edged sword effect takes hold.
The apparently very strong association between FSM and examination results convinces parents they had better avoid certain schools. But the strong correlations only result from the use of single numbers to represent the work of entire schools. This use of aggregated data leads to the "ecological fallacy", an example of which is the belief that free school meals strongly predict achievement of children. They do not.
FSM is, in any case, a poor measure because many pupils eligible for free meals do not claim them. In one school, 27 per cent of the girls were recorded as FSM, but only 8 per cent of the boys. As a result, the school showed positive value added for the girls, and negative for the boys. The error was in the FSM variable, not in the teaching. The boys had not claimed the FSM to which they were entitled.
These are just a few of the ways of getting value added wrong, but the 64,000-dollar question is how it can best be used to effect constant improvement in effectiveness, efficiency and morale in schools. The human systems must be as good as the technical systems.
Value added provides a tool of tremendous interest to teachers. If properly used, the UK could lead the world in having a carefully monitored system, with information on value added at the heart of it. But if wrongly used, it can be destructive. Since Gillian Shephard, the Education Secretary, has said that it will take five years to produce value-added measures (timing from the year when the SATs for 11-year-olds are in place) why don't we have a five-year moratorium in which schools are provided with a variety of value-added measures, but without the threat of publication? That could provide time for research on how to run systems which motivate rather than mislead or dismay. (Where could the money for research come from? From the Office for Standards in Education budget . . .) Carol Fitz-Gibbon is professor of education at the University of Newcastle upon Tyne and director of the Curriculum, Evaluation and Management Centre.