Value added taxes the statisticians

I sometimes think there are two kinds of people: those who believe anything if it is supported by numbers, and those who believe nothing numerical. The truth, as is frequently the case, lies somewhere between these two extremes. But in education today there is a danger that the first position will become dominant, especially in the continuing rush towards "value-added" analyses.

Don't get me wrong. I'm a statistician, and I believe that there is nothing like good statistical data for illuminating a situation and making people think about what is really going on. Unfortunately, there is nothing like badly analysed or presented data for obscuring a situation. "Raw" league tables of examination results are an excellent example of this, when used to measure the quality of a school's teaching.

For individuals, unadjusted results are vital indicators of the level of attainment reached in different areas. They also give a measure of the absolute levels achieved in a school. However, for judging a school's performance they tell you an awful lot about its catchment area and the ability level of its intake, but virtually nothing about how good they are at educating children.

This leads to the argument for value-added analysis. It is a good idea to control for prior attainment of pupils and social-context effects if you are trying to gauge how schools are performing relative to each other or to some average standard, but you have to be aware of the pitfalls. At present the movement towards producing national value-added systems from primary to secondary, from infant to junior, and even from pre-school to infant, runs the grave risk of discrediting the whole idea.

I will concentrate on some of the main points. The first is a basic statistical one, and it concerns the accuracy with which we can estimate averages based on small groups of pupils.

For a large secondary school with about 200 pupils in a year group, we may be able to estimate the value added by the school with a moderate degree of accuracy. In a primary, with say 30 pupils in a year group, just one or two high or low-scoring pupils can make a big difference to the overall results, so that it becomes impossible to accurately estimate the value added by the school. If you're thinking of doing value added from Reception to Year 2 in a small infant school, with say 20 pupils, you might as well forget it. You can do the calculations to three decimal places, and publish them in tables, but they mean nothing because for the vast majority of schools the results could just as easily have occurred by chance.

The second point is about what you use as a baseline measure of pupils' attainment when starting a particular phase of education. The obvious thing is to use the outcome measure from one stage of education as the prior attainment measure for the next - key stage 1 results as inputs to key stage 2, and so forth. But there are snags.

One is the question of how reliable and reproducible the intake measures are, between schools and across time. If new tests appear every year, and especially if the final levels awarded depend to some extent on teacher judgments, then the results may be useful for some purposes but it is not clear that they are suitable for measuring value added.

Another issue is how finely-differentiated the results are. National curriculum levels, each roughly equivalent to two years of schooling, are of little use as an intake measure for value-added purposes.

A third problem with using the outcomes of one stage as inputs to the next is the danger of "negative coupling". If a school, either in reality or through some technique which "massages" the results, achieves above-average results for its pupils at the end of one stage, then those pupils have to produce even better scores at the next stage for a good value-added result.

For example, I have heard of schools which have striven mightily to get good GCSE results for their pupils, given their initial attainments, only to have their A-level results, analysed on the basis of the GCSE data for those same pupils, show a poor value-added performance.

So the moral here might be: if you want to look good at A-level, make sure your pupils do quite badly at GCSE. This obviously cannot be right statistically, let alone educationally.

So what are the criteria for good measures of prior attainment for entrants to a new stage of education? I suggest that they should not be "high-stakes", that is to say they should just measure where pupils are without having any further consequences for anyone or any institution. They should be consistent countrywide and from year to year, and should be simple to administer and to mark, while giving measures of pupils' prior attainment which are as valid and reliable as possible.

Numerical value-added data can benefit schools, showing them how they're doing with the mix of pupils they have and pointing out areas of possible weakness. For this kind of job, however, you need more than a single score, value-added or otherwise. At the National Foundation for Educational Research, we have developed systems which allow schools to judge which departments are "adding significant value". They can then use their own knowledge or NFER's guidance to identify methods and processes in those departments which might be transferred to others. In that way numerical information helps schools to develop a fuller understandi ng and improve performance.

There is a case for publicly available data about schools, in the name of accountability. But what that data should be, and how it is used, must be carefully thought out and the consequences weighed. Let us try to measure what is important, rather than letting what we can measure be all that is important.

Ian Schagen is head of statistics at the National Foundation for Educational Research

Log in or register for FREE to continue reading.

It only takes a moment and you'll get access to more news, plus courses, jobs and teaching resources tailored to you