As the 'falling standards' bandwagon gathers pace, Ted Wragg attacks the latest analysis of test results.
The crisis in primary schools is now official, according to the newspapers. The basis for this alarm was a report written by the right-wing critic, Dr John Marks of the Social Market Foundation, analysing the results of the key stage 2 tests in 1995.
It was given huge publicity as an authoritative piece of research from an "independent think tank". Hardly any reporters or interviewers queried the conclusions: that the average pupil was 18 months behind the expected level of attainment in English and two years behind in maths.
I was concerned because the document seemed to me to have several flaws as a piece of "research". As a former president of the British Educational Research Association, and the editor of Research Papers in Education, an international journal, I tried to think how I would have reviewed it.
I concluded that there were 10 reasons why it should be rejected and two why it should be taken seriously. The reasons for rejection are as follows: 1 The report treats national curriculum levels as a ratio scale. This means that each point is exactly equidistant from the ones either side. A true ratio scale offers great precision. On an accurate tape measure, the 2 metre mark is exactly the same distance from the 1 metre and the 3 metre marks. Exam scores are not exact. A mark of 80 per cent is not precisely twice as clever as one of 40 per cent. The eight levels of the national curriculum are certainly not a true ratio scale.
2 National curriculum levels are measured by three different, but equally inaccurate, thermometers. Levels 1, 2 and 3 are mainly measured on an older thermometer at key stage 1. Levels 3, 4, and 5 are principally taken from a different, brand new thermometer (which hasn't yet "bedded down", according to Gillian Shephard) at KS2. The higher levels are measured on yet another at KS3. The middle thermometer is especially suspect.
3 The inaccuracy of the multiple thermometers is easily demonstrated. In Birmingham four 11- year-olds took maths GCSE in 1995 and gained a grade C pass, the level of an above average 16- year-old. The same four pupils obtained level 5 in the KS2 maths test, roughly the performance of an average 13-year- old. On two different "thermometers", they were over three years worse (or better) than themselves!
4 The tests for 11- year-olds were raw. Teachers felt the time for the maths test was too short. It was lengthened the following year. The first proper year in any test's life is often its most problematic.
5 One big flaw in the Social Market Foundation report is that John Marks takes a fairly crude category scale, where most pupils score 3, 4 or 5, and treats it as if it is a ratio scale. Averages are taken to two decimal places and interpreted as if they are accurate, with each gap between levels the equivalent of two years. For example, "In English the average pupil is 18 months behind the expected level of attainment".
6 The tests are criterion-referenced, so pupils cannot achieve part scores. You cannot get level 3.5. If every pupil in a class of 11-year-olds completed all the criteria for level 3 and half the criteria for level 4, the average score would be 3, not 3.5. John Marks would conclude that they were "two years behind the expected level attainment". In reality, given the problems of an inaccurate new test, they might be bang on the national average.
7 The report disregards teachers' assessments. These are higher than the test scores. It is easy to write them off as being "generous". Yet one year, when teachers' assessments were lower than test scores at KS1, an education minister, addressing parents, said teachers did not recognise their children's talents!
8 The returns of the test scores in 1995 were not accurate. The method of data collection set up by John Patten probably overestimated the poorer performers. In the tables of the Social Market Foundation report, the minimum school score in Somerset is zero. If that were true, then not one single 11-year-old pupil in those schools would be able to do any of the national curriculum.
9 There is no proper consideration in the report of social background, the numberof pupils who do not speak English as their first language, or the numerous other factors that affect performance in tests.
10 The "conclusions" offer the familiar right-wing litany, blaming "progressive" teaching methods. Test results in this form cannot assess teaching methods, unless they are individually related to observation data, which these are not.
All that said, there are two important reasons why better quality work on test scores than that shown in this report, should be taken seriously. There are differences in performance between boys and girls and between members of different social and ethnic groups which cause concern. These should be addressed.
Second, there are indeed variations between schools in similar circumstances. These may be the result of unac-ceptable differences in the quality of teaching and learning, rather than social or other factors.
Problems should be identified by effective inspection and addressed vigorously at local level.
John Marks is right to draw attention to school differences and to insist on a public debate. Unfortunately, despite his assiduousness, the analysis and interpretation of the 1995 results does not itself pass the test.
Ted Wragg is professor of education at Exeter University