How Pisa came to rule the world

As the international study’s latest results are released, its grip on policymakers and school systems is tightening. But with great power comes great responsibility, William Stewart reports

6th December 2013, 12:00am

William Stewart

This week was a momentous one for governments across the globe as the latest, long-awaited results from the world’s most influential international education study were published. The findings of the Programme for International Student Assessment (Pisa) may come out only once every three years but they are increasingly a very big deal indeed.

Even back at the beginning, however, Pisa had the power to make national leaders pay attention. The first edition of the survey plunged a leading nation into a state of collective trauma over standards in its schools (Germany’s 2001 “Pisa shock”). And since then its influence has only grown. Pisa results have been used to justify sweeping controversial reforms in England since 2010 and today are seen by a growing number of countries as a guide to how to create the perfect school system.

More than half a million 15-year-olds in 65 different jurisdictions took the most recent Pisa tests last year. But their performances will have an impact on many more students, some in countries that did not even participate. Today, Pisa is at the centre of what has been dubbed by two US-based academics as a new system of “global education governance”.

“Pisa seems to be well on its way to being institutionalised as the main engine in the global accountability juggernaut, which measures, classifies and ranks students, educators and school systems from diverse cultures and countries using the same standardised benchmarks,” Heinz-Dieter Meyer and Aaron Benavot write in a new book, Pisa, Power, and Policy: the emergence of global educational governance.

The professors from the University at Albany, State University of New York argue that Pisa has allowed the Organisation for Economic Cooperation and Development (OECD), which runs the programme, to begin “assuming a new institutional role as arbiter of global education governance, simultaneously acting as diagnostician, judge and policy adviser to the world’s school systems”.

The OECD readily acknowledges the reach of Pisa, noting that “many countries now set goals and benchmarks” based on its results. But it emphasises the help the study can give to education ministers by revealing “what is possible” rather than the power it has over their fates.

Few people would oppose the notion that collecting comparative data on the performance of national school systems could prove useful. As Sir Michael Barber, who was education adviser to former British prime minister Tony Blair when he was in office, told TESS’s sister magazine TES last year: “It would be really surprising and actually quite bizarre in an education field to argue that not researching something and not learning about it is a better way to go than learning about it.”

But once you go beyond that basic idea and look at what it might mean in detail, many potential problems emerge. Chief among them must be the question of whether it is even possible to make useful comparisons between such a variety of different countries. In the words of one critic, Professor Svend Kreiner, a Danish academic who has challenged the statistical methods used for Pisa, “it is meaningless to try to compare reading in Chinese with reading in Danish”.

Then there are the questions of exactly what should be measured, who decides and how they are held accountable. These will be asked more and more frequently as Pisa expands its activity. The latest round of the assessment included a computer-based problem-solving test for the first time, alongside the usual maths, reading and science tests. In 2015 that will be extended to include collaborative problem-solving.

Perhaps more significantly, a Pisa-based test for individual schools will be available from next year in the UK, the US and Spain.

Cold War origins

It is not just the growing scope of Pisa that will lead to greater scrutiny. The programme’s increasing power and influence could present its own issues. There is already evidence that some countries - such as Wales, which has gone through its own “Pisa shock” - are starting to tailor their education systems specifically to improve their Pisa rankings.

What began as an attempt at objective assessment is starting to become an active agent for change in the same school systems it is trying to measure.

So how did things get to this point? Professor Daniel Trohler, from the University of Luxembourg, traces Pisa’s origins all the way back to the Cold War and the launch of the Sputnik satellite by the Soviet Union in 1957. The unprecedented feat of sending a 58cm-diameter metal sphere into space punctured the West’s sense of superiority and triggered a crisis of confidence in school standards in the US. The federal government began to pour money into education. But the control that local authorities were guaranteed over state schools in the US presented the administration with a problem.

“To invest billions of dollars and not be sure about the effects was understandably unsatisfying,” Trohler writes in his article “The OECD and Cold War Culture: thinking historically about Pisa” in Pisa, Power, and Policy. “If the federal government could not govern directly, it at least wanted to see what effects its incentives had, and for this purpose a test instrument had to be developed.”

The result, he says, was the 1964 proposal for a National Assessment of Educational Progress, “which developed tools of comparative testing that were used 35 years later at a global level in Pisa”.

The first of these US national assessments took place in 1969, by which time the OECD was already eight years old. The organisation, which defines itself as a forum of countries with a “shared commitment to market economies backed by democratic institutions”, always saw education as a key part of its role.

But according to Trohler, it was only pressure from the US, and its threat to withdraw funding, that led to the OECD pursuing the idea of an international comparative education study in the early 1980s. This was despite opposition from most member states, “who seemed to doubt the feasibility and usefulness of international comparisons”, he explains.

Stephen Heyneman, an education expert who worked for the World Bank for two decades, says that the OECD’s own directors of education research reacted with “shock and deep suspicion” to the plan to collect education statistics. They believed “it was unprofessional to try to quantify such indicators, and that it would oversimplify and misrepresent OECD systems”.

But it was the US view that prevailed, paving the way for the eventual publication of the first Pisa report in 2001.

This long gestation period meant that Pisa was not actually the first set of international tests to compare school systems: the Trends in International Mathematics and Science Study (Timss) published its initial results in 1995. But it is Pisa that has proved to be the most influential programme. One reason could be that it comes out more often: every three years, compared with Timss’ four-year cycle.

The latter is run by the International Association for the Evaluation of Educational Achievement (IEA). In 2001, the IEA, an association of national research institutions and government research agencies, also launched a reading test, the Progress in International Reading Literacy Study (Pirls), which comes out every five years. But again this has a lower profile than Pisa.

Gabriel Sahlgren, research director at the Centre for Market Reform of Education in London, believes that this is because Pisa has had better marketing and the OECD is better known than the IEA. “But,” he says, “both (sets of studies) are probably valuable in their own right.”

Knowledge and real-life problems

Many agree with Sahlgren and view the assessments, which have important distinctions, as complementary. Pisa is age-specific, testing 15-year-olds. For the IEA, it is the number of years that students have attended school that is fixed. Timss tests fourth- and eighth-graders - roughly aged 9-10 and 13-14 - while Pirls is restricted to the fourth grade.

Even more significantly, whereas Timss and Pirls test topics that students are likely to have been taught as part of their school curriculum, Pisa looks at how well students can apply knowledge to real-life situations and problems.

This approach fits with the OECD’s economic focus. “The very meaning of public education is being recast from a project aimed at forming national citizens and nurturing social solidarity to a project driven by economic demands and labour-market orientation,” Meyer and Benavot write.

More prosaically, this means that Pisa is not really a test of school effectiveness. Andreas Schleicher, the senior OECD official who runs Pisa, acknowledged this point in a TES interview last year. “There are many different forms of students’ work - school is one, (but) it can be private tutoring, it can be learning reading outside of school with parents - and we should look at this holistically,” he said.

So, if Pisa is measuring what is going on in homes as well as schools, then could it be said that it is not a fair judge of the school system?

“I think that is true,” Schleicher admitted. “I agree with the criticism that you can’t say that the school system is entirely responsible for Pisa results.”

That is a potentially awkward fact for a programme that is increasingly being seen as a measure of the effectiveness of school systems, rather than the wider societal factors they contend with. And it could become more awkward as the OECD begins to market Pisa - designed to assess the ability of students across entire countries - as a way of rating schools.

But for Sahlgren the fact helps to explain some of the seeming inconsistencies with rival surveys. He uses the example of Finland, a perennial Pisa high-flyer. The first time the country entered Timss in 1999, it was placed at 14th for secondary students in maths. That was with seventh-grade students rather than the eighth-graders entered by other countries. Nevertheless, Finland’s next Timss entry in 2011 - it decided not to participate in 2003 and 2007 - did not provide much encouragement. Again it entered seventh-graders, but the average score they achieved in the test fell from 520 to 482.

Sahlgren argues that this is because the maths curriculum taught in Finland’s schools “has ignored certain concepts that are not tested in Pisa”. He adds: “If you compare the Pisa mathematics with the Timss mathematics, the Timss mathematics is much more relevant for higher studies in mathematically intensive subjects like engineering, which are going to be very important for countries’ futures.”

None of this has stopped educational tourists from around the world flocking to Finland to find out the secret of its success. Because among today’s ever more globally aware education policymakers, Pisa rankings are increasingly being seen as the main measure of success.

Overrated rankings?

The way these rankings are calculated is coming in for increasing criticism, with academics including Kreiner and Dr Hugh Morrison, from Queen’s University Belfast, arguing that flaws in the statistical methods employed mean that the results are “useless” and “meaningless”.

Schleicher has mounted a vigorous defence. But it is not just the views of sceptical academics that should be triggering concerns about the emphasis now placed on countries’ exact positions in the Pisa tables. The OECD has said that “large variation in single-ranking positions is likely” because of “the uncertainty that results from sample data”.

But it still publishes the league tables that become the focus of media and political discussion, giving Pisa its high profile.

John Bangs, chair of the OECD Trade Union Advisory Committee’s working group on education, wants the organisation to scrap the tables and start putting countries together in broad blocks or bands.

“The OECD has always argued that the rankings are not the main point of Pisa,” he says. “But the reality is that every country looks at the rankings even if the difference is a few points. It is turning into a football league.

“The OECD might say, ‘It is not us, it is the way the countries use the results.’ But I would say that it has to do much, much more to stop Pisa being used for crude rankings.”

Bangs also has major concerns about the Pisa-Based Test for Schools now being introduced by the OECD. “It should not fall into the trap of thinking it has got a marketable product that can be given to schools,” he says. “The tests are supposed to be a system-wide evaluation.”

He predicts the emergence of new elite league tables comprised of schools - most likely private ones - that can afford to run the Pisa test. “If the OECD wants to get into the test market for individual schools, then they ought to do something entirely different and not use the tests developed for a system-level approach,” Bangs says.

The power of the Pisa brand will probably ensure that the school tests are a success despite those concerns. But as Pisa grows, so will scrutiny of its findings and methods.

Only last month, TES revealed serious academic concerns about another crucial aspect of Pisa: the quality of the contextual information submitted by schools participating in the programme. There is also the risk that the higher the Pisa stakes become, the more governments will try to do everything possible to improve their position in the rankings.

Schleicher has admitted that in the past some countries, including Germany, Switzerland and Spain, “have tried to game the system”. The OECD says that the way the study is set up means it is impossible for them to succeed. But that is unlikely to stop further attempts in the future.

Today, Pisa has an unprecedented level of influence over global education. But it could be about to face its biggest test yet.