Are timed exams really the best way to assess students?

Research shows that having students take exams under timed conditions may not be the fairest or most accurate way to assess what they have learned. So, in light of the current exam hiatus, Charlotte Noon and Martin Noon ponder if now might be a good time to change the system
1st April 2021, 7:25pm
Are Timed Exams Really The Best Way To Assess Students?

Share

Are timed exams really the best way to assess students?

https://www.tes.com/magazine/teaching-learning/secondary/are-timed-exams-really-best-way-assess-students

When was the last time that you had to complete a complex task under strict timed conditions? It’s not a situation most adults find themselves in on a regular basis because strict time limits are rarely placed on the work we do.

Where there are time limits, deadlines will usually be discussed and will be days, weeks or even months into the future, providing us with time to carefully process information, plan, draft and redraft. Most agree that this is a good thing: it ensures better, more accurate results and it is fairer to those involved, too.

When it comes to students being assessed in schools, though, we seem to agree that the opposite is true. It is fairer, and a more accurate assessment, the system suggests, if we strictly constrain the time a student has to answer exam questions.

This is problematic. The system itself hints at why: schools give some learners with special educational needs and disability (SEND), or with English as an additional language, extra time to complete their exams as they are not expected to be able to do the same work in the same amount of time as their peers. There is an acceptance of an issue with time constraints. But no one seems to want to follow that thread to the end and see the tangle it stems from.

Are those students given enough “extra” time? How would we know? Would each student not differ in the amount of extra time they need? Should other students be given more time? If so, how much more?

All these questions come under the umbrella of a much bigger question, one that is useful to look at in this period when exams are not being held: is time really a useful or desirable constraint in the assessment of how far an individual has “learned” something? We wanted to find out.

There is little evidence from the exam providers about time limits, just that they set times they deem appropriate to the content (see box, below).

So, we started with a trawl of research to find any studies that focused on the timings of exams and the impact that variations might have on individual students. The net returned a meagre catch: it seems our interest was not widely shared, or of high priority, in academia.

However, we did find a very interesting 2007 paper entitled Validity Issues in Test Speededness by Ying Lu and Stephen G Sireci. In the paper, they argue that although time limits are a necessary part of test design, for administrative reasons, designers must pay attention to how much of an exam is about “speed” and how much is about “power”.

The researchers define a “speed test” as one in which the questions are easy enough that candidates should never get them wrong. The exercise, then, is purely testing how many of those questions can be answered in the allotted time. It’s a test of recall, basically, not knowledge (the knowledge is assumed to be there).

What they call a “power test”, on the other hand, is one in which candidates are still expected to answer all the questions but their performance is judged on the accuracy of their responses (assessing whether they know it, not how quickly they can recall it).

Taking Lu’s and Sireci’s definitions, it is clear that most, if not all, statutory tests in the UK are not designed to be speed tests but are instead functioning as power tests (insert sarcastic quip about rote learning here). What has happened, however, is that these power tests have been forced into the format of speed tests for administrative and logistical purposes.

‘A severe threat to validity’

So, what’s the problem with that? In the paper, the researchers state that “when tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores”.

In other words, when we ask students to complete tests that are intended to measure the accuracy or quality of their responses, but make them do it under timed conditions, we could be undermining the effectiveness of those tests.

As teachers, we felt intuitively that this would be the case - and we’re not alone. According to two surveys we conducted through Tes, the majority of teachers are already well aware of the limitations of our current exam system.

Of 107 primary teachers who responded, a whopping 97.2 per cent said that they did not believe that current key stage 1 and 2 tests reflect the ability of all pupils. Similarly, of 210 secondary teachers surveyed, 77.5 per cent said that they felt key stage 4 and 5 tests did not reflect the ability of all students.

What’s more, there was a feeling that the amount of time allocated for exams is generally not appropriate, although primary teachers felt this more keenly than in secondary. A significant 87.9 per cent of primary teachers reported that they did not feel that the length of the current KS1 and KS2 tests is appropriate for the age of the children, and 83.2 per cent believed that the time given to complete the tests is not “generally appropriate for what pupils are being asked to do”. In secondary, 42.1 per cent of teachers surveyed said that they did not believe that the time given for students to complete exams at KS4 and KS5 was appropriate for what they were being asked to do.

Why the difference between the settings? It may, in part, be down to the fact that secondary teachers are subject specialists and that those in some subjects view the existing exams as more appropriate for their subject than others.

Of the secondary teachers who did think the time limits were inappropriate, the more detailed responses were very telling. For example, one secondary teacher commented that “put simply, examinations are biased towards pupils who are able to recall information under pressure. Many pupils can skilfully apply information, demonstrating in-depth understanding in lessons, verbally or through coursework, but struggle to show the same abilities under pressure in an exam setting.”

Similarly, comments from primary participants argued that existing tests are “biased” or “unfair” and that they “do not truly reflect children’s knowledge and skills”, because they fundamentally function as tests of memory and speed, rather than of a pupil’s ability.

Do we not want to test how quickly learning can be recalled and recorded, as well as how fully or accurately, though? Perhaps, but we never explicitly state that tests are a measure of speed. And there are many good reasons that it would be problematic if we did take this route.

Ellen Braaten is an associate professor of psychology at Harvard Medical School and co-author of the book Bright Kids Who Can’t Keep Up. She explains that some young people have slower processing speeds than others, meaning that it takes them longer than their peers to perceive information, process it and formulate a response. That has no relation to their intelligence nor their capacity to learn, if they are given enough time to do so.

“If you are slower at reading, you are not going to finish the test in time. Not because you don’t understand the question but because you don’t have time to read the question accurately and do what’s needed,” she says.

That’s not something students can easily tackle: there are strong genetic links to processing speed. The concept of cognitive load comes into play here, according to David Putwain, professor of education and early childhood studies at Liverpool John Moores University, who researches exam anxiety.

He explains that cognitive load is “the balance between the amount of mental computation you need to do a task versus the demands made by the task capacity you’ve got”. This is linked to working-memory capacity, which allows you to hold information in your mind “while you need to think about something else”.

“It’s also the part of cognitive function that draws information out of long-term memory, so working-memory function and capacity is critical to any cognitive task,” Putwain says. So it seems that it is possible for a candidate to know everything they need to know to do well in an exam but for their performance to be affected by how quickly they can retrieve information from memory.

“We know that there are individual differences in working-memory capacity,” Putwain explains. “Some people naturally have a slower working memory or some have not got as large a working-memory capacity. In the latter instance, those people can perform a task as well as an individual with a greater working-memory capacity but they take longer to do it.”

Current SEND diagnosis processes are almost certainly not picking all of these children out for extended time in assessments, something teachers in our survey acknowledged. An overwhelming majority of respondents (89.6 per cent at primary and 71.7 per cent at secondary) said that they believe there are pupils who do not currently get awarded extra time in exams but who would benefit from being given this, despite not having recognised needs.

Extra time for all?

Is it not easier, then, just to give extra time to all? Lu’s and Sireci’s research found that, by extending or removing the time limit of a test, you can reduce or remove the speededness and its negative impact on those taking the test.

As we discussed earlier, however, removing the time limit on tests completely is not a logistical possibility. So then you get into the question of: OK, how much more time do students need?

Well, we could give everyone the extra time that designated pupils already get. However, the majority of the teachers who took part in our surveys don’t actually think that the extra time that students with SEND are currently given is enough to make a real difference to what they are able to achieve. Of the teachers who responded, 88.8 per cent of primary teachers and 52.2 per cent of secondary teachers said that they did not believe SEND students had enough time to complete all the tasks in the exams.

So, do we offer more than they are currently getting? It’s not that simple. Lu and Sireci point out that as a candidate reaches the end of a test, they may become more anxious which, in turn, may affect their performance. They write: “There has been little research on the psychological impact of time limits (eg, does it increase test anxiety?), and so more research in this area will help us better understand the degree to which time limits affect valid score interpretation.”

If nearing the end of a time limit does increase anxiety, this would surely come into play in a decision about whether you made the overall time available longer or not. And research has shown that anxiety is one factor that does have a negative effect on exam performance.

A 2020 literature review, published by Ofqual, found that “after controlling for ability, high levels of test anxiety are generally associated with small reductions in test performance”.

The report also found that test anxiety can have effects that stretch beyond a student’s score on one particular test: “Test anxiety could have a detrimental impact on performance in high-stakes assessments, with further implications for entry to subsequent education and employment opportunities. But, regardless of the impact on performance, test anxiety can be significantly detrimental to a child or young person’s mental health,” the report states.

Therefore, even if we did decide on an agreed extension of time, it seems that might not be the answer. So, what is?

‘A mark a minute’

We could look again at the amount of content. Take the GCSE English language exam, for instance. This contains a huge amount of previously unseen information that students need to process before writing their responses. It is very clearly a “power test” that aims to measure students’ abilities to analyse in depth.

Yet, within the existing time limits, students are forced to focus on gaining “a mark a minute” rather than developing an answer. It seems that the nature of the test does not align with the skills we are trying to measure.

And it isn’t just “essay subjects” that suffer from this problem. Teachers in our survey, across a range of subjects, commented that the enormous amount of information students are now asked to recall and process in exams means that there are fewer opportunities for them to show depth of understanding or knowledge.

Since the reforms to GCSE, there is now an expectation across the board for students to recall a lot more content from memory. In maths, for example, there are huge numbers of formulae to remember. This information used to be provided in the old specifications - and is still provided in the IGCSE. So, why not now?

Similarly, in English literature, students now need to remember banks of quotations (which may not always be the most relevant ones for the question they have been asked). In addition to time pressures, this weight of recall is bound to have a significant impact on processing speeds, adding to students’ cognitive load and anxiety.

So, do we teach less or provide more resources to reduce the load in exams? The answer is that it depends. There is a question we need to ask more often in schools and urgently at a policy level: what is it that we are really trying to measure?

From that, we can then have a more sensible discussion about whether existing specifications and the current format of exams are fit for purpose.

The answers may differ not just by subject but by topic within a subject, and perhaps even between pupils.

If we are assessing students’ skill in analysing text, for instance, does it really make sense to introduce an element of “speededness” that could affect the validity of the results? Does it make sense for some pupils and not others? We don’t know.

In the past 12 months, we have seen so much binary debate between exams versus no exams. Ultimately, that is limited in its helpfulness. If exams are to be part of our future, getting into the detail of what they cover, what they try to measure and how they are conducted seems to us to be more beneficial.

And timing needs to be central to that discussion: too many young people are currently being excluded from a future they are capable of partaking in, not on the basis of what they know (or don’t know) but simply on how quickly they are able to demonstrate it.

Charlotte Noon is an English teacher and GCSE examiner and Martin Noon is a maths teacher in Essex

This article originally appeared in the 2 April 2021 issue under the headline “Why the need for speed?”

You need a Tes subscription to read this article

Subscribe now to read this article and get other subscriber-only content:

  • Unlimited access to all Tes magazine content
  • Exclusive subscriber-only stories
  • Award-winning email newsletters

Already a subscriber? Log in

You need a subscription to read this article

Subscribe now to read this article and get other subscriber-only content, including:

  • Unlimited access to all Tes magazine content
  • Exclusive subscriber-only stories
  • Award-winning email newsletters

topics in this article

Recent
Most read
Most shared