Skip to main content
article icon

Essays to be marked by 'robots'

News | Published in TES Newspaper on 25 September, 2009 | By: William Stewart

As computers are used in English tests, experts predict GCSEs and A- levels will be next

The owner of one of England’s big three exam boards is to introduce artificial intelligence-based, automated marking of exam essays in the UK next month.

The decision by Pearson, parent company of Edexcel, to use computers to “read” and assess essays for English tests is fuelling speculation among assessment experts that GCSEs and A-levels will be next.

The head of research at another of the big three boards told The TES it was a question of “when not if” automated marking was introduced extensively.

But last night academics attacked the idea of machines judging the quality of extended writing as “ridiculous nonsense”.

They fear that pupils will start writing to impress computers, removing human creativity from their work.

The TES revealed in 2006 that Edexcel planned to start trials of the automated marking of essays in dummy GCSE-style questions.

The launch of the new Pearson Test of English Academic (PTE Academic) on October 26 will mean the controversial technology is being used for essays in live exams.

Pearson boasts that its “proven automated scoring” will provide a test that accurately measures candidates’ English writing abilities.

Bethan Marshall, senior lecturer in English and education at King’s College London, said: “A computer will never be unreliable. They will always assess in exactly the same way. But you don’t get a person reading it and it is people that we write for. If a computer is marking it then we will end up writing for the computer.

“People won’t be aiming for the kind of quirky, idiosyncratic work that produces the best writing. So what is the point?”

Tim Oates, research director at Cambridge Assessment, which owns the OCR exam board, said: “It’s extremely unlikely that automated systems will not be deployed extensively in educational assessment. The uncertainty is ‘when’ not ‘if’.”

The technology being used by Pearson is designed to allow computers to assess pupils’ use of grammar and vocabulary. But some experts say newer, more effective systems are available.

The Pearson approach is based on correlations between human judges and artificial intelligence systems. Machines are “trained” to learn from the scores given to specific texts by humans so that they will be able to achieve the same results on their own.

Mr Oates said: “In simply getting an automarking system to agree with human markers you are ignoring the vital question of exactly what parts of performance are being ranked.

“Other developers are working on more valid approaches, of greater merit and promise. Crucially, these aim to be sensitive to the concepts and language structures actually being used by candidates.”

A Pearson spokesman said that its system produced the accuracy of human markers while eliminating human elements such as tiredness and subjectivity.

“There are many technologies and we feel this is the best for its designed purpose,” he said.

The PTE Academic, being introduced in at least 20 countries, including the UK, is designed to help English-speaking universities assess the English language proficiency of potential students.

An Edexcel spokesperson said that the board was not planning to use automated marking in mainstream exams for anything other than multiple choice questions.

She said the trials that saw the technology used for GCSE-style essays were very small scale and not being pursued “at this time”. She could not say whether the results were positive or negative.

Subscribe to the magazine

5 average rating

Comment (8)

  • Computers can't yet recognise handwriting very well. How is an essay then assessed? If they achieve comparable scores to a human marker, then how do humans really assess essays?

    Unsuitable or offensive? Report this comment

    28 September, 2009


  • And voice recognition too.....just speak, and the machines will type perfect documents for us hahahahaha.
    There is a massive difference between automated testing of English language proficiency, and a program being able to mark a piece of expressive or academic writing.
    Bring it on, folks, but I'll hang on to my red pen for the time being.

    Unsuitable or offensive? Report this comment

    28 September, 2009


  • You are fighting against nature,,,, We are always trying to find an easy way out to make things faster with quality..... it is once said THE WRITER IS HIS CHARACTER AND THE WALK OF LIFE IS WRITTEN FOREVER,,,,,,,,A computer cannot feel what a person feels in how they write.....especially when they are giving a message to the reader,,,,, The teacher cannot be preplaced ,,,,,,, by a computer!!! Now I know why our human race suffers,,, We are always trying to find a way to replace what Gad has made. lateJust an English teacher that was raised from the hood,,,,,,,,,

    Unsuitable or offensive? Report this comment

    28 September, 2009


  • It won't work.

    In answer to fenwickC's question, it is possible to quite accurately score for language usage, spelling and grammar (there is a simple reliable way of telling the reading age of a piece of text for example). There is often a strong correlation between these and the quality of the essay itself, as you might expect.

    However, not always. Where computers will fall down is with the imaginative essay written in basic English, and the boring poorly constructed essay written using complex words.

    (The above-mentioned reading age calculation measures things like sentence length, word length and so on, and can thus tell the difference between Sun journalism and Telegraph journalism. But a human does this differently, by looking at *what* is written, not *how* it is written)

    It also can't understand the quality of any argument, things like poetry in language (or indeed poetry at all).

    Pearson will point to the above mentioned correlation that it's methods work. They are wrong. There are far too many edge cases for it to be at all reliable.

    If you just used the simple reading age test you'd get a reasonable result.

    But it doesn't detect rubbish. A computer will struggle to differentiate between "Brutus killed Julius Caesar" and "JBrutus was killed by Julius Caesar and "Julius Brutus Caesar was killed" for example - and more complex sentences will be much worse.

    Besides the above problems, as the article suggests, teachers will teach children to write to score marks, not to write well.

    - Use long words, even if you don't know what they mean.
    - Use long sentences, even if you just do it by sticking commas here and there
    - Mention key words relevant to the text you are writing. Doesn't need to be contextually correct.

    Unsuitable or offensive? Report this comment

    29 September, 2009


  • I agree with autismuk. The creative element of writing will be totally ignored.

    I fail to understand the following quotation from the article:

    "An Edexcel spokesperson said that the board was not planning to use automated marking in mainstream exams for anything other than multiple choice questions."

    How can creative writing be multiple choice? And isn't using a computer to mark multiple choice exams standard practice already?

    Unsuitable or offensive? Report this comment

    30 September, 2009


  • I'd endorse the comments above about the quality of writing being so much more than things a machine can measure. Could a machine detect, let alone evaluate/appreciate humour and irony?

    A by-the-by: it seems to me to be terribly quaint to report on the actions of "examination boards". Exam Boards weren't abolished as soon as the National Curriculum came in. They did continue for a while but we haven't had any for about eight years now.

    Unsuitable or offensive? Report this comment

    1 October, 2009


  • I believe that human is the most qualified to assess essays. . Robots can never replace us. They maybe can read the <a href="">UK essays</a> but they never can relate to it or feel it.

    Unsuitable or offensive? Report this comment

    28 May, 2012


  • Amazing Blog Post. Blog Post provide acceptable obsession. The offer impacts on numerous standard worries of your brain..

    Unsuitable or offensive? Report this comment

    Rating: 5 out of 5 stars
    8 December, 2015

Add your comment

Subscribe to TES magazine
Join TES for free now

Join TES for free now

Four great reasons to join today...

1. Be part of the largest network of teachers in the world – over 2m members
2. Download over 600,000 free teaching resources
3. Get a personalized email of the most relevant resources for you delivered to your inbox.
4. Find out first about the latest jobs in education