How justified is Sir David Normington, permanent secretary at the Department for Education and Skills, in calling for the Statistics Commission to "set the record straight" and correct its report?
Sir David argues that the report makes no attempt to look at English and maths separately before deciding whether improvements have been exaggerated.
He says that "the most important and comprehensive study available", by Alf Massey of the University of Cambridge Local Examinations Syndicate, found that standards in KS2 maths had been maintained.
Other academics contest this finding. But it seems at least plausible to say that the commission would be on stronger ground if it limited its detailed comments to KS2 English.
In his letter, Sir David fails to mention that the Massey research he quotes in relation to maths also found that the KS2 improvements in English came about largely because the standard of the reading test fell.
Massey's methodology was impressive and Sir David makes only a brief attempt to quibble with his findings on English but then moves on.
His second argument is that none of the evidence cited by the commission effectively calls into question the Qualifications and Curriculum Authority's test-setting procedure.
But Massey's report did question this. It said the QCA's practice in the 1990s of comparing test standards from one year to another rather than over a longer period was "inherently weak". It made an incremental drift, down or up, in standards over time possible.
Sir David's third and final point is that the commission did not have the evidence to conclude that teaching to the test was going on - and the commission does not attempt to prove this.
Yet, in 2003, a Government-commissioned evaluation by University of Toronto academics, which is not mentioned by the commission, cited widespread teaching to the test in Year 6.
The fact that the QCA has also pointed to evidence from Texas, where results shot up after high-stakes testing was introduced but on other measures pupils appeared to be doing no better, hardly helps the DfES's case.