Test-bashing is an easy way of raising a cheer from teachers. Many American commentators blame standardised testing for all kinds of ills in their education system. But, as E D Hirsch points out, there is no alternative to regular testing if we wish to hold schools accountable:
Tests of academic progress are the only practical way to hold schools accountable for educating all children and are therefore essential to the twin aims of quality and fairness (The Knowledge Deficit, p91).
It may be true that we won’t fatten a pig by weighing it, but a good farmer will know how fat his pigs are, will be looking at why some are fatter than others, and seeking ways to make sure as many as possible are as fat as possible. He won’t be able to do any of those things unless he weighs them. Likewise, a good hospital will check outcomes in different wards and compare them. When there are major differences they will investigate why. Carl Hendrick describes an example of the importance of this kind of analysis in medicine here. Without analysis of results, medical reform is impossible. Likewise, without frequent objective testing, education reform is impossible. Charter schools such as KIPP have no fear of standardised testing, because it proves that their methods work. It is more likely to be the opponents of reform who attack testing, just as the opponents of Semmelweis’ medical reform questioned the validity of his data.
Moreover, national testing is needed, not just every few years, nor at the end of five years, as we now have in secondary school, but annually. Annual testing tracks whether pupils are ready to move to the next level. It holds schools accountable for every year of progress, so that every year must be taken seriously. Too often the early years of primary and secondary school are not taken seriously enough, because public examinations are such a distant prospect. Hirsch again:
Yearly testing is essential both to keep track of each student’s progress and to encourage teachers to cooperate in providing students with a coherent education in which each grade can build on the previous one (The Knowledge Deficit, p92).
The howls of protest at the proposal to repeat KS2 tests at the end of year seven show how popular annual tests would be with teachers. There is already much complaint about how teachers end up teaching to the test in year six, and in the years prior to GCSE exams. Teaching to the test is educationally damaging. But it happens not because of the frequency of tests or because they are particularly demanding, but because they are infrequent and undemanding, but very fiddly.
The low frequency of national tests makes them seem like a big, scary thing, and so teachers focus on them excessively, and the teacher’s own anxiety often influences the pupils. It also places unfair burdens on the shoulders of the teachers who specialise in the years when testing is done. If national tests were an annual event, they would become much more of a normal part of school life, and would involve all teachers, at every level. No-one would be able to defer dealing with challenging material, leaving it to the following year when the pupil would not be their problem. This is particularly important at primary level, where the curse of developmental theory is keeping children at the intellectual and behavioural level of toddlers, and those who are keeping them there are not being held to account. Year six teachers are then left to pick up the pieces.
Current tests are undemanding in terms of breadth of knowledge, but fiddly in terms of technique. This is because of our British addiction to performative testing, where the assessment is supposed to mimic reality. Performative testing through methods such as essay writing (or, even worse, coursework) can only ever sample a very small part of a pupil’s knowledge. In contrast, well-designed multiple choice tests can reliably and quickly sample a much larger part of the knowledge domain. When the breadth of tests is sufficiently wide and the techniques for testing are not complex and burdensome, it becomes neither practical nor effective to teach to the test.
The marking of multiple choice tests is also reliable and objective, whereas essays are notoriously difficult to mark consistently, while coursework leads to outright cheating and inflated grades becoming the norm. Because of the difficulty of achieving consistent marking of something so complex and variable as an essay, exam boards end up depending much more on specific techniques and terminology. The techniques and terminology must then be drilled for the exam (or coursework essay) – the dreaded ‘box ticking’. Daisy Christodoulou notes how prevalent this is in an analysis of current GCSE history text books, which lack breadth of knowledge, but are packed with exam tips. Based on the content of these textbooks, ‘Clearly, to understand Germany from 1919-1945 it’s more important to meet the examiner than to meet Bismarck’ (Changing Schools, p49). Christodoulou goes on to point out that exams are too ‘tricky and technical’: ‘Many teachers feel forced into excessive teaching of exam technique because otherwise, candidates with a good grasp of the domain are penalised for not ticking the precise boxes on the mark scheme.’ (p51).
In his analysis of the pitfalls of essay marking, Hirsch examines the work of Princeton’s Educational Testing Service, and concludes that there is no guarantee of consistency beyond what has been agreed in one particular session by one particular group of examiners. Even after a laborious process of calibration (what is called ‘standardisation’ here in the UK), ‘after the lapse of a weekend’, further calibration was needed to keep markers in line. He comments:
In fact, one of the College Board’s reasons for instituting multiple-choice testing was the finding that the high-stakes grades given to a student’s performance-based test “might well depend more on which year he appeared for the examination, or on which person read his paper, than it would on what he had written.” (The Schools We Need, p184, quoting Godshalk, Swineford and Coffman, The Measurement of Writing Ability, 1966)
Achieving a Balance
For testing writing ability, Hirsch recommends a combination of an essay component with a generous multiple choice section. He suggests that an essay component should be included not because it increases the validity of the score significantly – researchers have found the improvement to be marginal – but simply because having some kind of essay component sends a message about the value of extended writing (see The Schools We Need p185-188).
In areas where there is no need for this message about extended writing, there is no reason why tests should not rely exclusively upon multiple choice questions. Whatever the specific decisions about including an essay element, the fact remains that there are methods already in existence for cheaply and reliably measuring academic progress in a way which precludes the possibility of teaching to the test. What is lacking is not the way, but the will.
At university level, demanding multiple choice tests are used to examine the knowledge of medical students. Here is a topic where it really matters whether you know your stuff, so they cannot risk the subjectivity and narrow focus of essay based exams. Nor can they worry about damaging the self-esteem of medical students by testing them too much. We have to ask ourselves whether it really matters to us whether school pupils know things. If it does, then we need proper annual testing to check that they do.
If the political will existed, we could soon determine in a clear and objective way which pupils had been dressing up in togas and talking about how it felt to be a Roman slave, and which ones had actually learned something about Roman history. Then parents could make a real, informed choice. If they believed schools should be teaching academic knowledge, they could select one with good results on annual national tests. If, on the other hand, they believed school was intended to be expensive babysitting and third rate entertainment, they could select one which just moaned about the tests and failed to prepare children properly for them by actually teaching them something.
The principle of objective annual national testing of a broad spectrum of knowledge fits with the principle of a specific, coherent national curriculum. Put together, these two strands of education reform could empower every pupil in our state schools with the knowledge that would give them access to literate culture, and provide a foundation for further learning throughout life.
As they await the coming of this millennial moment, though, it behooves those schools who do care about transferring empowering knowledge to every pupil to introduce their own system of a coherent curriculum matched with frequent, objective testing. There is no conflict between this approach and success in GCSE exams, because those who learn in this way will know far more than the curriculum requires, and formative assessment will of course continue to include extended writing where appropriate. Schools such as Michaela are shining beacons of the way forward, and the more they prove their effectiveness, the closer the millennium approaches.