Students and teachers are often disappointed with some or all of their grades, and this will always be so. Don’t let us be consoled by this and dismiss the anxiety over grades as a temporary, COVID driven problem requiring only an immediate, pragmatic solution.
I was for several years in the early 2000s, a senior A level examiner. I set papers, wrote mark schemes and participated in grade reviews before grades were published.
I participated in meetings that manipulated mark schemes after students had completed papers but before they were marked – also in the meetings which manipulated grade boundaries after marking. These manipulations had four aims:
- To try and achieve a normal (bell-shaped) distribution of the marks.
- To achieve a distribution of grades as similar as possible to that in previous years.
- To avoid favouring or disadvantaging some classes of students.
- To try and maintain a consistent standard of intellectual challenge from year to year.
These aims are entirely appropriate but can clash horribly despite best endeavours.
I can still feel the pressure as I marked and remarked the coursework of my own students, knowing that the marks would very substantially contribute to their grades and that my own reputation was at stake each year. The pressures on teachers passing judgement during this COVID year must have been horrendous.
I hope that the COVID grades fiasco will lead to a better understanding of the inevitable pitfalls of both teacher assessment and of public exams and tests. A proper balance between these is essential. The Tories have been so wrong to minimize coursework.
Grades should be abolished. How can we determine the futures of our children by erecting a series of artificial cliff edges where, for example, there is no real difference between a top D and a bottom C?
There are possible alternatives. We would do a huge public service if we chose a good alternative and campaigned for it. This might even compensate for tuition fees!
* I have been a long time Liberal and Libdem voter, member of the European Movement, I joined after the referendum
17 Comments
Mr Stevens, correct me if I am wrong; but things seem never to have been the same since norm referencing, which dominated the grades when I took my A levels back in the early 1960s, was abandoned some thirty or so years ago in favour of the more utopian criterion referencing. Back then we had what I think was called the Gaussian Function/Curve, whereby the percentage of grades awarded resembled a mountain, whose peak was around the middle. So, for example, there were far more Cs and Ds than, say, As or Bs or Es or Fs. What it meant was that the number of As and Bs awarded was limited, as were the Es and Fs. Mind you, back in the early 1960s we got percentages and not grades and very few of us went on to Higher Education compared with today.
Now if you complete the task you get the grade and, in my subject the task has changed radically. No wonder that we’ve had grade inflation over the years. I can only speak for the subject I taught for over thirty years, namely Modern Foreign Languages; but I would say that the attainment in language proficiency in terms of grammar and vocabulary has actually declined over the years. Bizarrely , the ability actually to communicate in a MFL, particularly orally, has generally improved, although the sad decline in the numbers studying MFLs past GCSE reflects the Inherent difficulty of the subject compared with many others.
Sorry the above is a bit tangential, Mr Lishman, but I felt the need to get it off my chest!
It is not rational to impose a pre=ordained frequency distribution on a set of measurements that have not been made yet. So in schools it is unreasonable to do anything other than tell children what they have to learn and test accordingly.
I appreciate that there exists an argument about whether to have criterion referenced or norm-referenced results. It is of course a false dichotomy. There should be a reason for a curriculum, and this should drive both content and methods of assessment.
I am fascinated by the idea that in some way intellectual challenge can be measured. I would love to know how a committee can decide that by looking at a question paper. With any degree of validity that is.
John -things have moved backwards I’m afraid. When I sat my exams in the 1970’s and started teaching in 1979, Norm Referencing was, as you say, the rule with artificially fixed %’s of how many could pass each grade nationally according to historical precedent/quotas.
Then Criterion referencing was introduced whereby if a student did or didn’t reach the appropriate level in their work they did or didn’t get the grade and artificial quotas were not imposed. I thought this was a vast improvement from all respects when I was a teacher, Head of Department, Head of Sixth Form over 22 years up to 2001.
Since then however there has been a return to Norm Referencing with it’s artificial quotas. Was it that Coalition to blame again? I can’t remember without looking it up. We don’t (as far as I know) say that only X% can pass their Driving Test each month/year and rig the pass rates to avoid going above or below that figure. If you have shown the appropriate skill levels and competencies you pass, if you don’t you don’t. Why should we artificially depress or increase pass rates for acadmic subjects in order to hit pre set quotas regardless of actual performance?
My wife, who went to Music college in the 1960s, had an ‘O’ level in Music at school and this was treated by the College as equivalent to a Grade 5 Theory pass (one of the entrance requirements). Nowadays the GCSE qualification equivalent to that ‘O’ level is treated as a Grade 3 Theory pass. The Music colleges, not being subject to the whims of educational theorists and politicians, have maintained their standards, while we can see the evidence of grade inflation in GCSE results.
The sad truth is that we have all been living with grade inflation for decades because no politician (or teacher) wants to be associated with a decline in educational standards. Not surprisingly, teaching to the test produces a seeming improvement in results, but does not imply increased inderstanding amongst students. The decision to remove the Universities’ from the Examination Boards (my own O and A level certificates bear the crests of the Universities of Manchester, Liverpool, Leeds, Sheffield and Birmingham in that order) and replace them with competing commercial companies was almost calculated to give a further twist to grade inflation as it removed the Universities’ role in identifying students suitable for Higher Education.
@Laurence Cox “The decision to remove the Universities’ from the Examination Boards … and replace them with competing commercial companies was almost calculated to give a further twist to grade inflation”
To be fair, I don’t think universities are well-placed to lecture anyone else about grade inflation!
The days of a 1st being exceptional and a 2:1 being a great achievement seem to have passed. I’ve never quite been able to reconcile this with the supposed dumbing down and grade inflation of students going to university.
Paul Holmes: The news today said the percentage of A-level students receiving the highest has increased over last year. So I’m not sure how that squares with the idea that Norm Referencing is making a comeback. anyway I do agree that Norm Referencing is unfair, because of the zero-sum nature of grading in such a system.
Alex, I think you would agree that this years exam results are somewhat unusual -not least because, due to COVID, there have been no exams!
The SNP Govt had to hastily backtrack over the outcry as it applied a Norm referenced ‘corrective’ to teachers grade predictions. The Westminster Government has trod on eggshells in its attempts to avoid a similar outcry in England.
@ Paul. It would have been very interesting to see what Williamson would have come up with if England had declared first instead of Scotland.
Since the A level results came out I have been interested in comments like “ our students worked hard and deserved better”. The problem I have as an ex teacher is that hard work alone often isn’t enough. You need some ability as well to get the top grade. When students used to ask me at the end of KS3 or 4 whether they should consider taking aNy subject further I used to tell them to ask themselves three questions about the subject, Do I like it? Am I any good at it? Will it be any use to me later? If they could answer yes to all three, they should definitely take it. Yes to any two and they should seriously consider taking it. Yes to any one of the questions and I would advise them against taking it.
It’s probably not scientific but why waste your time if you are clearly and demonstrably not cut out to do well in a subject that might appeal to you? As far as overall grades are concerned, they cannot be expected to continue to rise exponentially. The National IQ is not likely to change that much surely. Some students will do well, some will not. That’s why choosing realistically what you study is so important.
I think that Laurence Cox put it really well. Certainly, from my experience a Grade C in German GCSE was no sound basis to consider taking the subject at A level. It is no longer, if it ever was, equivalent to a GCE O level. Even today, going from GCSE to A level is like going from the old Vauxhall Conference to the Premier League in terms of standards.
Well, O level music was surely more about theory and less about performance than is the case with GCSE, so equivalence to a theory exam and ‘standards’ in general are two different things. (I took grade 5 theory; it was really little more than a memory test.) And I see that, under the heading ‘Action and Study’, Samuel Butler wrote: ‘These things are antagonistic. The composer is seldom a great theorist; the theorist is never a great composer. Each is equally fatal to and essential in the other’. Which of theory and execution is in the end more important in a performing art? And anyway, why are our discussions about education so often about grades at all, when they are no more than a crude shorthand notation, similar to index numbers in statistics and no more expressive? (Which is not to say there is not a genuine scandal about this year’s A level results.)
As to the universities and the examining boards, in the middle 1950s, long before GCSE was thought of, and when even comprehensive schools were no more than isolated local experiments, a new board, the Associated Examining Board, was set up which by design was not run by a university. The reason was that, while O level was designed for grammar school pupils, many secondary moderns were entering pupils for O level, and with success. It was hoped that the AEB would provide syllabuses more suited to them.
But do not let the flummery of a coat of arms let you estimate universities’ motives too highly. Though the AEB was not run by universities it did have academics advising it. One of these, Eric Laithwaite, a professor of engineering, was surprised to discover, when he took up the role, that all the boards, including the university ones, were commercial companies, as fearful as any other commercial companies of losing customers, and as consequently eager to trim to the caprices of their customers, the schools.
Like Me Stevens I was a principal examiner just over a decade a ago. The application of algorithms to exam results is nothing new.
The exam boards found that there were too many assistant examiners whose marking needed to be completely remarked because it was unacceptably erratic. This was made worse when examiner standardisation meetings were abandoned in favour of online training of examiners. 100 examiners holed up in a London hotel for two days was expensive. In addition many examiners were marking late at night after a day in the classroom and it showed in their accuracy.
All these remarks were expensive for the board do they came up with a technical fix. If we applited all we knew about the student and the school to a students paper we could give them a fair grade without the need to remark. That was the theory, but it relied on dangerous assumptions about the similarity between cohorts.
In short, exam boards have been using algorithms for some students for several years. Now this year it has been applied to all students.
The challenge posed by exams varies from subject to subject as shown by ALIS. I don’t believe it’s very profitable to try and compare things in the more remote past with the situation now, in the post Gove world (about which I know little).
Exams and tests have always been criteria referenced. Each learning outcome listed in a GCE or A level specification is a criterion that can be used to frame a question and prepare a mark scheme. The driving test example has never been relevant. If you make a serious error or omission in your driving test you fail. This does not happen with A levels.
We are never going to live in a world where it will be unnecessary to try and rank people. The question is, are they ranked fairly and in a way that improves overall standards and encourages creative and effective teaching and learning.
I do believe that injustice and other undesirable consequences arise if the need to produce a bell-shaped (normal) distribution of marks drives the whole process or if only a fixed percentage of candidates are allowed to pass.
For each subject, I would like to see a pass/satisfactory mark (e.g. 43%). All those with a mark equal to or above this mark should be given their percentile showing e.g. that the candidate was in the top 3% or top 56% of candidates taking the qualification that year. This would openly and fairly rank the candidates without the misleading distortion and injustice of the grade boundary. Establishing a pass mark is problematic enough.
Candidates who do not achieve the pass mark should not be ranked. Being a 98th percentile candidate would be worse psychologically than being a candidate graded U.
I do strongly believe that there are aspects of many subjects, eg practical work in sciences and oral work in languages, that can only be assessed by properly moderated coursework. It is a scandal that practical investigation is no longer assessed in Biology after Mr Gove’s efforts.
@Ianto Evans “It is a scandal that practical investigation is no longer assessed in Biology after Mr Gove’s efforts.”
I believe Mr Gove had a few orange-rosetted companions at the time!
I always dislike finding myself defending Michael Gove on this site (will need a shower afterwards!), but there were serious problems with the assessment of practical work in science at GCSE and A-level: e.g. grade boundaries were squashed very close together at the top and some schools cheated. It certainly needed reform. I don’t know the details of the reforms, but some sort of pass-fail certificate or endorsement which does not affect exam grades sounds reasonable.
I just want to add that I really appreciate this excellent article and the considered comments beneath it.
It’s great that people have avoided piling into the controversy raging elsewhere about exam grades, perhaps an acknowledgement of the fact that it’s an incredibly complex situation, there’s no easy or definitive version of “fairness”, and everybody involved (in government, in schools, and everywhere in between) is surely trying to do their best for the children and young adults affected. Some of the media coverage has made me despair.
I believe internally assessed coursework should be certificated alongside but separately from, the exam-board’s assessment. It should not be aggregated with external exam marks. This should be done in a way that allows the candidate (and employers, universities etc.) to easily see and reflect on how they have performed in both.
Not enough time, resources and strategic planning was put into course-work assessment in the past. Now that vast amounts of rich data (for example, all the drafts of an essay with all the editing that has taken place each time a document has been saved) can be stored and changes made trackable, we have better ways of spotting the cheats.
There will always be some cheating. You don’t abolish driving because some people break the speed limit. There is an unfortunate truth that the most important things to assess are often the most difficult to pin down in a fair and transparent way.
@Ianto Stevens “There is an unfortunate truth that the most important things to assess are often the most difficult to pin down in a fair and transparent way.”
Very true.
Coursework is a vital opportunity for students to learn important skills such as research, teamwork, etc., but if it is not assessed then it is easy to imagine it being neglected in an effort to optimise easily-measured exam results.
The only thing I have ever done under exam conditions is exams!
I’m a now retired university teacher of French and have attended many examination boards within my university over the years. In the 1970s, degrees were awarded on exam results only. The scripts were always marked according to the criteria laid down (like the driving tests referred to above), and at Honours level were double marked, so that the mark represented the agreed assessment of two examiners (moderated if they could not agree by an external). At the exam boards to determine the classification of the degree, the run of marks was for most candidates clearly within one of the classes, in those days mostly II/2. But there were usually cases in which a run of marks did not unambiguously point to one class, and the question was then did a candidate in a previous year have a similar distribution of marks and what grade had that candidate been awarded. Once a match was identified, that outcome was then the class awarded for the candidate we were considering, which makes the process about precedent rather than norm referencing. In the 1990s, the University decided to include coursework in each candidate profile, with the result that II/1s became much more frequent than II/2s. This does not reflect a change in the marking criteria, only that students under less time pressure produced better work. The conclusion I drew was that awarding classes was a mistake, since it forced different performances into a single mould, and that it would be more helpful to future employers to have the profile of marks awarded, and the set of criteria for marking followed by the markers in awarding them. In particular that would show who performed well under pressure, and who was capable of well thought out work. The two are not mutually exclusive. Does any of this apply to marking school pupils? I think it does if grades are abolished, which would make them unavailable for assessing the performance of previous years and for funding schools. The more difficult question is how to train examiners to apply the same criteria in the same way as fellow examiners. It is time consuming to meet and talk through differences of opinion in any individual piece of work, but the attempt to generalise from an algorithm has not covered itself in glory this year, and norm referencing is unfair to the individual students.