Delaying exams; why?

So, a research study has arrived at the conclusion that due to Covid19 students may be 3 months behind in their studies.     The delaying of exams to allow students more time to catch up has also been discussed.   This all seems like rather simplistic thinking.

There are for me a number of issues with delaying the exams.

The first is that we already accept that exams differ each year and therefore there is already tinkering in place to adjust the grade boundaries to keep some consistency across academic years when looking at the statistical outcomes of students in general.   This is why the result show small but steady changes year on year rather than being more volatile. It seems to me to be fairly easy to just adjust this process to normalise the exam results next year should they be, as would be expected, lower than previous years and should it be important to maintain parity in results across different calendar years. And this statistical fiddle would be more acceptable than the algorithm proposed for 2020 results as it doesnt differ from the statistical adjustments of GCSE and A-Level results in 2019, 2018, etc.

Another issue, if we were to delay the exams, is that it simply knocks on to following years.   So, delay the GCSE exams would mean teachers would lose some teaching time they would likely use to start A-Level studies or to start Year 13 teaching of A-Level subjects following Year 12 exams.  As such it doesnt solve the issue, but rather displaces it. Is the focus not on learning rather than measuring learning? As such how can any solution with a knock on to teaching and learning be acceptable.

Also, the point students should be at the end of each academic year has been arbitrarily determined.   At some point the curriculum for each subject was developed and the content decided for each year or stage however it could have easily been decided that more or less content be added.   Why, therefore, is the point students should be at perceived to be so immovable? Why not simply reduce content for the year based on the reduced time available to students? Surely this is an alternative option.

There is also the point that next years results will be compared with this years results, where it has already been reported this years results were significantly up.   This obviously resulted from the use of centre assessed grades, provided by teachers, without any of the normal annual statistical manipulation in relation to grade boundaries.    This comparison is unavoidable. So, despite any delay, etc, there is still a high likelihood of negative reporting in the press with regards the 2021 results, with knock-ons in terms of students/parents being disappointed.

This bring us nicely to the big question I have seen a number of people ask, which is 3 months behind who or what?     Is it 3 months behind where teachers think they would be had Covid19 not arisen?   A prediction based on a predication doesn’t provide me with much confidence as to its statistical reliability.   Is it three months behind in terms of curriculum content covered at the predicted rate that content is covered?   Again this suffers given it relies on predicated rate of coverage of materials plus could the content be covered at a faster rate but in less depth possibly?

Maybe this issue is an opportunity to reassess our assumptions and to question our current approach regarding education and how it is assessed or are we simply going to accept that this is the way things are done around here and that any changes should be limited and only in maintaining the status quo? I believe we have reached a fork in the road, however I worry that we may look to take the route which looks easier.


Research based education

researchThere has been a lot of talk over recent months and years about the importance of “research” based practice in teaching and about the importance of research evidence to back up any new technique, approach or fad.   The recent articles following the release of the TIMSS results and the articles which are likely to follow the PISA results due in a weeks time go to show the value which is being attributed to research findings, to quantifiable measures.

The issue is that the idea of a given approach or finding being validated by research make intuitive sense and therefore it seems logical if not common sense that such an approach be taken.     As such we fail to consider the full implications of research and in particular the importance of sample size within the research methodology.

We seek to identify approaches which will be transferable and applicable across the whole of education.   We seek to find those magical teaching methods and learning activities that can successfully be used independent of whether we are in a UK state school in a deprived area or a private school in the UAE.     We seek to make general statements in relation to the state of Maths education, or other subjects, in whole countries or even continents.   The sum total of all children currently in education therefore forms our overall target population.    Based on this any study of 10 schools or even 100 schools makes up a tiny, need I say insignificant, proportion of the overall target population.   Taken on face value the sample size of 600,000 students for TIMSS 2015 sounds impressive however as a percentage of all students within the age ranges covered by TIMSS across all countries involved I suspect it will be a small number.

Daniel Kahneman in his book Thinking fast and slow (2014) discusses the issue of “the law of small numbers” in that, where the sample size is small there is a greater tendency for variance to occur.    He specifically mentions education and how research evidence has suggested, and I am careful to say suggested as opposed to proved, that small schools perform better than larger schools.    He then mentions contradictory evidence which suggests small schools perform worse.    The reasoning behind these contradictory findings Kahneman suggests is the fact that the small sample size used in a small school involved in these studies allows for local variance within the sample which is not mirrored across the target population.   So a small number of high achieving students in one year can result in a significantly positive average, whereas the following year a small number of low achieving students in a year can result in a significantly negative average.   Where the sample size is bigger, such as in a bigger school, the impact of a small number of students is lesser as a result of the total number of students.   So there is a greater likelihood for small schools, those with a small sample size, to appear in either the top or bottom as a result of random variation.

Taking the above into account I wonder about TIMSS 2015 and the fact that Singapore and Hong Kong are both at the top.   These each have a total population according to google of 5.4 and 7.2 million people.   How can we compare these with the UK and USA with populations of 64 and 319 million people?    The smaller sample size allows for more random variation.   Now it might be claimed that the fact they have remained at the top across different years shows this isn’t random variation however as Naseem Taleb suggests in The Black Swan, it only takes a single set of data to refute findings which countless previous data might have appeared to confirm.   TIMSS so far has only seen 6 data sets, 1 every 4 years since 1995, so maybe the next TIMSS data will be the one which provides the Black Swan.

Having given this some thought I wonder if the issue is the viewpoint we are taking which is one of education on a macro level.    Maybe the intuitive pursuit of research based practices is as valid and worthwhile as it feels however the problem lies in trying to look holistically.      Looking at practices in our own school or in a small number of local or very similar schools and at things, practices and approaches that work may be more productive.    We could still use a research based approach however it would be at a micro rather than macro level.       I can also see some linkages here to the teachmeet movement as surely it has been about grassroots teachers getting together to discuss their approaches and what works in their classrooms.

Maybe we need to stop looking for “the” answers and start focusing our energy on looking for “our” answers to the question of how we provide the students in our individual schools with the best learning experience and opportunities possible.


Class sizes

This morning before walking out the front door I saw someone on a BBC morning programme suggesting that their political parties contribution to the education sector was a reduction in classroom sizes.

I find this interesting that classroom sizes continue to be considered as a measure of how good a school or an education system is.   In the case of the comments on BBC the person making the comments was equating classroom size to an improvement in the quality of education.

Hanushek (1998) suggested that the linkage between smaller class sizes and improved students results was “generally erroneous”.     Kahneman (2011) went further in suggesting that the fact associated with such a claim were “wrong”.

Kahneman’s explanation (2011) was that the reason for the findings relates to statistics and what he refers to as the “law of small numbers”.     A small class is made up of a smaller number of students which therefore results in higher levels of variability in terms of the average.    He uses an example of drawing coloured marbles from a jar to demonstrate this.   Consider randomly picking marbles from a jar containing red and blue marbles.  There is a higher probability of drawing out 3 of acolour (3 high achievers in a small class) than of drawing out 6 of a colour (6, the equivalent number of, high achievers but in a bigger class).


Within a larger class size there is a greater tendency towards regression to the mean and therefore a more stable and less variable average across schools.

The association of improved results resulting from more teacher time, more support, etc resulting from a smaller class sizes is therefore unfounded.    The improved results in schools with smaller class sizes is simply a feature of the statistical analysis of small sample sizes.   Kahneman suggests that if the researchers were to change their question and look at if poor results could be linked to small classes they would find this to be equally true.

My feeling on this is that generally class size doesn’t have a significant impact on student results within lower and upper limits.    Where the ratio is 1 teacher to 2 or 3 students I would expect to see a positive impact and equally at 50+ students I would expect to see a negative impact.   Within the larger range between 5 and 50 I would expect the impact to be minimal if evident at all.

Care needs to be taken with the use of statistics and care has to be taken in believing them.   As Kahneman explains, it is easy to create a causal explanation for why a given set of statistics such as those on class size make sense.   The ease with which a causal explanation comes to mind however doesn’t necessarily make the explanation and resulting judgement true.


Hanushek, E. A. (1998), The evidence on class size, W. Allen Wallis Institute of Political Economy

Kahneman, D. (2011), Thinking, fast and slow, Penguin Books



Inconsistency in the quality of teaching

I have came across the above statement or similar statements across schools both here and in the middle east.   At first reading I would suggest that everyone, myself included, will take this to be a negative comment.   On reflection I am not so sure it necessarily is negative or in fact that it tells us anything.

Consider the “average” school and lets consider that the measure of quality of teaching is student outcomes.    Now I know this is a very limited model however it will hopefully serve its purpose in terms of proving a point which could equally be proven by using a different measure for the quality of teaching.

Within this “average” school there will be some above average teachers where outcomes come out as very positive.   There will also be those that come out as below average.    Would this be considered as consistent as clearly having different qualities of teaching would suggest inconsistency?

Lets assume what is meant be inconsistency is an inconsistency when compared with the national profile for the quality of teaching within any given school.    In this case our average school now becomes consistent in terms of quality of teaching.    Consistency is therefore referring to the distribution of individuals within the school with regards the quality of teaching, and how this compares to other schools.

Modifying the scenario a little lets say that some of the so called “weaker” teachers performance only gets worse while the stronger teachers only get better.    Our average still remains the same however is the school any more or less consistent given the wider variance between teachers and given the difference between this profile and the profile of the “average” school?

If some of the teachers formerly within the “average” band improve this would shift the average and change the distribution.   Is this inconsistency and if so could it not be viewed as a positive inconsistency?

Now I was considering using some further examples however have decided not to.  Instead I will point out my belief in the fact that teaching is a social activity involving a class full of students and a teacher all interacting.   Given it is a social activity involving 30 or more human beings and therefore influenced and affected by a multitude of different dynamic variables, consistency is highly unlikely.    Teaching is very much like chaos theory in that it is highly sensitive to its conditions, which are frequently changing.    As such how could any school be expected to demonstrate consistency?   Like chaos theory, we can only possibly perceive a pattern by looking at the much wider picture, as under close inspection we see nothing except the variability and the differences.    How might an inspection team or an internal mock-sted see this big picture?   I doubt they would do so how can a judgment indicating an inconsistency be arrived at?

And maybe something different, unique or not fitting in with the usual run of play may be a positive thing.   So maybe consistency isn’t all it’s meant to be!

%d bloggers like this: