Time to stop adjusting grades/grade boundaries?

If using an algorithm to adjust marks is unfair, as it has been deemed to be this year, then surely this practice must cease going forward.

The last few weeks have been filled with issues surrounding exam results.   One of these was being how the A-Level results were adjusted from centre assessed grades based on a statistical algorithm.   This was deemed to be unfair as it penalised some students or groups of students more than others.    The lack of equity was clearly evident due to the ability for schools to compare their centre assessed grades with the finally awarded grades.   It was therefore evident how the statistical adjustment, carried out in the interests of keeping results generally in line with previous year’s results, impacted on individual students.  The faces and lives of individual students could be attached to the grade adjustments.  This was deemed unacceptable.

My worry here is that this statistical adjustment has always gone on.   Normally students would sit exams with their resulting score undergoing adjustment in the form of changes in the grade boundaries.   Again, this was done in the interests of keeping results generally in line with previous years results and again some groups of students would likely be penalised more than others.    The grade boundaries changed due to the exam being deemed generally easier/harder.   The focus on the difficulty of the exam meant that seldom did we associate resulting grade changes with individual students; we don’t generally attach faces to this change, yet some students would have received lesser grades than had the adjustment not been carried out, the same as happened this year.    This seemed acceptable, and has been the way things have been done for decades, but I don’t see how this is any fairer that what happened this year.  

Maybe following this years issues, we need to take another look at how we assess/measure students learning and achievement including the associated processes.

PISA: A balanced analysis?

So the PISA results were released today and with them a flurry of online articles providing various analysis and conclusions from the data.   It is my intention to post a couple of times over the coming weeks in relation to PISA and standardized testing.    As my first post my inspiration comes from a twitter discussion from the weekend, as part of the weekly #sltchat, where recruitment was being discussed.    The below tweet highlights the particular strand of the chat which I would like to highlight:

This strand of the discussion revolved around this constant need to bang on about how UK education is failing, is poor, isn’t working and a variety of other less than positive descriptions.

So what does this have to do with PISA.   Well, during my usual browse through social media and the news I came across an article in the Guardian looking at the PISA results.   You can read the article in full here.   The title:

“UK schools fail to climb international league table “

The use of “fail” in the article title; Not exactly a positive start.    It become more interesting when you dig around in some of the figures.   Lets just take the Science results;  The results show a fall from 514 in 2012 down to 509 this year which seems to align with the less than positive reporting however this doesn’t tell the full story.   It should also be noted that a drop of 5 represents a less than 1% variation.   Could this variation be explained through uncontrollable random variation within the sample group?    Is this drop statistically significant?   I doubt it.

Ignoring the issue of statistical significance for a moment, the UK science position in the rankings rose 6 places between 2012 and 2015 which seems to be a slightly more positive picture.   Also looking at the average of all countries we find that this fell from 501 to 493, representing a drop of 8 yet the UK only dropped by 5.   This UK difference, the drop of 5, could be considered as an improvement against the average across the period.   Also we should note that UK score of 509 is above the average, which again sounds reasonably positive.   A US article on the TIMSS data from last week had proclaim merrily that US students were “above average” however in this negative article we in the UK make no such claim despite the fact it would be valid.   The article was quick to point out falling UK results but didn’t report changes in the OECD averages across the period for comparison.

The article also didn’t share any information with regards the numbers of students involved in testing within each country and how this sample compares to the overall population of study within each country.   From a statistical analysis point of view this information would help in establishing the reliability, or lack there of, for the data.

So all in all I feel the negativity of the article doesn’t truly tell the story plus there is a lot of information missing which may cause us to question or at least assign less weight to the findings.

And all of the above is before I start discussing the issue of using standardized testing as a way to direct how individual students are taught in individual schools within individual geographical areas each with their own individual needs and context (did I use “individual enough to get my point across?).    Not to mention possible discussions in relation to the statistical value of the findings and also the impact of natural random variation on the results.

Do I like or value the PISA finding?   No not really, but that isn’t the point here.   My point, and I may have gone the long way about it, is why are we allowing such a negative view to be projected onto our education system when even the data seems to have some possible positive indicators.   Lets have some celebration of successes for once, of first steps in a positive direction.   Lets have anything except finger pointing!

Research based education

researchThere has been a lot of talk over recent months and years about the importance of “research” based practice in teaching and about the importance of research evidence to back up any new technique, approach or fad.   The recent articles following the release of the TIMSS results and the articles which are likely to follow the PISA results due in a weeks time go to show the value which is being attributed to research findings, to quantifiable measures.

The issue is that the idea of a given approach or finding being validated by research make intuitive sense and therefore it seems logical if not common sense that such an approach be taken.     As such we fail to consider the full implications of research and in particular the importance of sample size within the research methodology.

We seek to identify approaches which will be transferable and applicable across the whole of education.   We seek to find those magical teaching methods and learning activities that can successfully be used independent of whether we are in a UK state school in a deprived area or a private school in the UAE.     We seek to make general statements in relation to the state of Maths education, or other subjects, in whole countries or even continents.   The sum total of all children currently in education therefore forms our overall target population.    Based on this any study of 10 schools or even 100 schools makes up a tiny, need I say insignificant, proportion of the overall target population.   Taken on face value the sample size of 600,000 students for TIMSS 2015 sounds impressive however as a percentage of all students within the age ranges covered by TIMSS across all countries involved I suspect it will be a small number.

Daniel Kahneman in his book Thinking fast and slow (2014) discusses the issue of “the law of small numbers” in that, where the sample size is small there is a greater tendency for variance to occur.    He specifically mentions education and how research evidence has suggested, and I am careful to say suggested as opposed to proved, that small schools perform better than larger schools.    He then mentions contradictory evidence which suggests small schools perform worse.    The reasoning behind these contradictory findings Kahneman suggests is the fact that the small sample size used in a small school involved in these studies allows for local variance within the sample which is not mirrored across the target population.   So a small number of high achieving students in one year can result in a significantly positive average, whereas the following year a small number of low achieving students in a year can result in a significantly negative average.   Where the sample size is bigger, such as in a bigger school, the impact of a small number of students is lesser as a result of the total number of students.   So there is a greater likelihood for small schools, those with a small sample size, to appear in either the top or bottom as a result of random variation.

Taking the above into account I wonder about TIMSS 2015 and the fact that Singapore and Hong Kong are both at the top.   These each have a total population according to google of 5.4 and 7.2 million people.   How can we compare these with the UK and USA with populations of 64 and 319 million people?    The smaller sample size allows for more random variation.   Now it might be claimed that the fact they have remained at the top across different years shows this isn’t random variation however as Naseem Taleb suggests in The Black Swan, it only takes a single set of data to refute findings which countless previous data might have appeared to confirm.   TIMSS so far has only seen 6 data sets, 1 every 4 years since 1995, so maybe the next TIMSS data will be the one which provides the Black Swan.

Having given this some thought I wonder if the issue is the viewpoint we are taking which is one of education on a macro level.    Maybe the intuitive pursuit of research based practices is as valid and worthwhile as it feels however the problem lies in trying to look holistically.      Looking at practices in our own school or in a small number of local or very similar schools and at things, practices and approaches that work may be more productive.    We could still use a research based approach however it would be at a micro rather than macro level.       I can also see some linkages here to the teachmeet movement as surely it has been about grassroots teachers getting together to discuss their approaches and what works in their classrooms.

Maybe we need to stop looking for “the” answers and start focusing our energy on looking for “our” answers to the question of how we provide the students in our individual schools with the best learning experience and opportunities possible.