School Data: The tip of an iceberg

Schools gather a wealth of data in their everyday operation, everything from attendance information, academic achievement, library book loans, free school meals and a wide range of other data.    We use this data regularly however I think we are missing out on many opportunities which this wealth of data might provide.

The key for me lies in statistical analysis of the data looking for correlations.     Is there a link between the amount of reading a student does as measured by the number of library loans and their academic performance for example?     Are there any indicators which might help is in identifying students who are more likely to under perform?

The issue here is how the data is stored.   A large amount of the data is stored in tables within our school management system however no easy way exists in order to pull different data together in order to search for correlations.    I can pull out data showing which students have done well, which subjects students perform well in, etc. however I can’t easily cross link this with other information such as the distance the student travels to school or their month of birth.    Some of the data may exist in separate systems such as a separate library management system, print management system and catering system.    This makes it even more difficult to pull data together.

A further issue is that the data in its raw format may not make it easy for correlations to be identified.    Their postcode for example is not that useful in identifying correlations however if we convert this to a distance from the school we have a better chance of identifying a correlation.

In schools we continue to be sat on an iceberg worth of data although all we can perceive is that which lies above the water.   We perceive a limited set of possibilities in terms of what we can do with the data.    Analysing it in terms of pupil performance against baselines with filtering possible my gender, SEN status and a few other flags however given the wealth of data we have this is just the start of what is possible.    We just need to be able to look below the water as the potential to use the data better and more frequently is there, and in doing so we may be able to identify better approaches and more effective early interventions to assure the students in our care achieve the best possible outcomes.


Data: Making better use?

One of my areas which I want to work on over the next year will be that of Management Information.   In my school as in almost all schools we have a Management Information System (MIS), sometimes referred to as a SIS (School or Student Information System).    This systems stores a large amount of student data including info on their performance as measured by assessments or by teacher professional judgement.    We also have data either coming from or stored in other data sources such as GL or CEM in relation to baseline data.   These represent the tip of the iceberg in terms of the data stored or at least available to schools and their staff.

Using the data we then generate reports which do basic summaries or analysis based on identified factors such as the gender of students, whether they are second language learners of English, etc.  Generally these reports are limited in that they consider only a single factor at a time as opposed to allowing for analysis of compound factors.   So gender might be considered in one report and then age in another, but not gender and age simultaneously.   In addition the reports are generally reported in a tabular format, with rows and columns of numeric values which therefore require some effort in their interpretation.    You cant just look at a tabular report and make a quick judgement, instead you need to exercise some mental effort in examining the various figures, considering and then drawing a conclusion.

My focus is on how we can make all the data we have useful and more usable.    Can we allow staff to explore the data in an easier way, allowing for compound factors to be examined?    Can we create reports which present data in a form from which a hypothesis can be quickly drawn?    Can the data be made to by live and dynamic as opposed to fixed into the form of predetermined “analysis” reports?   Can we adopt a more broad view of what data we have and therefore gather and make greater use of a broader dataset?

I do at this point raise a note of caution.   We aren’t talking about doing more work in terms of gathering more data to do more analysis.  No, we are talking about allowing for the data we already have to be better used and therefore better inform decision making.

I look forward to discussing data on Saturday as part of #EdChatMeda.    It may be the after this I may be able to better answer the above questions.

A-Level results and football: Another enlightening analysis

footballNow the A-Level and GCSE results are out the usual sets of analysis and observations based on the data have started making an appearance.    As usual causal explanations have been developed to explain the data, using what Naseem Taleb described as the backwards process.   The resulting judgments have been established to fit the available data without any consideration for the data which is not available.

The perfect example is an article in the guardian (Wales A-Level results raise concerns pupils falling behind rest of UK, Richard Adams, Aug 2016)  discussing the A-level results in Wales as compared with the results in England.   The overall drop in the percentage entries achieving A* and A dropped in England “only slightly to 25.8%” while in Wales I “fell more steeply to 22.7%”.     The causal explanation apparently arrived at by one “expert” was that boys had been “possibly distracted by the national football team’s success at Euro 2016”.    This fails to consider the total number of entries in England when compared with Wales;   I suspect Wales would have less entries therefore resulting in increased variability in Welsh results versus English results.      The data also fails to include any information in relation to the students GCSE results.   Had the Welsh students achieved lower GCSEs results than their English counterparts it may be that their overall lower level of achievement could amount to “better” results given their lower starting point as measured by GCSEs.

Another possible conclusion, which is easy for me to draw as a Scotsman and most likely more difficult for an Englishman, is that the data shows something which wasn’t related to the Welsh football performance at all.    The English A-Level results could be better due to English students throwing themselves into their work following England’s poor showing during Euro 2016.  It’s the same data but a different conclusion which has been generated and made to fit the data available without any consideration for the data which isn’t available.

Having considered further this issue I think I am now more inclined than ever to agree with Talebs comments regarding the importance of the unread books in a library rather than the read ones.    Talebs discusses how a home library filled with read books gives a person the illusion of knowledge; the person has read it all.    A library filled largely with unread books however makes clear all that we do not yet know and have not read.    Reading each of these commentaries and analysis in relation to the A-Level data isn’t making me more informed or more educated, in fact it may be blinding me to the “true” facts or to other possibilities.    I think, therefore, that this will be my last post moaning about “expert” analysis or results as from now on I need to stop reading the analysis in the first place!


Some thoughts on GCSE and A-Level results

criminalatt from freedigitalphotosHaving read various articles following the recent A-Level and GCSE results I cant help but think that schools and more importantly education in general needs to make a decision as to what we are seeking to achieve, and stop acting re-actively to limited data which has been used to draw generalized conclusions.

Take for example the shortage of STEM graduates and students.    This was and still is billed as a big issue which has resulted in a focus on STEM subjects in schools.   More recently there has been a specific focus on computer programming and coding within schools.     In a recent article it was acknowledged that the number of students taking A-Level Computing had “increased by 56% since 2011” (The STEM skills gap on the road to closing, Nichola Ismail, Aug 2016).     This appears to suggest some positive movement however in another article poor A-Level ICT results were cited as a cause for concern for the UK Tech industry (A Level Results raise concern for UK tech industry, Eleanor Burns, Aug 2016).  Now I acknowledge this data is limited as ideally I need to know whether ICT uptake has been increasing and also whether A-Level Computing results declined, however it starts to paint a picture.

Adding to this picture is an article from the guardian discussing entries:

Arts subjects such as drama and music tumbled in terms of entries, and English was down 5%. But it was the steep decline in entries for French, down by 6.5% on the year, as well as German and Spanish, that set off alarm bells over the poor state of language teaching and take-up in Britain’s schools.

Pupils shun English and physics A-Levels as numbers with highest grades falls, Richard Adams, Aug 2016)

So we want STEM subjects to increase and they seem to be for computing, however we don’t want modern languages entries to fall.   Will this mean that next year there will be a focus on encouraging students to take modern foreign languages?    And if so, and this results in the STEM numbers going down will we then re-focus once more on STEM subjects until another subject shows signs of suffering.

It gets even more complex when a third article raises the issue of Music A level Entries which “dropped by 8.8% in a single year from 2015 and 2016”.  (We stand back and allow the decline of Music and the Arts at our peril. Alun Jones, Aug 2016).    Drama entries are also shown to have seen a decrease this year (Dont tell people with A-Levels and BTecs they have lots of options, Jonathan Simons, Aug 2016).  So where should our focus lie?   Should it be on STEM subject, foreign languages, drama or Music?

I suspect that further research would result in further articles raising concerns about still further subjects, either in the entries or the results.   Can we divide our focus across all areas or is there a particular area, such as STEM subjects, which are more worthy of focus?  Do the areas for focus change from year to year?

As I write this my mind drifts to the book I am currently reading, Naseem Talebs, The Black Swan, and to Talebs snooker analogy as to variability.     We may be able to predict with a reasonable level of accuracy, a single snooker shot however as we try to predict further ahead we need more data.    As we predict five shots ahead the quality of the surface of the table, the balls, the cue, the environmental conditions in the room, etc. all start to matter more and more, and therefore our ability to predict becomes less and less accurate.      Taking this analogy and looking at schools what chance do we have of predicting of the future and what the UK or world will need from our young adults?    How can we predict the future requirements which will be needed from the hundreds of thousands of students across thousands of schools, studying a variety of subjects from a number of different examining bodies, in geographical locations across the UK and beyond.

These generalisations of data are subject to too much variability to be useful.    We should all focus on our own schools as by reducing the scope we reduce the variability and increase the accuracy.   We also allow for the context to be considered as individual school leaders may know the significant events which may impact on the result of their cohort, individual classes or even individual students.  These wide scale general statements as to the issues, as I have mentioned in a number of previous postings, are of little use to anyone.   Well, anyone other than editors wishing to fill a space in a newspaper or news website.






Schools and Big Data

binaryAs Director of IT I am often directly involved with our School Management Information System (MIS, sometimes referred to as a Student Information System, SIS).   Throughout my career I have encountered and worked with a number of different MIS vendors.     My general opinion is that they are all “much of a muchness” as although they have different features, strengths and weaknesses, when you average them out the benefits and drawbacks are equal in terms of their magnitude.

These systems contain and allow us to collect a variety of data including both formative and summative student performance data.    We then design reports which allow us to interrogate the data and display it in different data.    This addresses the functionality side of an MIS however is rather weak in terms of the usability.    Users need to know which report displays which information so they can select and use the correct report at the correct time.

Within my school we are currently working on making our system more usable by developing a dashboard system to present important information directly to teachers without them have to seek it out.   This would represent an improvement however I feel still falls some way short.

One way improvement could be brought about on the above is to put more power in the hands of the users, allowing them to easily create their own reports using the data which is available.    The issue with this is it relies both on staff having the skills in data analysis to be able to design effective reports, plus it relies on them having the motivation to undertake this task.   Personally I believe this approach would be very beneficial for a small number of staff within a school, with the majority being unable to access it, even where the schools culture is very much around the use of data.   It would also potentially add another job to teaching staffs role in the need for them to design reports to analyse their data, which would represent an issue given the current situation in relation to workloads.

I think the solution lies with Big Data.   Within the IT world there is a lot of discussion with regards Big Data where large data sets are analysed to reveal trends or patterns, with this info then presented to users.   I see this as being of benefit in education.   As opposed to having to check different reports showing different sub-sets of our data such as the performance of male students vs female students, the system would identify the trends that exist for us.   The system would identify where there are correlations, without users needing to be aware of a potential correlation, therefore removing the potential for a correlation to be missed as we weren’t aware of it.    The system would also be able to look at data at a micro and macro level, either down to individual teachers groups assessment results this year, our out to patterns which may exist across a number of years.

Almost all schools have an MIS these days however they are still very much based on their origins, that of very structured data being analysed by reports.     It is about time we looked at the potential for data warehousing, data mining and Big Data to have an impact on how data is used in schools.