Using Data Mining in Educational Administration: A Case Study on Improving School Attendance

Pupil absenteeism remains a significant problem for schools across the globe with negative impacts on overall pupil performance being well-documented. Whilst all schools continue to emphasize good attendance, some schools still find it difficult to reach the required average attendance, which in the UK is 96%. A novel approach is proposed to help schools improve attendance that leverages the market target model, which is built on association rule mining and probability theory, to target sessions that are most impactful to overall poor attendance. Tests conducted at Willen Primary School, in Milton Keynes, UK, showed that significant improvements can be made to overall attendance, attendance in the target session, and persistent (chronic) absenteeism, through the use of this approach. The paper concludes by discussing school leadership, research implications, and highlights future work which includes the development of a software program that can be rolled-out to other schools.


Introduction
Pupil attendance remains a key focus for schools, local authorities, and national governments across the world as a result of its strong, positive correlation with pupil attainment, pupil well-being, and improved economic outcomes for pupils later in life [1][2][3]. In the UK, the Department for Education (DfE) has strict policies on school attendance with legal obligations for both parents, which also includes guardians in this study, and schools [4]. Parents are legally obliged to send their children to school and ensure regular attendance, while schools have a legal duty to take the necessary steps and have policies in place to effectively manage pupil attendance [4]. In this regard, there is a significant requirement from schools to be proactive on attendance management as they must accurately record attendance, proactively follow-up with parents on all absences, and put initiatives in place to manage and encourage good attendance [4].
As further clarified in Section 2.2.1, the underlying reasons as to why pupils are absent from school have been well-studied and generally fall into one, or more, of three categories: (1) Unable to attend school due to other obligations (e.g., illness, carer duties, family instability), (2) avoiding school due to fear, embarrassment, boredom (e.g., being bullied), and (3) pupil/family do not place value in schooling and/or have other activities that they would rather do, e.g., taking a vacation, or high levels of illiteracy within the family [1,5]. To this end, strategies for managing absenteeism

Educational Data Mining
The definition of EDM in [11] accurately surmises the approach of using what was once commercial data mining techniques to improving outcomes in education, including government-sponsored education. EDM, according to [11], "seeks to analyse educational data repositories to better understand learners and learning and to develop computational approaches that combine theory and data to transform practice to benefit learners". Similar definitions for EDM were provided in [12,13] where EDM is defined as a knowledge extraction process where valuable insights are obtained from data originating from an educational setting. In this regard, EDM may be compared to commercial techniques like Market Basket Analysis (MBA), which is in essence, a technique that leverages data analytics on customer transaction data to enhance customer engagement, and transaction intensity within the retail sector [6,[14][15][16].
The popular MBA techniques of Clustering and Association Rule Mining (ARM) have been widely used in EDM in a variety of contexts. Daniel, in [14], Merceron et al., in [15], and Weng, in [13], noted that ARM has been very useful in educational applications such as in finding mistakes that are commonly made together by students, making recommendations to students on e-learning course choices, and finding associations in behavioural patterns of students. Similarly in [17], ARM was used to find factors that influenced student performance in courses, with the study concluding that student performance directly correlated to attention in class (including attendance), completing assignments, and good note-taking.
Clustering has also been widely used in education with success. In their review of clustering within EDM, Dutt et al., in [18], discussed the various educational contexts in which clustering was used including using K-Means clustering to improve learning by grouping students with similar learning styles and clustering brain scans of students who showed similar responses to learning into groups and targeting each group differently to improve learning. Similarly, clustering was also used to understand student behaviour in online learning environments by comparing sequential student data and leveraging a clustering algorithm to group like-minded students [19]. It should be noted that while clustering does have its place, it needs to be done carefully within the government schooling sector as it may be perceived by some parents as unfairly "targeting" groups of pupils, which is generally not the case [20].

School Absence
There exists a myriad of terms used to describe school absence which helps focus diagnoses so that targeted plans could be put in place to address their underlying causes [1,3,4,21]. While some absences may be seen as acceptable, in the UK, schools have become tough on all absences irrespective of their reason, as they are equally destructive to learning [1,4]. Authorised absence, defined as an acceptable absence approved by the school (e.g., illness or bereavement) is typically granted, but schools have become wary of its abuse, particularly close to the ending of term, when parents want to capitalise on cheaper holidays without incurring fines [4,22,23]. On the other hand, unauthorised absence (absent without permission) has received widespread condemnation from lawmakers and education non-profit organisations, with several cases being trailed in court, or parents being fined in line with the local authority and national government policy [4,22,23].
The concepts of school refusal (SR) and truancy form part of unauthorised absence and has been well-outlined in [3] and [21], with SR defined as non-attendance due to the expectance of strong negative emotions while at school (e.g., fear as a result of bullying, embarrassment as a result of being teased, or separation anxiety), while truancy is related to anti-schooling sentiments (without parental consent) including finding school boring or finding activities outside of school more attractive (e.g., going to the cinema during school time). School withdrawal (e.g., taking time off to go on holiday) is similar to truancy but with parental consent, and is generally very difficult to address once it becomes excessive as it usually requires multi-agency involvement that focuses on the family as well as the pupil [5]. The notion of persistent absence or chronic absence has also been well-studied, with the definition in the UK being a pupil that is absent from school for 10% or more, irrespective of the reason [1,22]. Persistent absenteeism is being well-tracked by schools and local authorities in the UK, with initiatives and policies put in place to deal with the problem as it arises [4,22]. However, the situation is not the same in some other developed countries, and is often overlooked and wreaks havoc long before the problem is diagnosed [1]. In the US for example, Balfanz and Byrnes, in [1], noted that chronic absenteeism is largely unmeasured and hence not noticed. The authors further point out that only a few states and cities in the US measure chronic absenteeism, and even when it is measured, the metric of average daily attendance for the entire school "masks more than it reveals". Left unchecked, chronic absenteeism eventually leads to a disengagement with education and results in poor career prospects for the pupil, and most likely a future of poverty [1].
Separation anxiety or in its more severe form, Separation Anxiety Disorder (SAD), is a type of school refusal and has been well-documented in [24]. SAD is common among young children (up to 1 in 20 children suffer from SAD) and is defined as the fear of leaving the safety of parents or caregivers [24]. Children experiencing SAD often present with tantrums, panic attacks, or bad behaviour and can have a significant negative impact on the child's academic, social, and physiological development [24,25]. Indeed, separation anxiety is most common after children have spent long spells with their parents or caregivers and is common after weekends or holidays, and may also present every morning in some children after they have spent the previous afternoon and night with parents or caregivers [24,25].

Why are pupils absent?
There is broad consensus by researchers as to why pupils do not attend school, and the underlying causes for absence fall into three categories [1,5] which are indeed very large by themselves: 1.
Unable to attend school due to other obligations; 2.
Pupil/family do not place value in schooling (and/or have other activities that they would rather do).
This notion of not placing value in schooling has been further separated into truancy, i.e., pupils staying away from school without parental knowledge, and school withdrawal, also known as parental-condoned absence, i.e., parents condone the absence as it proves beneficial to them or the family at large [21]. Given the vast array of underlying causes, researchers have tended to become more specific in examining the problem of absence. In [3] the focus was on school refusal and truancy with peer relationships and classroom management by teachers as underlying causes. In this regard, Havik et al. [3] found that both good peer relationships and effective classroom management had strong positive correlations with good attendance. Similarly, tackling truancy and parental beliefs (as part of school withdrawal) were the key focus areas in [2] and [5] respectively, with both studies showing that there is a strong positive correlation with good attendance and effective, regular communication between school and home.

Impacts of Absence
Balfanz and Byrnes, in [1], were firm in their conclusions that "missing school matters", noting that in the US, missing school impacted academic achievement irrespective of age and that those that were from low-income backgrounds were more impacted by absence as they were less likely to have provisions at home to make up for the lost time. In the UK, similar sentiments were echoed in [4,22] with respect to absence, including more long-term impacts on the pupil, such as social anxiety and lack of self-confidence, both of which are known pre-cursors to interrupted employment and consequently lower economic attainment in adulthood [1,21,25]. Whilst these are all significant impacts in their own right, the key impact of absence, which was noted across several studies including in [1,2,4,5,22], was long-term disengagement with education which not only impacted the pupil in adulthood but also created the foundation for a vicious cycle when these pupils become parents and project their negative attitudes towards education onto their children.

Improving Attendance
The conceptual framework proposed in [26] for designing interventions to improve attendance is both relevant and very useful. The proposed three-tier framework targeted all pupils along the absenteeism spectrum with tier 1 strategies focussed on pupils with emerging attendance problems, whilst tier 2 focussed on pupils that are at risk of being persistently absent, and tier 3 on those that are already persistently absent. The overall approach of this framework emphasises early identification and treatment, rather than a sole focus on those that are already persistently absent.
This approach is well-recognised and several studies have operationalised this framework to varying depths [1,2,4,22]. In [4,22], which are relevant to the UK context, guidelines suggest that all absenteeism should be tackled with context-specific approaches that include using data analytics, working with parents, using incentives, and enforcing fines. Similarly, the Early Truancy Prevention Program (ETPP) introduced in [2] proposed a five-step approach, all of which required the teacher to be proactive, and work actively with parents to drive up attendance. Pilot tests using the ETTP did show a significant improvement in attendance [2], however, most initiatives were time-intensive and required teachers and school administrators to spend a large amount of time working with parents on an ongoing basis. This is not practical in the UK, because teachers are already stretched, and school budgets are being squeezed [27]. Efforts to improve attendance in [1] were underpinned by offering both short-term and long-term rewards through local and national/state campaigns. At a local level, schools offered rewards that were more meaningful to pupils who attended regularly and included fun activities like dance and diplomas for completing short courses. While at a national level, school attendance was stressed by senior political figures and "success mentors" who were largely celebrities that attributed their success to regular school attendance [2].

Problem Statement
It is well documented that providing pupils with the right incentives to attend school results in improved attendance and consequently improves pupil attainment and progress [1,4]. Given this, the problem being addressed by this study may be stated as follows: Let S be a school with all its pupils, U. Let the school week, J, be divided into m distinct sessions, J i , such that J i ∈ J = {J 1 , J 2 , . . . , J m }. Furthermore, let T be a database in S, that contains the attendance records of all pupils across all sessions for a period, W. Hence, there may exist a database, T t , where T t ⊆ T, that contains the attendance records of pupils U t , where U t ⊆ U, who have below the required attendance in at least one school session and/or the overall average attendance, but where attendance in all other sessions are above or equal to the requirement. In the UK, the required attendance target is 96% [4]. Given that the leadership and staff of the school S are intent on maximising pupil attendance (with the focus on driving up the overall average pupil attendance through incentives and interventions) while minimising effort and associated costs (largely incentives and staff costs), it becomes necessary to optimise the targeting of J i . Thus, this study aims to provide a framework and useful tool for schools, based on ARM and Frequent Itemset Mining (FIM), for targeting the right school session(s) with incentives and interventions that maximises the impact on improved overall school attendance.

Analytical Model
We commence by noting the definitions of the well-known ARM concepts of support, confidence, minimum support, minimum confidence, and the Apriori principle first introduced in [28], and as detailed in [29,30].

•
The support of an item A, in a transaction database T, is given by: supp (A) = P(A) = number of transactions in T that contain A number of transactions in T ; • The probability of the presence of item A leading to the presence of item C (commonly referred to as confidence) is given by: When supp (A) exceeds some user-defined value for support (commonly referred to as minimum support or minsup) we note that A is considered to be frequent. Similarly when conf (A → C) exceeds some user-defined value for confidence (commonly referred to as minimum confidence or "minconf") we note that A and C are considered to be associated. Note that FIM is defined as the process of finding all itemsets that exceed minsup in a given database [13,16,29].
The Apriori principle, first detailed in [28], and more recently in [16], states that for a given set of transactions, supp(A) ≥ supp(A, C). This is consistent with probability theory where P(A) ≥ P(A ∩ C), as well as in practical terms, e.g., where the number of transactions that contain pupils who are absent on Monday AM is always greater than or equal to the number of transactions that contain absences on both Monday AM and PM.

Identifying the Best Sessions to Target with Attendance Improving Initiatives
Pupils that have above or equal to the required attendance in every session are generally considered to have very good attendance, and in essence help the school boost its overall average attendance. Let T p be a database containing the attendance records of all pupils that are persistently absent, hence T p ⊆ T t . Persistent absenteeism in the UK is defined as having an overall average attendance of less than 90% [22]. Given that schools take severe action once attendance drops below 85%, including removing a pupil from the school roll, T t thus represents a significant portion of T for a school that has overall below-the-required-average attendance [4]. Hence, improving pupil attendance in T t will enhance overall attendance, and as most schools have limited resources, the question of which J i in T t should be targeted often arises. Intuitively, the best session to target should be that session which has both the highest absence and the highest association with poor overall average attendance, O. This scenario may be represented in terms of ARM as targeting the session where supp(J i ) and conf(J i → O) is the largest. However, we also note that scenarios do exist where supp In these cases, the choice between J k and J c is not obvious.
This choice-making problem is not unique to school attendance and often arises in several other sectors including in retail, medicine, and security [16]. We note that a similar problem involving the selection of the best item to target for grocery retail promotions has recently been addressed in [16], and thus the methods employed in that study could be applied here. To facilitate easy processing, T t is converted into a database with binary attributes, with sessions and/or the overall average attendance being assigned a "1" when attendance drops below the required levels. Clearly, T t may now be considered to be an absenteeism database.

Applying the Market Target (mt Model) on School Attendance Data
The mt model proposed in [16] was shown to be effective in making choices between items in the form (A → C) and (B → D). Indeed, the problem laid out in Section 3.2.1 is of the form (A → C) and (B → C), and may be considered a subset of the more generalised choice-making problem that the mt model addresses.
Let P(J i ) be the support of session J i in database T t , and P(J i , O) be the support of session J i and O co-occurring in database T t . In practical terms, P(J i , O) may be viewed as the number of children, or instances, that have both below the required attendance for J i and O in the database T t . Thus by definition, conf(J i → O) = P(J i , O)/P(J i ). As was the case in grocery retail, detailed in [16], there are two intuitive schools of thought on solving this problem to reduce attendance. One may suggest targeting the session, J i , that has the highest conf(J i → O), as a reduction in every absenteeism in J i will most likely lead to a reduction in (J i , O). However, if P(J i , O) is low, then (J i , O) may be considered to be rare, and solving this scenario may not have the desired overall impact on O. Rare rules, as defined in [13], are rules that are highly associated but occur less frequently in a dataset, i.e., they have lower support. Conversely, targeting a high P(J i , O) may seem attractive, but if conf(J i → O) is low, then lowering P(J i ), through some initiatives, may not have the required impact on P(J i , O), and consequently P(O). Thus, it is evident that a model that takes into consideration the concepts of support and confidence is required to find the optimum solution. In this regard, the mt model, detailed in [16], is a model that addresses this exact challenge.

Adapting the mt Model for School Attendance
The mt model, adapted for school attendance, is developed below and in essence evaluates options, for example: Option (J i , O) and (J k , O), based on both the support and confidence of that option in the database. It is clear that P(J i ) ≥ P(J i , O) for all i ∈ m, hence the underlying principle of the mt model is that it evaluates the "effort" required to make P(J i , O) = minsup, which is considered to be the "desired" state of P(J i , O). Note that in this instance, the "desired" state is equivalent to the maximum session and overall absenteeism. Also note that minsup is user-defined, and is governed by the Apriori principle, i.e., P(J i , O) ≤ minsup ≤ P(J i ) for all i ∈ m. The number of absences required for a P(J i , O) combination to reach the "desired" state is given by Equation (1), where |T t | is the number of transactions in database T t .
Number of absences required for "desired" state = (minsup − P(J i , O)) · |T t |. ( Given that not all children absent in J i will also be absent in O, the "market target" referred to [16], or in this instance, the pupil target, may thus be defined as the number of required absences in J i such that the number of absences required for the "desired" state of (J i , O) in T t to be reached. This is stated mathematically in Equation (2) Thus the mt equation, given in Equation (3) is obtained by combining Equations (1) and (2), and dividing both sides the minimum support, i.e., the physical number of absences equivalent to minsup in database T t . Note that mt is a normalised parameter, and is given by pupil target/minimum support.
From Equation (3), it is evident that options that have the lowest mt value require the lowest "effort" to reach the "desired" state, and are thus considered the best choices for a given minsup. From a practical perspective, this implies that the school targets the school session that has the greatest propensity to lead to overall below-average school attendance. There are also practical constraints of managing a school that must be considered. In this regard, initiatives must target all, or the majority of school pupils to ensure fairness, and given that most initiatives are largely fixed costs (e.g., the effort in planning activities is similar whether the audience is 100 or 250), it makes sense to target the session which impacts overall absenteeism the most [20]. The most impactful session is the session which has the lowest mt value as it requires the least "effort" to reach the "desired" state.

Algorithm for Identifying Target Sessions Using the mt Model
Applying the mt model to identify the best sessions to target is relatively straightforward. The mt value is computed for each (J i → O) combination and the one with the lowest mt value is the best session to target. The steps of the proposed algorithm are detailed in Algorithm (1).

Algorithm 1:
Identifying target sessions using the mt model. 1 Create the dataset, T t , from T that contains the attendance records of all pupils that are on the roll for the entire period, and where their attendance has been below the required level in at least one session 2 Using an ARM/FIM algorithm (e.g., Apriori or ECLAT) with a low support and confidence, find supp(J i ) and conf(J i → O) for all sessions 3 Calculate the mt value for each (J i , O) combination using an appropriate value for minsup 4 Order sessions based on mt values, with the session that has the lowest mt value being the best session to target

Experimental Process
Experiments were conducted based on the well-known action research process as detailed in [31] and outlined in Figure 1. As per [31], the process begins by defining the context and purpose by asking the question why is this project required or desirable? However, it is the diagnosing phase that usually proves to be the most challenging as it involves identifying the possible issues or the most impactful issue, which is sometimes not obvious [32]. Consequently, data analytics is often leveraged to simplify this task through the use of models and algorithms to process data into information [32]. In this regard, the mt model forms part of the diagnosing phase. Note that the action research process is cyclical and actions taken have to be regularly evaluated against the context and purpose, which could also change over time [31]. Research was conducted at Willen Primary School with the context and purpose of improving overall school attendance to be above or in line with the national requirement, which is currently set at 96% in the UK [4,22]. The mt model (part of the diagnosing phase) was then used to identify the session which was most impactful to overall school absence. Options for possible action were brainstormed with school leadership and evaluated in the planning action stage. Following this, selected actions were carried out at the school over several months (taking action stage) with the impact on overall school attendance then assessed in the evaluating action stage.

Willen Primary School
WPS is a mixed, 2-form entry primary school on the North-eastern side of Milton Keynes, catering for 4 to 11-year-old children. The school has a capacity of 420 pupils and had 366 pupils on its roll at the end of July 2019, with approximately 35% of its pupils coming from outside the school's catchment area [20,33]. The school was rated "Good" by the UK's Office for Standards in Education, Children's Services and Skill (Ofsted) in its last inspection, which was conducted in November 2017 [34]. Whilst the inspector cited very good attendance management practices by the school leadership, he did note that further improvement should be made [34]. Given this, the school has continued to fervently promote the importance of good attendance and explored the use of novel approaches to address the issue of absenteeism giving rise to this study [20].

Diagnosing
School attendance data for the previous three academic years, i.e., 2015/16, 2016/17, and 2017/18, were used as the basis for improving school attendance in 2018/2019. For the sake of completeness, detailed attendance and school roll data are provided in Appendix A. These data were first scrubbed to remove pupils that either joined the school after the start of the academic year or left the school during the academic year, thus producing a dataset T for each academic year W, as further discussed in Sections 3.2.1 and 3.2.4. Subsequently, data were further filtered to produce T t by selecting those pupils U t who either had an overall attendance of less than 96% (the required national average) and/or who were absent at least three times per session during the academic year. Given that the cardinality of sessions was generally between 34 and 39 per year, it was not practical to filter these sessions at the 96% level as it was too restrictive (equivalent to two absences per session). It is not uncommon for some children to have up to two absences for some sessions and still have an overall attendance of at least 96% [20]. The restriction on analysing pupils that were present for the entire academic year was placed to ensure that the data analysis process was not unfairly skewed. For example, consider a scenario that occurs fairly regularly: Some pupils may enrol at WPS after the start of the academic year, have 100% attendance for two weeks, and then transfer to another school (possibly one that is closer to their home) [20]. In this case, these pupils will have 100% attendance and be treated analytically as the same as pupils who had 100% attendance for the entire academic year. Consequently, these pupils were excluded from the analysis. The size of T t for each academic year is detailed in Appendix A.
T t for each academic year was then analysed using a FIM algorithm in R, with a minsup of 0.3 and a minconf of 0.3 to prune rare rules, similar to the process outlined in [16]. The value of minsup and minconf was chosen to be low enough to capture all essential rules but high enough to eliminate superfluous rules. Given that minsup and minconf are user-defined parameters, the choice of an appropriate value is typically based on the context. Whilst the values of minsup, and minconf have practical significance in some sectors and settings, e.g., in grocery retail where it is used to identify popular products [16], and in this context it is used to simplify the data processing by reducing the number of rules produced. The choice of 0.3 was based on trial and error, which is typically the case in data processing applications. An initial test pass on the dataset for the 2017/18 academic year using minsup = 0.4 resulted in some essential rules being pruned, e.g., Wed-PM and Thur-PM, hence minsup was adjusted to be lower than 0.4. It should be noted that choosing a value inferior to 0.3 will still achieve the objective, but will increase processing effort. For completeness, the number of rules extracted per academic year is detailed in Appendix A. Following this step, the output from the FIM stage was further analysed, using Microsoft Excel, to compute the mt value for each frequent itemset from which the best target session was identified.

Planning Action
Given that the school has strict obligations, guidelines, and a strategic agenda that it must adhere to, it was realised that a multi-prong approach had to be undertaken with regards to planned actions that would improve attendance and validate the targeting approach proposed in this study. These planned actions were over and above what the school was currently doing to monitor and promote attendance. Hence a two-pronged approach was adopted with (1) session-targeting focused on demonstrating that session (and overall) attendance can be improved by targeting identified session(s) and (2) overall attendance improvement initiatives focused on improving attendance in line with the strategic and statutory obligations of the school.
Several alternatives were considered by school leadership and based on their experience, the best two selected were: (1) Focus on shorter periods with prize-based rewards for full attendance and (2) create more exciting initiatives for targeted sessions. The selected initiatives were consistent with the tiered approach described in [26].

Taking Action
Apart from continuing to fulfil its statutory and strategic objectives with regards to attendance (including dealing with persistent absenteeism, promoting and fostering a good environment for improved attendance, and dealing with truancy) the school implemented the two initiatives outlined in Section 4.2.3.
Initiative One (I1) focused on increasing the frequency and perceived meaningfulness of the rewards for full attendance so that pupils could both feel tangibly rewarded for full attendance and know that they can always be eligible for rewards in the next reward period should they not win in the current or previous period. I1 commenced at the start of the Spring term in January 2019, with all pupils that had full attendance for the month placed in a draw to win one of eight tickets to a popular, local trampoline park. The reward was meaningful to the pupils as it was something that they enjoyed and it was something that was not always available to them due to cost constraints [20]. Given this, there was considerable excitement from pupils when the initiative was introduced. Initiative Two (I2) was geared towards targeting the sessions that had the largest impact on poor attendance. Exciting activities were conducted during the most impactful session throughout the Summer term starting at the end of April 2019. These activities, which were centred on a common theme and designed to be in line with the learning objectives, involved the entire school and included elements that the pupils would consider exciting [20]. Further details on I2 are provided in Section 5.

Evaluating Action
Following the implementation of the initiatives, the pupil attendance records for the 2018/19 academic year were analysed using Microsoft Excel and compared with previous years to quantify the impact of I1 and I2. This then fed into school planning operations for the 2019/20 academic year. We adopted a simple, inference-based approach by establishing two null hypotheses, and by inference drew conclusions on the 2018/19 year. The null hypotheses are stated as follows: • H

Identifying Target Sessions
The average attendance for all pupils who were on the school roll for the entire academic year was calculated using Microsoft Excel, with the results presented in Table 1.
From Table 1 it can be seen that the school generally did not achieve the required overall average attendance of 96% in any of the previous three academic years. Furthermore, attendance in the morning (AM) sessions were lower than the afternoon (PM) sessions, with Monday AM being consistently the most poorly attended session across the years. This is consistent with theories on separation anxiety where young children often dislike going back to school after spending long periods away from school with their parents and family, and school withdrawal [22,25]. Separation anxiety may be exacerbated when parental collision occurs (school withdrawal) and parents keep pupils at home for fear that they may become distressed further [1,21,22].
Whilst Monday AM is the most absent session based on average percentages, as shown in Table 1, the basis of attendance management is not only about increasing the overall average attendance, but centred on addressing the most impactful session to overall attendance, which in turn impacts pupil performance [1]. It is possible that the most frequently absent session is not the most impactful, as children absent in this session could return to school in the next session and have perfect attendance for the rest of the week, and generally have good academic performance as well. Thus, any interventions aimed at improving attendance in these sessions may likely be less effective as it will be targeting children that already have good overall attendance. This may take focus (and valuable resources) away from other sessions that may have marginally better attendance, but fraught with problem absenteeism that is impacting pupil performance and overall school morale. Indeed, executing "misguided" intervention programs can also have detrimental impacts on staff and parents. Staff may lose faith in their ability to improve school attendance and performance, and loss morale as their hard work may go unrewarded. At the same time, parents whose children are generally good attendees may feel unduly victimised for occasional absences, particularly where such absences are obligatory e.g., medical appointments or bereavement [20]. Furthermore, given that these interventions are focused on a session that generally comprises of occasional absenteeism, it is unlikely to make a significant impact on the children that are chronically absent. Tracking and improving absenteeism, especially chronic absenteeism, is a key performance metric of a school's performance management framework within the UK [20, 22,34]. As a result, it thus becomes important to target the most impactful session to overall attendance, and the use of the mt model, as detailed in Section 3, is one effective way of achieving this objective. The mt value for each session was calculated on each T t for the previous three academic years, as per the process outlined in Section 3, with the results detailed in Table 2-4. Some sessions were automatically eliminated, consistent with Lemma 1 in [16], as both their corresponding support and confidence were less than other sessions in the same year. As noted in Lemma 1 in [16] and adapted for this study, if both supp (J i , O) and conf (J i → O) is less than supp (J k , O) and conf (J k → O) respectively, then (J k , O) is the better choice, and (J i , O) can thus be eliminated. In Table 2 for the 2017/18 academic year, Fri-AM had both higher support and confidence than every other session except Mon-AM, hence there was no need to compute the mt value for all other sessions except Fri-AM and Mon-AM. The mt model in Equation (3) was used to decide the better target session between Fri-AM and Mon-AM, with a minsup value of 0.550 (the lower support between Fri-AM and Mon-AM) being used. Mon-AM had the lower mt value and hence was selected to be the best session to target.
The negative value for mt was also interesting to note. In practical terms, it implied that there were more records in T t that contained both Mon-AM and O that were below the required levels than records that contained Fri-AM being below the required level. Thus any initiative to resolve absenteeism on Fri-AM will always be less impactful than absenteeism on Mon-AM. Hence all other sessions except Mon-AM were considered to be rare rules as there exists a (J i , O) combination that is under consideration with P(J i , O) > minsup. This is not always the case and the scenarios were quite different for the 2015/16 and 2016/17 academic years.     Table 3 for the 2016/17 academic year, all rules except Mon-AM, Mon-PM, and Fri-PM were shortlisted as the others were determined to be rare. The mt values were computed for each of the shortlisted sessions, with minsup set at 0.549 (the lowest support between Fri-PM, Mon-AM, and Mon-PM). Mon-AM was found to be the most impactful session to overall below-average attendance. Similarly from Table 4 for the 2015/16 academic year, Mon-AM was found to be the most impactful session with minsup set at 0.589. Given that Monday AM was found to be the most frequent and the most impactful session in the three academic years analysed, it can be concluded that the poor attendance on Monday AM may be attributed to a combination of school refusal (e.g., due to separation anxiety), and school withdrawal/truancy where the return to school may not be seen as being as exciting as the weekend that just passed [1]. Therefore, an easier, more exciting start to the school week (initiated by the school) may prove successful in addressing this issue.

Early Warning System
The very high confidence values (>0.9 and in some cases = 1) was also of significant note as it suggested that any pupil that was absent for at least three times in any one session was very likely to have below overall required attendance. This could be a good tool for the school to use in tackling absenteeism as it may be used to identify pupils that are at risk of falling below the requirement, consistent with the recommendation in [26]. Furthermore, it could be used as part of conversations with parents and pupils in addressing their beliefs and misconceptions about attendance which is consistent with the recommendations in [1,2,4] for improving attendance through leveraging analytics. This fact-based approach is more likely to resonate well with parents and may negate any possible insinuations by parents that their families are being victimised or treated unfairly by teachers and school leadership [1,2].

Evaluating the Impacts of Initiatives I1 and I2
I1 and I2 were conducted as detailed in Section 4.2.3. Following the results of the analysis conducted as part of Section 5.1, the school decided to target Mondays with the emphasis on the Monday AM session as part of I2. The Monday Matters initiative was launched in the Summer term of 2019 and consisted of a "m-themed" program for five of the 10 Mondays during the term. The initiatives were selected by the school staff as it represented themes that would resonate well with the pupils. The five themed Mondays were: Move-It Monday, Muffin Monday, Mindfulness Monday, Mask Monday, and Movie Monday. For each themed Monday, pupils were allowed to come to school appropriately dressed, e.g., example sports kits on Move-It Monday, and participate in a range of planned activities related to that theme which were also linked to the work that was being done in the classroom.

I1: Frequent Rewards for Full Attendance
Draws were held every month during the Spring and Summer terms of 2018/19, except for April, for all pupils that had full attendance during the month. The April draw was omitted given that April had fewer than 10 school days in that month. From Table 5, it is evident that the shorter, more meaningful rewards for full attendance have contributed to a significant improvement in overall attendance for the Spring and Summer terms in 2018/19 with the attendance for every session being considerably higher than the attendance in the previous three years. This result was consistent with the findings in [1] Table 6 presents attendance data for Summer term Monday attendance for the 2015/16, 2016/17, 2017/18, and 2018/19 academic years. There is a fluctuation in the number of Mondays from year to year due to the timing of Easter which influences the half-term break as well, which is typically held towards the end of May. From Table 6 it can be seen that the average attendance for Mondays in the Summer term of 2018/19 was significantly higher than the previous years. Furthermore, not only was the 2018/19 attendance data higher, but it was also above the required 96% target and the first time that this was the case in four years. The range and median for the data also showed the strength of 2018/19 attendance data when compared to previous years. The range in 2018/19 was over half that of 2015/16 indicative of a consistently high Monday attendance throughout the term. There were some concerns from school leadership on the "stickiness" of Monday Matters events (where having an event every other Monday fosters good attendance on other Mondays and indeed other days of the week), and whilst there were spikes in attendance on Monday Matters days, attendance during the other Mondays was quite good, as evidenced by the data in Table 6. These findings are consistent with other studies that noted that in general, pupils are "creatures of habit" who thrive on routine, and are thus likely to sustain good attendance once a routine is established [1,2,4].

Evaluating the Overall Improvement in School Attendance
The full-year attendance comparison is presented in Table 7. It can be seen that initiatives in the Spring and Summer terms of the 2018/19 have contributed to an improvement in the whole school attendance for the full academic year. Indeed, WPS achieved the required attendance target of 96% for the first time in four years in 2018/19. Data in Table 7 also reveal the success of the Monday Matters initiative on the full-year attendance data. Monday AM and PM sessions showed the largest increase in attendance, with increases of 1.8 and 2.1 percentage points respectively. As a result, Mondays no longer have the worst-performing AM and PM sessions, and the shift in focus now moves towards Fridays, where the underlying reasons for poor attendance may be quite different. Unlike Monday absenteeism, which is influenced to some extent by separation anxiety, Friday absenteeism may be more influenced by school withdrawal where parents may: (1) Want to extend the weekend or start holidays earlier to beat the rush and/or save on costs, and (2) sometimes assume that Fridays are typically low-value school days in which limited learning takes place and hence pursue other activities outside school [21,23]. Hence, the action plan to tackle Friday absenteeism must be geared more towards school withdrawal as opposed to the Monday Matters initiative which was focused on tackling both school refusal and school withdrawal.
One argument that parents do make on Friday absence is that their child(ren) have excellent attendance on all other sessions and these occasional absences should not impact the child and the school. While it is well-documented that all and every absence impacts pupil learning, the question of whether Friday sessions have now become the most impactful session to overall absence arose [1,4,21]. In line with this, the analysis detailed in Sections 4.2.2 and 5.1.1 was conducted on the 2018/19 dataset. It was clear from the results in Table 8 that Friday is now the most impactful day to be overall below the required attendance, with Fri-AM being the most impactful session. Mon-AM is no longer the most impactful session to be the overall below-average attendance for the first time in the four academic years.

Persistent Absenteeism
The impacts of initiatives I1 and I2 on persistent absenteeism (attendance < 90%) were also analysed with the results presented in Table 9. Persistent absenteeism at WPS has been significantly higher than the national average for at least the last three years, despite regular and close monitoring by the school's leadership team (including governors) and the school's attendance officer. However, the level of persistent absenteeism significantly decreased in 2018/19 and was lower than the national average for a persistent absenteeism of 8.2%. This is a significant improvement and consistent with previous studies that sought to tackle the problem of chronic (persistent) absenteeism, in particular [1]. Indeed some of the approaches for tackling persistent absenteeism discussed in [1] have been leveraged in the development of I1 and I2 including the concept of making rewards more frequent and meaningful.

Statistical Testing of the Improvements in School Attendance
Statistical testing was conducted using the approach outlined in Section 4.2.5 and the data presented in Table 7. For H (1) 0 , the Kruskal-Wallis test showed that there was no statistical difference in attendance, H = 2.61, p < 0.01, hence we accept the null hypothesis, while for H (2) 0 , the Kruskal-Wallis test showed that there was a statistical difference in attendance, H = 21.46, p < 0.01, hence we reject the null hypothesis. Based on this, we thus accept H (1) 0 and reject H (2) 0 to conclude that the initiatives in 2018/19 had an impact (positive) on overall attendance.

Conclusions
The mt model, described in Equation (3) and detailed in [16], was adapted to improve school attendance at WPS. The algorithm detailed in Section 3.2.4, which included the mt model, was used to identify the school session which was most impactful to be overall below the required average attendance. In line with this, the previous three years' attendance data from WPS was analysed and it was found that the Monday AM session was consistently the most impactful session to the overall below the required average attendance. Two initiatives were carried out at WPS that were based on approaches in previous studies and the collective wisdom of WPS leadership and staff [1,2,20]. Initiative I1 provided more frequent and meaningful rewards for full attendance while I2 focussed on improving Monday attendance through the use of themes that were known to be exciting for the pupils.
Both I1 and I2 resulted in a significant improvement of attendance at WPS, with attendance in 2018/19 being at its highest over the past four academic years. Overall average attendance for the 2018/19 academic year was at the required target of 96%, whilst the combined Spring and Summer term attendance was higher at 96.2%. Monday attendance during the Summer term also improved significantly from an average and range perspective. The average Summer term Monday attendance in 2018/19 was significantly higher than the three previous years at 96.5%, while its range was significantly lower 2.8%, implying that attendance on Mondays was consistently better throughout the term.
Analysis of the 2018/19 data using the mt model revealed that Monday AM was no longer the most impactful session to be overall below the required attendance, instead, it is now Friday AM. The underlying dynamics as to why this is the case may also include a shift away from school refusal and more towards school withdrawal (parental condoned absence) which is underpinned by a variety of reasons including cheaper holidays [20,23]. Addressing this is considered to be part of future work and is detailed in Section 6.2.

Theoretical Implications
The proposed approach, which includes the mt model underpinned by well-grounded theory and concepts in tackling absenteeism as detailed in [1,26], provides a novel, simple, yet effective way to tackle the well-known problem of addressing absenteeism in schools. The implementation of two, easy-to-action, initiatives have demonstrated a significant improvement in attendance. This study also contributes to the body of knowledge on MBA, in particular, its use in a wide range of sectors including retail, medicine, and now education [16,32].

Practical Implications
The proposed algorithm detailed in Section 3.2.4 enables schools to easily identify and tackle issues around pupil attendance. This study considered the impact of sessions on attendance, but this approach could be extended to identify other factors impacting attendance including the impact of subjects or topics being taught and the impact of pupil demographics.
The algorithm can also be used to identify other issues at schools for example, factors impacting pupil progress. These factors (which may include attendance, demographics, attentiveness in class, and completion of homework) could be quantified using a simple 1 to 5 ranking scale and analysed using the mt model to identify and rank the impact of these factors on pupil progress. Whilst this may be seen to be similar to the work in [17], this approach will add further value by quantifying the impact of each factor to overall progress, as opposed to only ranking their association.

Future Work
Future work has been divided into two parts namely future work for the school and future work for the authors.

Future Work for the School
The school will continue to use the model to keep attendance above the required target. Both the I1 and I2 initiatives are planned for the 2019/20 academic year. At the same time, the school should consider tackling Friday absenteeism, given that Friday is now their new problematic school day. Given that the dynamics may be slightly different as outlined in Section 5.3, the school should explore a new series of initiatives, perhaps entitled "Fun-d-mental" Fridays, where the focus is still on fun and excitement but also includes the "mental" aspect which emphasises the need for pupils and parents to treat Friday as an essential learning day. Furthermore, the play on the word "fundamental" also emphasises that Fridays are a key part of overall learning (fundamental to learning) as it usually involves a consolidation of the week's work where the various concepts and pieces of work that pupils have learned during the work are brought together to both evaluate pupils' learning and demonstrate (to them) how all the learning fits together. It should be noted that schools already use Fridays in this way, for example: "Big Write" or "Cold Write" to consolidate the week's writing activities as well as arithmetic testing to assess pupils' learning and ability to apply the mathematical concepts learned during the week [20].

Future Work for the Authors
The authors have realised, through this study, that school leaders and staff have predominantly been trained in the pedagogical aspects of education and thus do not possess advanced skills in analytics. In light of this, the authors will investigate automating the proposed approach and include a graphical user interface with customisable analytical fields into a software program so that school practitioners can benefit from the use of the model across a variety of school fields (e.g., attendance, progress, behaviour, etc.) without the need to conduct detailed programming and data mining by themselves. The authors will also consider rolling out the approach and software program to other schools so that the benefits and lessons learned at WPS can be shared and maximised.
Author Contributions: All authors made significant contributions throughout this piece of research and agreed to submit the manuscript in the current form. The first author made major contribution in terms of writing and implementing software. All the authors contributed in terms of conceptualisation, writing, and revising the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Acknowledgments:
The authors would like to thank the leadership and staff of Willen Primary School for permitting us to use their data and for their efforts in supporting this study, in particular, Ms Emma Warner (attendance officer), Ms Carrie Matthews (headteacher), and Ms Sarah Orr (deputy headteacher).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Data Tables
Detailed data tables are provided that includes the school roll, population size for this study, attendance data, and the number of extracted rules.

Appendix A.1. School Roll and Study Population Size
The school roll and study population size is presented in Table A1. Note that children who join the roll during the school year were removed from the study population to prevent skewed data, as discussed in Section 4.2.2. Children in the reception year join the school's attendance roll once they turn 5 years old, which almost always occurs after the start of the school year.