Understanding Student Behavior in a Flipped Classroom: Interpreting Learning Analytics Data in the Veterinary Pre-Clinical Sciences

: The ﬂipped classroom has been increasingly employed as a pedagogical strategy in the higher education classroom. This approach commonly involves pre-class learning activities that are delivered online through learning management systems that collect learning analytics data on student access patterns. This study sought to utilize learning analytics data to understand student learning behavior in a ﬂipped classroom. The data analyzed three key parameters; the number of online study sessions for each individual student, the size of the sessions (number of topics covered), and the ﬁrst time they accessed their materials relative to the relevant class date. The relationship between these parameters and academic performance was also explored. The study revealed patterns of student access changed throughout the course period, and most students did access their study materials before the relevant classroom session. Using k-means clustering as the algorithm, consistent early access to learning materials was associated with improved academic performance in this context. Insights derived from this study informed iterative improvements to the learning design of the course. Similar analyses could be applied to other higher education learning contexts as a feedback tool for educators seeking to improve the online learning experience of their students. This study yielded several interesting insights into the learning behavior of students in our ﬂipped classroom. We found that patterns of access do change over time across the course, associated both with the nature of the material (degree of di ﬃ culty) and the timing of assessments. Most students in this context did access their study materials before their classroom session, and our clustering analysis demonstrated that consistent early access to online learning materials was associated with improved learning outcomes. While the speciﬁc outcomes described here may not be translatable to all learning contexts in higher education, the study demonstrates how educators can undertake a basic analysis of their own learning analytics data to derive meaningful outcomes. As a result of the outcomes of this study, the educators created new targeted online resources for speciﬁc topics, and communicated to students (using this evidence base) that consistently accessing their learning materials early will assist them to maximize their academic performance in the subject. In this way, learning analytics data has been utilized to inform iterative improvements in the design and delivery of the course.


Introduction
The flipped classroom delivery mode has been increasingly reported in the higher education literature over the past decade [1][2][3]. In this delivery mode, students are provided with learning materials before attending class, and are required to undertake preparatory study prior to attendance [4]. Classroom time is allocated to engaging students in synchronous active learning activities. In higher education, these active learning activities have particularly been used as an opportunity for students to apply their knowledge to solve relevant disciplinary problems [5]. This student-centered pedagogical approach fosters student engagement in independent learning and offers greater flexibility and personalization of student learning.
Studies of flipped classroom implementation in higher education have focused on student satisfaction as a primary outcome measure [1,3]. These studies have consistently reported increased student satisfaction, with some also reporting increased academic performance and improved attendance when compared to traditional didactic delivery modes [6][7][8][9][10]. It is surmised that improvements in academic performance are due to heightened engagement in active learning activities during class time, however, few studies have robustly evaluated the student learning behaviors which may lead to improved academic outcomes. The delivery mode does present some new challenges to the learning experience of students. It places a greater responsibility on the learner to undertake the preparatory study prior to attending class, requiring self-regulated learning skills [11]. Variable student engagement with pre-class learning materials leads to variability in student preparedness, and hence ability to actively engage with the class activities. Videos are the most commonly reported form of pre-class learning activities reported [3], however, a range of other types of learning resources can be utilized. Moreover, the quality and design of these resources (including videos) may significantly impact student engagement. Therefore, the design of pre-class activities is critical to the success of the flipped classroom model, but many higher education teachers are not experienced in optimal design of online learning materials.
Learning content provided online as pre-class materials is often delivered through an institutional learning management system (LMS). These systems provide a platform for distributing learning materials to students, and many also collect data on the access behavior of students. This provides an opportunity to undertake learning analytics, which has been defined as, "The measurement, collection, analysis, and reporting of data about learners and their contexts, for the purposes of understanding and optimizing learning and the environments in which it occurs" by the first International Conference on Learning Analytics [12]. Through the study of learning analytics, investigators aim to gain insights into the effectiveness of different elements of the online learning design, and how this design impacts on student behavior in interacting with the materials [13][14][15]. Additionally, the data provides an opportunity to evaluate patterns in behavior of individual students or groups of students. This can provide opportunities for educators to better understand how their students learn using online materials and to identify and assist at-risk students [16][17][18].
A range of variables derived from LMS data have been used by investigators to understand student learning behavior, including frequency of log-in, assignment submission time, and access to specific learning materials [19][20][21][22][23]. More recently, patterns of LMS usage in the flipped classroom has been reported [24][25][26][27][28][29]. AlJarrah et al. [25] found that high performing computer science students in a flipped classroom demonstrated more frequent access to course materials overall compared to medium and low performing students. A study of pre-clinical medical students studying histology in a flipped classroom [26] evaluated engagement with a single online module, which showed that 32% of students did not access the online module before the classroom session. This study questioned if students in a busy curriculum are able to prepare effectively, and suggested that this variable preparation may have led to variability in experience of small group learning in class. Other recent studies have utilized data clustering approaches to describe student behaviors. Jovanovich et al. [27] found clusters of online student behavior that were consistent with profiles of student learning strategies, and found an association between learning strategy and course performance. Studies of this type have highlighted the importance of the contextual nature of student learning situated within online and classroom learning experiences. For this reason, further evidence across a range of learning contexts is required to make generalizable conclusions.
The objective of this study was to examine patterns of students' behavior in interacting with online learning materials in a flipped classroom delivery mode, and to evaluate if these patterns of behavior were linked to academic outcomes. This study will seek to evaluate key parameters within available LMS data, and detail data analysis methods used to produce meaningful visualizations for educators. The contextual background of this study occurs within a pre-clinical veterinary medicine classroom. The outcomes from this study will assist educators to understand how they could evaluate the online learning behavior of students in their flipped classroom settings, to inform iterative improvements to learning design and pedagogical approaches.

Materials and Methods
This study was approved by a University of Melbourne Human Research Ethics committee, project number 1648462.1.

Course Design and Classroom Context
The context for this study is a cohort of students studying a pre-veterinary sciences course titled 'Foundations of Animal Health' which was delivered across a 12-week study period, and has been previously described [30]. This course covers a range of foundational topics including animal nutrition, animal toxicology, animal thermoregulation, and animal behavior. This course is presented in a flipped classroom delivery mode, with students receiving weekly learning materials presented online as approximately eight short (5-15 min) video segments together with approximately six complementary online learning activities. This online content was released week by week during the course, with students gaining access to the materials five days before the relevant live class. These online components of the course were delivered to students through the institutional learning management system (LMS) Blackboard Learn TM (2009, Blackboard Inc., Washington, DC, USA).
The live classes were structured as small group case-based learning activities that engaged students working in groups of five to seven on applied problems of relevance to the weekly online materials. The sessions were facilitated by instructors, including the educator who presented in the online videos available for that week. At the end of each week, an interactive face-to-face feedback session was held, where the lecturer provided feedback on the learning activities of the week and answered any student questions. Table 1 provides a summary of the weekly cadence in this flipped classroom.

Pre-Class Learning Class Day Feedback Session
Series of short videos, with short online learning activities (available four days before class) Small group case-based learning tasks Interactive large group question and answer session facilitated by staff The assessment for the course consisted of three intra-semester quizzes and one final examination. The first of the intra-semester quizzes, held at the end of Week 3, was intended purely for formative feedback and the result did not contribute to the final grade for the course. The subsequent two intra-semester quizzes (held at the end of Weeks 6 and 10) each contributed 15% towards the final student grade. These quizzes consisted of both multiple-choice style questions and short written answer style questions. The final examination was weighted 70% and also consisted of multiple-choice style questions and short written answer questions. This examination covered material from across all 12 weeks of the course.

Data Collection and Description
Learning analytics data was retrospectively collected from the institutional LMSfor analysis. All students studying the course (228 in total) were included in the analysis. The data comprised four main parameters, as shown in Table 2. Firstly, the student is identified through their unique username (Student) as well as which material out of the 12 weeks was being studied (Item). The time that the material was accessed is also presented (Access Time). Finally, the parameter SessionID provides a unique identifying number to the student's study session, which can be used to detect cases where the student was studying materials from multiple weeks simultaneously. In addition to the LMS data, the (numerical) grades the students achieved in the subject were also collected at the end of the semester.

Data Engineering
The data needed to first be examined and engineered to an appropriate and meaningful form. This step typically involves tasks such as dealing with missing data, or identifying potentially incorrectly recorded entries, amongst others. For this study, the handful of missing or inconsistent entries were removed from the dataset. Another important aspect of data engineering is to identify if the available data satisfy the requirements of the analysis or if they need to be further manipulated/engineered. In this case it is important to identify a significant shortcoming of the available dataset, the fact that there is no information regarding the duration of the students' visit/study sessions. This results to new entries being created in cases such as a student refreshing their webpage and therefore there exist multiple entries with the same student and item information, but with timestamps that are perhaps a few minutes different. For the context of this problem, these entries should not be treated as different study sessions, since outside factors such as internet issues or accidentally refreshing the webpage would result in multiple entries/study sessions which provides little valuable information. Therefore, in order to more properly identify the study behavior of the student, data entries within 6-h periods were combined into a single study session in order to provide a more accurate representation of the student behavior.

Data Features/Characteristics
Three key parameters were identified for further analysis (as summarized in Table 3). The first is the number of study sessions each student had, which is a measure of how many times the student accessed the study materials. The second is the size of the sessions, meaning how many topics the student was studying in each session. This feature identifies the style of studying, either sequentially (week by week, different topics) or in parallel (studying multiple topics at the same time). Lastly, a very important feature, based on the data set is the first time of access (FTOA), i.e., when the student first accessed each week's material. The learning materials for each week are made available to students in advance of the classroom sessions, and there is an expectation that students have reviewed these materials before class. FTOA is treated as a categorical parameter with the student either having accessed the material "Before the class day", "On the class day", "After the class day", or "Never". To incorporate this feature in numerical analyses these categories are given a score of −1, 0, 1, and 2, respectively, meaning the lower the number the more consistently the student accessed learning materials before the relevant class session.

Data Analysis
Two layers of data analysis are performed in this study. In the first, the analysis uses visualization to gain a better understanding of the student behavior. This, for example, includes the patterns with which the students access the available online material (frequency, consistency, one topic at a time or multiple topics simultaneously, etc.), and how this changed throughout the semester, including in relation to the intra-semester assessments. The second layer of the analysis aims to identify if and to what degree these patterns of student behavior affected their academic performance (measured by final grade). To undertake this analysis, unsupervised machine learning techniques, specifically k-means clustering [31,32], is performed to group students of similar behavior together and identify each group's characteristics, for example the average grade. A study comparing clustering algorithms for learning analytics in similar datasets concluded that k-means was amongst the best performing partition algorithms overall [33]. In order to group the students, the available selected features are utilized, based on the available data, which are parameters characterizing the students' online behavior.

Visualizing the Data
Since the data are characterized by three identified attributes/features, the initial step was to gain an understanding of the distribution of the data for these attributes. Each of the features contains daily or weekly data throughout the semester, however, in order to be able to create a meaningful plot, either the average or the sum of these values over the semester has been utilized (for example a student's average FTOA score is the average of their FTOA scores for each week/material). Figure 1 presents this information in the form of a histogram for each feature, showing that the FTOA and the size of sessions are relatively non-skewed, meaning that the values for most students fall around the middle of the range, while the number of sessions is more skewed to the right, indicating that a small number of students undertook considerably more sessions than the majority (in this case up to more than double the number of sessions).
Educ. Sci. 2020, 10, x FOR PEER REVIEW 5 of 14 together and identify each group's characteristics, for example the average grade. A study comparing clustering algorithms for learning analytics in similar datasets concluded that k-means was amongst the best performing partition algorithms overall [33]. In order to group the students, the available selected features are utilized, based on the available data, which are parameters characterizing the students' online behavior.

Visualizing the Data
Since the data are characterized by three identified attributes/features, the initial step was to gain an understanding of the distribution of the data for these attributes. Each of the features contains daily or weekly data throughout the semester, however, in order to be able to create a meaningful plot, either the average or the sum of these values over the semester has been utilized (for example a student's average FTOA score is the average of their FTOA scores for each week/material). Figure 1 presents this information in the form of a histogram for each feature, showing that the FTOA and the size of sessions are relatively non-skewed, meaning that the values for most students fall around the middle of the range, while the number of sessions is more skewed to the right, indicating that a small number of students undertook considerably more sessions than the majority (in this case up to more than double the number of sessions).  Figure 2 shows the total number of sessions daily over the course of the semester, indicating the day when new material is introduced (Week 1, 2, etc.), as well as the days of the three quizzes (Quiz 1 not contributing to the final mark) and the exam. The data show that generally the days where new material was introduced resulted in higher student activity as did the quizzes and exam, with Quiz 3 resulting in the highest student activity. It is also interesting to note that higher activity is identified in the second half of the semester, after the Easter Holidays.
Further visualization of the data, more specific to each of the three identified features, can be seen in Figure 3. Firstly, Figure 3a shows the FTOA for each of the 12 different class materials, with different colors representing whether students accessed the material before the day of the lecture, on the same day as the lecture, after the day of the lecture, or if they never accessed it. Most students accessed the material generally either before or on the day of the lecture, with only Weeks 7, 10, and 11 breaking the pattern. Figure 3b shows the total number of sessions for each class, showing distinctly that the material delivered in the teaching Weeks 2, 4, and 6 were the ones the students clicked on the most times. Lastly Figure 3c contains the average size of the session over the semester, clearly showing that it followed a relatively linear increase. This means that as more material was available to the students, on average they tended to concurrently work on multiple weeks' materials.  Figure 2 shows the total number of sessions daily over the course of the semester, indicating the day when new material is introduced (Week 1, 2, etc.), as well as the days of the three quizzes (Quiz 1 not contributing to the final mark) and the exam. The data show that generally the days where new material was introduced resulted in higher student activity as did the quizzes and exam, with Quiz 3 resulting in the highest student activity. It is also interesting to note that higher activity is identified in the second half of the semester, after the Easter Holidays. Educ. Sci. 2020, 10, x FOR PEER REVIEW 6 of 14  The next phase of the analysis investigated the relationship between these identified attributes and the performance of the students. A simple approach to this question is to investigate the Further visualization of the data, more specific to each of the three identified features, can be seen in Figure 3. Firstly, Figure 3a shows the FTOA for each of the 12 different class materials, with different colors representing whether students accessed the material before the day of the lecture, on the same day as the lecture, after the day of the lecture, or if they never accessed it. Most students accessed the material generally either before or on the day of the lecture, with only Weeks 7, 10, and 11 breaking the pattern. Figure 3b shows the total number of sessions for each class, showing distinctly that the material delivered in the teaching Weeks 2, 4, and 6 were the ones the students clicked on the most times. Lastly Figure 3c contains the average size of the session over the semester, clearly showing that it followed a relatively linear increase. This means that as more material was available to the students, on average they tended to concurrently work on multiple weeks' materials.
The next phase of the analysis investigated the relationship between these identified attributes and the performance of the students. A simple approach to this question is to investigate the correlation of these features and the final grade received on an individual student basis, using the average or sum of values for the three features over the 12 teaching weeks as explained earlier in this section. The resulting plots can be seen in Figure 4, for each of the three features, indicating the correlation coefficient with the final grade. The results show that on this individual basis there exists no correlation between the separate features and the final grade. Because of the many different variables that can affect each student, trying to relate one particular feature (for example how many sessions they initiated in total) provides no clear insights about how a feature effects individual student academic performance.  The next phase of the analysis investigated the relationship between these identified attributes and the performance of the students. A simple approach to this question is to investigate the

Clustering the Data into Student Behavioral Groups
As it has been shown, trying to investigate student behavior and the resulting performance at an individual level results in no clear insights, likely due to the high variance and large number of parameters affecting this performance. An alternative approach to model student behavior is to group students together based on similar behavioral patterns. To achieve this, clustering is used to identify the clusters/groups of similar behavior between the students, in this case using the k-means clustering algorithm, as mentioned in Section 2.5. Moreover, as an initial approach each of the three features are used separately to perform the clustering, utilizing the score for each week per student instead of an aggregate (average/sum) as previously. This means that for example, when using the number of sessions as a feature, for each student the date entry consists of 12 values, each being the sum of sessions per teaching week. It is also important to note that when clustering is performed the number of clusters or groups needs to be identified beforehand. For this study, between four and six are considered as reasonable numbers of potential behavioral groups based on previous research [27]. Therefore, all three values are used in the analysis, since using more than one fixed value reduces the margin for errors. The results for the clustering based on each of the three features can be observed

Clustering the Data into Student Behavioral Groups
As it has been shown, trying to investigate student behavior and the resulting performance at an individual level results in no clear insights, likely due to the high variance and large number of parameters affecting this performance. An alternative approach to model student behavior is to group students together based on similar behavioral patterns. To achieve this, clustering is used to identify the clusters/groups of similar behavior between the students, in this case using the k-means clustering algorithm, as mentioned in Section 2.5. Moreover, as an initial approach each of the three features are used separately to perform the clustering, utilizing the score for each week per student instead of an aggregate (average/sum) as previously. This means that for example, when using the number of sessions as a feature, for each student the date entry consists of 12 values, each being the sum of sessions per teaching week. It is also important to note that when clustering is performed the number of clusters or groups needs to be identified beforehand. For this study, between four and six are considered as reasonable numbers of potential behavioral groups based on previous research [27]. Therefore, all three values are used in the analysis, since using more than one fixed value reduces the margin for errors. The results for the clustering based on each of the three features can be observed in Figure 5, where the clusters have been plotted based on the aggregate feature value of each group, with indicative trendlines and regression coefficient values. The values over each cluster represent the number of students within that cluster.
Educ. Sci. 2020, 10, x FOR PEER REVIEW 8 of 14 relatively linear relationship can be seen for the first few clusters in each case, while for the higher number of sessions the trendline tends to become flatter. Lastly, the size of the sessions is investigated, which indicates how many different classes (each class being the material presented in one of the 12 teaching weeks) the student was studying simultaneously (in one study session). The results show an interesting pattern that follows a bell curve and suggests that the students with values around the middle of the x-axis for the size of the sessions performed the best. Clustering results when using each of the three parameters separately for three different number of clusters/groupings. Each plot shows trendlines based on the average/sum value for each cluster and the number above each cluster represents the number of students within that cluster.
We then investigated the relationships between the three features to observe how these potentially influence the observed clustering results. These can be seen in Figure 6, showing no significant correlation between any of the features, with only a small correlation between FTOA and session number. However, a closer look reveals some interesting observations. Examining Figure 6a, for low values of the session number the average FTOA also seems to be high, indicating that students that did not study many sessions also did not generally study on time, which, based on the clustering results, both indicate a poor performance. Prompt students, with negative value in FTOA, had a range of session numbers. Moreover, Figure 6c also shows that prompt students mostly had a score for the Clustering results when using each of the three parameters separately for three different number of clusters/groupings. Each plot shows trendlines based on the average/sum value for each cluster and the number above each cluster represents the number of students within that cluster.
It can be seen that the FTOA is an excellent indicator for student performance, with the cluster average values following a linear trend. The results suggest that the lower the value of FTOA, which indicates a student that accessed material before the class day and therefore were likely prepared for the class, the better the final grade that they received. On the other hand, high values of FTOA and therefore students that accessed the material mostly after the lecture days resulted in overall poorer results. This observation is consistent despite the number of clusters with R 2 > 0.9. It should be clarified here that this pattern is not obvious when looking at the students at an individual level ( Figure 4) because of the variability. Each cluster represents a similar behavior between students, who achieved different overall results. What is being shown here are the aggregates of the results within the cluster which follow a pattern.
The second feature utilized is the number of sessions the students initiated, indicating how many times they logged into the online system to view the material. The results for this feature show a non-linear relationship with some interesting observations. Looking at the lower end of the x-axis, a relatively linear relationship can be seen for the first few clusters in each case, while for the higher number of sessions the trendline tends to become flatter.
Lastly, the size of the sessions is investigated, which indicates how many different classes (each class being the material presented in one of the 12 teaching weeks) the student was studying simultaneously (in one study session). The results show an interesting pattern that follows a bell curve and suggests that the students with values around the middle of the x-axis for the size of the sessions performed the best.
We then investigated the relationships between the three features to observe how these potentially influence the observed clustering results. These can be seen in Figure 6, showing no significant correlation between any of the features, with only a small correlation between FTOA and session number. However, a closer look reveals some interesting observations. Examining Figure 6a, for low values of the session number the average FTOA also seems to be high, indicating that students that did not study many sessions also did not generally study on time, which, based on the clustering results, both indicate a poor performance. Prompt students, with negative value in FTOA, had a range of session numbers. Moreover, Figure 6c also shows that prompt students mostly had a score for the session number in the middle of the range (between 2-3) which, based on the clustering results, indicate students that performed well in the subject.

Combined Clustering
The final part of the analysis combines the three chosen features with the aim to identify more general behavioral groups. To achieve this, each of the three features are categorized and the results are combined in categories that encompass all three features. To identify the categories, the results of Figure 5, particularly the trendlines, are observed. For example, the bell curve of the average session size feature indicates that there exist a lower limit (~1.9) and an upper limit (~2.3) under/over which students generally perform less well, therefore, three categories are used. The logarithmic/decaying trend of the session number indicates a limit (~130) over which not much difference is observed in the results and therefore the data are split into two categories for this feature. Finally, the FTOA feature follows a linear trend, meaning that any number of categories could be suitable. In this study three categories are identified, which can be interpreted as students being "mostly prepared" (FTOA < −0.5), "somewhat prepared" (−0.5 < FTOA ≤ 0.25), and "unprepared" (FTOA > 0.25).
The results of this analysis are presented in Figure 7, showing the created groups/clusters when combining all three features. The size of the circles represents the number of students within the cluster and the color the average final mark the students in the cluster received, as an indicator of performance. This analysis revealed that there are several combinations of features that described

Combined Clustering
The final part of the analysis combines the three chosen features with the aim to identify more general behavioral groups. To achieve this, each of the three features are categorized and the results are combined in categories that encompass all three features. To identify the categories, the results of Figure 5, particularly the trendlines, are observed. For example, the bell curve of the average session size feature indicates that there exist a lower limit (~1.9) and an upper limit (~2.3) under/over which students generally perform less well, therefore, three categories are used. The logarithmic/decaying trend of the session number indicates a limit (~130) over which not much difference is observed in the results and therefore the data are split into two categories for this feature. Finally, the FTOA feature follows a linear trend, meaning that any number of categories could be suitable. In this study three categories are identified, which can be interpreted as students being "mostly prepared" (FTOA < −0.5), "somewhat prepared" (−0.5 < FTOA ≤ 0.25), and "unprepared" (FTOA > 0. 25).
The results of this analysis are presented in Figure 7, showing the created groups/clusters when combining all three features. The size of the circles represents the number of students within the cluster and the color the average final mark the students in the cluster received, as an indicator of performance. This analysis revealed that there are several combinations of features that described very few students, for example unprepared students who registered large numbers of sessions, or students who were mostly prepared with a small average session size. A large group of students were found to be somewhat prepared with a medium session size, and regardless of their total session number they performed above average in their final grade.

Discussion
This study sought to utilize learning analytics data to better understand the learning behavior of students in a flipped classroom. Using three key parameters, this exploratory study investigated how the time of first access to online study materials, the number of online study sessions, and the size of study sessions (number of topics accessed) related to the academic outcomes (final course grade) achieved by students. Our initial analyses visualized the distribution of study sessions across the semester (Figure 2). This yielded some expected results, with peaks of activity occurring in association with summative assessments (Quiz 2, 3, and the final examination). There was an overall trend towards increasing activity in the second half of the semester, which may also indicate the assessment-driven motivation for students in accessing their learning materials as they approached the final exam. Non-uniform patterns of access to online learning resources, with a tendency for increased activity temporally associated with assessments has been reported previously in medical education [34] and in a flipped classroom management course [35]. An analysis of student access to online preparatory materials in a flipped classroom in engineering also revealed students employed assessment-motivated learning strategies [27]. Further investigations including student interviews or focus groups would be needed to better understand the learning strategies motivating our students' extra study sessions at these peak times, and if these strategies may lead to improved performance.
The flipped classroom delivery mode requires students to access their learning materials and

Discussion
This study sought to utilize learning analytics data to better understand the learning behavior of students in a flipped classroom. Using three key parameters, this exploratory study investigated how the time of first access to online study materials, the number of online study sessions, and the size of study sessions (number of topics accessed) related to the academic outcomes (final course grade) achieved by students. Our initial analyses visualized the distribution of study sessions across the semester (Figure 2). This yielded some expected results, with peaks of activity occurring in association with summative assessments (Quiz 2, 3, and the final examination). There was an overall trend towards increasing activity in the second half of the semester, which may also indicate the assessment-driven motivation for students in accessing their learning materials as they approached the final exam. Non-uniform patterns of access to online learning resources, with a tendency for increased activity temporally associated with assessments has been reported previously in medical education [34] and in a flipped classroom management course [35]. An analysis of student access to online preparatory materials in a flipped classroom in engineering also revealed students employed assessment-motivated learning strategies [27]. Further investigations including student interviews or focus groups would be needed to better understand the learning strategies motivating our students' extra study sessions at these peak times, and if these strategies may lead to improved performance.
The flipped classroom delivery mode requires students to access their learning materials and undertake preparatory study before they attend class. For this reason, the time of first access (FTOA) of study materials was an important parameter in this learning context. As shown in Figure 3a, across the study period most students accessed learning materials for the first time either before or on the day of class across the semester. This is consistent with other reported studies of the flipped classroom, which have found the majority of students do access online materials before class [26,36,37]. Exceptions were Weeks 7, 10, and 11 where most students accessed their learning materials after the class day.
Week 7 occurred immediately after Quiz 2 and an associated peak in activity accessing previous learning materials. It may be that after the quiz, students turned their attention to other study activities and neglected to prepare for this class. Another dip in frequency of access before class occurred in Week 10 and persisted into Week 11. Week 10 also showed a peak in total sessions (Figure 2), so this observation could also be linked to assessment-driven behaviors, as the final quiz occurred in Week 10. Students may have spent an increased number of study sessions revising previous materials rather than maintaining preparedness for the immediate class. These observations have important implications for educators. In particular, the dip in preparedness immediately after the scheduled quiz sessions could impact significantly on student learning in those specific classes. This would be a pertinent time to remind students of the importance of preparedness, and to facilitate their return to normal study habits as quickly as possible following the assessment.
Distribution of the number of online learning sessions for each week of study ( Figure 3b) and the average size of study sessions across the course period (Figure 3c) also have important implications for educators. In our study, there were significantly more online study sessions for Weeks 2, 4, and 6. Previous studies investigating the frequency of video-watching in flipped classrooms have concluded that repeated viewing indicated increased difficulty of the study material [26,38]. In the context of this study, Weeks 2, 4, and 6 covered applied nutrition topics in the course, which are areas that students have found more challenging based on assessment performance. Therefore, this learning analytics data points to the likelihood that increased sessions may be associated with increased difficulty of the study material. Educators may therefore use the data to identify areas in the curriculum where students could benefit from additional supportive resources. Although we expected that the number of different study topics in a session (size of study sessions) would increase as the course progressed, it was surprising that this increased to between 3-4 by the final quarter of the study period. Through studying these multiple topics concurrently, students may have been drawing links or comparisons between topics, or refining their study notes across multiple topics. Further investigations would be needed to understand why students chose this approach to concurrently study several topics at one time, and to determine if the design of the learning materials encouraged this behavior.
While we found that no single parameter displayed a simple correlation at the individual level with the final grades achieved by students (Figure 4), clustering of students based on behavior over the duration of the course did yield some interesting results. When between four and six clusters were created based on the FTOA parameter, the clusters demonstrated a strong relationship (R 2 > 0.9) with the average final grade ( Figure 5). This finding suggests that in our learning context, consistent early access of learning materials (lower average FTOA) is associated with improved performance. To the best of the authors' knowledge, this is the first report of a link between consistent before-class access and improved outcomes in a flipped classroom context. In contrast, AlJarrah et al. [25] found that high and low performing students in a flipped classroom showed no difference in temporal access to material. It may be that our finding is highly context-specific, as it is particularly dependent on the nature of the relationship between the pre-class online materials and the classroom activities. Our findings occurred in the context of utilizing videos as the primary pre-class learning resource and small group case studies as active learning activities in the classroom, and results may differ where the flipped classroom learning design is substantially different. Familiarity with pre-class materials was critically required for active engagement in our classroom sessions, and higher performing students may have recognized this and regulated their study strategy accordingly. This information has been provided as guidance to future students studying the course who wish to maximize their academic performance.
The clustering based on total sessions showed a pattern of diminishing returns as session size reached a threshold number. This would suggest that up to a threshold level increasing the number of study sessions does improve performance, but beyond that threshold, additional sessions do not confer improvements in performance. It has been previously reported that in an online learning environment, keeping pace with the class schedule ('pacing') and regularly accessing course materials improves academic achievement [39], and it is likely that these behaviors are reflected in the students with higher numbers of study sessions. Clustering based on average session size revealed a less significant relationship with academic performance, with values around the middle of the range leading to a better performance. We observed that the cluster of students with an average session size over 2.5 performed more poorly than their peers, which may demonstrate that attempting to multitask around more than two different topics in a single study session could be detrimental to learning outcomes. It may also be that those students with high average session size also had fewer study sessions overall, so the effect may also be related to reduced overall study sessions. This finding is again of interest to educators, who could utilize this understanding to provide guidance to students on study strategies as they progress through the course.
The effect of session size was further elucidated in the combined clustering results (Figure 7). A large cluster of mostly prepared students with a large number and size of sessions achieved a lower average final grade than a similar behavioral group with a medium session size. Further, it was interesting to observe that the best performing (albeit small) cluster in this analysis had a low number of total sessions, a medium session size, and were mostly prepared for class. This could indicate that higher performing students did not require repeated study sessions and did not study large numbers of topics concurrently within a study session. Future studies involving collection of qualitative data will be important to evaluate how student behavioral groups differentially utilized online learning materials to aid their learning.
This study is subject to several limitations, as the data available through the LMS had several important constraints. The data is generated by students clicking in the LMS to access their learning materials but does not provide information about how (or if) students used those resources as a study tool. The time stamp of access to materials indicated only the starting time of access and provided no information about the duration of the study session. We noted that in many cases, there were multiple access records in very short timeframes, likely indicating students clicking in and out of materials, or refreshing their browser. Because of this, we created a 6-h timeframe for a single study session, and this decision framed the absolute values for the total number of sessions and the size of the sessions (number of topics per session). We acknowledge that some students may have actually undertaken multiple short sessions within that timeframe, however believe that this would not significantly alter the outcomes of the study other than increasing the absolute value for those parameters. Further, students had the opportunity to download their study materials, so some may have only accessed material once through the LMS, but subsequently held multiple study sessions using downloaded materials. It is also possible that multiple students may have studied in groups from a single device, with only a single student logged in to the LMS and therefore registering a study session within the data. Because of these limitations we have made some key assumptions in our interpretation of the data, including that students were using the resources to actively engage in study at each session. Despite these limitations, some key relationships have been identified in the data which we believe are meaningful and actionable for educators. Further research, including mixed methods approaches will be important to better understand how students are utilizing the resources as part of their study strategy.
This study yielded several interesting insights into the learning behavior of students in our flipped classroom. We found that patterns of access do change over time across the course, associated both with the nature of the material (degree of difficulty) and the timing of assessments. Most students in this context did access their study materials before their classroom session, and our clustering analysis demonstrated that consistent early access to online learning materials was associated with improved learning outcomes. While the specific outcomes described here may not be translatable to all learning contexts in higher education, the study demonstrates how educators can undertake a basic analysis of their own learning analytics data to derive meaningful outcomes. As a result of the outcomes of this study, the educators created new targeted online resources for specific topics, and communicated to students (using this evidence base) that consistently accessing their learning materials early will assist them to maximize their academic performance in the subject. In this way, learning analytics data has been utilized to inform iterative improvements in the design and delivery of the course.