Online and In-Class Evaluation of a Music Theory E-Learning Platform

: This paper presents a new version and a three-month evaluation of the Troubadour platform—an open-source music theory ear training platform. Through interviews with teachers, we gathered the most-needed features which would aid their use of the platform. In the new version of the Troubadour platform, we implemented different types of interaction, including class management, re-occurring homework and challenges. Previous research has shown a signiﬁcant improvement in the students’ performance while using the platform. However, the short time span of the previous experiments has not shown whether these results can be attributed to the novelty bias. To evaluate the efﬁcacy of the platform beyond its novelty bias, we performed a three-month-long evaluation experiment on the students’ interaction through questionnaires and platform-collected data. We collected data on their engagement with the platform. During the experiment, the students attended the school through online courses during the ﬁrst part of the evaluation, and in-class in the second part. In this paper, we investigate the students’ engagement during the three-month period, explore the inﬂuence of the platform’s use in-class versus online learning process, analyze the students’ self-report on their practice habits and compare them with the collected data. The results showed high student engagement during the lockdown period, while the in-class process showed a decrease in the platform’s use, unveiling the students’ need for such platform as a complementary learning channel in remote learning.


Introduction
Modern teaching approaches across a variety of educational fields have been enhanced by gamification approaches, both in real-world [1,2] and online environments [3,4]. In their early start about two decades ago, new online learning environments increased the accessibility of the learning process; however, the latter did not necessarily result in increased performance [5]. Nevertheless, the effects of gamification were positive both in terms of the learners' engagement and motivation [6,7], and consequently, their performance [8]. Moreover, online learning environments may be needed to prevent knowledge loss in future situations similar to the recent COVID-19 outbreak, where in-class learning is not available or permitted [9].
Gamified e-learning is also expanding into non-technical subjects, such as music training (e.g., [10]), with several million users. Gamification is included in various platforms, from mobile and web applications to hardware solutions [11] and virtual reality applications [12,13]. With the rise of e-learning's popularity at the beginning of the millennium, several solutions and approaches have been presented over the past two decades in music theory and ear training [14]. Despite the popularity of instrument training solutions, music theory training is not as popular as instrument training. Moreover, formal music training still mostly employs conventional in-class techniques of music theory training, with a teacher either giving direct or delayed feedback on the student's work. In the last two years, the in-class activities have moved to online meeting platforms [15,16], such as Microsoft Teams and Zoom, further limiting the individual feedback by the teacher, exposing the need for online music theory training tools, which would therefore significantly benefit the students in their training. However, music theory training significantly differs from other topics of music-related learning. Music theory exercises, such as melodic and rhythmic dictation, are more difficult to perform individually through homework without immediate feedback. On the other hand, these types of exercises could be automatically evaluated through an e-learning system. Tools aiding the music theory training could complement the existing in-class activities, while enabling students to practice theoretical skills individually at home.
The Troubadour platform (https://trubadur.si) was developed by Pesek et al. [17] to aid the music theory ear-training process. The Troubadour platform is an online educational platform for music theory training, publicly available-both as an online service, as well as the source code (Source code available at https://bitbucket.org/ul-fri-lgm/troubadour_ production (accessed on 1 June 2022)). The platform employs basic elements of gamification to motivate users, such as direct feedback, scoring, levels, leaderboards, avatars, and badges. As a service, which is complementary to the existing instrument and music-theory training applications, the main focus of the Troubadour platform is to aid students in ear training while following the progression of the curriculum according to the tempo of the individual student, teacher or class. In this aspect, the platform offers personalization techniques, which aid the users in their training. Unlike other popular products, ear training does not receive as much attention from the developers and the target audience as compared to the most popular instrument training applications. This difference in popularity is somewhat expected, as both instrument and theory training target a significantly larger audience of beginners and amateur or self-taught musicians, while ear training is more commonly practiced among the professionals in middle-and higher-education music institutions. Nevertheless, the ICT support for this specific aspect of music training is much needed and poses a significant opportunity to aid musicians in their professional musical training.
The first version of the Troubadour platform (named Trubadur (Available at: https: //trubadur.si/ (accessed on 1 June 2022)) in Slovenian language) was developed and evaluated in collaboration with the Conservatory of music and ballet, Ljubljana, Slovenia. The conservatory represents one of the two intermediate-level music institutions attended by the students, who mostly pursue music-performance-related academic careers. While focusing on students who already possess some music knowledge, students at the beginning of their academic enrollment-especially the first-year students-significantly differ in their level of music theory proficiency. Addressing the potential engagement of these students with an online platform for music theory learning, therefore, poses a challenge.
The platform's impact was evaluated in two short periods, analyzing the students' performance in interval [17] and rhythmic dictation exercises [18]. The analysis of the students' performance showed an increase among the students who engaged with the platform. The students also reported positive feedback about the platform in both studies. Nevertheless, the relatively short duration of both evaluation experiments could also imply the novelty bias of the platform. Both previous studies also introduced novel forms of music training on the platform.
In this paper, we implemented additional interaction approaches into the platformnamely class management, homework and challenges. The administration module was extended to provide an insight for the teachers and now includes individual and aggregated statistics on the students' performance. In the described experiment, we collect and analyze students' engagement, interaction and direct feedback with the platform over a threemonth evaluation period. The goal of this research is to evaluate the benefits and remaining challenges of the platform in an attempt to gather user feedback past the novelty bias and observe the dynamics of the students' engagement throughout the experimental time period. Due to restrictions of public life due to COVID-19, the students attended the music school online (March-April 2021) and in-class (May-June 2021) during this time period. We, therefore, expected valuable insights into students' engagement in both learning modalities (online versus in-class).

Related Work
Interest in ICT-supported education has grown in recent decades [19]. In music training and education, a variety of approaches have been presented in a theoretical or prototype forms [14], ranging from computer-assisted learning-such as instrument training and music theory training-to assist tools for music-related training and performance-such as score following (e.g., Dorfer et al. [20]).
Zanetti [21] proposed a web-based ear training approach for pitch discrimination. Using the proposed solution, the authors collected the participants' performance data, as well as their feedback. The authors also reported the proposed approach had a positive effect on participants' achievement. They also concluded that a large-scale longitudinal study would be necessary to fully explore the effects of musical intelligence and music aptitude on musical achievement (p. 6). Kiraly [22] proposed a computer-aided approach for solfeggio training. They reported the presented approach was perceived as positive for students' motivation and emphasized the future use of teacher-computer to train in a spacetime independent environment, as opposed to the traditional space-time fixed classroom approach. Riley [23] provided an overview of the possibilities of mobile and tablet devices through their experience and integration of the devices in the teaching/learning process at their high school by using commercial products and solutions for online classes, music training and music generation. Seddon [24] reported participants' experience in the in-class to online transition and observed positive experience in a music-training environment. In the last two years, Gillis [16] reported a shift from traditional to remote learning due to the COVID-19 pandemic. While several aspects of learning can be replicated remotely, it is more difficult for music-related training to transit to remote environments without proper tools.
Among the commercial instrument training solutions, Yousician [10], Simply piano (https://www.joytunes.com/ (accessed on 25 February 2022)) and Synthesia [25] have been popular among amateur musicians for instrument training. Piano and guitar are the most popular instruments on these platforms, with several millions of users cumulatively. Although music theory training solutions are not as popular as instrument training, there are several online products for music theory training, such as Tenuto (https://www. musictheory.net/products/tenuto (accessed on 25 February 2022)) and ABRSM Theory Works-mobile applications with quizlike exercises; MusicTheoryPro-mobile application with various games for interval, chord and key signature training. By offering more courselike content, the Musicca (https://musicca.com (accessed on 26 February 2022)) platform is also designed for in-class and home use. However, it heavily relies on individual use and does not offer teacher-related features, such as classes. The Teoria platform (https://teoria.com (accessed on 24 February 2022)) offers highly customizable exercises, but leaves the progression through different difficulties for the individual user to choose and track through their learning process. Other platforms, such as EdX (https://www. edx.org/learn/music-theory (accessed on 25 February 2022)), Liberty Park Music (https: //www.libertyparkmusic.com/learn-music-theory/ (accessed on 25 February 2022)) and Coursera (https://www.coursera.org/ (accessed on 28 February 2022)), offer online courses in the form of conventional class lectures and offline homework.
Nevertheless, the commercial solutions are not directly applicable to formal training for several reasons. First, there are usually differences between curricula in formal training. Most of the aforementioned solutions target individual learning without the teacher's role in the learning process. While commercial solutions offer a full range of difficulty levels, the individual exercises are not modifiable by an individual teacher. Second, the commercial products include payment models, which are difficult to finance in public and state-funded music schools, and private schools. Third, the solutions mentioned above are available in English, whereas other languages are not supported or are available only in a small subset of most popular languages, which exclude students from non-English-speaking countries and communities.

Troubadour Platform
Troubadour is an open-source online platform for music ear training. The platform aims to become a complementary tool for training and homework, providing instant feedback and a framework for guided training for students and teachers. The platform contains several gamification elements, including instant feedback, levels, leaderboards and avatars. The platform includes interval and rhythmic dictation applications to ease the teacher's workload with preparation and the lack of instant feedback for the students with their homework. The application generates, reviews and corrects the exercises, giving immediate feedback to the students, and an insight into students' performance to the teacher. The students can immediately analyse which part of their work was incorrect, enabling the students to instantly observe the feedback and improve their understanding of the exercise. In addition, the teacher's input is not required at any training step for the student to obtain the feedback. However, the platform offers several additions which would further aid the teacher-student interaction. Class management would be useful to group students of the same real-life class into separate groups. Additionally, class-oriented homework and individual challenge creation would also further support the teachers' interaction with the students and aid the tool's inclusion in the in-class environment.

Previous Evaluations of the Platform
The first version of the Troubadour platform provided a tool for students to consolidate their ear training further and overcome individual challenges through motivation for ear training. Initially, the interval dictation application was implemented into the platform ( Figure 1). We conducted A/B testing to evaluate the platform's impact on the students' performance. The students of two classes in the first and second year were split into two groups. The control group did not use the platform (consisting of 14 students), whereas the test group did (consisting of 19 students). The test group used the platform for one month. At the end of the testing period, both groups took a conventional exam. Their results showed that better success was achieved by students who actively used the Troubadour platform. First-year students in the test group achieved 9.2% better results on average than those who trained only in a conventional way. Meanwhile, the difference in success among the second-year students was smaller-the test group students performed 1% better. The platform's data were also analyzed in the experiment, including the number of incorrect submissions of the exercise, the number of deletions or additions of notes, and the number of playback re-plays. The results showed all metrics decreased proportionally with the number of solved exercises.
Later, in an updated version of the platform, a new rhythmic dictation application was developed [18]. We developed an algorithm to generate rhythmic sequences, which creates meaningful rhythmic patterns. Rhythmic pattern generation was more difficult than interval sequence generation, specifically in avoiding meaningless rhythmic sequences, which would demotivate the users. Each game in the rhythmic dictation application consisted of four rhythmic sequences. The student could fill in the perceived rhythm via the rhythmic keyboard ( Figure 2). The user could also enable or disable the metronome and adjust the tempo and the playback volume.  The rhythmic dictation application was also tested in a similar experiment to the one for the interval dictation application. During the experiment, the authors interviewed four students to assess the prototype user interface and improve the user experience. After this stage, the improved application was tested on a larger test group. The students were again divided into a control and a test group-there were 24 students in the control group and 23 in the test group. During the experiment, the control group practiced rhythmic dictation in a conventional way, while the test group used the platform in-class and at home. The experiment lasted five weeks.
Within the experiment, the test group students also responded to a questionnaire about the user experience and the functionalities of the platform. All students agreed the platform was easy to use. Almost all students found the exercises challenging enough and quickly got used to the new rhythmic application. The questionnaire at the end of the test period also showed that the test group students almost completely stopped practicing in a conventional way as a result of the use of the platform.
At the end of the evaluation period, the students were again evaluated through a conventional exam. The difference between the test and control groups in the first years was insignificant, while the second-year test group students achieved significantly better grades than the control group (average grade of 4.44 for the test group versus 3.58 for the control group). The hypothesis that the use of the platform has a positive effect on the students' performance was therefore confirmed [18].

Troubadour Platform Version 2.0
First, we gathered a list of functionalities in collaboration with the teachers. Specifically, we focused on their interaction with the platform as a complementary tool for their in-class and remote-class work with the students.

Gathering Teachers' Feedback
The platform provided several desired features to both the students and the teachers, such as automatic generation of exercises, difficulty levels and instant feedback. Nevertheless, there were still several challenges to integrating the platform into the learning process better. First, the teachers had no option of creating homework, which would be fixed to the desired difficulty level and generated one set of exercises for individual classes. Additionally, the teachers had no insight into the generated exercises, nor were they able to modify or change them. Second, the teachers expressed the need to form challenges, which would allow the students to create a series of exercises with the difficulty levels and exercise duration chosen by the students as the creators, available for participation by any platform user, regardless of their school/class enrollment. Last, the language of the platform was integrated directly into the platform. Students who used a single instance of the platform could not personalize the language, which was specifically challenging for international students attending the classes. The platform should support internationalization to accommodate these students.

Technical Updates to the Platform
The platform's source code is available on Github (https://bitbucket.org/ul-fri-lgm/ troubadour_production (accessed on 1 June 2022)). The backend was written in Laravel 5.7, which was supported until September 2019. The framework needed to be updated to the latest Laravel version, namely, 8.0. We used two plug-in management tools, the Composer and Node Package Manager, to upgrade the framework. Most of the platform's logic for the REST API was written in controllers, which sent data to the platform's frontend. Thus, the controller communicated with the base directly, which is not in line with the MVC guidelines.
We created a new application programming REST interface for better quality code that will be scalable and manageable over time. The new version had a more transparent source code, which is faster to read and easier to understand. Documentation for the REST API was also generated in this process. The documentation is written according to the Open API 3.0 standard. The documentation is an online document that offers the developer an easy and interactive overview of the exposed links with which the backend communicates with the frontend. All calls can be tested within the API documentation for testing purposes. The new REST API interface was exposed via a new handle (http://trubadur.si/api/v2 (accessed on 1 June 2022)) to retain the backward compatibility with the first version (http://trubadur.si/api/v1 (accessed on 1 June 2022)).

Development of a Learning Management System within the Platform
Based on the teachers' requirements, the learning management system should support the following user stories: • The user with sufficient (teacher) permissions can create a classroom within the platform. The classroom can be named and its access limited only to users who are added to it; • Homework assignments can be created within the classroom. Each assignment should be assigned a deadline, difficulty and number of exercises. The assignment should not be accessible after the deadline; • The teacher can preview the generated exercises and modify (re-generate) each exercise if needed; • The teacher can observe the statistical data about the completed assignments, including the gathered points and time spent, per user and in an aggregate form.
The students were given a new view within the platform to access the classrooms. In addition, the teacher was also given access to the classroom administration module (https: //trubadur.si/administration).
On the landing page view of the administration module, the teacher can observe the basic statistical data for the current week (Figure 3a).
A list of their classrooms is available to the teachers in the side menu ( Figure 3b). Each classroom also contains information about the enrolled students. Classrooms can also be sorted and searched (top right corner). The teacher can then access a list of all homework assignments in a detailed classroom overview. Each assignment can be modified or deleted (Figure 4a). The teacher can observe individual statistical data for each assignment (Figure 4b). The teacher can observe the spent time and the achieved score for each enrolled student. Aggregate homework assignment data is also available, such as average points achieved in time spent per assignment. This view, therefore, offers the teacher a detailed insight into their students' homework. Considering the time-related statistical data, the teacher cannot observe these data in a conventional homework assignment.  The challenges follow the same methodology as classrooms, except they do not limit the users in creating or joining a challenge. All parameters are made available to the challenge creator-in this case, a student or a teacher. These parameters include the level of difficulty, the deadline date and the number of games. The statistical data are made available to the challenge creator at the end of the challenge. The intention of this model is to enable the students to produce their own challenges and more feeely interact with their peers by posing challenges to each other, which may differ from their current stage in the curriculum.

Internationalization
We also made backend improvements for greater scalability and a better user experience. As past research has shown that the platform is useful to students of the Conservatory of Music and Ballet Ljubljana, one of the goals has become to add the possibility of using the platform for international students and other developers to use and adjust the platform for their curriculum and language. The platform at the time supported only the Slovenian language directly integrated into the source code (last git commit). Initially, this was not considered an important feature since two instances of the platform were maintained separately, one in the Slovenian language and the second in the English language. We, therefore, developed a localization module to support different languages and the personalization of the user interface without running separate instances of the platform. We used the Vue i18n plugin. On the platform, we first extracted translations from all Vue files, replaced them with plug-in method calls, and specified the translation key as parameters. We then created the files for the translations. We generated two translations, which are now made available in the Git repository (sl.json and en.json). Future translations to additional languages are also supported in the codebase.

Experiment
After developing the desired platform functionalities, we evaluated the long-term user experience along with the newly-implemented functionalities. Following the teachers' suggestions, the platform now included more detailed statistics on the users' performance. Additional statistical data shown to the teachers included time spent solving individual questions, number of try attempts and mistakes, number of repetitions and replays, and number of correct and incorrect answers. The students' performance proved to be valuable information on the long-term engagement with the platform.
In addition to passively collected data, we also gathered the students' responses through questionnaires at the beginning and towards the end of the evaluation period. The evaluation period lasted for three months, between March and June 2021.

Questionnaire 1
The goal of the first questionnaire was mostly to obtain basic information about the platform users. In addition to the basic demographic data, we asked users about their interaction with others while training in music theory and what tools they use for training. The questions about the social interactions were self-statements, answered on a Likert scale between 1 (strongly disagree) and 5 (strongly agree), except for questions no. 3 and no. 11-13 (marked by *), which asked about the specific response by the students. The questions were presented as follows: 1.
I like studying with friends.

2.
If I don't understand something, I ask a friend rather than a teacher. 3.
* Which tool (application, social network) most often helps you learn together? 4.
Working with the above tool makes it easier for me to explain to a friend.

5.
With the aforementioned tool, I can easily talk to a friend about interval dictation tasks (problems, solutions). 6.
With the aforementioned tool, I can easily talk to a friend about rhythmic dictation tasks (problems, solutions). 7.
I like to compete with friends in solving such exercises.
We were also interested in how the students trained at the time before they were invited to use the Troubadour platform. Specifically, we asked them about their motivation for solving certain types of tasks and how frustrating is it that they do not have an immediate response to the solved task. They responded to the following statements:
I enjoy solving interval dictation homework. 10. I dislike not getting immediate feedback on my homework.
The last set of questions referred to the time they spent solving tasks. We also wanted to learn whether they used any application or solution for their training.
11. * Do you use any application to solve tasks or prepare for an assessment? If so, with which? 12. * How many hours a week do you spend solving rhythmic dictation exercises? 13. * How many hours a week do you spend solving interval dictation exercises?

Questionnaire 2
We also conducted a second survey at the end of the evaluation period. The final questionnaire aimed to gather information about any potential change in the students' habits during the period. We also gathered information about their experience using the Troubadour platform.
The questionnaire consisted of the following questions: 1. * How many times a week do you visit the online platform Troubadour? 2.
* Do you spend more time per week learning and practicing than before? 3.
How do you agree with the statement that using the Troubadour platform is easy? 4.
* Did this interactive homework (or interactive assignments) make music theory training easier? 5.
* List 3 things you do not like about the platform. 6.
* List 3 things you like about the platform. 7.
* Would you like to tell us something else about the online platform?
The final questionnaire was used to confirm the hypothesis that interactive homework through the Troubadour platform is more engaging to students than the conventional way of completing homework.

Analysis of the Gathered Data
The application was actively used by students of the Conservatory of Music and Ballet, Ljubljana, Slovenia. They ranged in age from 15 to 21 years. The evaluation period lasted from 1 March 2021 to 1 June 2021. We conducted two surveys that helped us collect user responses. We also monitored the user data on the platform to gather information about the students' performance and improvement during the evaluation period. The purpose of the analysis is to evaluate the long-term engagement and impact and to gather future information about suggested improvements to the platform. We gathered the questionnaire responses from two classes (first and second year) of 20 students in total. The classes included students from both jazz and classical programmes. The data collected by the platform also tracked users who did not participate in the questionnaires. These students (about 60) were invited to the platform by word of mouth from their peers and the teachers. No specific invitation or presentation was given to these students; however, the platform was open for anyone to register, and we did not limit access to the platform for other students during the evaluation period.

External Factors
Several key factors impacted this experiment. These factors are mostly related to the COVID-19 pandemic and the presence/absence of the students in class. Consequently, the students attended the school online from March to May and in-class from May until June.
1. Social interaction between the students was also limited to online communication, which may have influenced the amount of communication about the platform. While the students attended the classes online, their workload and, consequently, their motivation varied compared to the in-class situation; 2. Teachers also reported that the general workload for students (including in-class and homework activity) was lower during the lockdown period; however, the execution of this request was left to the individual teacher and varied between teachers. As a result, students may have devoted more time to other subjects and, consequently, less time to music theory practice; 3. The end of the evaluation period also correlated with the final exams and the end of the school year. During this period, the students' focus shifted to the final exams. In this study, we also experienced lower student engagement during the last week of the evaluation; 4. The individual student progress was limited to lower levels at the beginning of the evaluation period. Due to direct student feedback, we unlocked all training levels during week 8 of the evaluation period. This addition of levels is also reflected in the statistical data.

Evaluation
During the evaluation period, the teachers and the students used the platform independently, without interaction from the authors of this study. Five classes were created within the platform. Homework was given to each class individually. However, no requirement or incentive was given to students to complete the homework.
Statistical data on user activity on the platform was the basis for measuring engagement. New registrations, logins and the number of solved exercises were tracked.
Eighty users registered on the platform in total ( Figure 5). Most students registered at the beginning of the evaluation period, when we presented the platform. During the first week, 45 user accounts were created. Later on, there were 35 new registrations, which is more than expected. We believe these registrations are based on word of mouth among the students since the previously registered students suggested the platform to their peers. The weekly average of logins during the first week was around 11 per day ( Figure 6). This was the first week when all the students got to know the platform during the lessons. Then, the weekly average for a few weeks stabilized at an average of five logins per day. A major decline was observed during the weeks after 1 May. At that time, the number of logins averaged around two to three per day. Based on the discussion with the teachers, this can be attributed to the fact that the students' workload and stress increased after the school resumed their courses in-class. The individual logins into the platform also significantly varied between individual days. Students have most commonly logged into the platform twice a week. We attribute this phenomenon to homework deadlines. We also observed the time spent solving individual exercises (Figure 7). We expected it to decrease over time as students quickly mastered the way of solving the exercises at the given difficulty levels. During the first week, students spent an average of 7.1 min per exercise. The following week, the value dropped to 6.35 min. In the following weeks, students spent 2 to 3 min solving the exercise. Solving time was therefore reduced by more than 50%. This drop seemed too steep, and given the students' feedback, the difficulty levels, which were unlocked at the time, were too easy. An individual student could unlock the more difficult levels by completing several exercises at a lower level. However, this number was, in retrospect, set too high.
Consequently, the low difficulty affected student engagement. By the suggestion of the teachers, we unlocked all levels for the students after week 8. At that time, it was no longer necessary to advance to a higher level, but the students could choose for themselves at which difficulty they wanted to practice. The difference was immediately noticeable as the students were again more engaged by choosing the difficulty themselves. Additionally, the time spent per exercise again increased due to the increased difficulty.
Based on the results, we can conclude that the difficulty levels and constraints should be changed to accommodate the students' learning curve better. One of the straightforward changes could be a faster progression through the levels, allowing the user to progress more easily while still allowing them to practice easier levels if they feel the progression is too fast. This change would probably benefit the more talented and knowledgeable students who desire more difficult challenges. In total, 800 games were played on the platform during this time (Figure 8). Each game consisted of two exercises. The number of games varied weekly, with about eight games per day on average successfully solved. The maximal number of games played was during the fourth week, with a peak of 58 exercises on Wednesday of that week. A gradual decline in the number of games is seen after this week. Most of the games were played during the week and fewer during weekends. Games were played more frequently at the beginning of the week. The analysis of the students' engagement on the platform showed interesting results. Initially, we were interested in the distribution of students per games played. We initially assumed one group of students engaged only with the homework while the other group, challenged by the platform, engaged beyond the scope of the exercises provided by the teacher. Figure 9 shows that about 40 users solved only the exercises that were part of the homework. These are visible on the left side of the figure and have played up to about six games. A more encouraging result is the number of students on the right side of the Figure, as almost half of the students played additional exercises. These students played 10 or more games on the platform. We attribute this result to their engagement with the platform. Two students also solved over 100 games on the platform. Based on the average time spent per exercise (4 min), we can conclude that the students spent about 6 h on the platform. We also conclude that more than half of the students started using the platform independently. Both results are encouraging in terms of the students' long-term motivation. The platform also recorded user interaction data during gameplay, allowing us to analyze the students' progress. We recorded the number of times the student added or erased notes within an exercise. We expected the number of additions and deletions to decrease on a week-after-week basis and remain steady as students would grow accustomed to the interface and the difficulty levels. We also recorded the number of times the students tried to submit an answer and had to correct it and the number of times they replayed the recorded playback. Weekly averages are shown in Figure 10. Figure 10a shows the average number of event additions per exercise. The students wrote between 8 and 14 note events per exercise. The values are lower at the beginning and start to rise with the third week when most students had already unlocked level 1.2. During week 3, most newly registered students became accustomed to the platform, followed by fewer additions on average during week 4. Between weeks 4 and 8, the value did not change significantly. In week 9, a significant jump in value occurred when all difficulty levels were unlocked on the platform. The students started to discover more difficult tasks in the following weeks, as the number of additions first dropped and then gradually increased.
The weekly average number of deletions ( Figure 10b) reflects a similar movement as the number of additions. We assumed that both values would correlate, as the students who added a greater number of notes also performed more deletions while completing the exercise and achieving the correct answer. Again, we see a period of platform adoption in the first four weeks. The adoption period is followed by a slight drop in the average. The students also differed from the trial-and-error approach to exercise solving as more corrections and deletions yielded fewer points. Therefore, they started preferring to listen to the playback more carefully and more times. As the assignments have become more difficult in recent weeks, the number of deleted (incorrect) notes has also increased. The average number of replays (Figure 10c) shows the weekly average of the number of playback replays per exercise. While the value correlated with the task difficulty, it also increased over time due to the penalization of retry attempts. After the induction period, the weekly average declined again by week 8, with the average number of replays at around 2.5. Later, the number of replays increased.
The average number of solving attempts shows how many times students attempted to complete an exercise. The platform allowed a maximum of 12 attempts. When this value is exceeded, the application progresses to the next exercise, while the user is not awarded any points for the previous exercise. The students attempted to solve the exercise about twice on average. Therefore, most of them succeeded in the first or second attempt. In week 9, there is an increase in value, which correlates with the availability of more challenging levels.
In general, the analysis of these data indicates that students actively used the platform and found it engaging. Since we unlocked all difficulty levels in week 9, the engagement improved, as the students could tackle more challenging exercises, resulting in higher motivation. Consequently, the difficulty levels should be adjusted to this desire for more difficult tasks, and the platform should accommodate level progression, closer to the students' fast-growing proficiency.

Questionnaire Results
The authors attended two classes at the Conservatory of Music and Ballet Ljubljana, via Zoom. We presented the Troubadour platform and described the features of the inplatform games, challenges and homework. We helped the students interested in using the platform to register on the platform and start an exercise. Concurrently, we asked the registered students to complete the first questionnaire.

First Questionnaire Results
With the first questionnaire, we gathered the basic user information and their interaction with online and mobile tools for music training. After class, we talked with the students about rhythmic dictation and their home exercising. The students reported that rhythmic dictation, along with interval and harmonic dictation, made home exercising difficult. The results of the first questionnaire are shown in Figure 11. A minority of students disliked studying with friends ( Figure 11a). The majority of students (13 students) also agreed or strongly agreed that if they did not understand something, they would prefer to discuss the problem with friends (Figure 11b).
The students reported social networks, such as Discord and Facebook, were most often used while training. However, they did not support the claim that these networks helped them solve the dilemmas of rhythmic and interval dictation. Most of the students disliked competing with friends ( Figure 11c). Therefore, although students like to help each other, they are not competitive.
When asked which homework assignments they prefer to attempt, several students indicated that they prefer to practice rhythmic dictation (Figure 11d,e). From the conversation held with the students after the presentation, they reported that they find it difficult to practise rhythmic dictation at home in a conventional manner.
Interestingly, the students did not find the lack of immediate response to traditional homework problematic (Figure 11f) at the beginning of this experiment. Only two students agreed they would like an immediate response to the homework. The rest were indifferent or disagreed with the statement. The opinion can also be attributed to the fact that at the time, they were already accustomed to having their homework checked the next day in class by the teacher and not earlier.
When we asked students which app they used to aid their homework, they listed only a few applications: TonedEar app and the teoria.com website, which were listed by two students. From this, we can conclude that very few students use applications that could help them with homework and music theory training. We hypothesize that the underlying reason is the lack of native-language support. Additionally, other underlying reasons could be related to a lack of flexibility to adapt the applications to the students' curriculum and lack of motivation, such as gamification elements.
The students reported spending 53 and 55 min on average per week on rhythmic and interval dictation, respectively. The small number can be attributed partly to the high workload of students in class over the week with this and other courses. Another reason is that students had to practise these tasks on paper, which could be frustrating.

Second Questionnaire Results
With the second questionnaire, we gathered the students' feedback on the platform's usability and their suggestions on further improving the games and the platform. We assumed that the students would use the platform weekly, pending their engagement and commitment to improving their theoretical knowledge. Since the platform was evaluated for a shorter period in the previous studies [17,18], extending the evaluation period would result in students' responses past the initial novelty-based engagement.
In total, 18 students responded to the second questionnaire. Since no incentive was given to the students to participate in this questionnaire or to use the platform, the number of responses exceeded our expectations. Most students visited the platform once or twice a week, which corresponds with the data collected. Given the amount of time the students reported spending on music theory training in the first questionnaire and the number of games played on the platform every week, the results roughly coincide with the students' responses.
Interestingly, the majority of students did not report spending more time in music theory training, compared to the time before using the platform. Considering the first question, we can conclude the students began using the platform instead of the conventional training.
Using the Troubadour platform was easy for most students. Only one student disagreed with the statement, while four students were undecided until the statement. Twentythree students either agreed or strongly agreed with the statement.
The question about interactive exercises making the training easier for students also yielded interesting results. Interactive exercises did not make the training easier for three students, while five students did not agree, nor disagree with the statement. Although the majority (19 students) stated the training was easier, these results indicate the platform could further be improved to support the students who disagreed.
To conclude, we asked the students what bothered them about the platform and what they liked. Below are some of the most common observations in the table.
The students' responses, shown in Table 1, indicated several aspects which can be improved in the platform. Overall, the results indicate that most students were still inclined to use the platform after three months, indicating their engagement surpassed the novelty bias. Scoring and a sense of competition. Problems setting the tempo for rhythmic tasks. A simple rhythmic dictation application that offers good functionality for listening training.
Monotony of exercises. Some are too easy. Inability to change the sound (from piano to another instrument) Levels and badges that give the user the opportunity to improve.

Self-Reported Versus Collected Data
The students self-reported the average amount of time spent on rhythmic dictation practice. The estimate varied between 0 and 3 h weekly. Most students reported a value between 0.5 and 1 h, with an average of 53 min spent on rhythmic dictation exercises. In the second questionnaire, the students were asked whether they spent more or less time practicing within the platform with respect to their average practice before the experiment. About three-quarters of the students reported spending less or the same amount of time practicing than before. The data collected on the platform showed a strong correlation between the reported amount and the actual amount of time spent on the platform practicing in weeks 1-8 (50 min on average). However, the amount drastically dropped after the students returned to the in-school education process, down to 24 min in weeks 9-13. Considering the situational effect of returning to school, the students spent more time on exams and tests to compensate for the lost time during the lockdown. This was corroborated by the teachers, who also experienced a significant decrease in students' attention in class due to the overwhelming number of exams that took place in May 2021. Alternatively, the students' ability to communicate in person could also result in a conventional practice among their peers since most students were inclined to study with friends. Most students also self-reported using the platform once to twice a week. The collected data based on the number of sessions per user confirm this self-report.

Discussion
Compared to previous research, which included the Troubadour platform, the experiment was conducted in three months to avoid the novelty-bias effect on the students' motivation and engagement with the platform. Since the platform did not incorporate functionalities that would support organised interaction with the teachers, we added homework and modules, which included new challenges weekly. The students quickly became acquainted with the platform and accepted the platform as a music-training tool. Considering the experimental setup, designed to interfere with the established in-class and out-of-class learning minimally, the number of new user registrations within the platform increased over the initial number of potential students. In the end, more students than recruited used the platform within the first month of the experiment. For the number of logins and games played, the results indicate the students still used the platform weekly after three months of use. Overall, we conclude that students' engagement with the platform remained after three months and is assumed to remain after the conclusion of the experiment. However, the frequency of platform's use could also be attributed to the individual student's engagement with the subject in general-as Szopinski and Bachnik discovered that (self-reported) high engagement correlates with higher assessment of online learning [26]. Whether this correlation also includes causation has yet to be determined in this case.
We noticed that progression through the levels was too slow during the experiment. This began to affect the students' engagement negatively and was verbally reported by the students to the authors. One of the key improvements of the platform is to redefine the progression boundaries by the difficulty level. The progression should be limited by the number of games played but not so limiting as to avoid diminishing the students' motivation. A second option would be to unlock all levels at this stage of formal training. This option could diminish the motivation of beginners, while the more advanced students would not negatively respond to the accessibility of all difficulty levels. This discovery coincides with the engagement responses collected by [27], where making materials available at the beginning of the semester for the students to work at their own pace was reported as one of the key recommendations for retaining the students' engagement and improving their results.
The COVID-19 guidelines, which affected the students' presence in the school, also did significantly impact the students' engagement with the platform. While some decline in their engagement was expected, the students continued using the platform for ear training. We attribute this to the fact that homework ear training is difficult without ICT-supported tools and to the teachers' continued creation of non-obligatory homework through the platform, while the students, who were already accustomed to the platform, continued using the platform at home.
From the results of the second questionnaire, we learned that the students did not like the platform's design and deemed it too childish. As new applications and features were added to the platform, the platform retained the design, potentially useful for a younger audience (elementary-school students). Previous research (e.g., [28]) indicated the user experience is determinal in both the students' engagement with LMSs and their perception of online learning. Moreover, Maslov et al. [29] reported that user experience could influence the students' performance. Considering the students' results in previous experiments, we can state with a certain level of confidence that the user experience is not critically low in case of the presented platform. However, this feedback shows the UX is also not at the students' desired level. To this end, the design of the platform should be redesigned with a more modern look through a more thorough analysis of the user experience for the conservatory-level target group. In this aspect, our future work will consist of redesigning a more modern mobile-app-style platform, as shown in Figure 12. In case of the presented experiment, the analyzed data reflect a specific group of students, already enrolled in a mid-level music institution. While we are working on expanding the platform towards younger target audience-namely, elementary music school and music classes within the general elementary school curriculum-the presented results should not be generalized to the aforementioned population, therefore implying only internal validity of the presented results [30]. Developing a high-engagement platform for ayounger audience with different levels of interest in music will most probably result in lower average engagement. Therefore, the user experience with the platform will pose a bigger challenge in the future of our development.

Remaining Challenges of the Experiment
Given the results, several aspects of this study should be further explored. While the impact on the students' performance was previously evaluated and has proven to have a positive impact in rhythmic and interval dictation exams, the influence of using an ICT tool instead of the conventional pen-and-paper presentation for other students' competencies has yet to be explored similarly to the writing versus typing problem (e.g., [31][32][33]).
In terms of the physical location of the students, we observed that the platform was used during remote-class and in-class learning. The homework, given to the students, was not included in their performance evaluation in a traditional manner (i.e., homework grades). However, the homework was highly correlated with the in-class work, and the teachers could partially influence the students' motivation. While this influence does not diminish the results of this study, they should not be generalized to the platform's use for individual or curriculum-independent training.
Regarding the students' feedback on their experience with the platform, several comments were given on the visual appearance of the platform. There is a possibility that a different design of the platform could more positively influence the students' engagement during the experiment. Since these comments were collected after the experiment, this hypothesis has yet to be evaluated.
Due to the small number of students attending the Conservatory, the number of participants in this study is smaller than desired. Additionally, the COVID-19 situation has made the access and reliability of communication with the students more difficult than in the previous studies performed in this environment. On the other hand, this situation has also allowed the authors to observe the potential differences in the students' engagement in two learning environments (remote and in-class), which were employed due to the COVID-19 pandemic. This insight would be rather difficult to obtain in a non-pandemic situation since it would demand a significant disruption of the teaching process and would also be impossible to conduct without the explicit approval of the Ministry of Education. We, therefore, believe that while collecting data in this study was not as methodological as intended due to the reported limitations, it also provided an insight, which will probably be irreproducible in a real environment, hopefully, for some time.

Conclusions and Future Work
The paper presented a further development of the Troubadour platform. The platform was developed for music ear training and includes applications for interval and rhythmic dictation exercising. Since the platform did not incorporate functionalities that would accommodate long-term interaction with the teachers, we added homework and challenge modules, which included new challenges weekly. The administrative module enables the teachers to observe and react based on the provided insights. The homework and challenge modules automatically generated exercises and motivated students every week.
To evaluate the students' engagement with the platform and gather insights into long-term use, we collected data from students over a three-month period. The students were given exercises, depending on their class, on a weekly basis. No incentive was given for participation, and the amount of homework completed by a student did not influence their in-class grade or give any bonus. The collected data showed high student engagement throughout the period. The evaluation period was also impacted by external factors related to the COVID-19 pandemic. The students were taught remotely for the first two months (March-April) and in-class (May-June). Interestingly, the students used the platform for ear training during the first part-when the regular lessons were taught remotely-and in the second part of the evaluation period, when the learning process moved back to the classroom. The platform has proven to sufficiently aid the home ear training, which demanded significant effort from the teacher to prepare and grade and lacked the instant feedback and infinity of automatically generated exercises for the student.
The newly developed modules for classes, homework and challenges have proven to aid the teachers in their assignment of exercises significantly. Nevertheless, there are several functionalities still to be implemented. The next development cycle will include harmonic dictation exercises, which are currently unavailable on the platform and will complete the three fundamental aspects of ear training. Additionally, we will extend the existing applications using different styles of training. For example, the rhythmic dictation application will include a response by tapping, given a displayed rhythmic sequence.
Similarly, the interval dictation application will also have the inverted mode, where the user observes the interval sequence and sings or hums the response. These extensions will include porting the application to a native mobile application. The API interface presented in this paper is already available. It can support mobile application integrations and other forms or use in different educational tools, including existing LMSs and virtual reality headsets.
As one of the key challenges which need to be tackled, the user design of the platform emerged. We collected students' feedback and identified that the design style should be improved since the current design was initially focused on teenage users and now seems to be more appropriate for elementary school students. Initial wire-frames were already made, and the development is currently undergoing with a designer, based on teacher and student feedback. The new design is expected to be made available in the late spring of 2022, along with new applications for ear training.
While the platform currently includes only ear-training applications, its support can be extended into music theory and instrument training. For example, the aforementioned inverted exercises could take a user's humming or tapping as an input instead of typing. Furthermore, the exercises could be adapted to support singing or instrument as input, further extending the usability of the platform for remote practice. The decision to switch from a web-only platform to web and mobile implementation in the near future is also based on the ability to process the audio input for such tasks efficiently.