Research on Personalized Recommendation Methods for Online Video Learning Resources

: It is not easy to find learning materials of interest quickly in the vast amount of online learning materials. The purpose of this study is to find students’ interests according to their learn ‐ ing behaviors in the network and to recommend related video learning materials. For the students who do not leave an evaluation record in the learning platform, the association rule algorithm in data mining is used to find out the videos that students are interested in and recommend them. For the students who have evaluation records in the platform, we use the collaborative filtering algo ‐ rithm based on items in machine learning, and use the Pearson correlation coefficient method to find highly similar video materials, and then recommend the learning materials they are interested in. The two methods are used in different situations, and all students in the learning platform can get recommendation. Through the application, our methods can reduce the data search time, im ‐ prove the stickiness of the platform, solve the problem of information overload, and meet the per ‐ sonalized needs of the learners. teaching quality and analyzed the results with decision tree. The results show that the attributes of curriculum, credits, weekly hours, number of students, and other attributes have an impact on teaching quality. These laws provide decision support for teaching management and talent optimization of higher education. In the paper [10], the teaching quality is evaluated. They established an evaluation model based on particle swarm op ‐ timization (PSO) and extreme learning machine (ELM) algorithm for English classroom teaching evaluation. This study is helpful to improve the quality of English teaching and the level of teaching management. Liao, Soohyun Nam and others [11] adopted support vector machine binary classifier to train the teaching data of the course and predict the teaching effect of the next semester, so as to help those students who do not perform well in time and make the whole teaching develop healthily. This is a typical artificial intelli ‐ gence assisted instruction. It is not uncommon for artificial intelligence to be used in ed ‐ ucation and teaching research. Machine learning algorithms are usually used together with algorithms of data mining to analyze some teaching data [12]. According to the re ‐ search of student selection criteria, machine learning algorithm, especially support vector machine, is very suitable for student selection modeling. Some researchers employed ar ‐ tificial neural networks to develop decision support system. For example, the research from Fayoum, Ayman G, and Hajjar, Amjad Fuad [13] shows that the assessment of students’ academic indicators is based on academic performance. The data source is in ‐ tegrated into the autonomic neural network to verify the accuracy of the system. This system can provide students with academic planning and suggestions, and provide ac ‐ ademic decision support for managers. This is a typical advanced decision in artificial intelligence technology assisted education. We are also inspired by this, using artificial intelligence technology to assist education, in order to provide students with a better learning effect. Some scholars use machine learning to classify the online learning stu ‐ dents. Lykourentzou and Giannoukos et al. [14] used three machine learning techniques for testing. The results show that the combination of feed forward neural network, sup ‐ port vector machine and probability ensemble simplified fuzzy is better than other methods in


Introduction
E-Learning (electronic learning) breaks through the limitation of time and space, so that learners who want to learn knowledge can get learning materials, watch learning videos, and study as if they are in a school classroom. For some subjects, online learning is even better than traditional teaching. For example, in the field of medicine, students can watch some videos of the operation process repeatedly, which is better than the teacher's live demonstration. Thanks to its ease of use and unlimited access, e-learning improved students' clinical learning process by virtual visual support [1]. As a supplement to traditional teaching, video teaching also has its advantages [2]. In 2020, the sudden COVID-19 epidemic forced all kinds of distance learning platforms to develop quickly. It is, however, not easy to find the information they need quickly in the vast amount of e-learning resources. Personalized recommendations can effectively help learners to obtain learning resources quickly for their personal needs. Intelligent recommendation systems are more and more widely used in today's era of big data [3,4]. There are intelligent recommendation algorithms applied in online shopping, online movie watching, and news pushing. At present, most e-learning systems have no recommendation function. This work will add personalized recommendation of video learning resources to the existing e-learning platform based on data mining and machine learning algorithm.
Learners usually watch videos on the e-learning platform. Some learners know exactly what they want to watch. Some learners do not know which materials are more useful for them and just search at random.
If the e-learning platform is equipped with an intelligent recommendation system then it can recommend more interesting learning videos according to users' preferences and optimize the learning experience for learners; also, improve learning efficiency and enhance the stickiness of the platform. We use machine learning algorithm to effectively mine data from the video's ratings or evaluations given by the learners who have watched. Learners' interests are found and the corresponding learning videos are recommended to them. Some learners do not necessarily score or evaluate the video after learning. In this case, we use a priori algorithm of data mining to find the correlation among the learning materials, and recommend learning video materials to the corresponding learners. For the learners who have not watched videos, we recommend videos with a large number of views according to the subject category they choose when they register. This situation will not be described. In summary, our contributions in this study are as follows.  For the situation that the viewers do not give a score, we give the detailed process of searching the strong association relationship among the video learning materials by a priori algorithm in data mining. According to the strong association relationship of the video data, the videos are recommended to the related learners.  If the evaluation from the learners can be found in the e-learning platform, the collaborative filtering algorithm based on items is used to find out the highly similar video data by Pearson correlation coefficient method, and then the corresponding videos are recommended for the related learners.  These two methods complement each other. Most systems only use one recommendation method. We use two methods. All kinds of users can get personalized recommendation. If a system adopts these two methods at the same time, then its recommendation function is stronger than other systems.
The rest of the paper is arranged as follows. The second section displays related work about research on the reform of education and teaching with technology of data mining and artificial intelligent. The third section introduces collaborative filtering algorithm and the a priori algorithm. In Section 4, we present how to implement personalized recommendation and their detailed processes. In Section 5, we draw conclusions about our work.

Related Works
It is not uncommon for data mining technology to be worked on education and teaching research. Kausar Samina et al. [5] used a clustering method to analyze students' learning behavior and proposed a personalized e-learning system architecture. It can make the teaching content adjust to the students' learning situation and find the best learning state, and then help learners to improve their learning ability. Its hidden mode can provide decision support for educational administrators in the reform of education and teaching system. This is a typical application of data mining in educational decision-making. The research also provides a good inspiration for our work. Aldowah, Hanan, and Al Samarraie et al. [6] explained that learning analysis, behavior analysis and visual analysis in higher education are the results of mining educational data. These results will help to formulate strategies for the reform of education and teaching in the colleges and universities. Our research also has an analysis of students' behavior, but our goal is to find out students' interest in learning. Clustering analysis is a good tool. Antonenko Pavlo D. et al. [7] employed a hierarchical clustering method and nonhierarchical clustering method to analyze the behavior characteristics of learners in e-learning. It reflected the advantages of clustering analysis in data mining in educational research. Aher and Lobo. [8] employed K-means and a priori algorithm to find out the correlation among courses and recommend courses for freshmen. It is about finding pure curriculum relevance, which is different from our personalized recommendation. We try to find the interest of learners to recommend content. Li et al. [9] used data mining to evaluate teaching quality and analyzed the results with decision tree. The results show that the attributes of curriculum, credits, weekly hours, number of students, and other attributes have an impact on teaching quality. These laws provide decision support for teaching management and talent optimization of higher education. In the paper [10], the teaching quality is evaluated. They established an evaluation model based on particle swarm optimization (PSO) and extreme learning machine (ELM) algorithm for English classroom teaching evaluation. This study is helpful to improve the quality of English teaching and the level of teaching management. Liao, Soohyun Nam and others [11] adopted support vector machine binary classifier to train the teaching data of the course and predict the teaching effect of the next semester, so as to help those students who do not perform well in time and make the whole teaching develop healthily. This is a typical artificial intelligence assisted instruction. It is not uncommon for artificial intelligence to be used in education and teaching research. Machine learning algorithms are usually used together with algorithms of data mining to analyze some teaching data [12]. According to the research of student selection criteria, machine learning algorithm, especially support vector machine, is very suitable for student selection modeling. Some researchers employed artificial neural networks to develop decision support system. For example, the research from Fayoum, Ayman G, and Hajjar, Amjad Fuad [13] shows that the assessment of students' academic indicators is based on academic performance. The data source is integrated into the autonomic neural network to verify the accuracy of the system. This system can provide students with academic planning and suggestions, and provide academic decision support for managers. This is a typical advanced decision in artificial intelligence technology assisted education. We are also inspired by this, using artificial intelligence technology to assist education, in order to provide students with a better learning effect. Some scholars use machine learning to classify the online learning students. Lykourentzou and Giannoukos et al. [14] used three machine learning techniques for testing. The results show that the combination of feed forward neural network, support vector machine and probability ensemble simplified fuzzy is better than other methods in total accuracy, sensitivity, and precision. Data mining and artificial intelligence technology are also used in personalized training and teaching research [15]. They analyze students' pre-school information to extract the information features, and then propose a personalized training model based on artificial intelligence and predict the development of students. It can be used to personalized teaching for college students. The model can also be used in career planning. Chen's [16] research is also to study the knowledge that the individual learners master, and then put forward learning strategies for the individual. This kind of targeted guidance makes students' learning progress very fast. Chrysafiadi Konstantina and Virvou Maria [17] developed a dynamic learning platform to arrange the course schedule according to the needs of each learner and truly achieve personalized online teaching. In this way, each learner has his own schedule, and finally completes the training course with high quality. This article realizes the personalized learning of students, and also provides us with good ideas. The purpose of all education and teaching related research is to make students learn more easily and efficiently. Our research purpose is no exception. The above literature provides us with good research ideas, but none of them are ready to use. The implementation of our recommendation function intends to find out the relevance of video learning materials by a priori algorithm of data mining and the similarity between the materials by combining with the collaborative filtering algorithm in machine learning, and then recommend them for the relevant learners. Our aim is to let students reduce the time of searching learning materials, improve learning efficiency, and increase the stickiness of the platform.

Preliminary Notions
This section illustrates some prior knowledge that we will use in the following study.

Collaborative Filtering Algorithm
The most common algorithm in many intelligent recommendation algorithms is the collaborative filtering algorithm in commercial applications. The principle of collaborative filtering algorithm is to find the correlation between users or items according to the user's preference data for items, and recommend the items to the user based on the correlation [18]. According to different principles, collaborative filtering algorithms can be divided into two categories: collaborative filtering algorithms based on users and collaborative filtering algorithms based on items.
Collaborative filtering algorithm based on users is to find the other users with similar interests through the interests of a user and then recommend products that similar users pay attention to. For example, if user A and user B both give a high score for item 1, item 2, and item 3, we can assume that user A and user B have similar interests. If user A gives a high score to item 4, then item 4 can be recommend to user B.
The principle of collaborative filtering algorithm based on items is to find similar items. We use the features of an item to find other items that are similar to the item, so as to recommend similar items to users who are interested in the item. For example, book 1 and book 2 were bought by readers A, B, and C. Therefore, we think that book 1 and book 2 have strong similarity. It can be inferred that readers who like book 1 also like book 2. When a buyer purchases book 1, book 2 can be recommended to him.
This system intends to use collaborative filtering algorithm based on items to recommend relevant video learning materials for the related users.
Whether it is based on users or items for collaborative filtering algorithm, its principle is to find the similarity between data. There are three common methods to calculate similarity: Euclidean distance, cosine value and Pearson correlation coefficient. Here, we use Pearson correlation coefficient to find the similarity between data.

Principle of Pearson Correlation Coefficient Method
Pearson correlation coefficient r is a statistic used to calculate the degree of linearity between two variables [19]. The value range is (−1,1). A positive value indicates that there is a positive correlation between the two variables, and a negative value indicates that there is a negative correlation between the two variables. The greater the absolute value of r, the stronger the correlation between the two variables. Assuming that the two variables are X and Y, the calculation formula of r is as follows: where cov (X, Y) is the covariance of variables X and Y, and D(X) and D(Y) are the variances of variables X and Y, respectively.

Association Rules
Some learners do not give a score after watching the video, so the machine learning algorithm cannot be used. We look for correlations among videos that learners have watched. Association rules are used to find out the correlation among learning materials, and the relevant learning materials are recommended for the relevant learners [20]. Association rule analysis is a classical and practical technology in data mining. By analyzing the frequent combination of things, we can find out the rules and connections between things. A classic example is beer and diapers. Through the analysis of transaction data from Wal-Mart customers, we find that beer and diapers, which are seemingly unrelated commodities, appear in the same transaction list quite frequently. According to the survey, it turns out that the husband usually buys diapers after a wife gives birth, while men usually buy beer for themselves in the United States. As a result, beer and diapers were put in the same place to facilitate customers and increase sales [21].
It is important that is to find frequent itemsets by association rules. Frequent itemsets are item sets that appear frequently in all transaction records. There are two concepts involved: support degree and confidence degree [22]. Support degree refers to the probability that one or more items appear in all transaction records. That is to say the probability is possibility that commodity X and commodity Y or commodity X and commodity Y and commodity Z are purchased at the same time. Assuming that the total amount of transaction records is M, then the formula of support degree is as formula (2).
Confidence degree is the conditional probability of purchasing commodity Y after purchasing commodity X. That is to say, the probability that the customer who purchases X will buy commodity Y.

Application of Association Rules
We can mine the association among different learning materials through the video data that learners have watched if we cannot search the video data that the learners have scored in the system. Here, videos that learners watched with a duration of no fewer than two minutes are selected. We think that if a video is watched for two minutes or more then it is considered that the learner is interested in the content of that. If a learner is interested in a video learning material, but he opens it for less than two minutes and then leaves. We think he may have other things temporarily, then he will open the video to watch again. This video also should be selected which the learner has opened no fewer than two times. According to these two standards, we read the relevant data from the learning platform and store it in an excel table which called learner-video table as a collection of things. To simplify the presentation, Table 1 only intercepts five learners as an example because otherwise the table is too large. For example, L1 means learner 1, and the first row in the table indicates videos that are watched by learner 1 for a time of no fewer than two minutes or fewer than two times include: what is data mining, a priori algorithm explanation, classification by decision tree, and basic principle of random forest.

Learner
Video Name

L1
what is data mining, a priori algorithm explanation, classification by decision tree,the basic principle of random forest.

L2
what is data mining, a priori algorithm explanation, support vector machine, regression analysis.

L3
what is data mining, a priori algorithm explanation, support vector machine, web crawler, machine learning.

L4
what is data mining, a priori algorithm explanation, machine learning, BP neural network model.

L5
what is data mining, machine learning, convolutional neural network, Bayesian classifier.
We set the minimum support degree as 0.4 and the minimum confidence degree as 0.5. The above video names are represented by letters: A denotes what is data mining, B denotes a priori algorithm explanation, C denotes is the classification by decision tree, D denotes the basic principle of random forest, E denotes support vector machine, F denotes regression analysis, G denotes web crawler, and H denotes machine learning. I represents BP neural network model. J denotes convolutional neural network, K denotes Bayesian classifier. Thus, Table 1 is transformed into Table 2. Here, the iterative method of the a priori algorithm is used to find the frequent itemsets. Firstly, we search out all sets only including one item and its corresponding support degree and get Table 3. Table 3. 1 item set and its support degree.

Itemsets Support
The items with less than the minimum support are pruned to get the frequent 1 item set. That is Table 4. Table 4. Frequent 1 item set.

Itemsets Support
We join the frequent 1 itemsets to get the frequent 2 itemsets. Continue to filter and prune the frequent 2 itemsets which are less than the minimum support. The obtained frequent 2 itemsets are shown in Table 5. In this way, merging and filtering again and again until there is no new K + 1 itemset. Then, we get Table 6. In other words, in these five items, A and H appear three times at the same time, A and B appear four times at the same time, and A, B, E appear twice at the same time. These itemsets satisfy the minimum support degree of 0.4. The maximum frequency set is generated, which is {A, B, E}. In addition, then we can calculate the confidence in pairs. We just give an example to illustrate how to use association rules to find strong association videos and then recommend them. In fact, we take a much larger sample size.

Data Reading
Before the algorithm applying, the relevant data is read, analyzed and processed from the online learning platform.
The video data is read out and stored in two excel tables. One table is "video learning materials table. XLS", which stores 5792 learning videos. The fields in Table 7 "video Number", "video name" and "video category". The second table is "Evaluation Score table. XLS", which stores 136,800 scores, and the fields are "Learner number", "video Number" and "score". Reading the data through pandas, we only show the first ten rows of the table in Table 7 because the original table is too large. The two tables can be associated through the "video Number" field and merged into a single table by the merge() function.

Data Analysis
The data in the above consolidated Table 8 is classified by name, and then the mean() function is used to calculate the average score of each video. The average value is sorted from high to low. The count() function is used to count the scoring times. The field "scoring times" is added. Table 9 shows the output of only five rows.  From Table 9, it can be concluded that the scores with fewer evaluation times are also lower, and those with higher scores usually have been evaluated more times. The videos with very low ratings are removed.
Assuming that a student gives a high score to big data analysis technology, we need to find a video with high similarity with it to recommend to the student.
To analyze the data in different ways, such as sum, count, etc., we convert the original data to a table report. The student number is used as the index of the table. The video name is as the column of the table. In addition, the video score is as the data displayed in the table. We use the describe() function to view the descriptive statistics of the table. The statistical results are shown in Table 10. Only 5 columns are given here because the amount of data is too large. The rows in the table represent different learners and the columns represent different videos. Because the table is too large, we only show the first four rows as an example. Nil said that the learner did not give a score for the video. The data table appears a little sparse. The number of video learning materials is large, but each student's evaluation is very limited.

System Construction
After data processing, correlation analysis can be carried out. Taking "big data analysis technology" as an example for analysis, we will display which similar videos should be recommended to them who have watched this learning video.
The evaluation of big data analysis technology is extracted from Table 10, and the first seven lines are displayed with the head() function in Table 11. The Pearson correlation coefficient between big data analysis technology and other videos is calculated with corrwith() function. We merge the tables by merge() function and display the Pearson correlation coefficient and evaluation times of each video with big data analysis technology in the same table. If no learner evaluations for the two learning videos are at the same time, the covariance in Pearson correlation coefficient cannot be calculated, which will lead to the correlation coefficient being empty. Removing these null values, Table 11 changes to Table 12. Merging tables by row index alignment with merge function the correlation coefficient and evaluation times were combined and displayed in a table, as shown in Table 13. With huge video data, only a limited number of videos are evaluated by learners. The evaluation times for many videos are very few, so the statistical results may be biased and difficult to show the real situation. At this time, we set a minimum value of evaluation times, i.e., the threshold value of 50. Then sort values() function is used to arrange the table in descending order of correlation coefficient. The results are shown in Table 14. After setting the threshold, the first five videos with high Pearson correlation coefficient of big data analysis technology are the basic algorithm of data mining, BP neural network model, deep learning algorithm, support vector machine and machine learning. Therefore, according to 10,078 scores of 5671 videos in the original data, this paper based on the collaborative filtering algorithm using Pearson correlation coefficient as the similarity measure draws the conclusion that big data analysis technology has high similarity with basic data mining algorithm and BP neural network mode. It can be considered that learners who are willing to learn big data analysis technology will also like the basic algorithm of data mining and BP neural network model. In addition, the basic algorithm of data mining and BP neural network model can be recommended to the learners with high evaluation of big data analysis technology, and also big data analysis technology can be recommended to users who have high evaluation on the basic algorithm of data mining or BP neural network model.

Conclusions
At present, the related algorithms of artificial intelligence are widely used in all professions and trades. The collaborative filtering algorithm is also very common in commercial applications [3,4,23]. It can mine the content or goods that users are interested in and make personalized recommendations. If the recommended content or product can match the user's needs, it can optimize the user experience and create additional revenue. In this work, personalized recommendation can effectively save time to search information for learners, improve the stickiness of the platform, and solve the problem of information overload. For the students who have evaluated the system, the collaborative filtering algorithm of machine learning is used, and the Pearson correlation coefficient method is used to find out the similarity between the video materials, and recommend it to the interested students. For the other students who have not left an evaluation, we use data mining to find association relationship of the video data according to browse behavior. This strong correlation is the basis of our recommendation. Our methods can be used in any platform with video learning materials. On the one hand, it can obtain students' personalized information and facilitate students' searching of materials, so as to create a supportive personalized learning environment. On the other hand, it can improve the viscosity of the platform and optimize the platform data management. The general recommendation system only uses one recommendation method. We use two methods to make all kinds of students in the learning platform receive recommendations. So the recommendation function of the recommendation system based on our methods is better than other platforms.