Next Article in Journal
Joint Data Hiding and Partial Encryption of Compressive Sensed Streams
Previous Article in Journal
Data-Driven Machine Learning-Informed Framework for Model Predictive Control in Vehicles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Teaching Recommendation Model for Practical Discussion Course of Higher Education Based on Naive Bayes Machine Learning and Improved k-NN Data Mining Algorithm

1
Postdoctoral Innovation Practice Base of Sichuan Province, Leshan Vocational and Technical College, Leshan 614000, China
2
Engineering Research Center of Integration and Application of Digital Learning Technology, Ministry of Education, Beijing 100039, China
3
Research Center for the Protection and Development of Local Cultural Resources, Xihua University, Chengdu 610039, China
4
Department of Military Logistic, Army Logistics Academy, Chongqing 401331, China
5
Chongqing Vocational Institute of Engineering, Chongqing 402260, China
*
Authors to whom correspondence should be addressed.
Information 2025, 16(6), 512; https://doi.org/10.3390/info16060512
Submission received: 31 March 2025 / Revised: 24 May 2025 / Accepted: 17 June 2025 / Published: 19 June 2025
(This article belongs to the Special Issue AI Technology-Enhanced Learning and Teaching)

Abstract

Aiming at the existing problems in practical teaching in higher education, we construct an intelligent teaching recommendation model for a higher education practical discussion course based on naive Bayes machine learning and an improved k-NN data mining algorithm. Firstly, we establish the naive Bayes machine learning algorithm to achieve accurate classification of the students in the class and then implement student grouping based on this accurate classification. Then, relying on the student grouping, we use the matching features between the students’ interest vector and the practical topic vector to construct an intelligent teaching recommendation model based on an improved k-NN data mining algorithm, in which the optimal complete binary encoding tree for the discussion topic is modeled. Based on the encoding tree model, an improved k-NN algorithm recommendation model is established to match the student group interests and recommend discussion topics. The experimental results prove that our proposed recommendation algorithm (PRA) can accurately recommend discussion topics for different student groups, match the interests of each group to the greatest extent, and improve the students’ enthusiasm for participating in practical discussions. As for the control groups of the user-based collaborative filtering recommendation algorithm (UCFA) and the item-based collaborative filtering recommendation algorithm (ICFA), under the experimental conditions of the single dataset and multiple datasets, the PRA has higher accuracy, recall rate, precision, and F1 value than the UCFA and ICFA and has better recommendation performance and robustness.

1. Introduction

The goal of higher education is to cultivate innovative and applied talents [1]. It focuses on cultivating students’ interests and improving their practical abilities, encouraging students to transform their theoretical knowledge into practical skills [2]. Based on the training objectives of higher education, course design includes two parts: theoretical teaching and practical teaching [3]. Practical teaching focuses on cultivating and examining the students’ mastery of a certain skill [4]. Traditionally, the basic teaching mode of practical courses is “teacher transferring knowledge”, which means that the teachers design the teaching content while the students passively receive it [5]. The process of this teaching mode is relatively simple. The teachers firstly determine and analyze the theoretical course content, and then extract the skills and requirements involved in the course, determine the practical content based on the skills and requirements, and divide the practical content into several implementation steps, which will be completed by the student groups. At the end of the course, the entire practical teaching activity is summarized by the whole class [6]. This process mainly involves three steps: firstly, the design of the practical content; secondly, grouping students based on the practical topics; and thirdly, the implementation of the practical teaching [7]. Analyzing the design and implementation process of practical teaching in the traditional mode, it can be concluded that the teachers play a leading role in this process. The design of the practical topics, the student grouping, and the implementation and evaluation of practical teaching are all independently completed by the teachers. Students participate in the entire practical course, receive the practical content designed by the teachers, and complete the practical content [8]. This mode has the features of teacher control, prominent theme, and strong pertinence. It can effectively explore the core content of the theoretical course, transform the theoretical teaching content into practical content, guide students to complete the practice in order, and improve the students’ skills in a targeted manner [9].
In addition, there are some other commonly used teaching methods in practical courses, such as the scenario simulation method, the experimental operation method, the task-driven method, the game teaching method, the visiting teaching method, and so on. In the scenario simulation method, the teachers design the teaching scenarios, and the students are divided into groups to perform, identify, and solve problems [10]. In the experimental operation method, the teachers design the experimental content and process, and the students are divided into groups to complete the experiment. Students are required to perform hands-on operations to discover and solve problems during the experimental process [11]. In the task-driven method, the teachers design specific tasks, the students are divided into groups, and the teachers ask each group to complete different tasks to achieve the specified goals [12]. In the game teaching method, the teachers design games with simulated scenarios, in which the students are grouped together to participate and solve problems during the game process [13]. In the visiting teaching method, the teachers lead the students to visit the teaching site, in which the students are divided into groups to discuss and solve problems [14]. These practical teaching methods have their own features, but they all rely on three aspects: the teaching content design, the student grouping, and the practical teaching implementation. Based on the teaching objectives and contents, the three core elements of practical teaching are analyzed, and the following problems in the traditional practical teaching mode are summarized.
The design of practical content is based on the teachers’ teaching experience and their mastery of the teaching content. The practical topics are determined by their qualitative description. This approach lacks a quantitative method for practical topics, and there is no clear quantitative method to explore and visualize the core elements involved in the teaching content. Meanwhile, the design of the practical content lacks a quantitative standard and executable method to guide the teachers in exploring the practical elements, resulting in low accuracy in the design of practical topics.
Grouping students is based on the teachers’ experience and the students’ free matching, without strict grouping criteria or a scientifically quantified grouping method. The grouping process is highly random and casual, and the features and interests of group members vary greatly, resulting in low accuracy of grouping.
The implementation of the practical teaching lacks personalized interest mining or a method for matching and recommending practical content based on the interest mining results, resulting in a low degree of matching between the practical topics and the interests of each group’s members. This ultimately leads to a low quality of practical teaching and poor classroom teaching effectiveness.
In response to the above problems, the practical teaching mode needs to be optimized from three aspects: quantitative modeling of the core elements of the practical topics, a quantitative grouping algorithm model for the students, and a matching and recommendation algorithm model for the practical topics [15]. Firstly, it is necessary to determine the practical teaching topics based on the theoretical teaching content, explore the core element labels of the practical teaching topics, and construct a label quantification model. Secondly, based on the label quantification model, the student interest mining model and the algorithm model for the student grouping are constructed to achieve accurate classifications of students, so that students within the same group all have the closest interests. We establish a practical topic recommendation algorithm based on the student grouping model, with interest mining and matching as the core methods. It aims to accurately recommend the practical topics for each group; then, the teachers and students make decisions based on the recommendation results [16]. According to the problem analysis and the modeling principle, we construct an intelligent teaching recommendation model for a higher education practical discussion course based on naive Bayes machine learning and an improved k-NN data mining algorithm. The main work includes the following aspects.
(1)
We analyze the research background of “artificial intelligence plus education” and the current status and existing problems of practical teaching methods. In response to these problems, we propose the advantages of the intelligent teaching recommendation model based on naive Bayes machine learning and improved k-NN data mining algorithm.
(2)
We construct the student grouping algorithm based on naive Bayes machine learning. Firstly, we establish the training set model for the naive Bayes machine learning algorithm and collect the training dataset from the previous classes. Secondly, we set up a naive Bayes machine learning model to group the students in a class, so that each group of students has the closest interests.
(3)
We build a teaching recommendation model based on the improved k-NN data mining algorithm by implementing the class grouping. Firstly, an optimal complete binary encoding tree for the discussion topic is constructed, and the feature attributes of the discussion topic are matched based on the interests of each group of students. The binary encoding tree is established based on the spatial coordinate system of the discussion topic, and the recommendation model is established on the coordinate system.
(4)
We design the experiments to validate the proposed algorithm, demonstrating its feasibility from three aspects as detailed in the following sections: “Results and Analysis on the Naive Bayes Grouping”, “Results and Analysis on the Proposed Teaching Recommendation Algorithm”, and “Results and Analysis of the Comparative Experiment”. Compared with the traditional collaborative filtering algorithms, the proposed algorithm has higher accuracy, recall rate, precision, and F1 value.

2. Related Works and the Advantages, Application Purpose, Formulated Requirements, and Constraints of the Proposed Model

2.1. Related Works

In the field of artificial intelligence development, the integration of intelligent algorithms into teaching practice is currently a hot research topic. The research on teaching recommendation algorithms mainly focuses on the combination of recommendation algorithms with teaching platform development, teaching design, teaching evaluation, and other aspects. Ren [17] constructed an e-learning-based recommendation model incorporating the user’s previous behaviors, which are used to mine the user’s interest data and recommend Chinese learning resources. The model has been experimentally tested and shows good performance. Fu et al. [18] designed a multimedia system based on Chinese language teaching by combining data mining techniques and recommendation algorithms. The research optimized and improved the information resource library of the system from the perspective of users and added system functions to meet the personalized needs of certain users. It enhances the efficiency of the teaching system. Yin [19] developed an improved collaborative filtering algorithm that combined users’ social relationships and behavioral characteristics to recommend relevant teaching content for music teaching platforms. The algorithm applies the users’ social relationships to construct the similarity calculation formula and uses the behavioral feature data as the basis for recommendation calculation, improving the accuracy of the recommendation algorithm. Liu [20] established a new mode of college English teaching based on the personalized recommendation of teaching resources and developed a teaching recommendation model based on a collaborative filtering algorithm to improve the accuracy of the recommendation algorithm. Zhang et al. [21] established a recommendation model for the online teaching of Chinese as a foreign language based on user interest similarity using a collaborative filtering recommendation algorithm. Three experiments were designed to prove that the constructed algorithm has good recommendation performance. Ying [22] developed an interactive AI virtual teaching resource recommendation algorithm based on similarity measurements. The core idea is to mine the users’ similarity and find similar user neighbors for the current users, thereby recommending the teaching resources to the neighboring users and improving the accuracy of the recommendation results. Liu [23] developed a collaborative filtering recommendation algorithm based on the user, content, and student profiles. It analyzed the users’ previous behaviors to construct their interest profiles and incorporated a recommendation algorithm to build a personalized classroom teaching model. Lu [24] used information retrieval technology to optimize a recommendation algorithm and combined two methods for recommendation. The main steps that affect the recommendation results in recommendation algorithms include similarity calculation, nearest neighbor selection method, rating prediction method calculation, etc., which can accurately retrieve the content that users are interested in and greatly improve the diversity of recommended content. Gavrilovic et al. [25] constructed two discrete-element heuristic algorithms to group students in e-learning. It groups students with different knowledge levels and recommends teaching content to each group of students, thus improving the overall efficiency of the online learning process. Baig et al. [26] proposed an efficient knowledge-graph-based recommendation framework that can provide personalized e-learning recommendations for existing or new target learners with sufficient previous data of the target learners. Bhaskaran et al. [27] developed an intelligent recommendation system by using clustering based on splitting and conquering strategies, which can automatically adapt to learners’ needs, interests, and knowledge levels; provide intelligent suggestions by evaluating the ratings of frequent sequences; and offer the optimal recommendations for the learners. Nachida et al. [28] designed a new educational recommendation model: EDU-CF-GT, which is based on the universal CF-GT model. The constructed model can adapt to the complexity of the education field and improve learning efficiency by simplifying resource acquisition. Bustos López et al. [29] developed an educational recommendation system that combines collaborative filtering with sentiment detection technology to recommend educational resources to users based on their preferences/interests and user emotions detected by facial recognition technology. Amin et al. [30] proposed a new personalized course recommendation model, which was implemented by an intelligent electronic learning platform. This model aims to collect data on the students’ academic performance, interests, and learning preferences and use the information to recommend the most beneficial courses for each student. Lin et al. [31] developed a deep learning recommendation system, which includes augmented reality (AR) technology and learning theory, for non-professional students with different learning backgrounds to learn. It can effectively improve the students’ academic performance and optimize their computational thinking ability. Chen et al. [32] designed a reliable personalized teaching resource recommendation system for online teaching under large-scale user access. It combines collaborative filtering and unit closure association rules, achieving the reliable recommendation of personalized teaching resources. Qu [33] constructed a personalized system and joint recommendation technology for English teaching resources. It improved the traditional joint recommendation algorithms and proposed hybrid recommendation. The recommendation system has good effectiveness and stability in both performance and practical applications. Wang et al. [34] designed a recommendation system based on multiple collaborative filtering hybrid algorithms and evaluated the performance of the recommendation system through teaching practice. The experiment proved that this hybrid method has certain advantages in recommending Chinese learning resources, with high accuracy in recommending learning resources.
Based on the analysis of the existing research, it can be concluded that the current research on teaching recommendation systems mainly focuses on the combination of recommendation algorithms with teaching platform development, teaching design, teaching evaluation, and other aspects. In collaborative filtering algorithms, the users’ previous behaviors are mined, and teaching resources are recommended to users. The accuracy of the teaching resource recommendations is increased by improving the recommendation algorithms. However, there are several problems with these types of research methods. Firstly, most of the research only learns from the users’ perspectives, mining their interests and establishing the recommendation models without quantitatively modeling the teaching contents, teaching topics, or teaching resources. The teaching objects targeted by the models are not clearly quantified, and the matching relationship between the user interests and the teaching objects is vague. Secondly, some research explores the behaviors of previous users or analyzes their browsing behaviors to obtain interest tendencies. This method obtains approximate user behaviors rather than precise interests, and the recommendation results cannot fully match the current users’ interests, resulting in low accuracy. Thirdly, some researchers design recommendation models based on collaborative filtering algorithms. The collaborative filtering algorithm itself has limitations. The user-based collaborative filtering recommendation algorithm (UCFA) searches for the approximate users, while the item-based collaborative filtering recommendation algorithm (ICFA) searches for the approximate items. These methods are both approximate searching algorithms, and the recommendation results are also based on approximate matching, resulting in low accuracy.

2.2. The Advantages, Application Purpose, Formulated Requirements, and Constraints of the Proposed Model

2.2.1. The Innovations and Advantages of the Proposed Model

Regarding the problems of the related works, our proposed teaching recommendation model based on naive Bayes machine learning and an improved k-NN data mining algorithm has the following innovations. Firstly, we quantitatively mine the teaching contents and topics and construct a quantitative matrix to label the core elements of practical teaching, so that the teaching objects have clear quantitative expressions, which is more in line with the data structure required for building the recommendation algorithm. Secondly, we construct a machine learning algorithm and use the naive Bayes classification model to classify the interests of students in the class, achieving the teaching grouping. The model can accurately group class students based on the teaching content labels, the student feature attributes, and the teaching topic categories, providing the grouping criteria and standards for the practical classroom teaching. Thirdly, the establishment of the recommendation model is based on an improved k-NN data mining algorithm, which achieves the optimal matching for group interests and practical teaching topics, providing a basis for student group matching, recommending the most suitable teaching topics for the students’ interests, and providing decision support for the teachers to implement the practical teaching. Fourthly, the recommendation model based on the improved k-NN data mining algorithm has higher accuracy than the collaborative filtering recommendation algorithms, and the recommendation results are more in line with the students’ interests and needs.
The constructed recommendation model has the following advantages. Firstly, in the existing research on teaching recommendation systems, most studies focus on recommending teaching resources, student courses, learning plans, etc. However, there is a lack of research specifically on recommendation systems for organizing the practical discussion courses in universities. This model can effectively solve this problem. Secondly, compared with the fuzzy recommendation implemented by traditional collaborative filtering algorithms, this model is based on accurate mining of the student interest labels and the teaching topic labels, achieving precise matching between the two labels. The recommended results are closer to the students’ interests and have higher accuracy. Thirdly, compared with the large-scale model and universal features of traditional teaching recommendation systems, this model can achieve personalized teaching in small classes, divide students into groups according to their interests, and enable the teachers to accurately quantify the interest tendencies of the group members. Based on the label matching, personalized practice content can be recommended and designed for the students, achieving personalized thematic discussions and effectively improving the students’ learning interests and the teaching quality.

2.2.2. The Application Purpose, Formulated Requirements, and Constraints of the Proposed Model

  • The Application Purpose and Formulated Requirements
The constructed teaching recommendation model has a completely different application purpose compared to the recommendation models discussed in the literature review for course selection and teaching planning. Firstly, the constructed model is not specific to which courses the students should choose to achieve their semester goals but is based on a course that has already been offered and intended to design the specific teaching method. In the model, the teaching process of the course must include practical discussion classes. The model is used for accurately recommending the theme of a certain discussion class in the course. The recommendation model is specifically designed for teaching activities such as student interest grouping, discussion topic classification, discussion topic matching, and recommendation in the practical discussion courses. Secondly, the constructed model is not designed for the students’ specific majors, to recommend the most suitable courses. Instead, based on the established course plan, it helps to design the discussion topics for the teachers and the students, determine the specific student groups, and recommend the specific discussion topics for each group in the practical teaching of the courses. Thirdly, the constructed model is not aimed at how to choose the specific learning issues but recommends the most suitable discussion topics for the different interest groups based on the determined learning contents and themes, in order to achieve problem-oriented discussion teaching. Thus, we summarize the application purpose and formulated requirements of the constructed teaching recommendation model as follows:
(1) It is used for a specific course.
(2) The course must include the teaching process of practical discussion.
(3) During the teaching process, it is specifically used for a practical discussion class.
(4) It is used for student interest mining, student interest grouping, discussion topic classification, and recommendation in the practical discussion class.
(5) It is suitable for a small class with a moderate number of students (such as small classes of 10–20 students). Additionally, the number of student groups should not be too large, and the number of students in each group should not be too large, either. When there are a large number of students in a class (such as 50 students), it is necessary to split the class or increase the class hours to complete the teaching task in batches and ensure the teaching quality.
(6) The number of discussion topics is limited to one, and the number of grouped topics divided by the discussion topic cannot exceed five.
2.
The Model Features in Practical Teaching Case
Based on the application purpose and the formulated requirements of the model, combined with the specific scenarios of practical teaching, the constructed teaching recommendation model has the following features and advantages in practical teaching cases compared to the traditional recommendation models:
(1) Accuracy. Traditional recommendation models commonly use collaborative filtering algorithms to implement recommendations, which are based on the previous users’ interests or current users’ behaviors to recommend objects that are close to the current users’ preferences. It is an approximate recommendation. The constructed teaching recommendation system matches the discussion topics based on the current interests of students and has built a strict mathematical model for matching interests, which has the feature of accuracy. In the teaching recommendation system, the recommended objects are the discussion topics, and the users are the students. The feature of accuracy is reflected in the modeling process of feature engineering in the recommendation system, which includes the following features:
  • Primary feature labels of the recommended objects: the designed discussion topic for the course;
  • Secondary feature labels of recommended objects: the grouped discussion topics determined by the discussion topic, with each group representing an interest tendency;
  • Primary feature labels for students: based on the secondary feature labels of recommended objects, we design the labels that students are interested in, and let the students judge their degree of preference for the labels;
  • Student classification labels: corresponding to the secondary feature labels of the recommended objects;
  • Discussion contents and feature labels: we further subdivide the group discussion topics into several discussion contents, quantify the labels of the discussion contents, and use them to match the student interest labels.
(2) Personalization. The constructed recommendation model has personalized features, achieving precise matching and recommendation between the student interest labels and the discussion contents, meeting the personalized interests of each student. Its feature of personalization is manifested in the internal algorithmic logic of the recommendation system outputting the discussion contents that match the interests. It includes the following features:
  • Collection and quantification of discussion content labels: we determine the discussion contents based on the discussion topic, then collect discussion content labels, and quantify the labels;
  • Collection and quantification of student interest labels: regarding the designed discussion contents of student classification, we collect interest labels and quantify the labels;
  • Build the matching model: we construct the matching model between the discussion content labels and the student interest labels to achieve the personalized recommendation.
Based on the analysis of the features of the recommendation model, we compare the features of the constructed recommendation model with those presented in related works, and the results are shown in Table 1. From the feature comparison results in Table 1, it can be concluded that the constructed model has great innovation and advantages compared to the related research. It can realize the student interest grouping and recommend specific teaching contents based on the interest grouping.
3.
The Constraints of the Model
Based on the analysis of the application purpose and the formulated requirements, it can be concluded that the constructed teaching recommendation model has certain constraints in its application. Firstly, it is applicable to courses that include practical discussion teaching, while for courses that do not include the practical discussion teaching (such as theoretical courses), this recommendation model is not applicable. Secondly, there are constraints on the specific objectives and discussion contents of the practical teaching. It is suitable for classes that include practical activities such as exploration, debate, and discussion in order to train the students’ oral expression ability, logical thinking ability, and teamwork ability. It is particularly suitable for subjects in the humanities and social sciences such as tourism, education, philosophy, history, and sociology, as well as group discussions on solutions in the subjects of the natural sciences. This method is not suitable for practical activities such as writing computer programs, building algorithms, executing engineering projects, conducting chemical and physical experiments, etc. Overall, the recommendation model is suitable for oral debates and discussions but not for practical applications that require hands-on experience. Thirdly, the algorithm design of the recommendation model is suitable for teaching small classes. For classes with a large number of students, in order to ensure the teaching quality, it is necessary to divide the classes or increase class hours to complete the teaching tasks in batches.

3. Methodology

3.1. Class Grouping Algorithm Based on Naive Bayes Machine Learning

The naive Bayes machine learning algorithm is a classification algorithm built on the basis of constructing a training set for the feature attribute objects containing independent properties, with the goal of quantifying the posterior probability of the object to be classified belonging to a certain class [35]. For the practical teaching content of the same teaching course, the interest classification of the previous class of students provides the raw structured data for constructing the naive Bayes machine learning algorithm [36]. We firstly construct a training set model for the naive Bayes machine learning algorithm, and then establish a class grouping algorithm based on the naive Bayes machine learning. The modeling principle of the algorithm, as well as how the algorithm realizes the student grouping, are interpreted as follows:
(1) The first step: The naive Bayes machine learning algorithm is a supervised learning algorithm. Therefore, for a certain discussion course, we select an adequate amount of students and their label data from the previous classes that have organized identical discussion courses to construct a training set for the naive Bayes machine learning algorithm. It includes the recommendation system labels described in Section 2.2.2, namely the primary feature labels of students and the classification labels of students. We then quantify the labels [37].
(2) The second step: We collect the student data from the current class to be classified, including the primary feature labels of the students, and then quantify the student labels.
(3) The third step: We use the collected student data from the previous classes to construct the naive Bayes machine learning algorithm. Based on the algorithm, we input the student label data from the current class (a student who is to be classified) and calculate the posterior probability of each classification label for the student. The student’s assigned group is the group with the highest posterior probability [38].
  • The first step is used to interpret the data collected from the previous classes to build the algorithm.
  • The second step is used to interpret the data collected from the current class (a class that will be organized to have a discussion course, and its students will be grouped).
  • The third step is used to interpret how the student data in the current class is used in the naive Bayes machine learning algorithm.

3.1.1. Training Set Model for Naive Bayes Machine Learning Algorithm

To establish the naive Bayes machine learning algorithm, it is necessary to select the individual students from the previous classes who meet the naive Bayes classification criteria and then construct the quantitative labels for the core elements of the practical teaching topics to determine the classified labels of the individual students. We construct the feature vector and the data matrix for the students in the previous classes, quantify the data matrix for the students in the previous class, select the individual students who meet the conditions of the naive Bayes machine learning algorithm, and then establish the training set model. We first construct the relevant definitions and then establish the naive Bayes machine learning training set model. Figure 1 shows the training set model of the constructed naive Bayes machine learning algorithm.
Definition 1.
The sample student  S ( i )  and the student feature vector  S ( i ) . Naive Bayes machine learning is a supervised learning algorithm that requires collecting data from the previous courses and establishing a training set. We define an arbitrary student selected from the previous teaching class, who participates in the practical teaching and has feature attributes and interests, as the sample student, denoted as  S ( i ) . We construct a one-dimensional vector to store the teaching interest labels selected and quantified by the students. The dimension is denoted as  1 × k , and its elements store the practical teaching topic elements. This vector is the student feature vector, denoted as  S ( i ) .
Definition 2.
The classification label  T ( i )  for the discussion topic. The labels used for classification in supervised learning are the key elements of naive Bayes machine learning. We define the student classification label determined by the topic content in practical teaching as the classification label for constructing the naive Bayes machine learning, denoted as  T ( i ) . After determining the classification label, the student feature vector is expanded into dimension  1 × ( k + 1 ) , and the last element of the vector stores the label  T ( i ) .
Definition 3.
The student data matrix  S [ i ] . We select sample students from all the previous students who could be used as the training set for constructing the naive Bayes machine learning algorithm, and store them in a matrix in the order of a certain algorithm. We define the matrix as the student data matrix, denoted as  S [ i ] .
Definition 4.
The training set model  T r  for the naive Bayes machine learning. According to the storage rules for the feature labels and classification labels in the naive Bayes training set, the vector  S ( i )  is topologically transformed to generate a training set for constructing the naive Bayes machine learning, denoted as  T r . The training set stores the student labels and classifications that have undergone data prepossessing.
The algorithm for constructing the training set model for the naive Bayes machine learning based on the previous student teaching data is as follows (Appendix A.1):
Step 1: Randomly select N number of students S ( i ) from the previous classes, each of whom is an independent individual with independent and different interests, which meet the conditions S ( i ) S ( j ) = , 0 < i , j N and i , j , N N . We initialize the N number of students S ( i ) as S ( 1 ) , S ( 2 ) , …, S ( N ) .
Step 2: Establish the feature vector S ( i ) for the students in the previous classes, 0 < i N , i , N N . A vector S ( i ) represents a student S ( i ) . Formula (1) is the constructed vector model S ( i ) .
S ( i ) = L ( 1 ) , , L ( i ) , , L ( k ) , | T ( i )
Step 2.1: Determine the k number of the core attributes L ( i ) for the practical teaching topics, 0 < i k , i , k N . Each core attribute L ( i ) is an independent feature that satisfies L ( i ) L ( j ) = . Attribute L ( i ) has a quantifiable interval or discrete value.
Step 2.2: Determine p number of discussion topics based on the practical content and note it as a classification label T ( i ) , 0 < i p , i , p N .
Step 2.3: Establish a 1 × ( k + 1 ) dimensional vector S ( i ) with row rank r a n k ( S ( i ) ) r o = 1 and column rank r a n k ( S ( i ) ) c o = k + 1 . The composition of the vector elements is as follows:
(1) The first element of the no. k element stores attributes L ( i ) , corresponding to the element S ( x , j ) = L ( i ) , 0 < i , j k , in which x represents the no. x student element.
(2) The no. k + 1 element T ( i ) corresponds to the element S ( x , k + 1 ) = T ( i ) , 0 < i p , in which x represents the no. x student element.
Step 3: Establish the previous class student data matrix S [ i ] . According to the number N of students S ( i ) , we set the row rank r a n k ( S [ i ] ) r o and column rank r a n k ( S [ i ] ) c o of the n × n dimensional matrix, satisfying r a n k ( S [ i ] ) r o = N + 1 , r a n k ( S [ i ] ) c o = N + 1 . Formula (2) is the constructed matrix model S [ i ] .
S [ i ] = S ( 1 ) S ( r 1 ) S ( max r 1 ) S ( r 2 ) S ( max r 2 ) S ( r max ) S ( max r max )
Step 3.1: Store the N number of marked students S ( i ) in the matrix S [ i ] in increasing order of rows i and columns j . The storage status meets the following conditions:
(1) If N < n × n , then the remaining n × n N number of elements S [ i , j ] in the matrix S [ i ] are stored as 0;
(2) If N = n × n , then the matrix S [ i ] is full rank.
Step 3.2: The elements S [ i , j ] in the matrix S [ i ] correspond to the students S ( i ) , and the vectors S ( i ) are quantified based on the previous data records. For the student S ( i ) , if any element S ( i , j ) in the vector S ( i ) satisfies S ( i , j ) = 0 , it indicates that the student has not participated in any classification or practical course. Then, there are:
(1) For S [ i , j ] in the matrix S [ i ] , if corresponding student S ( i ) ~ S ( i ) is S ( i , j ) = 0 , then set S [ i , j ] = 0 ;
(2) For S [ i , j ] in the matrix S [ i ] , if corresponding student S ( i ) ~ S ( i ) is S ( i , j ) 0 , then set S [ i , j ] = 1 .
Step 3.3: Mark all the elements of S [ i , j ] = 1 in the matrix S [ i ] with a count of N S . According to the modeling process, there is N S N .
Step 4: Build the training set model for the naive Bayes machine learning. Construct a N S × ( k + 1 ) dimensional matrix T r based on the vector S ( i ) dimension 1 × ( k + 1 ) . Formula (3) is the constructed model T r .
T r = L ( S ( 1 ) , 1 ) , , L ( S ( 1 ) , i ) , , L ( S ( 1 ) , k ) , T ( S ( 1 ) ) L ( S ( 2 ) , 1 ) , , L ( S ( 2 ) , i ) , , L ( S ( 2 ) , k ) , T ( S ( 2 ) ) L ( S ( N S ) , 1 ) , , L ( S ( N S ) , i ) , , L ( S ( N S ) , k ) , T ( S ( N s ) )
Step 4.1: Extract the quantified matrix S [ i ] and extract N S number of elements S [ i , j ] = 1 .
Step 4.2: Define the 1 × ( k + 1 ) dimensional empty vector S ( i ) = 0 with the row rank r a n k ( S ( i ) ) r o = 1 and the column rank r a n k ( S ( i ) ) c o = k + 1 .
Step 4.3 Initialize the row r o = 1 , r o = r o + 1 , and expand the rows of the vector S ( i ) . If r o < N s , continue to execute. Note that r o = r o + 1 . If r o = N s , complete the execution, and output T r .
Step 4.4: Store the quantified labels of the N S number of students into T r . The storage rule is:
(1) The matrix row corresponds to one student vector S ( i ) ~ S ( i ) ;
(2) The element T r ( i , j ) represents the no. j attribute label L ( i ) of the no. i student and satisfies the constraint condition 0 < i N s , 0 < j k , i , j , N s , k N ;
(3) The last column c o = k + 1 of the matrix T r stores the noted classification labels T ( i ) of students S ( i ) ~ S ( i ) .

3.1.2. Class Grouping Algorithm

The establishment of the naive Bayes machine learning algorithm is based on the constructed training set T r , with the goal of achieving student classification in the teaching class and grouping the practical teaching based on the classification results. Based on the k number of core attributes L ( i ) determined by the practical teaching topics, the interest labels and classification labels of students in the previous classes, we construct the naive Bayes machine learning model to calculate the Bayesian posterior probability of each student belonging to the classification label T ( i ) , achieving the classification of the students in the teaching class. According to the modeling principle of the naive Bayes machine learning algorithm, the constructed training set T r consists of two parts: one is the student interest label and the quantified module, namely the element T r ( i , j ) , 0 < i N s , 0 < j k , i , j , N s , k N ; the second is the student classification label module, namely the last column c o = k + 1 of the matrix T r . The columns in the training set T r are independent from each other, which meets the modeling requirements of the naive Bayes machine learning algorithm [39].
Definition 5.
The naive Bayes prior probability model  P ( T ( i ) ) . For the student training set, it is one of the conditions used to construct the naive Bayes machine learning algorithm. It is defined as the probability of the classification  T ( i )  appearing in the total number  N  of samples in the training set.
Definition 6.
The naive Bayes conditional probability density  P ( S ( x ) | T ( i ) ) . For the student training set, it is also one of the conditions used to construct the naive Bayes machine learning algorithm. It is defined as the probability of students  S ( x )  and student feature labels  L ( i )  appearing under the classification conditions  T ( i ) .
Definition 7.
The naive Bayes posterior probability model  P ( T ( i ) | S ( x ) ) . The probability of classifying student  S ( x )  into the classification label  T ( i ) . It is used to determine the final classification of the student  S ( x ) .
Definition 8.
The feature vector  S ( x ) Δ  of the student to be classified. Based on the  1 × ( k + 1 )  dimensional feature vector  S ( i )  of the previous class of students, we construct a  1 × k  dimensional vector with the same attribute labels  L ( i )  for storing and quantifying the feature labels of students to be classified. We define this vector as the feature vector of the student to be classified, denoted as  S ( x ) Δ .
For an arbitrary student S ( x ) to be classified in the teaching class, the classification objective is to match the student S ( x ) with the training set model T r for the naive Bayes machine learning algorithm and the selected classification labels T ( i ) based on the previous classes and use the classification algorithm to classify the student S ( x ) into the classification T ( i ) with the highest Bayesian posterior probability value. The measuring of the interest tendency of the student S ( x ) to be classified by the naive Bayes machine learning algorithm is consistent with the prior probability and previous teaching experience. We suppose that H is the assumption that the student S ( x ) is classified to a classification T ( i ) , and the basic idea of constructing a classification model using Bayes’ theorem is to determine the Bayesian posterior probability P ( T ( i ) | S ( x ) ) of the assumption that the student S ( x ) is to be classified to a classification T ( i ) . Formula (4) is the constructed Bayesian posterior probability model that is used to assume that the student S ( x ) belongs to a classification T ( i ) .
P ( T ( i ) | S ( x ) ) = P ( S ( x ) | T ( i ) ) P ( T ( i ) ) P ( S ( x ) )
For the topic classification T ( i ) determined by the practical teaching, 0 < i p , i , p N , the naive Bayes machine learning algorithm predicts that a student S ( x ) belongs to the classification T ( i ) with the highest posterior probability by calculating the Bayesian posterior probability. For the p number of classifications T ( i ) set in the sample space, the condition for the student S ( x ) belonging to a certain classification T ( i ) is: if and only if P ( T ( u ) | S ( x ) ) > P ( T ( v ) | S ( x ) ) , in which 0 < u , v p and u v . At this moment, the classification T ( u ) relating to the maximum Bayesian posterior probability P ( T ( u ) | S ( x ) ) is the maximum posteriori assumption. Based on the modeling principle, the naive Bayes machine learning algorithm for the student S ( x ) classification is constructed as follows (Appendix A.2).
Step 1: Determine the quantified vector S ( x ) Δ of the student to be classified, which will be used to construct the naive Bayes machine learning model.
Step 2: Perform the equivalent simplification on the Bayesian posterior probability model.
Step 2.1: Without considering the conditional probability constraints, the possibility of the student S ( x ) in any class T ( i ) is identical. Define the probability P ( S ( x ) ) of the student S ( x ) for all classes T ( i ) as constant, i.e., P ( S ( x ) ) = c o n s t .
Step 2.2: Simplify the Bayesian posterior probability model P ( T ( i ) | S ( x ) ) and convert it to calculate the value P ( S ( x ) | T ( i ) ) P ( T ( i ) ) .
Step 2.3: Set δ ( S ( x ) ) = P ( S ( x ) | T ( i ) ) P ( T ( i ) ) . Calculating and comparing δ ( S ( x ) ) is equivalent to calculating P ( T ( i ) | S ( x ) ) .
Step 3: Retrieve the training set model T r and construct the prior probability model P ( T ( i ) ) for the classification labels.
Step 3.1: Mark the students S ( i ) in the model T r who belong to the classification T ( i ) and record S ( i ) T ( i ) .
Step 3.2: Initialize r o = 1 , N T ( i ) = 0 , set the number of students as N T ( i ) , S ( i ) T ( i ) , and determine the rows of the matrix T r : (1) if S ( i ) T ( i ) , then N T ( i ) = N T ( i ) + 1 ; (2) if S ( i ) T ( i ) , then N T ( i ) = N T ( i ) + 0 .
Step 3.3: Iterate r o = r o + 1 , determine whether row r o related to S ( i ) meets S ( i ) T ( i ) , and include N T ( i ) .
Step 3.4: Determine the termination conditions: (1) if r o = N S , the searching ends, output the current N T ( i ) ; (2) if r o < N S , continue searching.
Step 3.5: Build a prior probability model for student classification, as shown in Formula (5).
P ( T ( i ) ) = N T ( i ) N S
Step 3.6: Repeat steps 3.1 to 3.5, traverse T ( i ) ~ i i | 0 < i p , and calculate the prior probability for each T ( i ) .
Step 4: Introduce S ( x ) Δ and quantify the labels L ( i ) to construct the conditional probability density model P ( S ( x ) | T ( i ) ) .
Step 4.1: If the attribute labels in the matrix T r satisfy L ( S ( i ) , i ) L ( S ( i ) , i ) = , construct a conditional probability density function P ( S ( x ) | T ( i ) ) , as shown in Formula (6).
P ( S ( x ) | T ( i ) ) = u = 1 k P ( L ( u ) | T ( i ) )
Step 4.2: Estimate the probability P ( L ( u ) | T ( i ) ) . Each label in the matrix T r satisfies the condition L ( S ( i ) , i ) L ( S ( i ) , i ) = , which means that each label is a discrete feature. The probability estimate P ( L ( u ) | T ( i ) ) is constructed as Formula (7), in which w ( i , u ) is the number of students with attribute labels L ( u ) in the classification T ( i ) , and w ( i ) is the number of students in the classification T ( i ) .
P ( L ( u ) | T ( i ) ) = w ( i , k ) w ( i )
Step 4.3: When counting the number of label L ( u ) as w ( i , u ) = 0 , in order to avoid calculating the conditional probability density value as 0, a perturbation factor σ is introduced as an adjustment when calculating the number of the label as w ( i , u ) = 0 . Then, a conditional probability density model is constructed as shown in Formula (8).
P ( S ( x ) | T ( i ) ) = u = 1 k w ( i , u ) w ( i ) + σ ,   w ( i , u ) w ( i ) + σ < 1
Step 5: Calculate the transformation value δ ( S ( x ) ) for the Bayesian posterior probability, traversing T ( i ) ~ i i | 0 < i p , and determine the classification T ( i ) with the max δ ( S ( x ) ) as the classification that the student S ( x ) belongs to.

3.2. Teaching Recommendation Model Based on Improved k-NN Data Mining Algorithm

The naive Bayes machine learning algorithm determines the classification T ( i ) of the students in the teaching class, and based on the determined student classification, the teachers encode and group the students in the class. According to the inherent logic of the naive Bayes machine learning algorithm, the students in the same classification tend to have a high degree of closeness in their interest tendencies, while the students in the different classifications tend to have a low degree of closeness in their interest tendencies. According to the course design process, the teachers need to determine the specific practical topic content for each group based on the grouping. Scientific and quantitative recommendation and decision making is the best mode for accurately determining the practical topic content for each group, ensuring that the recommended practical topic content can accurately match the interests of each student in the group. Due to the inclusion of multiple students in each group T ( i ) , the construction of the practical topic recommendation algorithm must consider the precise interests of all students. From the perspective of recommendation algorithm design, it is necessary to construct a collective recommendation model [40]. Based on this fundamental principle, we construct a teaching recommendation model based on the improved k-NN data mining algorithm.
The basic idea of the modeling is as follows. We set the classification result as the p number of groups T ( i ) . Each classification T ( i ) represents one interest, and the number of students in each group T ( i ) is N T ( i ) . The teacher determines h i number of discussion topics G ( j ) for each group T ( i ) , 0 < j h ( i ) , j , h ( i ) N . We build a k-NN data mining algorithm between N T ( i ) number of students and h ( i ) number of discussion topics in a group T ( i ) , output n number of the most matching topics for each student, and then find the k number of topics with the best intersection among the n number of topics as the recommended topics for the group T ( i ) of students. The model outputs the recommended topics for other groups T ( i ) by using the same algorithm until the recommendations for the p number of groups are completed [41].

3.2.1. The Modeling of the Complete Binary Encoding Tree for Discussion Topic

According to the modeling concept, we confirm some Definitions for the algorithm. Firstly, the algorithm is constructed to match the individual students S ( i ) in the group T ( i ) with the discussion topics G ( j ) , and the optimal complete binary encoding tree model for the discussion topic is established.
Definition 9.
The discussion topic feature vector  G ( j ) . For the  h ( i )  number of discussion topics  G ( j )  designed for the group  T ( i ) , the containing labels of each discussion topic  G ( j )  have a matching relationship with the labels  S ( i , t )  contained in the  1 × k  dimensional student interest feature vector  S ( i ) . The  1 × k  dimensional vector composed of  k  number of quantified labels used to express the characteristics of the discussion topic is defined as the discussion topic feature vector, denoted as  G ( j ) . The vector elements are labels  G ( j , t ) ,  0 < j h ( i ) ,  0 < t k and  j , h ( i ) , t , k N .
Definition 10.
The quantization vector  S ( i ) q  and the quantization vector  G ( j ) q . We quantify and normalize the elements  S ( i , t )  of the vector  S ( i )  based on the interests of the individual students  S ( i ) , and the resulting quantified vector is denoted as  S ( i ) q , in which the normalized elements are denoted as  S ( i , t ) . For the feature vector of the discussion topic, the elements  G ( j , t )  are quantified and normalized based on the topic characteristics, and the resulting quantified vector is denoted as  G ( j ) q , in which the normalized elements are denoted as  g ( i , t ) .
Definition 11.
The discussion topic weight  δ G ( j ) . The feature vector of the discussion topic describes the degree to which a certain discussion content covers the requirements of the practical teaching under the specific practical teaching conditions. The reciprocal of the modulus of the quantified vector  G ( j ) q  of the discussion topic is defined as the discussion topic weight  δ G ( j ) , quantified as  δ G ( j ) = G ( j ) q 1 . It is the measurement value that covers the requirements of the practical teaching. The higher the weight value is, the higher the coverage will be.
Definition 12.
The discussion topic matching model  f ( S ( i ) q , G ( j ) q ) . Based on the matching feature between the student interest feature vector  S ( i )  and the discussion topic feature vector  G ( j ) , a matching model based on the quantization vector  S ( i ) q  and  G ( j ) q  is constructed to express the matching measurement value between the individual interest of no.  i  student  S ( i )  in the group  T ( i )  and the no.  j  discussion topic  G ( j ) . Formula (9) is the constructed discussion topic matching model.
f ( S ( i ) q , G ( j ) q ) = t = 1 k s ( i , t ) g ( i , t ) q 1 q
Definition 13.
The discussion topic spatial coordinate system  x o y G ( i )  and the spatial coordinates  ( x G ( i ) , y G ( i ) ) . Based on the discussion topic weight  δ G ( j )  and the discussion topic matching model  f ( S ( i ) q , G ( j ) q ) , we quantitatively model the spatial distribution of the discussion topics  G ( j )  included in student group  T ( i ) . We construct a spatial coordinate system for the discussion topics with weights  δ G ( j )  as the abscissa and function values  f ( S ( i ) q , G ( j ) q )  as the ordinate, denoted as  x o y G ( i ) . The discussion topic is represented by a coordinate point, whose coordinates are denoted as  ( x G ( i ) , y G ( i ) ) , in which  x G ( i ) = δ G ( j ) ,   y G ( i ) = f ( S ( i ) q , G ( j ) q ) . Figure 2 shows the constructed discussion topic spatial coordinate system:  Figure 2a shows the constructed spatial coordinate system  x o y G ( i ) , and Figure 2b shows an example of the quantified coordinate system containing the discussion topic points.
Definition 14.
The optimal complete binary encoding tree model  H T  for the discussion topic. The goal of the optimal complete binary tree model is to achieve the ordered storage of parent node and child nodes, output the optimal sorted tree structure, and extract the optimal nodes. We simulate the generation rules of the optimal complete binary tree. Based on the discussion topic spatial coordinate system  x o y G ( i )  and the spatial coordinates  ( x G ( i ) , y G ( i ) ) , we treat the spatial coordinates  ( x G ( i ) , y G ( i ) )  as the codes of the discussion topic in the coordinate system  x o y G ( i )  and construct the complete binary tree with the optimal parent node. This tree is defined as the optimal complete binary encoding tree model for the discussion topic, denoted as  H T . Figure 3  shows the basic generation rules and the logical structure of the tree model  H T . Based on the Definition, the rules and logical structure are generated in Figure 3.
The algorithm steps for constructing the optimal complete binary encoding tree model for the discussion topic are as follows (Appendix A.3).
Step 1: Select the individual student S ( i ) from the group T ( i ) . Quantify and output S ( i ) q . Quantify and output h ( i ) number of G ( j ) q for h ( i ) number of the topics G ( j ) in the group T ( i ) and encode the discussion topics G ( j ) .
Step 1.1: Determine the basic structure of the discussion topic G ( j ) in the coordinate system x o y G ( i ) , including the head-word h e a d and suffix-word S u f f i x . The storage structure is as follows:
{ < h e a d > : discussion topic weight: δ G ( j ) } & { < S u f f i x > : discussion topic matching value: S ( i ) q , G ( j ) q ) }
Step 1.2: Initialize the counter c o u n t = 0 , which is the number of times the discussion topic G ( j ) has been traversed.
Step 1.3: Take j = 1 , initialize the first topic G ( 1 ) , and generate the nodes for G ( 1 ) .
(1) Extract G ( 1 ) q , calculate δ G ( 1 ) , and store δ G ( 1 ) to the related head-word h e a d for G ( 1 ) ;
(2) Extract S ( i ) q , calculate f ( S ( i ) q , G ( 1 ) q ) , and store f ( S ( i ) q , G ( 1 ) q ) to the corresponding suffix-word S u f f i x for G ( 1 ) ;
(3) Complete the storage and generate the storage structure for node G ( 1 ) ;
(4) Note c o u n t = c o u n t + 1 : (i) if c o u n t < h ( i ) , continue searching; (ii) if c o u n t = h ( i ) , the node searching ends, and the algorithm ends.
Step 1.4: For any j , return to Step 1.3 to initialize the no. j topic G ( j ) and generate node G ( j ) . Note c o u n t = c o u n t + 1 : (i) if c o u n t < h ( i ) , continue searching; (ii) if c o u n t = h ( i ) , the node searching ends, and the algorithm ends.
Step 1.5: Iterate the index c o u n t , and stop the iterating when c o u n t = h ( i ) . Output the node storage structures for h ( i ) number of discussion topics G ( j ) .
Step 1.6: Based on the node storage structures for h ( i ) number of discussion topics G ( j ) , establish the spatial coordinate system x o y G ( i ) of the discussion topic under the condition of the individual student S ( i ) interest vector S ( i ) q .
Step 2: Randomly select any topic G ( j 1 ) and store it in the parent node H T ( 1 , 1 ) of the model H T . Randomly select another topic G ( j 2 ) and make the following comparison, in which f ( j ) represents f ( S ( i ) q , G ( j ) q ) , the number j represents the no. j topic G ( j ) , and each function value f ( j ) has the same S ( i ) q .
(1) If f ( j 1 ) f ( j 2 ) , keep the topic G ( j 1 ) storage H T ( 1 , 1 ) unchanged and store G ( j 2 ) in the left child node H T ( 2 , 1 ) of the second layer;
(2) If f ( j 1 ) < f ( j 2 ) , delete H T ( 1 , 1 ) , store G ( j 2 ) in the parent node H T ( 1 , 1 ) of the model H T , and store G ( j 1 ) in the left child node H T ( 2 , 1 ) of the second layer.
Step 3: Introduce arbitrary G ( j 3 ) and make the following comparison:
(1) If f ( j 1 ) f ( j 2 ) :
① If f ( j 1 ) f ( j 2 ) f ( j 3 ) , store G ( j 1 ) and G ( j 2 ) to H T ( 1 , 1 ) and H T ( 2 , 1 ) , and store G ( j 3 ) to the child node H T ( 2 , 2 ) on the right side of the second row;
② If f ( j 1 ) f ( j 3 ) > f ( j 2 ) , store G ( j 1 ) and G ( j 3 ) to H T ( 1 , 1 ) and H T ( 2 , 1 ) , and store G ( j 2 ) to the child node H T ( 2 , 2 ) on the right side of the second row;
③ If f ( j 3 ) > f ( j 1 ) f ( j 2 ) , store G ( j 3 ) and G ( j 1 ) to H T ( 1 , 1 ) and H T ( 2 , 1 ) , and store G ( j 2 ) to the child node H T ( 2 , 2 ) on the right side of the second row;
(2) If f ( j 1 ) < f ( j 2 ) :
① If f ( j 3 ) f ( j 1 ) < f ( j 2 ) , store G ( j 2 ) and G ( j 1 ) to H T ( 1 , 1 ) and H T ( 2 , 1 ) , and store G ( j 3 ) to the child node H T ( 2 , 2 ) on the right side of the second row;
② If f ( j 1 ) < f ( j 3 ) f ( j 2 ) , store G ( j 2 ) and G ( j 3 ) to H T ( 1 , 1 ) and H T ( 2 , 1 ) , and store G ( j 1 ) to the child node H T ( 2 , 2 ) on the right side of the second row;
③ If f ( j 1 ) < f ( j 2 ) < f ( j 3 ) , store G ( j 3 ) and G ( j 2 ) to H T ( 1 , 1 ) and H T ( 2 , 1 ) , and store G ( j 1 ) to the child node H T ( 2 , 2 ) on the right side of the second row.
Step 4: Introduce any G ( j c ) , 4 < c h ( i ) . Calculate the current f ( j c ) corresponding to G ( j c ) , compare f ( j 1 ) ~ f ( j c ) , and continue storing by the following storage rules.
(1) Arbitrary node H T ( u , v ) can have a maximum of two child nodes H T ( u + 1 , * ) and a minimum of zero child nodes;
(2) Regarding the binary tree storage rule, any row u of the tree H T contains 2 u 1 number of child nodes, and G ( j 1 ) ~ G ( j c ) must be stored in the previous c number of nodes in the tree H T , satisfying the following:
① The number of nodes that store G ( j ) meets the requirement u = 1 max u 2 u 1 c . The max u represents the maximum row that is used to store G ( j ) ;
② For any node H T ( u , v ) for G ( j ) , its left node H T ( u , v 1 ) must meet the criteria H T ( u , v 1 ) ;
③ For any node H T ( u , v ) for G ( j ) , its right node H T ( u , v + 1 ) must meet the following:
(i) If the last G ( j c ) is currently stored in H T ( u , v ) , the node H T ( u , v + 1 ) on the right does not exist;
(ii) If the current node H T ( u , v ) is not the last G ( j c ) , the right node continues to store, satisfying the condition H T ( u , v + 1 ) .
④ If the row u contains the child node that stores G ( j ) , then all nodes in the previous u 1 rows of the tree H T satisfy the condition H T ( u , v ) .
(3) Any node H T ( u , v ) satisfies the following:
① The stored H T ( u , v ) of G ( j ) corresponds to f ( j m ) , the stored G ( j ) of child nodes H T ( u + 1 , 2 v 1 ) and H T ( u + 1 , 2 v ) correspond to f ( j k ) and f ( j d ) , and there is f ( j m ) f ( j k ) f ( j d ) ;
② The stored H T ( u , v ) of G ( j ) corresponds to f ( j m ) , the stored G ( j ) of left node H T ( u , v 1 ) corresponds to f ( j k ) , the stored G ( j ) of right node H T ( u , v + 1 ) corresponds to f ( j d ) , and there is f ( j k ) f ( j m ) f ( j d ) .
Step 5: Continue storing G ( j c ) according to the algorithm from Step 1 to Step 4 until the searching stops at the end of traversal c = h ( i ) . The algorithm ends, producing output H T .

3.2.2. Recommendation Model Based on the Improved k-NN Data Mining Algorithm

For any student group T ( i ) , the construction of the k-NN data mining algorithm is based on the optimal complete binary encoding tree of the discussion topic. Firstly, we establish the encoding trees H T for all the individual students S ( i ) in the group T ( i ) and search for the first n number of optimal matching topics G ( j ) based on the encoding tree. Secondly, based on the determined n number of optimal matching topics G ( j ) for N T ( i ) number of students S ( i ) , the group T ( i ) is used as a team discussion group to find out the k number of common optimal nearest neighbor topics for N T ( i ) number of students S ( i ) and to recommend them as the optimal topics for the group T ( i ) . Finally, the model outputs the recommended topics for each group T ( i ) generated for the class, and then the teacher and the students jointly determine the practical discussion topic for the practical course [42]. Based on the modeling principle, we construct the related Definitions for the algorithm.
Definition 15.
The optimal topic vector  U ( i , j )  for the student  S ( i ) . Regarding the constructed student optimal encoding tree  H T , the previous  n  number of nodes are selected starting from the parent node  H T ( 1 , 1 )  as the topics that best match the student  S ( i ) . When the value reaches until the node  H T ( u , v ) , it satisfies the requirement  n = u = 1 u 1 2 u 1 + v . Construct a  1 × n  dimensional vector to store the selected  n  number of nodes corresponding to the topics  G ( j )  from the tree  H T . Define this vector as the optimal topic vector for the student  S ( i ) , denoted as  U ( i , j ) . The  i  in the vector represents the no.  i  student, and the  j  in the vector represents the no.  j  topic  G ( j ) .
Definition 16.
The student optimal topic matrix  U N T ( i ) × n  for the student  S ( i ) . Each student  S ( i )  corresponds to the  n  number of the most matched topics  G ( j ) , and if the group  T ( i )  contains  N T ( i )  number of students  S ( i ) , a  N T ( i ) × n  dimensional matrix is constructed to store all the topics corresponding to  N T ( i )  number of students  S ( i ) . The matrix  U N T ( i ) × n  satisfies the following conditions:
(1) The row rank is r a n k ( U N T ( i ) × n ) r o = N T ( i ) , and the column rank is r a n k ( U N T ( i ) × n ) c o = n ;
(2) The row corresponds to one student S ( i ) , and the column element represents the topic G ( j ) ;
(3) The rows r o ( x ) and r o ( y ) are not linearly related, and the columns c o ( x ) and c o ( y ) are not linearly related;
(4) Any element is a non-zero element;
(5) A matrix U N T ( i ) × n corresponds to a group T ( i ) .
Definition 17.
The discussion topic interest intensity  f G ( j ) . For any group  T ( i )  and the related matrix  U N T ( i ) × n , we construct an algorithm to iterate the frequency of each topic  G ( j )  appearing in the matrix  U N T ( i ) × n , and the frequency of  G ( j )  is defined as the discussion topic interest intensity, denoted as  f G ( j ) . Normalize the intensity  f G ( j )  to obtain the interest intensity weight with a value range of  0 < f G ( j ) < 1 .
Definition 18.
The discussion topic recommendation vector  R ( i , j ) . Extract the  k  number of topics  G ( j )  with the highest intensity  f G ( j )  from the matrix  U N T ( i ) × n  and store them in a  1 × k  dimensional vector in order of decreasing intensity. We define this vector as the discussion topic recommendation vector, denoted as  R ( i , j ) . The  i  represents the no.  i  group  T ( i ) , and the  j  represents the no.  j  topic  G ( j ) .
Based on the basic idea and the related Definitions of the algorithm model, we construct a recommendation model based on the improved k-NN data mining algorithm to recommend the optimal discussion topics G ( j ) for each student group T ( i ) . The constructed algorithm is as follows (Appendix A.4):
Step 1: Determine the group T ( i ) and the student S ( i ) , and quantify the student interest vector S ( i ) q and the topic vector G ( j ) q . Introduce the optimal complete binary encoding tree algorithm for the discussion topics and output the individual encoding trees H T ( i ) for the students S ( i ) .
Step 1.1: Generate the encoding tree H T ( 1 ) for the student S ( 1 ) , and mark the previous n = u = 1 u 1 2 u 1 + v number of nodes including the parent node H T ( 1 , 1 ) , in which u is the maximum row of the current tree with n number of nodes, while v is the maximum element number in the no. u row in the tree.
Step 1.2: Regarding the same algorithm, generate the encoding trees H T ( 2 ) , H T ( 3 ) , …, H T ( N T ( i ) ) for the students S ( 2 ) , S ( 3 ) , …, S ( N T ( i ) ) , and mark the previous n = u = 1 u 1 2 u 1 + v number of nodes in each tree separately.
Step 2: Output the optimal topic vector U ( i , j ) of the student S ( i ) . Based on each encoding tree H T ( i ) of the student S ( i ) and labeled n number of nodes, extract the corresponding topics G ( j ) for n number of nodes and construct the vector U ( i , j ) .
Step 2.1: Take the n number of optimal topics G ( j ) from the encoding tree H T ( 1 ) of the student S ( 1 ) and store them in 1 × k dimensional vector U ( 1 , j ) .
Step 2.2: Take the n number of optimal topics G ( j ) of the corresponding encoding trees H T ( 2 ) , H T ( 3 ) , …, H T ( N T ( i ) ) of the students S ( 2 ) , S ( 3 ) , …, S ( N T ( i ) ) , and store them in 1 × k dimensional vector U ( i , j ) , in which i = 2 , 3 , , N T ( i ) .
Step 3: Initialize the matrix U N T ( i ) × n = 0 and initialize the counter c o u n t = 0 . Generate the full rank matrix U N T ( i ) × n :
Step 3.1: Take the element U ( 1 , j ) and store U ( 1 , 1 ) ~ U ( 1 , n ) in the first row of the matrix in sequence, so that the first row is full rank. Note c o u n t = c o u n t + 1 .
Step 3.2: Take the element U ( 2 , j ) and store U ( 2 , 1 ) ~ U ( 2 , n ) in the second row of the matrix in sequence, so that the second row is full rank. Note c o u n t = c o u n t + 1 .
Step 3.3: In line with the same storage rules, take the element U ( i , j ) , and store the U ( i , 1 ) ~ U ( i , n ) in the no. i row of the matrix in sequence, so that the no. i row is full rank, in which i traverses 3 < i N T ( i ) . Note c o u n t = c o u n t + 1 until the traversal is completed when c o u n t = N T ( i ) ; then, the storing process ends.
Step 4: Build a baseline vector B containing h ( i ) number of the discussion topics G ( j ) . The dimension of the vector B is 1 × h ( i ) , and the vector element B ( j ) corresponds to the stored G ( j ) . Introduce vector B to scan the matrix U N T ( i ) × n .
Step 4.1: The row code of the matrix U N T ( i ) × n is u , the column code is v , and the matrix elements are U ( u , v ) . According to the matrix Definition U N T ( i ) × n , there are 0 < u N T ( i ) , 0 < v n , and u , N T ( i ) , v , n N .
Step 4.2: Initialize the counter c o u n t = 0 . Take the first element B ( 1 ) ~ G ( j ) of the vector B and make the following judgment:
(1) For matrix U N T ( i ) × n , take u = 1 and traverse 0 < v n :
(i) If U ( 1 , 1 ) = B ( 1 ) , then c o u n t = c o u n t + 1 ; if U ( 1 , 1 ) B ( 1 ) , then c o u n t = c o u n t + 0 .
(ii) If U ( 1 , 2 ) = B ( 1 ) , then c o u n t = c o u n t + 1 ; if U ( 1 , 2 ) B ( 1 ) , then c o u n t = c o u n t + 0 .
(iii) Using the same iterative method, if U ( 1 , v ) = B ( 1 ) , then c o u n t = c o u n t + 1 ; if U ( 1 , v ) B ( 1 ) , then c o u n t = c o u n t + 0 . Until the traversal is complete, when v = n , output c o u n t and note c o u n t ( 1 ) .
(2) For matrix U N T ( i ) × n , take u = 2 , and traverse 0 < v n . Iterate over the elements U ( 2 , v ) in the second row of the matrix, and output c o u n t , denoted as c o u n t ( 2 ) .
(3) Follow steps (1)–(2) to traverse all rows 2 < u N T ( i ) of the matrix U N T ( i ) × n and output c o u n t ( 3 ) ~ c o u n t ( N T ( i ) ) separately.
(4) Iterate to calculate δ ( 1 ) = i = 1 N T ( i ) c o u n t ( i ) , and denote δ ( 1 ) as the interest intensity f G ( j ) [ 1 ] of the element B ( 1 ) .
Step 4.3: Initialize the counter c o u n t = 0 . Take the second element B ( 2 ) ~ G ( j ) of the vector B , iterate in line with the same algorithm as in Step 4.2, and output δ ( 2 ) as the interest intensity f G ( j ) [ 2 ] of the element B ( 2 ) .
Step 4.4: Using the same algorithm, output the interest intensity f G ( j ) [ i ] of the element B ( i ) . Traverse 2 < i h ( i ) , complete the iteration, and output f G ( j ) [ h ( i ) ] before the searching ends. Output the quantified vector B .
Step 4.5: Calculate the normalized interest intensity weight f ¯ G ( j ) [ i ] . Formula (10) is the constructed interest intensity weight model f ¯ G ( j ) [ i ] .
f ¯ G ( j ) [ i ] = f G ( j ) [ i ] i = 1 n f G ( j ) [ i ]
Step 5: Build a complete binary tree T B based on the vector B by the following algorithm:
Step 5.1: Take B ( 1 ) and store in the parent node T B ( 1 , 1 ) of the tree. Take B ( 2 ) and make a judgment:
(1) If f G ( j ) [ 1 ] f G ( j ) [ 2 ] , store B ( 2 ) in the left child node T B ( 2 , 1 ) of the second layer.
(2) If f G ( j ) [ 1 ] < f G ( j ) [ 2 ] , delete T B ( 1 , 1 ) , and store B ( 2 ) to the parent node T B ( 1 , 1 ) and store B ( 1 ) to the left child node T B ( 2 , 1 ) of the second layer.
Step 5.2: Take B ( 3 ) and make a judgment:
(1) If f G ( j ) [ 1 ] f G ( j ) [ 2 ] .
① If f G ( j ) [ 1 ] f G ( j ) [ 2 ] f G ( j ) [ 3 ] , keep T B ( 1 , 1 ) and T B ( 2 , 1 ) , and store B ( 3 ) to the right child node T B ( 2 , 2 ) ;
② If f G ( j ) [ 1 ] f G ( j ) [ 3 ] > f G ( j ) [ 2 ] , store B ( 3 ) to T B ( 2 , 1 ) , and store B ( 2 ) to T B ( 2 , 2 ) ;
③ If f G ( j ) [ 3 ] > f G ( j ) [ 1 ] f G ( j ) [ 2 ] , delete T B ( 1 , 1 ) and T B ( 2 , 1 ) , and then store B ( 3 ) to T B ( 1 , 1 ) and store B ( 1 ) and B ( 2 ) to T B ( 2 , 1 ) and T B ( 2 , 2 ) .
(2) If f G ( j ) [ 1 ] < f G ( j ) [ 2 ] .
① If f G ( j ) [ 3 ] f G ( j ) [ 1 ] < f G ( j ) [ 2 ] , keep T B ( 1 , 1 ) and T B ( 2 , 1 ) , and store B ( 3 ) to the right child node T B ( 2 , 2 ) .
② If f G ( j ) [ 1 ] < f G ( j ) [ 3 ] f G ( j ) [ 2 ] , store B ( 3 ) to T B ( 2 , 1 ) , and store B ( 1 ) to T B ( 2 , 2 ) .
③ If f G ( j ) [ 1 ] < f G ( j ) [ 2 ] < f G ( j ) [ 3 ] , delete T B ( 1 , 1 ) and T B ( 2 , 1 ) , and store B ( 3 ) to T B ( 1 , 1 ) and B ( 2 ) and B ( 1 ) to T B ( 2 , 1 ) and T B ( 2 , 2 ) .
Step 5.3: Store B ( 1 ) ~ B ( i ) to the first i number of nodes in the binary tree in line with steps 5.1~5.2, meeting the following conditions:
(1) Any node T B ( x , y ) can have a maximum of two child nodes T B ( x + 1 , * ) and a minimum of zero child nodes.
(2) Any row x of the tree T B contains 2 x 1 number of child nodes, satisfying:
① The number of nodes storing B ( t ) meets the requirement x = 1 max x 2 x 1 i , where the max x represents the maximum row that can be stored currently.
② For any node T B ( x , y ) storing B ( t ) , its left node T B ( x , y 1 ) must satisfy T B ( x , y 1 ) .
③ For any node T B ( x , y ) storing B ( t ) , the right node T B ( x , y + 1 ) satisfies:
(i) If the current node T B ( x , y ) stores the last B ( i ) , the right node T B ( x , y + 1 ) does not exist.
(ii) If the current node T B ( x , y ) does not store the last B ( i ) , the right node continues to store, satisfying the condition T B ( x , y + 1 ) .
④ If the row x contains the child node storing B ( t ) , then all nodes in the previous x 1 rows of the tree T B satisfy the condition T B ( x , y ) .
(3) Any node T B ( x , y ) satisfies:
① The G ( j ) in T B ( x , y ) relates to f G ( j ) [ a ] . Its child nodes T B ( x + 1 , 2 y 1 ) and T B ( x + 1 , 2 y ) of G ( j ) correspond to f G ( j ) [ b ] and f G ( j ) [ c ] . There must be f G ( j ) [ a ] f G ( j ) [ b ] f G ( j ) [ c ] .
② The stored G ( j ) in T B ( x , y ) corresponds to f G ( j ) [ a ] . The stored G ( j ) in the left node T B ( x , y 1 ) corresponds to f G ( j ) [ b ] , while the stored G ( j ) in the right node T B ( x , y + 1 ) corresponds to f G ( j ) [ c ] . There must be f G ( j ) [ b ] f G ( j ) [ a ] f G ( j ) [ c ] .
Step 6: Select the corresponding G ( j ) and f G ( j ) [ i ] of the k number of nodes in the tree T B , and the topics G ( j ) of the k number of nodes are the optimal discussion topics recommended to all the students in the group T ( i ) . Build the same recommendation model and binary trees T B for all p number of groups T ( i ) in line with the same algorithm, and output the optimal discussion topics recommended to all p number of groups T ( i ) . The algorithm ends.

3.2.3. Improvement of the Constructed k-NN Recommendation Algorithm

The basic modeling process of the traditional k-NN algorithm includes three steps: the first step is calculating the distance between the object to be classified and other objects; the second step is selecting the objects closest to the object to be classified; the third step is confirming the classification, in terms of which class the majority of the closest objects belong to and which class the object to be classified belongs to. Compared with the traditional k-NN algorithm, the constructed k-NN algorithm has significant improvements, mainly reflected in the following aspects:
Firstly, the algorithm’s aim is not to achieve the general classification of objects but to calculate and output the common features of the group of objects based on the classification ideas. Using the grouping as the basic unit, it establishes a matching relationship between each student in the group and the overall discussion topics, and then obtains the common interest topic of the student group through cross statistics. This model is based on the necessary conditions for the student grouping in the discussion courses. By constructing the k-NN algorithm, the student group’s interest matching is achieved based on calculating the matching degree of individual students; thus, it greatly improves the logic of the k-NN algorithm.
Secondly, the matching degree between the students and the discussion topics is obtained by calculating the Minkowski distance between the student interest feature labels and the topic feature labels, rather than the spatial distance of the traditional k-NN algorithm. The dimensionality of the feature attribute is higher, containing more complex feature labels. Therefore, the objective function constructed by the Minkowski distance has a higher dimensionality.
Thirdly, for the algorithm logic that has been greatly improved, in order to help teachers and students master the recommendation process of the discussion topics, we introduce the complete binary encoding tree algorithm into the k-NN algorithm. The goal is to output and visualize the strength ranking of individual student matching discussion topics within the group, so that the selection of the value k directly corresponds to the top k number of nodes. This is another innovation and improvement of the k-NN algorithm.
Fourthly, the complete binary encoding tree algorithm has significant improvements compared to the traditional tree structure, and its data structure is significantly different. Each node contains two storage units, one storing the weight of the discussion topic and the other storing the matching value of the discussion topic. The former describes the degree to which the discussion topic covers the practical teaching requirements, while the latter describes the degree to which the student interests match the discussion topic. The teachers and students use the former unit to understand the strength of the discussion topic in line with the teaching objectives, while the latter unit is used to construct the k-NN algorithm. Therefore, the constructed complete binary encoding tree has significant improvements in both structure and functionality.

4. Experiment and Analysis

For the constructed model, we use a professional course as the research basis. The “Rural Tourism” practical course topic of the Smart Tourism Course is set as the experimental object. We randomly select 20 students from the previous teaching classes who participated in the “Rural Tourism” practical course, and had high participation passion and satisfaction in the course, as the training set for constructing the naive Bayes model. By constructing the naive Bayes machine learning model, the current teaching class is divided into several student groups, and then h ( i ) number of discussion topics are designed for each group. By using the proposed k-NN data mining algorithm to output the optimal complete binary encoding trees for student groups, we determine the quantified coordinates of the discussion topics to output the most matched discussion topics for the students and ultimately output the optimal discussion topic for each group. Finally, we design a comparative experiment to verify the advantages of our proposed recommendation algorithm over the traditional recommendation algorithms.

4.1. Data Preparation

We determine the classification labels of the naive Bayes machine learning training set model as T ( 1 ) : “Rural Preservation”, T ( 2 ) : “Rural Cuisine”, and T ( 3 ) : “Rural Farming”.
According to the application purpose and the formulated requirements for building the model in Section 2.2.2 of the Introduction, the application scope of the constructed naive Bayes machine learning model is constrained in small classes with 10–20 students. Therefore, the experiment needs to select small-scale sample data to build the model and verify it. The data’s influence on the effectiveness of the experiment is manifested in the following aspects:
(1) The raw training data for building the naive Bayes machine learning algorithm come from small classes of 10–20 students. We collect the interest data and grouping data of students in small classes, making the raw data highly targeted and capable of constructing an accurate machine learning model for small class grouping.
(2) The features of the naive Bayes machine learning algorithm determine that it exhibits high-performance features on small-scale datasets. Compared to the modeling on large-scale datasets, it has less computational complexity and higher accuracy in the output results, ensuring the accuracy of the class grouping.
(3) In practical teaching applications, the naive Bayes machine learning algorithm needs to control the number of students in a class. When the number of students in a class is too large (such as 50 or more), the class needs to be split or the class hours need to be increased to complete the teaching tasks in batches and ensure the teaching quality.
Based on the above conditions, we select 20 students who have participated in the Rural Tourism practical course from the previous classes. We collect interest labels and vectors, determine each student’s group T ( i ) , and construct the training set model as shown in Table 2. In Table 2, S ( i ) represents the selected representative student sample, I-1 represents “the level of preference for cooking”, I-2 represents “the level of preference for reading”, I-3 represents “the level of preference for sports”, I-4 represents “the level of preference for planting”, and I-5 represents “the level of preference for music”. “MFL” represents “most favorite level”, “FL” represents “favorite level”, and “LL” represents “like level”. In the classification, “T1” represents “Rural Preservation”, “T2” represents “Rural Cuisine”, and “T3” represents “Rural Farming”.
We collect data from the students in the experimental class (Class E1). In the small-class teaching course, we select 15 students S ( i ) , and each student’s selection and evaluation of items I-1~I-5 are based on their own interests. We distribute collection forms with the designed questions to the 15 students, and make them select their preference levels for items I-1~I-5 to determine the feature vectors S ( i ) for each student. We input the student feature vectors into the constructed naive Bayes machine learning algorithm to output the student classification T ( i ) . Table 3 shows the collected feature vectors of the experimental class’s students to be classified. In the table, each row represents the feature vector S ( i ) of a student S ( i ) , in which X ( i ) represents the encoding of the student to be classified. I-1 represents “the level of preference for cooking”, I-2 represents “the level of preference for reading”, I-3 represents “the level of preference for sports”, I-4 represents “the level of preference for planting”, and I-5 represents “the level of preference for music”. In each item, “MFL” represents “most favorite level”, “FL” represents “favorite level”, and “LL” represents “like level”.
According to the teaching content of the “Rural Tourism” practical course, we design the discussion topics G ( j ) for each classification T ( i ) , take h ( i ) = 10 , that is, for each group of the experimental class: T ( 1 ) : “Rural Preservation”, T ( 2 ) : “Rural Cuisine”, and T ( 3 ) : “Rural Farming”. The 10 matched discussion topics are designed for T ( i ) as the raw data for constructing the topic recommendation algorithm. According to the topic recommendation algorithm, we design the quantitative labels g ( i , t ) for each group and construct the feature attribute vectors G ( j ) for the discussion topic based on the labels, and then output the quantitative vectors G ( j ) q . The quantization vector represents the measurement value of the topic G ( j ) . Based on the results of student grouping, within each group, we determine the quantitative values S ( i , t ) of the topic feature labels for the classification T ( i ) on the students X ( i ) . For the Definition, there is 0 < g ( i , t ) < 1 , 0 < S ( i , t ) < 1 . Table 4 shows the designed feature labels for each classification T ( i ) . Based on Table 4, for each classification T ( i ) ( T ( 1 ) : “Rural Preservation”, T ( 2 ) : “Rural Cuisine”, and T ( 3 ) : “Rural Farming”), we choose the tourism city Leshan in Sichuan Province, China, as the research scope to select the relevant scenic spots, then we design the discussion topic G ( j ) , shown in Table 5, 0 < j 10 , j N .

4.2. Results and Analysis on the Naive Bayes Grouping

Based on the proposed naive Bayes machine learning algorithm, we input the related data on the collected feature vectors of the students to be classified in the experimental class and obtain the posterior probability values of each student X ( i ) in the class belonging to different classifications T ( i ) . The results are shown in Table 6. Based on the results in Table 6, we output the posterior probability bar chart and curve trend chart shown in Figure 4. The abscissa in Figure 4 represents the student number, and the ordinate represents the posterior probability value.
Figure 4a shows the naive Bayes posterior probability bar charts of each student X ( i ) for the classifications T ( 1 ) , T ( 2 ) , and T ( 3 ) , in which the blue data bar represents the classification T ( 1 ) , the orange data bar represents the classification T ( 2 ) , and the green data bar represents the classification T ( 3 ) . Figure 4b shows the naive Bayes posterior probability trend curves of each student X ( i ) for the classifications T ( 1 ) , T ( 2 ) , and T ( 3 ) , in which the blue data bar represents the classification T ( 1 ) , the orange data bar represents the classification T ( 2 ) , and the green data bar represents the classification T ( 3 ) . Figure 4c shows the maximum naive Bayes posterior probability bar chart of the student X ( i ) for the classifications T ( 1 ) , T ( 2 ) , and T ( 3 ) , in which the blue data bar represents the classification T ( 1 ) , the orange data bar represents the classification T ( 2 ) , and the green data bar represents the classification T ( 3 ) . Figure 4d shows the trend curve of the maximum naive Bayes posterior probability for all students. The blue data bar represents the classification T ( 1 ) , the orange data bar represents the classification T ( 2 ) , and the green data bar represents the classification T ( 3 ) .
On the other hand, in teaching practice, we organize three experimental classes: classes E1, E2 and E3, each with 15 students and different class members. Among them, the teacher uses the naive Bayes machine learning algorithm to group the class E1 and obtain the student grouping results, while the teacher subjective evaluation method is used to group classes E2 and E3. By organizing a themed discussion course on “Rural Tourism”, the students evaluate the course with the indicators: “grouping satisfaction”, “interest matching satisfaction”, “team collaboration satisfaction”, and “discussion process satisfaction”. The evaluation indicators are divided into three categories: “very satisfied”, “satisfied”, and “dissatisfied”. Based on the students’ evaluations, we calculate the statistical percentage of each indicator in each class and compare them to obtain the results in Table 7.
To test the accuracy of the methods, we use the measurement method of the accuracy indicator in the machine learning algorithm to measure and compare the accuracy of the naive Bayes machine learning algorithm and the teacher subjective evaluation method. The accuracy indicator is shown in Formula (11). In the formula, Y t e s t is the classification label that should be accurate, Y p r e d i c t is the classification label that is predicted to be accurate, s u m represents the total number of accurate labels, and l e n represents the total number of label samples. In the experiment, we set the students who choose “very satisfied” as the accurately predicted classification labels, and the total students in the class as the classification labels that should be accurate. The experiment outputs the calculation results of accuracy, shown in Table 8.
A c c u r a c y = s u m ( Y p r e d i c t = = Y t e s t ) l e n ( Y t e s t )
Based on the analysis of the data in Table 6, Table 7 and Table 8 and the results in Figure 4, we can draw the following conclusions:
(1) The grouping result of students in class E1 by the naive Bayes machine learning algorithm is:
T ( 1 ) : “Rural Preservation”: { X ( 2 ) , X ( 5 ) , X ( 7 ) , X ( 10 ) }.
T ( 2 ) : “Rural Cuisine”: { X ( 1 ) , X ( 3 ) , X ( 6 ) , X ( 12 ) , X ( 14 ) , X ( 15 ) }.
T ( 3 ) : “Rural Farming”:{ X ( 4 ) , X ( 8 ) , X ( 9 ) , X ( 11 ) , X ( 13 ) }.
The grouping result of students in class E2 by the teacher’s subjective evaluation method is:
T ( 1 ) : “Rural Preservation”: { X ( 1 ) , X ( 3 ) , X ( 4 ) , X ( 9 ) , X ( 10 ) , X ( 11 ) }.
T ( 2 ) : “Rural Cuisine”: { X ( 6 ) , X ( 7 ) , X ( 8 ) , X ( 14 ) , X ( 15 ) }.
T ( 3 ) : “Rural Farming”: { X ( 2 ) , X ( 5 ) , X ( 12 ) , X ( 13 ) }.
The grouping result of students in class E2 by the teacher’s subjective evaluation method is:
T ( 1 ) : “Rural Preservation”: { X ( 1 ) , X ( 7 ) , X ( 12 ) , X ( 13 ) , X ( 14 ) , X ( 15 ) }.
T ( 2 ) : “Rural Cuisine”: { X ( 2 ) , X ( 4 ) , X ( 5 ) , X ( 10 ) , X ( 11 ) }.
T ( 3 ) : “Rural Farming”: { X ( 3 ) , X ( 6 ) , X ( 8 ) , X ( 9 ) }.
(2) The proposed naive Bayes machine learning model can effectively identify the independent feature labels representing the students’ preference levels and calculate the posterior probabilities of the students in the different classifications T ( 1 ) , T ( 2 ) , and T ( 3 ) . The probability values have obvious mutual exclusion feature, that is, there is no equivalent posterior probability between the classifications T ( 1 ) , T ( 2 ) , and T ( 3 ) . This shows that the proposed naive Bayes machine learning model has accurate computational performance and operational capability and can accurately calculate the posterior probability of the student samples in various classifications.
(3) For any students X ( i ) to be classified, the posterior probability values output by the naive Bayes machine learning model are different. Analyzing the data in Table 6 and the results in Figure 4a, it is concluded that the same student sample has different posterior probability values, which is determined by the classification properties of the naive Bayes machine learning model. The maximum peak among the three peaks corresponds to the classification of the student sample. The classification of the sample student X ( i ) depends on his level of preference for each item, which is independent of the preference levels of other students ¬ X ( i ) for the item. This meets the modeling conditions of the naive Bayes machine learning model, in which the samples have independent features.
(4) Figure 4b shows the posterior probability fluctuation curve of all student samples X ( i ) in each classification X ( i ) , indicating the variation pattern of the posterior probability values of the student samples T ( i ) in each classification. The curve graphs in each classification show a fluctuating trend, indicating that the same learning model has different abilities in the same classification T ( i ) for different students and can generate different posterior probability values. The probability value at the peak of the same curve corresponds to a higher probability of belonging to the classification. For different classifications T ( i ) , the classification with the highest peak of the same sample student is the proper classification of the sample student.
(5) We obtain the result in Figure 4c based on the data in Table 6 and the results in Figure 4a,b. We extract the posterior probability with the highest peak value for each sample student X ( i ) from Figure 4a,b and obtain the highest posterior probability belonging to the different classifications. The analysis results show that each sample student only belongs to a unique classification T ( i ) and exhibits a relatively uniform distribution pattern, which is in line with the features of the naive Bayes machine learning model. This proves that the proposed model can uniformly classify the samples and that it conforms to the algorithm features.
(6) Figure 4d shows the highest posterior probability distribution curve of the entire student sample X ( i ) . From the results, it can be concluded that the highest posterior probability value of each student fluctuates greatly, indicating that the naive Bayes machine learning model has an independent action scope for each student and shows significant discrimination. It proves that the proposed machine learning model can effectively classify students, and the posterior probability value corresponding to the classification to which the student T ( i ) belong must have a clear distinction from the posterior probability values of other classifications.
(7) According to the analysis of the data in Table 7, the total proportion of “very satisfied” and “satisfied” indicators in class E1 is higher than that in class E2 and class E3, while the proportion of “dissatisfied” indicator is lower than that in class E2 and class E3. Overall, students in class E1 are more satisfied with the grouping results and interest matching, which has a positive impact on the team collaboration and discussion process, resulting in higher satisfaction with the group collaboration and discussion process than in class E2 and class E3. It indicates that the constructed naive Bayes machine learning algorithm can effectively achieve student interest grouping, and compared to the teacher subjective evaluation method, the students have a higher satisfaction degree. According to the analysis of the data in Table 8, the accuracy of each indicator in class E1 is higher than in the class E2 and class E3, indicating that the constructed naive Bayes machine learning algorithm has a higher accuracy than the teacher subjective evaluation method.

4.3. Results and Analysis of the Proposed Teaching Recommendation Algorithm

We collect the quantitative vectors S ( i ) q of the student samples within each group T ( i ) based on the classification results. Based on the data in Table 5, we collect the quantitative vectors G ( j ) q of the discussion topics G ( j ) , calculate the weights δ G ( j ) and matching values f ( S ( i ) q , G ( j ) q ) of the discussion topics according to the constructed optimal complete binary encoding tree, and output the corresponding spatial coordinate systems and the optimal complete binary encoding trees for the different groups T ( i ) of the discussion topics. Figure 5 shows the coordinate system and the distribution of points for discussion topics within each group T ( i ) of students. Figure 5a–d represent the coordinate system of the students {a- X ( 2 ) , b- X ( 5 ) , c- X ( 7 ) , d- X ( 10 ) } in the group T ( 1 ) , with each point representing the corresponding topic G ( j ) . Figure 5e–j shows the coordinate system of the students {e- X ( 1 ) , f- X ( 3 ) , g- X ( 6 ) , h- X ( 12 ) , i- X ( 14 ) , j- X ( 15 ) } in the group T ( 2 ) , with each point representing the corresponding topic G ( j ) . Figure 5k–o shows the coordinate system of the students {k- X ( 4 ) , l- X ( 8 ) , m- X ( 9 ) , n- X ( 11 ) , o- X ( 13 ) } in the group T ( 3 ) , with each point representing the corresponding topic G ( j ) . Based on the calculation results in Figure 5, we output the optimal complete binary encoding tree H T for the student discussion topics within each group T ( i ) , as shown in Figure 6. Figure 6a–d show the encoding trees for the students {a- X ( 2 ) , b- X ( 5 ) , c- X ( 7 ) , d- X ( 10 ) } in the group T ( 1 ) , Figure 6e–j show the coding trees for the students {e- X ( 1 ) , f- X ( 3 ) , g- X ( 6 ) , h- X ( 12 ) , i- X ( 14 ) , j- X ( 15 ) } in the group T ( 2 ) , and Figure 6k–o show the coding trees for the students {k- X ( 4 ) , l- X ( 8 ) , m- X ( 9 ) , n- X ( 11 ) , o- X ( 13 ) } in the group T ( 3 ) .
Based on the calculation results in Figure 5 and Figure 6, we introduce the coordinate values and the binary tree values into the constructed improved k-NN data mining algorithm. The current classification T ( i ) contains h ( i ) number of nodes. We set n = 5 , 6 , 7 , 8 . We calculate the weight of the interest intensity f ¯ G ( j ) [ i ] for the topics G ( j ) in each classification T ( i ) . The output results are shown in Table 9, and the trend of interest intensity weights is shown in Figure 7, in which the blue represents the group T ( 1 ) , the orange represents the group T ( 2 ) , the green represents the group T ( 3 ) , the abscissa represents the discussion topic number, and the ordinate represents the interest intensity weight.
By analyzing the results of Figure 5, Figure 6 and Figure 7 and Table 9, the following conclusions can be drawn:
(1) Based on Figure 5, after determining the interest vectors of the students within each group, the coordinate for each discussion topic can be obtained. As to the distribution of the topic points in the coordinate system, it can be concluded that the value of the student interest vector is directly related to the degree of the topic dispersion distribution.
① The distribution of the discussion topics of the student X ( 2 ) in the group T ( 1 ) is relatively concentrated, while the distribution of the discussion topics of other students is relatively scattered, indicating that the first group of topics has a similar matching degree for the student X ( 2 ) , while the matching degree for other students is relatively scattered.
② The distribution of the discussion topics of the student X ( 6 ) in the group T ( 2 ) is relatively concentrated, while the distribution of the discussion topics of other students is relatively scattered, indicating that the second group of topics has a similar matching degree for the student X ( 6 ) , while the matching degree for other students is relatively scattered.
③ The distribution of the discussion topics of the student X ( 9 ) in the group T ( 3 ) is relatively concentrated, while the distribution of the discussion topics of other students is relatively scattered, indicating that the third group of topics has a similar matching degree for the student X ( 9 ) , while the matching degree for other students is relatively scattered.
(2) Regarding Figure 6, the proposed optimal complete binary encoding tree algorithm can sort the matching degree of the discussion topics corresponding to each group of students, and we extract the topics with the highest matching degree from each binary tree and then recommend the topics to the students or groups for discussion. The first row of each node on the tree represents the topic weight, while the second row of each node represents the topic matching degree. Since the matching degrees between the topics and the students’ interests are different, the binary trees of different students within the same group have completely different structures. The topic located at the parent node is the optimal node of the tree, and the corresponding topic best matches the interests of the student.
(3) According to Figure 5 and Figure 6, the corresponding relationships between the students and the optimal topics are as follows:
① Students in T ( 1 ) : {a- X ( 2 ) - G ( 1 ) , b- X ( 5 ) - G ( 6 ) , c- X ( 7 ) - G ( 4 ) , d- X ( 10 ) - G ( 1 ) };
② Students in T ( 2 ) : {e- X ( 1 ) - G ( 6 ) , f- X ( 3 ) - G ( 8 ) , g- X ( 6 ) - G ( 10 ) , h- X ( 12 ) - G ( 10 ) , i- X ( 14 ) - G ( 9 ) , j- X ( 15 ) - G ( 5 ) };
③ Students in T ( 3 ) : {k- X ( 4 ) - G ( 1 ) , l- X ( 8 ) - G ( 3 ) , m- X ( 9 ) - G ( 1 ) , n- X ( 11 ) - G ( 9 ) , o- X ( 13 ) - G ( 9 ) }.
Based on the result, when the teachers arrange individual students to conduct the topic discussions and presentations in the practical courses, they can recommend the topics with the highest matching degree to the students based on the corresponding relationship results.
(4) Regarding Table 9, the weight f ¯ G ( j ) [ i ] of the interest intensity indicates the importance of the topic G ( j ) in the n number of selected topics within the group T ( i ) by the k-NN recommendation algorithm. According to this Definition, take k = 3 , and the recommendation results output by the constructed k-NN recommendation algorithm are as follows:
① When n = 5 , for the first group T ( 1 ) , the recommended optimal topics are {0.150: G ( 1 ) , G ( 2 ) , G ( 4 ) , G ( 9 ) }; for the second group T ( 2 ) , the recommended optimal topics are {0.200: G ( 8 ) ; 0.133: G ( 5 ) , G ( 7 ) }; and for the third group T ( 3 ) , the recommended optimal topics are {0.200: G ( 10 ) ; 0.160: G ( 1 ) , G ( 3 ) }.
② When n = 6 , for the first group T ( 1 ) , the recommended optimal topics are {0.167: G ( 1 ) ; 0.125: G ( 2 ) , G ( 4 ) , G ( 7 ) , G ( 9 ) }; for the second group T ( 2 ) , the recommended optimal topics are {0.167: G ( 8 ) ; 0.139: G ( 9 ) , G ( 10 ) }; and for the third group T ( 3 ) , the recommended optimal topics are {0.167: G ( 3 ) , G ( 4 ) , G ( 10 ) }.
③ When n = 7 , for the first group T ( 1 ) , the recommended optimal topics are {0.143: G ( 1 ) , G ( 7 ) ; 0.107: G ( 2 ) , G ( 4 ) , G ( 5 ) , G ( 9 ) }; for the second group T ( 2 ) , the recommended optimal topics are {0.143: G ( 7 ) , G ( 8 ) ; 0.119: G ( 5 ) , G ( 9 ) , G ( 10 ) }; and for the third group T ( 3 ) , the recommended optimal topics are {0.143: G ( 3 ) , G ( 4 ) , G ( 10 ) }.
④ When n = 8 , for the first group T ( 1 ) , the recommended optimal topics are {0.125: G ( 1 ) , G ( 2 ) , G ( 5 ) , G ( 7 ) , G ( 9 ) }; for the second group T ( 2 ) , the recommended optimal topics are {0.125: G ( 7 ) , G ( 8 ) , G ( 10 ) }; and for the third group T ( 3 ) , the recommended optimal topics are {0.125: G ( 3 ) , G ( 4 ) , G ( 5 ) , G ( 8 ) , G ( 10 ) }.
Analyzing the optimal recommended topics for each group T ( i ) by the k-NN recommendation algorithm under each variable condition n = 5 , 6 , 7 , 8 , it is concluded that the optimal recommended topics for each group are different when the parameters are different. This indicates that our algorithm conforms to the k-NN recommendation mechanism and can recommend reasonable discussion topics. Taking into account the different variable conditions n = 5 , 6 , 7 , 8 and analyzing the most frequently recommended topics for each group, the results are shown as follows:
① The first group T ( 1 ) : { G ( 1 ) , G ( 2 ) , G ( 9 ) };
② The second group T ( 2 ) : { G ( 7 ) , G ( 8 ) , G ( 10 ) };
③ The third group T ( 3 ) : { G ( 3 ) , G ( 4 ) , G ( 10 ) }.
Based on the statistical results, we obtain the following conclusions: recommending the topics G ( 1 ) , G ( 2 ) , and G ( 9 ) to the first group can meet the interests of the group to the greatest extent; recommending the topics G ( 7 ) , G ( 8 ) , and G ( 10 ) to the second group can meet the interests of the group to the greatest extent; and recommending the topics G ( 3 ) , G ( 4 ) , and G ( 10 ) to the third group can meet the interests of the group to the greatest extent.
(5) Regarding Figure 7, under the different variable conditions n = 5 , 6 , 7 , 8 , the interest intensity weights of each group show different fluctuation trends, indicating that the proposed k-NN recommendation algorithm has different effects under the different variable conditions, which conforms to the mechanism of the k-NN data mining algorithm. The proposed algorithm is reasonable. Analyzing the curves of the different parameters, the three sets of curves in Figure 7a,b have a high degree of dispersion, indicating that the proposed recommendation algorithm has a significant difference in the action scope of the interest matching for the three groups of students when n = 5 and n = 6 , resulting in a significant difference in the interest matching. The aggregation degree of the three sets of curves in Figure 7c,d is relatively high, indicating that when n = 7 and n = 8 , the proposed recommendation algorithm has a small difference in the action scope of the interest matching among the three groups of students and the closeness of the interest matching is high.

4.4. Results and Analysis of the Comparative Experiment

4.4.1. Testing and Comparison in the Single Dataset (Class E1)

To verify the advantages of the proposed recommendation algorithm, we design a comparative experiment to evaluate and test the performance of the recommendation algorithm. The commonly used recommendation algorithms include the user-based collaborative filtering algorithm (UCFA) and the item-based collaborative filtering algorithm (ICFA). In the related works, Yin [19], Liu [20], Zhang [21], and Liu [23] all used collaborative filtering algorithms to design their recommendation models. We take the UCFA and ICFA used in the relevant literature as the control group, and construct the proposed recommendation algorithm (PRA) as the experimental group. We evaluate and test the three algorithms by using the recommendation algorithm evaluation indicators. In the control experiment, we set the same experimental conditions and select the parameters n = 5 , 6 , 7 , 8 for the recommended samples, with each group of student samples being the classifying results output by the naive Bayes algorithm in Section 4.2.
The evaluation indicators we use are as follows:
①Accuracy. The proportion of the recommended positive samples to the total number of samples. In the comparative experiment, we use the total frequency of the topic samples provided by each group { G ( j ) , j = 10 } as the total number of samples. The total frequency of the topic samples { G ( j ) , j = 10 } is calculated by multiplying the total number of students in the group by the number of topics (10), resulting in a total frequency of 40 for the group T ( 1 ) , 60 for the group T ( 2 ) , and 50 for the group T ( 3 ) . The recommendation model is based on k-NN, which outputs the possible number of recommended topics under the different n value conditions. Meanwhile, the number of students in each group is different in the experiment. The frequency f G ( j ) of the best recommended topics in a group under the condition n = 5 , 6 , 7 , 8 is taken as the positive sample number, and the top three topics ( k = 3 ) in the ranking are taken as the positive sample number for calculation. The accuracy model is constructed as Formula (12).
A c c u r a c y = s u m : f G ( j ) t o t a l : f G ( j ) | G ( j ) , j = 10
② Recall rate. The proportion of the positive samples recommended by the algorithm among the positive samples that should be recommended. In the comparative experiment, we use the total frequency of the topics counted under the three algorithms of UCFA, ICFA, and PRA with parameters n = 5 , 6 , 7 , 8 for each group as the positive sample size, and the top three topics ( k = 3 ) in the ranking as the final recommended topics to construct the recall rate model, as shown in Formula (13).
R e c a l l = s u m : f G ( j ) s u m : f G ( j ) | G ( j ) max
The evaluation method is as follows. Firstly, we use the UCFA as the control group, search for the individual students S ( i ) s m p in the overall previous class samples who have the closest interests to each group T ( i ) of the students X ( i ) , and then collect the interest rankings of the students S ( i ) s m p for each discussion topic G ( i ) and output the recommended topics under the set conditions n = 5 , 6 , 7 , 8 . Secondly, we use the ICFA as the control group, and then the “Suburban Tourism” course with the highest association with the “Rural Tourism” course is selected as a similar item among the courses previously taken in the same class. The optimal “Suburban Tourism” recommended topics for each group T ( i ) of the students X ( i ) are chosen, and the “Rural Tourism” topics with the closest feature attributes to the “Suburban Tourism” recommended topics are output as the recommended results.
Table 10 shows the accuracy values of the optimal topics output by the different recommendation algorithms. Table 11 shows the recall rate values of the optimal topics output by the different recommendation algorithms. Figure 8 compares the accuracy values and recall rate values of the optimal topics output by the different recommendation algorithms. Figure 8a–d show the accuracy bar charts of each algorithm, Figure 8e–h show the recall rate bar charts of each algorithm, Figure 8i–l show the accuracy curves of each algorithm, and Figure 8m–p show the recall rate curves of each algorithm. In Figure 8, the a–d, e–h, i–l, and m–p respectively represent n = 5 , 6 , 7 , 8 . The blue represents the first group T ( 1 ) , the orange represents the second group T ( 2 ) , and the green represents the third group T ( 3 ) .
Based on the accuracy and recall rate of the experimental group and control group, we use Formula (14) to calculate the accuracy optimization degree Φ a c and use Formula (15) to calculate the recall rate optimization degree Φ r e , in which the symbol “ e x g . ” represents the experimental group and the symbol “ c o g . ” represents the control group. The calculated results of accuracy optimization degree Φ a c are shown in Table 12, and the calculated results of recall rate optimization degree Φ r e are shown in Table 13.
Φ a c [ e x g . c o g . ] = A c c u r a c y ( e x g . ) A c c u r a c y ( c o g . ) A c c u r a c y ( e x g . ) × 100 %
Φ r e [ e x g . c o g . ] = R e c a l l ( e x g . ) R e c a l l ( c o g . ) R e c a l l ( e x g . ) × 100 %
Based on an analysis of Table 9, Table 10, Table 11 and Table 12 and Figure 8, the following can be noted:
(1) In the bar chart, for each group T ( i ) , the blue bar corresponding to the PRA is higher than the orange and green bar under the various parameter conditions n . From the graph, it can be concluded that the PRA curve trend is significantly higher than those of the UCFA and ICFA, indicating that the proposed algorithm exhibits higher accuracy and recall rate than the control group algorithms in the different parameters and groups.
(2) Regarding the accuracy of the algorithms, when the parameter values n are different, each recommendation algorithm has different accuracy. Overall, the PRA has a higher accuracy compared to the UCFA and ICFA, indicating that the proposed algorithm has a higher accuracy than the traditional collaborative filtering algorithms. It has a higher probability and reliability in recommending topics that perfectly match the students’ interests compared to the UCFA and ICFA. The results in the accuracy optimization degree show that, compared with the UCFA, the PRA has the lowest accuracy optimization degree of 5.14% and the highest accuracy optimization degree of 13.44%, while compared with the ICFA, the PRA has the lowest accuracy optimization degree of 13.54% and the highest accuracy optimization degree of 17.03%.
(3) Regarding the recall rate of the algorithms, when the parameter values n are different, each recommendation algorithm has different recall rates. Overall, the PRA has a higher recall rate compared to the UCFA and ICFA, indicating that the proposed algorithm has a higher recall rate compared to the traditional collaborative filtering algorithms and has a stronger ability to find out the most matched student interest topics in the overall samples compared to the UCFA and ICFA. The results in recall rate optimization degree show that, compared with the UCFA, the PRA has the lowest recall rate optimization degree of 5.21% and the highest recall rate optimization degree of 13.42%, while compared with the ICFA, the PRA has the lowest recall rate optimization degree of 13.56% and the highest recall rate optimization degree of 17.07%.
(4) In the experiment, the frequency of the optimal recommended topics is used as the basis for calculating the accuracy and recall rate, indicating that under the same overall sample conditions, the proposed algorithm has a higher probability of recommending the most matched topics than the traditional collaborative filtering algorithms. It can maximize the matching of students’ interests and topics, thereby comprehensively improving the students’ learning enthusiasm. On the basis of the same overall recommendation frequency, the proposed algorithm has a higher recall rate compared to the traditional collaborative filtering algorithms, and it has a stronger ability to find out the most matched student interest topics in the overall recommendation topics compared to the UCFA and ICFA.
(5) We conclude with the reasons why the PRA has advantages over the UCFA and ICFA. The algorithm we constructed is based on the naive Bayes machine learning model for the grouping classes. It mines the interests of each student and then directly establishes the matching model between the discussion topics and the student interests, achieving the accurate matching of the student interests, which causes the recommended topics to be as close as possible to the student interests. However, the traditional collaborative filtering algorithms have certain shortcomings. The user-based collaborative filtering algorithm (UCFA) searches for the students whose interests are similar to those of the sample students and then recommends the best topics that the approximate students have participated in to the sample students. The item-based collaborative filtering algorithm (ICFA) searches for the topics that the sample students have previously participated in and recommends the current topics that are similar to the topics they have participated in. Therefore, the UCFA and ICFA are approximate recommendation methods. The recommendations based on similar users or similar items do not involve the interest mining of the current user, nor do they involve the feature mining of the recommended objects. They do not establish a direct matching relationship between the current user and the object. Therefore, the proposed algorithm is superior.

4.4.2. Testing and Comparison in the Multiple-Dataset (Class E2 and Class E3): Robustness Testing

To verify the robustness of the recommendation algorithm, we set two additional experimental classes for the comparative testing (class E2 and class E3). We apply the same experimental conditions as in Section 4.4.1 to the two experimental classes, each with 15 students. The discussion topics, the student grouping algorithm, and the collection method for the student interest label are identical. We use the PRA, UCFA, and ICFA to recommend the discussion topics for the three student groups T ( i ) generated by the two classes and then test and compare the accuracy, recall rate, precision, and F 1 value of the recommendation algorithms. The precision represents the proportion of the topics that are ultimately recommended to the group in the predicted recommended topics. The calculation formula for the F 1 value is shown in Formula (16), and the proportion is used to calculate the frequency of the recommended topics. Table 14 shows the comparison of the accuracy between the two classes, Table 15 shows the comparison of the recall rate between the two classes, Table 16 shows the comparison of the precision between the two classes, and Table 17 shows the comparison of the F 1 value between the two classes. Table 18 shows the accuracy optimization, the recall rate optimization, the precision optimization, and the F 1 value optimization of the experimental group compared to the control group.
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
According to the data analysis of Table 13, Table 14, Table 15, Table 16 and Table 17, there are significant differences in the ability of the PRA, UCFA, and ICFA to recommend discussion topics that match the students’ interests. On average, the PRA has higher accuracy, recall rate, precision, and F 1 values than the UCFA and ICFA in different groups of the two classes, indicating that the PRA performs better than the UCFA and ICFA in both classes. Based on the data of the two classes, on average, comparing to the UCFA and ICFA, the PRA has the lowest accuracy optimization rate of 15.03% and the highest accuracy optimization rate of 48.25%; the PRA has the lowest recall rate optimization rate of 15.03% and the highest recall rate optimization rate of 48.27%; the PRA has the lowest precision optimization rate of 7.71% and the highest precision optimization rate of 35.32%; and the PRA has the lowest F 1 value optimization rate of 14.02% and the highest F 1 value optimization rate of 42.61%. By the comparative experiment between the two classes, we can conclude that the PRA outperforms the UCFA and ICFA in recommendation ability. All indicators demonstrate that the PRA has strong robustness and can demonstrate stable performance in recommending discussion topics for multiple classes and recommending the discussion topics with the highest interest-matching degree for the teachers and students.

5. Conclusions and Prospects

5.1. Conclusion of the Research Work

The integration of artificial intelligence technology into higher education is a hot and cutting-edge issue in educational research. By analyzing the current status and existing problems of practical course teaching models in higher education, we propose a new teaching method: an intelligent teaching recommendation model for practical discussion courses in higher education based on naive Bayes machine learning and an improved k-NN data mining algorithm. Based on the practical classes of higher education courses as the data collection objects, the naive Bayes machine learning model is constructed to conduct data mining on the teaching classes to be classified. We group the students in the teaching class and further collect and quantify their interest labels based on the grouping results. The matching with the feature attributes of teaching and discussion topics is carried out, and the optimal complete binary encoding tree for the discussion topics is constructed. The method of accurately matching the students’ interests and the discussion topics by using the encoding tree structure as a quantitative model is achieved. Then, we establish a recommendation model based on the improved k-NN data mining algorithm on the basis of the optimal complete binary encoding tree, achieving accurate recommendation of the teaching topics. Through experiments, we demonstrate the feasibility of the constructed algorithm, which has higher accuracy, recall rate, precision, and F 1 values compared to traditional collaborative filtering algorithms.

5.2. The Practical Applications and Implications of the Proposed Model

5.2.1. The Application Value and Implications

The constructed teaching recommendation model can provide teaching methods and decision supports for the topic discussion courses in practical teaching in universities and provide an intact teaching practice procedure. In the aspects of data collection, algorithm modeling, model operation, etc., the constructed model demonstrates good performance. The practical application value and implications of the model lie in the following:
(1) It is capable of collecting and mining small class interest data, implementing the class grouping based on the students’ interests and providing teachers with a scientific basis and method for the student grouping. The experiment proves that the designed algorithm leads to a higher degree of satisfaction than the grouping determined by the subjective evaluation method of the teachers. Students’ evaluations on the indicators of “grouping satisfaction”, “interest matching satisfaction”, “team collaboration satisfaction”, and “discussion process satisfaction” are all higher than those of the subjective evaluation method of the teachers.
(2) The recommendation of the teaching topics is based on the matching of students’ interest features and discussion topic features. The model recommends the best matched discussion topics for the students’ interests, which can improve student satisfaction, achieve personalized teaching, and effectively enhance teaching quality.
(3) Unlike the approximate recommendation of the collaborative filtering recommendation algorithms, the constructed teaching recommendation algorithm is a personalized recommendation method based on small class data, which outperforms the collaborative filtering recommendation algorithms in accuracy, recall rate, precision, and F 1 value. In teaching practice, using the constructed recommendation model can obtain more accurate teaching topics, providing technical methods for the teachers to carry out high-quality discussion course teaching.

5.2.2. Principles for Conducting Teaching in Practical Applications

In practical teaching applications, the teaching activities should be effectively implemented and carried out according to the following guidelines. Teachers can use the real-time teaching platform to collect data, determine the student groups, and release the discussion topics, guiding students to participate in the topic discussions and complete the teaching tasks.
(1) In terms of data collection, firstly, it should be ensured that the discussion course has been organized and implemented by the previous classes, and that the model training set has complete and accurate data sources to output the correct and effective model. Secondly, small class data, with a student number of 10–20, should be collected to ensure the accuracy of the model.
(2) In terms of model construction, when constructing the student grouping model, a small class with 10–20 students should be used as the basic data to group students, ensuring high interest matching within each group and low interest matching between different groups. When constructing the recommendation model for the discussion topics, the individual interests of students within the group should be used as the basic data, which are matched with the features of the discussion topic; then, the personalized recommendation can be realized.
(3) In terms of teaching implementation, based on the grouping results and the recommended discussion topics, the teachers should further explore and develop the grouping discussion contents, the teaching process standard, the evaluation standard, and the optimization methods. In the process of course implementation, discussion data collection and post-class evaluation data collection should be performed to provide the data support for the subsequent model’s establishment, model optimization, teaching effectiveness improvement, etc.
(4) In terms of evaluation and optimization, the teachers should design the course evaluation indicators and collect the data generated during the implementation of the discussion course, such as the students’ ratings of the topic features, specific grouping data of students, and students’ evaluation data of the course. By utilizing these data, the accuracy and effectiveness of the model implementation can be further studied, and problems in the teaching process can be identified to further optimize the model and teaching methods.
(5) In terms of data update, there are two situations: the first one is that the discussion topic remains unchanged and the class members change; the second one is that the discussion topic changes, while the class members remain unchanged. In the first situation, the discussion topic remains unchanged, and thus the constructed naive Bayes machine learning algorithm remains unchanged, too. When new students enroll or new members join the class, it is necessary to form a new class or redetermine the class members, and then use the constructed naive Bayes machine learning algorithm for the new class as a unit, generate new groups, and use the k-NN algorithm to recommend the discussion topics for each group. In the second situation, if the discussion topic changes (new discussion topics emerging), it is necessary to recollect the previous class data and construct a new naive Bayes machine learning algorithm. Based on the new discussion topic, the current class will be grouped, and the k-NN algorithm will be used to recommend new discussion topics for each group.

5.2.3. The Application Method and Process

Based on the research results, we design and provide the specific methods and process for applying the model. Teachers can use the model in practical teaching activities according to these steps to organize the discussion courses.
(1) Teachers determine the primary discussion topic and select a class from the classes that have previously organized the discussion course on this topic as the data source of modeling.
  • The case: in “4. Experiment and Analysis”, the discussion topic is determined as “Rural Tourism”. The data source comes from the selected students of the previous class.
(2) Teachers design the student interest labels (label set A) and determine the student classification labels (label set B).
  • The case: in “4.1. Data Preparation”, I-1, I-2, I-3, I-4, and I-5 form the label set A, while T1, T2, and T3 form the label set B.
(3) Teachers build the naive Bayes training set model based on the student interest label A and classification label B and establish the naive Bayes classification model based on the training set model.
  • The case: in “4.1. Data Preparation”, Table 2 represents the training set model.
(4) For the class that will organize the discussion course, the teachers quantify the interest labels of the class students (corresponding to label set A).
  • The case: in “4.1. Data Preparation”, Table 3 represents the interest labels of students in the class to be classified.
(5) Teachers input the interest label A of the student to be classified into the naive Bayes classification model, and the model outputs the specific classification of the student (corresponding to label set B).
  • The case: the grouping result output in “4.2. Results and Analysis on the Naive Bayes Grouping”.
(6) Based on each classification, teachers determine the group discussion topic (secondary topic), further refine the discussion contents, then output the secondary discussion topic labels and student interest labels (label set C) for the discussion topic.
  • The case: in “4.1. Data Preparation”, Table 4 represents the set of secondary discussion topic labels and student interest labels (label set C), and Table 5 represents the secondary topics.
(7) Based on the interest labels and secondary topic labels of students within the group, teachers establish the k-NN algorithm to output the complete binary encoding tree for the discussion topics and determine the encoding tree for each student.
  • The case: the encoding tree result output in Figure 6.
(8) Based on each student’s encoding tree, teachers output the optimal discussion topic for each classification (student group).
  • The case: the result output in Table 9.
(9) Based on the classification (student group) with the optimal discussion topics, teachers carry out the discussion teaching activities.
(10) Teachers evaluate and provide feedback on the teaching process of the discussion.

5.3. Work Prospect

In future research work, we will optimize the algorithm from the following aspects. Firstly, we will further optimize the constructed naive Bayes machine learning model by expanding the dimension of the feature vector based on the existing student interest features, incorporating more teaching labels covered by professional courses into the student feature vector, so that the algorithm can cover more teaching contents. Secondly, we will further optimize the constructed recommendation algorithm. When constructing an interest-matching model, higher-dimensional student interests and discussion topic features will be integrated to enable recommendation algorithms to cover a wider range of teaching contents. Then, we will deeply explore the correlation between the student interests and the teaching discussion topics, which could further improve the accuracy of the recommendation results.

Author Contributions

Conceptualization, X.Z., L.G. and R.L.; methodology, X.Z., L.G. and R.L.; formal analysis, J.P., R.L. and L.L.; visualization, J.P., R.L. and L.L.; writing—original draft preparation, X.Z. and R.L.; writing—review and editing, X.Z., L.G., R.L., L.L. and J.P.; funding acquisition, X.Z. and L.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Engineering Research Center of Integration and Application of Digital Learning Technology, Ministry of Education (no. 1411015), the National Social Science Fund of China (no. 2023-skjj-a-001), the Annual Planning Project of Commerce Statistical Society of China (no. 2024STZX04), the Project of Sichuan Ethnic Region Rural Digital Education Research Center (no. MZSJ2004C11), the Project of the Key Research Institution of Social Sciences in Sichuan Province—The Center for the Protection and Development of Local Cultural Resources (no. DFWH2024-012), and the funding project of the Sichuan Ethnic Minority Music Culture Research Center (no. SCMY2024003).

Institutional Review Board Statement

All procedures performed in this study involving human participants were in accordance with the ethical standards of the Declaration of Helsinki and approved by the Institutional Review Board of Leshan Vocational and Technical College (protocol No. 1/06.09.2024). The school principal and the director of the research department approved this research.All the experimental data will not be disclosed according to the signed confidentiality agreement.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations and Mathematical Symbols

The following abbreviations and mathematical symbols are used in this manuscript:
K-NNK- nearest neighbor
UCFAUser-based collaborative filtering recommendation algorithm
ICFAItem-based collaborative filtering recommendation algorithm
S ( i ) The sample student
S ( i ) The student feature vector
T ( i ) The classification label for the discussion topic
S [ i ] The student data matrix
T r The training set model for the naive Bayes machine learning.
P ( T ( i ) ) The naive Bayes prior probability model
P ( S ( x ) | T ( i ) ) The naive Bayes conditional probability density
P ( T ( i ) | S ( x ) ) The naive Bayes posterior probability model
S ( x ) Δ The feature vector of the student to be classified
G ( j ) The discussion topic feature vector
S ( i ) q The quantization vector based on the interests of the individual students
G ( j ) q The quantization vector based on the topic features
δ G ( j ) The discussion topic weight
f ( S ( i ) q , G ( j ) q ) The discussion topic matching model
x o y G ( i ) The discussion topic spatial coordinate system
( x G ( i ) , y G ( i ) ) The discussion topic spatial coordinates
H T The optimal complete binary encoding tree model for the discussion topic
U ( i , j ) The optimal topic vector for the student
U N T ( i ) × n The student optimal topic matrix for the student
f G ( j ) The discussion topic interest intensity
R ( i , j ) The discussion topic recommendation vector

Appendix A

Appendix A.1. Pseudo Code for the Training Set Algorithm for Naive Bayes Machine Learning Model

Input: Sample set of previous class students: D
Output: Training set for naive Bayes machine learning model: T r
Process:
  1:Randomly select N number of student samples S ( i ) from D : { S ( 1 ) , S ( 2 ) , …, S ( N ) }
  2:Establish student vector S ( i ) , 0 < i N
  3:  Confirm k number of attributes L ( i ) for teaching theme
  4:  Identify p number of discussion topics and label them as classification labels T ( i ) , 0 < i p
  5:  Set up 1 × ( k + 1 ) dimension vector S ( i ) , elements 1~ k store L ( i ) , element k + 1 stores T ( i )
  6:Establish a student matrix S [ i ] for the previous class, element is S [ i , j ] , i , j n
  7:For  i = 1 , 2 , , n  do
  8:  For  j = 1 , 2 , , n  do
  9:  Store S ( i ) into S [ i , j ] , i , j n
 10:  If  N < n × n  then
 11:  The remaining n × n N elements S [ i , j ] are stored as 0
 12:  Judge the element S ( i , j ) of vector S ( i )
 13:     If vector S ( i ) element: S ( i , j ) = 0  then matrix element S [ i , j ] = 0
 14:  If vector S ( i ) element: S ( i , j ) 0  then matrix element S [ i , j ] = 1
 15: Note element S [ i , j ] = 1 , count its number N S
 16:End for
 17:Establish N S × ( k + 1 ) dimension matrix T r
 18: Define 1 × ( k + 1 ) dimension empty matrix S ( i ) = 0
 19:  Initialize row r o = 1 , r o = r o + 1 , expand row of vector S ( i ) to N S

Appendix A.2. Pseudo Code for the Class Grouping Algorithm Based on Naive Bayes Machine Learning

Input: Student matrix to be classified: S ( x ) Δ
Output: Student classification: T ( i )
Process:
  1:Quantify vector S ( x ) Δ , determine the quantified value of student labels
  2:Equivalently simplify the Bayesian posterior probability model
  3:  Assume that student S ( x ) Δ has equal probability for all classes T ( i ) , P ( S ( x ) ) = c o n s t
  4:  Equivalently simplify P ( T ( i ) | S ( x ) ) to calculate P ( S ( x ) | T ( i ) ) P ( T ( i ) )
  5:  Set δ ( S ( x ) ) = P ( S ( x ) | T ( i ) ) P ( T ( i ) ) equivalently to calculate P ( T ( i ) | S ( x ) )
  6:Regarding T r , establish prior probability model P ( T ( i ) ) of T ( i )
  7: Note S ( i ) T ( i )
  8: Initialize r o = 1 , N T ( i ) = 0
  9:For  r o = 1 , 2 , , N s  do
 10:  If  S ( i ) T ( i )  then  N T ( i ) = N T ( i ) + 1
 11:    If  S ( i ) T ( i )  then  N T ( i ) = N T ( i ) + 0
 12:End for
 13: Calculate P ( T ( i ) ) = N T ( i ) / N S
 14:Repeat Traversing T ( i ) ~ i i | 0 < i p
 15:Introduce S ( x ) Δ and quantified label L ( i ) , set up P ( S ( x ) | T ( i ) )
 16: Construct the conditional probability density P ( S ( x ) | T ( i ) ) = u = 1 k P ( L ( u ) | T ( i ) )
 17: Probabilistic valuation P ( L ( u ) | T ( i ) ) = w ( i , k ) / w ( i ) . The w ( i , u ) is the number of students with labels L ( u ) appearing in the class T ( i ) , w ( i ) is the number of students in class T ( i )
 18:  Introduce disturbance factor σ , calculate P ( S ( x ) | T ( i ) ) = u = 1 k ( w ( i , u ) / w ( i ) + σ )
19:Sort and output δ ( S ( x ) ) . The related classification T ( i ) is the group for student S ( x ) Δ

Appendix A.3. Pseudo Code for the Algorithm of the Complete Binary Encoding Tree for Discussion Topic

Input: Discussion topics G ( i ) of classification T ( i ) , Individual students within the group S ( i )
Output: Encoding tree H T ( i ) for student individual S ( i )
Process:
  1:Select individual students S ( i ) from the classification T ( i ) , quantify and output S ( i ) q
  2:Regarding the h ( i ) number of discussion topics G ( j ) for classification T ( i ) , quantify and output h ( i ) number of G ( j ) q , encode the discussion topic G ( i )
  3:Determine the tree node structure:
  4:   < h e a d > : discussion topic weight: δ G ( j )
  5:   < s u f f i x > : discussion topic matching value: f ( S ( i ) q , G ( j ) q )
  6:Initialize c o u n t = 0 , encode the discussion topic G ( i )
  7:For  c o u n t = 1 , 2 , , h ( i )  do
  8:  Initialize the no. i topic G ( i ) , generate the tree node for G ( i )
  9:    Extract G ( i ) q , calculate δ G ( i ) , store δ G ( i ) into the related h e a d of G ( i )
 10:    Extract S ( i ) q , calculate f ( S ( i ) q , G ( j ) q ) , store f ( S ( i ) q , G ( j ) q ) into the related S u f f i x of G ( i )
 11:   End for
 12:Initialize the complete binary encoding tree H T ( i ) , including h ( i ) number of nodes. Traverse all nodes.
 13:For  j = 1 , 2 , , x  do
 14:  Compare f ( 1 ) , f ( 2 ) , …, f ( x ) and store them
 15:  For any node H T ( u , v ) , its left node H T ( u , v 1 ) meets H T ( u , v 1 )
 16:    For any node H T ( u , v ) , its right node H T ( u , v + 1 ) meets:
 17:   If current H T ( u , v ) stores the last G ( j ) , then the right node H T ( u , v + 1 ) does not exist
 18:     If current H T ( u , v ) is not the last G ( j ) , then the right node meets H T ( u , v + 1 )
 19:    Any node H T ( u , v ) meets:
 20:      ①The stored H T ( u , v ) of G ( j ) corresponds to f ( j m ) , and the stored G ( j ) of child nodes H T ( u + 1 , 2 v 1 ) and H T ( u + 1 , 2 v ) correspond to f ( j k ) and f ( j d ) , there is f ( j m ) f ( j k ) f ( j d ) ;
      ② The stored H T ( u , v ) of G ( j ) corresponds to f ( j m ) , and the stored G ( j ) of left node H T ( u , v 1 ) corresponds to f ( j k ) , the stored G ( j ) of right node H T ( u , v + 1 ) corresponds to f ( j d ) , there is f ( j k ) f ( j m ) f ( j d ) .
 21:  Repeat Stop searching until the traversal x = h ( i ) is complete
 22:End for
 23:Output individual student S ( i ) encoding tree H T ( i ) , with tree nodes for sorting discussion topics

Appendix A.4. Pseudo Code for the Recommendation Algorithm Based on the Improved k-NN Data Mining

Input: Classification T ( i ) and students S ( i ) , student interest vector S ( i ) q and topic vector G ( j ) q , individual student S ( i ) encoding tree H T ( i )
Output: The optimal discussion topic G ( j ) recommended for student classification T ( i )
Process:
  1:For  i = 1 , 2 , , N T ( i )  do
  2: Generate individual student S ( i ) encoding tree H T ( i )
  3: Output the top n number of nodes of each tree H T ( i )
  4:End for
  5:Output the optimal topic vector U ( i , j ) of student S ( i )
  6:  For  i = 1 , 2 , , N T ( i )  do
  7:  Take the n number of optimal topics G ( j ) from the student S ( i ) coding tree H T ( i ) and store them in the 1 × k dimensional vector U ( i , j )
  8:End for
  9:Initialize matrix U N T ( i ) × n = 0 , counter: c o u n t = 0
 10:For  c o u n t = 1 , 2 , , N T ( i )  do
 11:   Take U ( i , j ) element; vector elements U ( i , 1 ) ~ U ( i , n ) are stored into the elements of no. i row
 12:   c o u n t = c o u n t + 1
 13:   End for
 14:Build a baseline vector B containing h ( i ) number of discussion topics G ( j ) , with corresponding storage of G ( j ) for vector element B ( j )
 15:Scan matrix U N T ( i ) × n ; the row is encoded as u , the column is encoded as v ; note that c o u n t is the intensity weight f G ( j ) [ j ] of B ( j )
 16:For row u = 1 , 2 , , N T ( i )  do
 17:  For column v = 1 , 2 , , n  do
 18:    If U ( u , v ) = B ( j ) then  c o u n t = c o u n t + 1
 19:    If U ( u , v ) B ( j ) then  c o u n t = c o u n t + 0
 20:   End for
 21:   End for
 22:Normalize f G ( j ) [ j ] . Output the intensity weight of each element in the vector B
 23:Build a complete binary tree T B , store f G ( j ) [ j ] to nodes in descending order. The top k number of G ( j ) and f G ( j ) [ j ] are recommended to the classification T ( i )

References

  1. Huang, H. The cultivation of quality and ability of environmental design professionals in universities under the digital background. Int. Educ. Res. 2025, 8, 25. [Google Scholar] [CrossRef]
  2. Sunardi, S.; Hermagustiana, I.; Rusmawati, D. Tension between theory and practice in literature courses at university-based educational institutions: Strategies and approaches. J. Lang. Teach. Res. 2025, 16, 666–675. [Google Scholar] [CrossRef]
  3. Wu, S.Y. Research on the integration of practical teaching in introduction courses under the background of inter-school cooperation. Educ. Insights 2025, 2, 45–51. [Google Scholar]
  4. Gong, Y.F. Innovation and practice of translation practice course teaching from the perspective of interdisciplinary integration. J. Hum. Arts Soc. Sci. 2025, 9, 103–108. [Google Scholar]
  5. Meng, Y.R.; Cui, Y.; Aryadoust, V. EFL teachers’ formative assessment literacy and developmental Trajectories: A comparative study of face-to-face and blended teaching modes. System 2025, 132, 103694. [Google Scholar] [CrossRef]
  6. Loureiro, A.; Rodrigues, M.O. Student grouping: Investigating a socio-educational practice in a public school in Portugal. Soc. Sci. 2024, 13, 141. [Google Scholar] [CrossRef]
  7. Fu, L.M. Construction of Vocational Education Quality Evaluation Index System from the Perspective of Digital Transformation Based on the Analytic Hierarchy Process of Higher Vocational Colleges in Hainan Province, China. J. Contemp. Educ. Res. 2025, 9, 282–289. [Google Scholar] [CrossRef]
  8. Wang, J.Y. Issues and Improvement Strategies in Group Teaching of Instrumental Performance Courses in Higher Normal Universities. Int. J. New Dev. Educ. 2024, 6, 31–36. [Google Scholar]
  9. Caskurlu, S.; Yalçın, Y.; Hur., J.; Shi, H.; Klein, J. Data-Driven Decision-Making in Instructional Design: Instructional Designers’ Practices and Strategies. TechTrends 2025, prepublish. [Google Scholar] [CrossRef]
  10. Ashcroft, J.; Warren, P.; Weatherby, T.; Barclay, S.; Kemp, L.; Davies, R.J.; Hook, C.E.; Fistein, E.; Soilleux, E. Using a scenario-based approach to teaching professionalism to medical students: Course description and evaluation. JMIR Med. Educ. 2021, 7, e26667. [Google Scholar] [CrossRef] [PubMed]
  11. Rizi, C.E.; Gholami, A.; Koulaynejad, J. The compare the affect instruction in experimental and practical approach (with emphasis on play) to verbal approach on mathematics educational progress. Procedia—Soc. Behav. Sci. 2011, 15, 2192–2195. [Google Scholar] [CrossRef]
  12. Porubän, J.; Nosál’, M.; Sulír, M.; Chodarev, S. Teach Programming Using Task-Driven Case Studies: Pedagogical Approach, Guidelines, and Implementation. Computers 2024, 13, 221. [Google Scholar] [CrossRef]
  13. Heidari-Shahreza, M.A. Pedagogy of play: Insights from playful learning for language learning. Dis. Edu. 2024, 3, 157. [Google Scholar] [CrossRef]
  14. Johnson, O.; Olukayode, Y.A.; Abosede, A.A.; Homero, M.; Gao, X.H.; Kereshmeh, A. Construction practice knowledge for complementing classroom teaching during site visits. Smart Sustain. Built Environ. 2025, 14, 119–139. [Google Scholar]
  15. Wira, G.; Oke, H.; Rizkina, A.P.; Direstu, A. Updating aircraft maintenance education for the modern era: A new approach to vocational higher education. High. Educ. Ski. Work.-Based Learn. 2025, 15, 46–61. [Google Scholar]
  16. Wilkinson, S.D.; Penney, D. Students’ preferences for setting and/or mixed-ability grouping in secondary school physical education in England. Br. Edu. Res. J. 2024, 50, 1804–1830. [Google Scholar] [CrossRef]
  17. Ren, C.J. Immersive E-learning mode application in Chinese language teaching system based on big data recommendation algorithm. Entertain. Comput. 2025, 52, 100774. [Google Scholar] [CrossRef]
  18. Fu, L.W.; Mao, L.J. Application of personalized recommendation algorithm based on sensor networks in Chinese multimedia teaching system. Meas. Sens. 2024, 33, 101167. [Google Scholar] [CrossRef]
  19. Yin, C.J. Application of recommendation algorithms based on social relationships and behavioral characteristics in music online teaching. Int. J. Web-Based Learn. Teach. Technol. 2024, 19, 1–18. [Google Scholar] [CrossRef]
  20. Liu, Y. The application of digital multimedia technology in the innovative mode of English teaching in institutions of higher education. Appl. Math. Nonlinear Sci. 2024, 9. [Google Scholar] [CrossRef]
  21. Zhang, Y.Y.; Guo, H.Y. Research on a recommendation model for sustainable innovative teaching of Chinese as a foreign language based on the data mining algorithm. Int. J. Knowl.-Based Dev. 2024, 14, 1–18. [Google Scholar] [CrossRef]
  22. Ying, F. Interactive AI Virtual Teaching Resource Intelligent Recommendation Algorithm Based on Similarity Measurement on the Internet of Things Platform. J. Test. Eval. 2024, 52, 1650–1662. [Google Scholar]
  23. Liu, Q.L. Construction and application of personalised classroom teaching model of college English combined with recommendation algorithm. Appl. Math. Nonlinear Sci. 2024, 9. [Google Scholar] [CrossRef]
  24. Lu, H. Personalized music teaching service recommendation based on sensor and information retrieval technology. Meas. Sens. 2024, 33, 101207. [Google Scholar] [CrossRef]
  25. Nebojsa, G.; Tatjana, S.; Dragan, D. Design and implementation of discrete Jaya and discrete PSO algorithms for automatic collaborative learning group composition in an e-learning system. Appl. Soft Comput. 2022, 129, 109611. [Google Scholar]
  26. Baig, D.; Nurbakova, D.; Mbaye, B.; Calabretto, S. Knowledge graph-based recommendation system for personalized e-Learning. In Proceedings of UMAP Adjunct ‘24: Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization, Cagliari, Italy, 28 June 2024; pp. 561–566. [Google Scholar]
  27. Sundaresan, B.; Raja, M.; Balachandran, S. Design and analysis of a cluster-based intelligent hybrid recommendation system for e-learning applications. Mathematics 2021, 9, 197. [Google Scholar]
  28. Nachida, R.; Benkessirat, S.; Boumahdi, F. Enhancing collaborative filtering with game theory for educational recommendations: The Edu-CF-GT Approach. J. Web Eng. 2025, 24, 57–78. [Google Scholar] [CrossRef]
  29. Bustos López, M.; Alor-Hernández, G.; Sánchez-Cervantes, J.L.; Paredes-Valverde, M.A.; Salas-Zárate, M.d.P.; Bickmore, T. EduRecomSys: An Educational Resource Recommender System Based on Collaborative Filtering and Emotion Detection. Interact. Comput. 2020, 32, 407–432. [Google Scholar] [CrossRef]
  30. Amin, S.; Uddin, M.I.; Mashwani, W.K.; Alarood, A.A.; Alzahrani, A.; Alzahrani, A.O. Developing a personalized e-learning and MOOC recommender system in IoT-enabled smart education. IEEE Access 2023, 11, 136437–136455. [Google Scholar] [CrossRef]
  31. Lin, P.-H.; Chen, S.-Y. Design and evaluation of a deep learning recommendation based augmented reality system for teaching programming and computational thinking. IEEE Access 2020, 8, 45689–45699. [Google Scholar] [CrossRef]
  32. Chen, W.Q.; Yang, T. A recommendation system of personalized resource reliability for online teaching system under large-scale user access. Mob. Netw. Appl. 2023, 28, 983–994. [Google Scholar] [CrossRef]
  33. Qu, Z.H. Personalized recommendation system for English teaching resources in colleges and universities based on collaborative recommendation. Appl. Math. Nonlinear Sci. 2024, 9. [Google Scholar] [CrossRef]
  34. Wang, T.Y.; Ge, D. Research on recommendation system of online Chinese learning resources based on multiple collaborative filtering algorithms (RSOCLR). Int. J. Hum.-Comput. Interact. 2025, 41, 177. [Google Scholar] [CrossRef]
  35. Luo, Y.L.; Lu, C.L. TF-IDF combined rank factor Naive Bayesian algorithm for intelligent language classification recommendation systems. Syst. Soft Comp. 2024, 6, 200136. [Google Scholar] [CrossRef]
  36. Soheli, F. Classification of academic performance for university research evaluation by implementing modified Naive Bayes algorithm. Procedia Comp. Sci. 2021, 194, 224–228. [Google Scholar]
  37. Ahmad, K.; Ali, B.M.; Hamid, B. A distributed density estimation algorithm and its application to naive Bayes classification. App. Soft Comp. 2020, prepublish. [Google Scholar]
  38. Li, Q.N.; Li, T.H. Research on the application of Naive Bayes and support vector machine algorithm on exercises classification. J. Phys. Conf. Ser. 2020, 1437, 012071. [Google Scholar] [CrossRef]
  39. Gao, H.Y.; Zeng, X.; Yao, C.H. Application of improved distributed naive Bayesian algorithms in text classification. J. Supercomp. 2019, 75, 5831–5847. [Google Scholar] [CrossRef]
  40. Wang, D.Q.; Yang, Q.; Wu, X.L.; Wu, Z.Z.; Zhang, J.W.; He, S.X. Multi-behavior enhanced group recommendation for smart educational services. Discov. Comp. 2025, 28, 49. [Google Scholar] [CrossRef]
  41. Gong, Y.J.; Shen, X.Z. An algorithm for distracted driving recognition based on pose features and an improved KNN. Electronics 2024, 13, 1622. [Google Scholar] [CrossRef]
  42. Bahrani, P.; Bidgoli, B.M.; Parvin, H.; Mirzarezaee, M.; Keshavarz, A.J. A new improved KNN-based recommender system. J. Supercomput. 2024, 80, 800–834. [Google Scholar] [CrossRef]
Figure 1. The constructed training set for the naive Bayes machine learning algorithm.
Figure 1. The constructed training set for the naive Bayes machine learning algorithm.
Information 16 00512 g001
Figure 2. The constructed discussion topic spatial coordinate system.
Figure 2. The constructed discussion topic spatial coordinate system.
Information 16 00512 g002
Figure 3. The basic generation rules and logical structure of the tree model H T .
Figure 3. The basic generation rules and logical structure of the tree model H T .
Information 16 00512 g003
Figure 4. The bar chart and curve trend chart of posterior probability. The blue color represents the T(1); the orange color represents the T(2); the green color represents the T(3).
Figure 4. The bar chart and curve trend chart of posterior probability. The blue color represents the T(1); the orange color represents the T(2); the green color represents the T(3).
Information 16 00512 g004
Figure 5. The coordinate system and distribution of points for discussion topics within each group T ( i ) of students.
Figure 5. The coordinate system and distribution of points for discussion topics within each group T ( i ) of students.
Information 16 00512 g005aInformation 16 00512 g005b
Figure 6. The optimal complete binary encoding tree H T for student discussion topics within each group T ( i ) .
Figure 6. The optimal complete binary encoding tree H T for student discussion topics within each group T ( i ) .
Information 16 00512 g006aInformation 16 00512 g006b
Figure 7. The trend chart of interest intensity weight curves of students in different groups with different parameter values. The blue color represents the T(1); the orange color represents the T(2); the green color represents the T(3).
Figure 7. The trend chart of interest intensity weight curves of students in different groups with different parameter values. The blue color represents the T(1); the orange color represents the T(2); the green color represents the T(3).
Information 16 00512 g007
Figure 8. Comparison of the accuracy values and recall rate values of the optimal topics output by the different recommendation algorithms. The blue color represents the PRA; the orange color represents the UCFA; the green color represents the ICFA.
Figure 8. Comparison of the accuracy values and recall rate values of the optimal topics output by the different recommendation algorithms. The blue color represents the PRA; the orange color represents the UCFA; the green color represents the ICFA.
Information 16 00512 g008aInformation 16 00512 g008b
Table 1. Comparison of the relevant features contained in each recommendation model.
Table 1. Comparison of the relevant features contained in each recommendation model.
Recommendation
Model/Method
Discussion
(Teaching)
Topic
Group Discussion (Teaching)
Topic
Student Interest LabelStudent Classification LabelGroup Discussion (Teaching)
Sub-Topic
The constructed modelDesign the overall topic for the teaching contentDesign each group topic based on the grouping resultDesign student interest labels based on each topicDesign student classification labels based on grouping resultFurther refine the discussion content for each group based on their respective topics
Literature [17]Design the overall topic for the teaching contentNo user classification mechanismObtain users’ interest labelsNo user classification mechanismEfficiently screen required learning content and data
Literature [18]Design the overall topic for the teaching contentNo user classification mechanismObtain users’ interest labelsNo user classification mechanismThere is no mechanism to refine the teaching topic
Literature [19]Design the overall topic for the teaching contentNo user classification mechanismObtain users’ interest labels and behavior labelsNo user classification mechanismTeaching labels can be further subdivided
Literature [20]Design the overall topic for the teaching contentThe experimental group and the control group use the same teaching contentObtain users’ interest labels and behavior labelsThe experimental group and the control group use the same teaching contentEfficiently screen required learning content and data
Literature [21]Design the overall topic for the teaching contentNo user classification mechanismMine user interest’s similarityNo user classification mechanismRecommend different teaching resources for different learners
Literature [22]Design the overall topic for the teaching contentClassify learners and identify learning resourcesExplore learners’ behavioral patternsClassify learners and identify learning resourcesRecommend different teaching resources for different learners
Literature [23]Design the overall topic for the teaching contentNo user classification mechanism, conducting research on individual learnersObtain student portraits and interestsNo user classification mechanism, conducting research on individual learnersImplement the personalized teaching content recommendations
Literature [24]Design the overall topic for the teaching contentPersonalized recommendations for individual learnersEstablish interest labels based on users’ preferencePersonalized recommendations for individual learnersRealize personalized recommendation of teaching resources
Literature [25]Design the overall topic for the teaching contentAutomatically create student interest groups and recommend teaching content based on group recommendationsObtain students’ interest labelsAutomatically create student interest groups and recommend teaching content based on group recommendationsImplement teaching content recommendations for different groups of students
Literature [26]Design the overall topic for the teaching contentPersonalized recommendations for individual learnersObtain student interests based on knowledge graphsPersonalized recommendations for individual learnersRealize personalized recommendation of teaching resources
Literature [27]Design the overall topic for the teaching contentAutomatically analyze and learn learners’ styles and features to determine different topicsObtain users’ interest labelsAutomatically analyze and learn learners’ styles and features to determine different topicsRecommend teaching contents for students in different groups
Literature [28]Design the overall topic for the teaching contentNo user classification mechanism, conducting research on individual learnersObtain users’ interest labelsNo user classification mechanism, conducting research on individual learnersImplement the personalized teaching content recommendations
Literature [29]Design the overall topic for the teaching contentNo user classification mechanism, conducting research on individual learnersObtain users’ interest labelsNo user classification mechanism, conducting research on individual learnersImplement the personalized teaching content recommendations
Literature [30]Design the overall topic for the teaching contentNo user classification mechanism, conducting research on individual learnersObtain users’ interest labelsNo user classification mechanism, conducting research on individual learnersImplement the personalized teaching content recommendations
Literature [31]Design the overall topic for the teaching contentDivide students into two groups and recommend them by using different methodsObtain interest data from different groups of studentsDivide students into two groups and recommend them by using different methodsRecommend different teaching resources for different learners
Literature [32]Design the overall topic for the teaching contentNo grouping, used for recommendation in large-scale user baseEstablish a thematic interest model and explore students’ interestsNo grouping, used for recommendation in large-scale user baseImplement the personalized teaching content recommendations
Literature [33]Design the overall topic for the teaching contentNo user classification mechanismExplore and mine user interestsNo user classification mechanismRecommend different teaching resources for different learners
Literature [34]Design the overall topic for the teaching contentNo user classification mechanismExplore and mine user interests and demandsNo user classification mechanismRecommend different teaching resources for different learners
Table 2. Naive Bayes learning training set model constructed by the experiment.
Table 2. Naive Bayes learning training set model constructed by the experiment.
S ( i ) I-1I-2I-3I-4I-5 T ( i ) S ( i ) I-1I-2I-3I-4I-5 T ( i )
S ( 1 ) LLFLFLLLMFLT1 S ( 11 ) LLMFLMFLMFLFLT2
S ( 2 ) FLMFLMFLMFLFLT1 S ( 12 ) MFLMFLFLLLMFLT2
S ( 3 ) FLMFLBMFLFLT1 S ( 13 ) FLFLLLFLMFLT2
S ( 4 ) LLMFLLLMFLFLT1 S ( 14 ) FLFLLLMFLLLT2
S ( 5 ) MFLMFLFLMFLLLT1 S ( 15 ) LLFLFLMFLLLT2
S ( 6 ) FLFLFLMFLMFLT1 S ( 16 ) MFLLLMFLMFLLLT3
S ( 7 ) LLLLMFLFLFLT1 S ( 17 ) FLLLMFLFLLLT3
S ( 8 ) MFLFLLLFLFLT2 S ( 18 ) LLMFLLLMFLFLT3
S ( 9 ) MFLFLFLFLLLT2 S ( 19 ) MFLLLFLMFLFLT3
S ( 10 ) FLLLLLFLFLT2 S ( 20 ) MFLFLFLLLMFLT3
Table 3. The collected feature vectors of students to be classified in the experimental class.
Table 3. The collected feature vectors of students to be classified in the experimental class.
S ( i ) I-1I-2I-3I-4I-5 S ( i ) I-1I-2I-3I-4I-5
X ( 1 ) FLFLMFLMFLLL X ( 9 ) MFLMFLFLLLLL
X ( 2 ) LLMFLMFLLLFL X ( 10 ) FLMFLLLMFLFL
X ( 3 ) FLFLFLMFLLL X ( 11 ) LLLLMFLMFLFL
X ( 4 ) LLLLMFLMFLFL X ( 12 ) FLMFLFLFLLL
X ( 5 ) LLMFLMFLMFLFL X ( 13 ) MFLLLMFLFLFL
X ( 6 ) FLFLLLMFLMFL X ( 14 ) FLFLLLMFLMFL
X ( 7 ) LLFLMFLMFLMFL X ( 15 ) MFLMFLFLFLLL
X ( 8 ) MFLLLFLMFLLL
Table 4. The designed feature labels for each classification.
Table 4. The designed feature labels for each classification.
Classification Topic   Feature   Label   g ( i , t )
Quantization interval 0 < g ( i , t ) < 1 0 < g ( i , t ) < 1 0 < g ( i , t ) < 1 0 < g ( i , t ) < 1 0 < g ( i , t ) < 1 0 < g ( i , t ) < 1
T ( 1 )
“Rural Preservation”
g ( 1 , 1 ) :
Leisure Walk
g ( 1 , 2 ) :
Physical Exercise
g ( 1 , 3 ) :
Medical Health Preservation
g ( 1 , 4 ) :
Swimming Fitness
g ( 1 , 5 ) :
Cycling Experience
g ( 1 , 6 ) :
Climbing Mountain Experience
T ( 2 )
“Rural Cuisine”
g ( 2 , 1 ) :
Food Making
g ( 2 , 2 ) :
Food Tasting
g ( 2 , 3 ) :
Food Science Popularization
g ( 2 , 4 ) :
Food Expo
g ( 2 , 5 ) :
Food Festival
g ( 2 , 6 ) :
Food and Health Preservation
T ( 3 )
“Rural Farming”
g ( 3 , 1 ) :
Picking Experience
g ( 3 , 2 ) :
Fertilization Experience
g ( 3 , 3 ) :
Fishing Experience
g ( 3 , 4 ) :
Planting Experience
g ( 3 , 5 ) :
Harvesting Experience
g ( 3 , 6 ) :
Drying Experience
Table 5. The designed discussion topic G ( j ) for each classification T ( i ) .
Table 5. The designed discussion topic G ( j ) for each classification T ( i ) .
T ( 1 )
T ( 2 )
T ( 3 )
T ( 1 )
T ( 2 )
T ( 3 )
G ( 1 ) Emei Yequan ValleyNiuhua Ancient TownJiajiang Fengshan, New Year’s Painting Village G ( 6 ) Suji Ancient TownBagou Ancient TownZi Ai Tianyuan Family Farm
G ( 2 ) Jiayang Suoluo LakeSuji Ancient TownJia’e Tea Valley G ( 7 ) Kashasha Rural ResortQingxi TownTianye Farm
G ( 3 ) Luomu Ancient TownMuyu Mountain VillaQinghe Village G ( 8 ) Fanshen Village, WutongqiaoXiba Ancient TownTangjiaba Village
G ( 4 ) Pingqiang XiaosanxiaJinying Mountain VillaGuihuaqiao Town, Agricultural Park G ( 9 ) Shuangshan Village, Zhenxi TownHuatou Ancient TownTianfu Sightseeing Tea Garden
G ( 5 ) Futian VillageLianggou, Gaoqiao TownXinhua Village G ( 10 ) Black Bamboo GullyLuocheng Ancient TownSi’e Mountain Terraced Fields
Table 6. The posterior probability values of students X ( i ) belonging to different classifications T ( i ) output by naive Bayes machine learning algorithm.
Table 6. The posterior probability values of students X ( i ) belonging to different classifications T ( i ) output by naive Bayes machine learning algorithm.
S ( i )
T ( 1 )
T ( 2 )
T ( 3 )
S ( i )
T ( 1 )
T ( 2 )
T ( 3 )
X ( 1 ) 1.249 × 10−31.648 × 10−30.960 × 10−3 X ( 9 ) 0.249 × 10−30.659 × 10−30.960 × 10−3
X ( 2 ) 2.000 × 10−30.146 × 10−30.320 × 10−3 X ( 10 ) 4.998 × 10−32.637 × 10−30.480 × 10−3
X ( 3 ) 2.499 × 10−34.944 × 10−30.960 × 10−3 X ( 11 ) 2.499 × 10−30.220 × 10−32.880 × 10−3
X ( 4 ) 2.499 × 10−30.220 × 10−32.880 × 10−3 X ( 12 ) 1.000 × 10−32.637 × 10−30.320 × 10−3
X ( 5 ) 9.996 × 10−30.439 × 10−30.960 × 10−3 X ( 13 ) 0.167 × 10−30.439 × 10−32.880 × 10−3
X ( 6 ) 1.249 × 10−34.395 × 10−30.240 × 10−3 X ( 14 ) 1.249 × 10−34.395 × 10−30.240 × 10−3
X ( 7 ) 2.499 × 10−30.732 × 10−30.480 × 10−3 X ( 15 ) 0.333 × 10−32.637 × 10−30.960 × 10−3
X ( 8 ) 0.416 × 10−30.989 × 10−38.640 × 10−3
Table 7. Satisfaction evaluation results of experimental classes E1, E2, and E3 on the grouping results and discussion courses.
Table 7. Satisfaction evaluation results of experimental classes E1, E2, and E3 on the grouping results and discussion courses.
ClassDegree of SatisfactionGrouping SatisfactionInterest Matching SatisfactionTeam Collaboration SatisfactionDiscussion Process Satisfaction
E1very satisfied and satisfied0.867 0.867 0.800 0.800
dissatisfied0.133 0.133 0.200 0.200
E2very satisfied and satisfied0.800 0.667 0.667 0.600
dissatisfied0.200 0.333 0.333 0.400
E3very satisfied and satisfied0.6000.5330.6670.600
dissatisfied0.400 0.467 0.333 0.400
Table 8. Accuracy evaluation results of experimental classes E1, E2, and E3 on the grouping results and discussion courses.
Table 8. Accuracy evaluation results of experimental classes E1, E2, and E3 on the grouping results and discussion courses.
ClassGrouping AccuracyInterest Matching AccuracyTeam Collaboration AccuracyDiscussion Process Accuracy
E10.667 0.600 0.533 0.667
E20.533 0.400 0.467 0.400
E30.400 0.333 0.400 0.333
Table 9. The weight f ¯ G ( j ) [ i ] of interest intensity for topics G ( j ) in classifications T ( i ) .
Table 9. The weight f ¯ G ( j ) [ i ] of interest intensity for topics G ( j ) in classifications T ( i ) .
n = 5
G ( 1 ) G ( 2 ) G ( 3 ) G ( 4 ) G ( 5 ) G ( 6 ) G ( 7 ) G ( 8 ) G ( 9 ) G ( 10 )
T ( 1 ) 0.150 0.150 0.050 0.150 0.100 0.050 0.050 0.050 0.150 0.100
T ( 2 ) 0.100 0.033 0.033 0.067 0.133 0.100 0.133 0.200 0.100 0.100
T ( 3 ) 0.160 0.000 0.160 0.120 0.120 0.040 0.000 0.120 0.080 0.200
n = 6
G ( 1 ) G ( 2 ) G ( 3 ) G ( 4 ) G ( 5 ) G ( 6 ) G ( 7 ) G ( 8 ) G ( 9 ) G ( 10 )
T ( 1 ) 0.167 0.125 0.042 0.125 0.083 0.083 0.125 0.042 0.125 0.083
T ( 2 ) 0.083 0.056 0.028 0.083 0.111 0.083 0.111 0.167 0.139 0.139
T ( 3 ) 0.133 0.000 0.167 0.167 0.133 0.033 0.000 0.133 0.067 0.167
n = 7
G ( 1 ) G ( 2 ) G ( 3 ) G ( 4 ) G ( 5 ) G ( 6 ) G ( 7 ) G ( 8 ) G ( 9 ) G ( 10 )
T ( 1 ) 0.143 0.107 0.071 0.107 0.107 0.071 0.143 0.071 0.107 0.071
T ( 2 ) 0.095 0.048 0.048 0.095 0.119 0.071 0.143 0.143 0.119 0.119
T ( 3 ) 0.114 0.057 0.143 0.143 0.114 0.086 0.029 0.114 0.057 0.143
n = 8
G ( 1 ) G ( 2 ) G ( 3 ) G ( 4 ) G ( 5 ) G ( 6 ) G ( 7 ) G ( 8 ) G ( 9 ) G ( 10 )
T ( 1 ) 0.125 0.125 0.063 0.094 0.125 0.063 0.125 0.094 0.125 0.063
T ( 2 ) 0.104 0.063 0.063 0.104 0.104 0.083 0.125 0.125 0.104 0.125
T ( 3 ) 0.100 0.050 0.125 0.125 0.125 0.100 0.050 0.125 0.075 0.125
Table 10. The accuracy values of the optimal topics output by the different recommendation algorithms.
Table 10. The accuracy values of the optimal topics output by the different recommendation algorithms.
n = 5 n = 6 n = 7 n = 8
PRA T ( 1 ) 0.300 0.400 0.500 0.500
T ( 2 ) 0.233 0.267 0.450 0.300
T ( 3 ) 0.2600.3000.300 0.500
UCFA T ( 1 ) 0.200 0.350 0.500 0.475
T ( 2 ) 0.217 0.250 0.417 0.300
T ( 3 ) 0.200 0.2200.300 0.480
ICFA T ( 1 ) 0.2500.350 0.4500.425
T ( 2 ) 0.1670.2170.3830.283
T ( 3 ) 0.1800.2400.2600.480
Table 11. The recall rate values of the optimal topics output by the different recommendation algorithms.
Table 11. The recall rate values of the optimal topics output by the different recommendation algorithms.
n = 5 n = 6 n = 7 n = 8
PRA T ( 1 ) 0.600 0.667 0.714 0.625
T ( 2 ) 0.467 0.444 0.643 0.594
T ( 3 ) 0.520 0.500 0.429 0.531
UCFA T ( 1 ) 0.400 0.583 0.714 0.375
T ( 2 ) 0.433 0.417 0.595 0.375
T ( 3 ) 0.400 0.367 0.429 0.354
ICFA T ( 1 ) 0.500 0.583 0.643 0.625
T ( 2 ) 0.333 0.361 0.548 0.600
T ( 3 ) 0.360 0.400 0.371 0.600
Table 12. The calculated results of accuracy optimization degree Φ a c for the experimental group compared to the control group.
Table 12. The calculated results of accuracy optimization degree Φ a c for the experimental group compared to the control group.
n = 5 n = 6 n = 7 n = 8 Average
Φ a c [ PRA UCFA ] T ( 1 ) 33.33%12.50%0.00%5.00%12.71%
T ( 2 ) 6.87%6.37%7.33%0.00%5.14%
T ( 3 ) 23.08%26.67%0.00%4.00%13.44%
Φ a c [ PRA ICFA ] T ( 1 ) 16.67%12.50%10.00%15.00%13.54%
T ( 2 ) 28.33%18.73%14.89%5.67%16.90%
T ( 3 ) 30.77%20.00%13.33%4.00%17.03%
Table 13. The calculated results of recall rate optimization degree Φ r e for the experimental group compared to the control group.
Table 13. The calculated results of recall rate optimization degree Φ r e for the experimental group compared to the control group.
n = 5 n = 6 n = 7 n = 8 Average
Φ r e [ PRA UCFA ] T ( 1 ) 33.33%12.59%0.00%4.96%12.72%
T ( 2 ) 7.28%6.08%7.47%0.00%5.21%
T ( 3 ) 23.08%26.60%0.00%4.00%13.42%
Φ r e [ PRA ICFA ] T ( 1 ) 16.67%12.59%9.94%15.04%13.56%
T ( 2 ) 28.69%18.69%14.77%5.60%16.94%
T ( 3 ) 30.77%20.00%13.52%4.00%17.07%
Table 14. The comparison of accuracy between the two classes.
Table 14. The comparison of accuracy between the two classes.
Class 1—E2Class 2—E3
n = 5 n = 6 n = 7 n = 8 n = 5 n = 6 n = 7 n = 8
PRA T ( 1 ) 0.250 0.350 0.300 0.600 PRA T ( 1 ) 0.275 0.350 0.575 0.600
T ( 2 ) 0.283 0.350 0.500 0.700 T ( 2 ) 0.267 0.283 0.300 0.500
T ( 3 ) 0.400 0.500 0.600 0.600 T ( 3 ) 0.300 0.400 0.600 0.700
UCFA T ( 1 ) 0.225 0.325 0.300 0.325 UCFA T ( 1 ) 0.250 0.300 0.450 0.425
T ( 2 ) 0.200 0.267 0.400 0.300 T ( 2 ) 0.233 0.267 0.267 0.317
T ( 3 ) 0.200 0.260 0.280 0.400 T ( 3 ) 0.220 0.260 0.320 0.420
ICFA T ( 1 ) 0.250 0.300 0.300 0.325 ICFA T ( 1 ) 0.225 0.325 0.500 0.300
T ( 2 ) 0.183 0.217 0.267 0.283 T ( 2 ) 0.200 0.233 0.267 0.300
T ( 3 ) 0.180 0.260 0.260 0.400 T ( 3 ) 0.220 0.260 0.280 0.420
Table 15. The comparison of recall rate between the two classes.
Table 15. The comparison of recall rate between the two classes.
Class 1—E2Class 2—E3
n = 5 n = 6 n = 7 n = 8 n = 5 n = 6 n = 7 n = 8
PRA T ( 1 ) 0.500 0.583 0.429 0.750 PRA T ( 1 ) 0.550 0.583 0.821 0.750
T ( 2 ) 0.567 0.583 0.714 0.875 T ( 2 ) 0.533 0.472 0.429 0.625
T ( 3 ) 0.800 0.833 0.857 0.750 T ( 3 ) 0.600 0.667 0.857 0.875
UCFA T ( 1 ) 0.450 0.542 0.429 0.344 UCFA T ( 1 ) 0.500 0.500 0.643 0.531
T ( 2 ) 0.400 0.444 0.571 0.375 T ( 2 ) 0.467 0.444 0.381 0.396
T ( 3 ) 0.400 0.433 0.400 0.500 T ( 3 ) 0.440 0.433 0.457 0.525
ICFA T ( 1 ) 0.500 0.500 0.429 0.406 ICFA T ( 1 ) 0.450 0.542 0.714 0.375
T ( 2 ) 0.367 0.361 0.381 0.354 T ( 2 ) 0.400 0.389 0.381 0.375
T ( 3 ) 0.360 0.433 0.371 0.500 T ( 3 ) 0.440 0.433 0.400 0.525
Table 16. The comparison of precision between the two classes.
Table 16. The comparison of precision between the two classes.
Class 1—E2Class 2—E3
n = 5 n = 6 n = 7 n = 8 n = 5 n = 6 n = 7 n = 8
PRA T ( 1 ) 0.714 0.778 0.667 0.857PRA T ( 1 ) 0.647 0.778 0.852 0.774
T ( 2 ) 0.895 0.581 0.833 0.875 T ( 2 ) 0.696 0.586 0.514 0.682
T ( 3 ) 0.833 0.833 0.882 0.789 T ( 3 ) 0.600 0.714 0.882 0.875
UCFA T ( 1 ) 0.563 0.650 0.500 0.731 UCFA T ( 1 ) 0.556 0.700 0.833 0.739
T ( 2 ) 0.500 0.533 0.632 0.486 T ( 2 ) 0.538 0.552 0.432 0.413
T ( 3 ) 0.625 0.542 0.609 0.645 T ( 3 ) 0.579 0.542 0.552 0.656
ICFA T ( 1 ) 0.625 0.571 0.444 0.520 ICFA T ( 1 ) 0.529 0.684 0.741 0.500
T ( 2 ) 0.579 0.520 0.533 0.425 T ( 2 ) 0.500 0.424 0.500 0.563
T ( 3 ) 0.450 0.591 0.481 0.625 T ( 3 ) 0.478 0.542 0.424 0.724
Table 17. The comparison of F 1 value between the two classes.
Table 17. The comparison of F 1 value between the two classes.
Class 1—E2Class 2—E3
n = 5 n = 6 n = 7 n = 8 n = 5 n = 6 n = 7 n = 8
PRA T ( 1 ) 0.588 0.667 0.522 0.800 PRA T ( 1 ) 0.595 0.667 0.836 0.762
T ( 2 ) 0.694 0.582 0.769 0.875 T ( 2 ) 0.604 0.523 0.468 0.652
T ( 3 ) 0.816 0.833 0.869 0.769 T ( 3 ) 0.600 0.690 0.869 0.875
UCFA T ( 1 ) 0.500 0.591 0.462 0.468 UCFA T ( 1 ) 0.527 0.583 0.726 0.618
T ( 2 ) 0.444 0.484 0.600 0.423 T ( 2 ) 0.500 0.492 0.405 0.404
T ( 3 ) 0.488 0.481 0.483 0.563 T ( 3 ) 0.500 0.481 0.500 0.583
ICFA T ( 1 ) 0.556 0.533 0.436 0.456 ICFA T ( 1 ) 0.486 0.605 0.727 0.429
T ( 2 ) 0.449 0.426 0.444 0.386 T ( 2 ) 0.444 0.406 0.432 0.450
T ( 3 ) 0.400 0.500 0.419 0.556 T ( 3 ) 0.458 0.481 0.412 0.609
Table 18. The accuracy optimization, recall rate optimization, precision optimization, and F 1 value optimization of the experimental group compared to the control group (calculate the average value of each group).
Table 18. The accuracy optimization, recall rate optimization, precision optimization, and F 1 value optimization of the experimental group compared to the control group (calculate the average value of each group).
Class 1—Average optimization degree
Φ a c Φ r e Φ p r Φ F 1
PRA UCFA T ( 1 ) 15.74%17.79%19.34%19.84%
T ( 2 ) 32.55%32.62%30.25%31.62%
T ( 3 ) 46.17%46.17%27.28%38.41%
PRA ICFA T ( 1 ) 15.03%15.03%27.96%21.25%
T ( 2 ) 44.88%44.88%33.31%40.06%
T ( 3 ) 48.25%48.27%35.32%42.61%
Class 2—Average optimization degree
Φ a c Φ r e Φ p r Φ F 1
PRA UCFA T ( 1 ) 18.57%18.55%7.71%14.02%
T ( 2 ) 16.50%16.54%20.97%18.66%
T ( 3 ) 37.08%37.11%22.51%30.70%
PRA ICFA T ( 1 ) 22.09%22.06%19.69%21.09%
T ( 2 ) 23.44%23.43%18.99%21.88%
T ( 3 ) 38.75%38.77%28.40%34.24%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, X.; Guo, L.; Li, R.; Liu, L.; Pan, J. Intelligent Teaching Recommendation Model for Practical Discussion Course of Higher Education Based on Naive Bayes Machine Learning and Improved k-NN Data Mining Algorithm. Information 2025, 16, 512. https://doi.org/10.3390/info16060512

AMA Style

Zhou X, Guo L, Li R, Liu L, Pan J. Intelligent Teaching Recommendation Model for Practical Discussion Course of Higher Education Based on Naive Bayes Machine Learning and Improved k-NN Data Mining Algorithm. Information. 2025; 16(6):512. https://doi.org/10.3390/info16060512

Chicago/Turabian Style

Zhou, Xiao, Ling Guo, Rui Li, Ling Liu, and Juan Pan. 2025. "Intelligent Teaching Recommendation Model for Practical Discussion Course of Higher Education Based on Naive Bayes Machine Learning and Improved k-NN Data Mining Algorithm" Information 16, no. 6: 512. https://doi.org/10.3390/info16060512

APA Style

Zhou, X., Guo, L., Li, R., Liu, L., & Pan, J. (2025). Intelligent Teaching Recommendation Model for Practical Discussion Course of Higher Education Based on Naive Bayes Machine Learning and Improved k-NN Data Mining Algorithm. Information, 16(6), 512. https://doi.org/10.3390/info16060512

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop