Adapting Fleming-Type Learning Style Classiﬁcations to Deaf Student Behavior

: This study presents the development of a novel integrated data fusion and assimilation technique to classify learning experiences and patterns among deaf students using Fleming’s model together with Thai Sign Language. Data were collected from students with hearing disabilities (Grades 7–9) studying at special schools in Khon Kaen and Udon Thani, Thailand. This research used six classiﬁcation algorithms with data being resynthesized and improved via the application of feature selection, and the imbalanced data corrected using the synthetic minority oversampling technique. The collection of data from deaf students was evaluated using a 10-fold validation. This revealed that the multi-layer perceptron algorithm yields the highest accuracy. These research results are intended for application in further studies involving imbalanced data problems.


Introduction
Education is a vital social catalyst for improving quality of life, and everyone (including those with physical, emotional, or intellectual disabilities) should expect equal access to education. The Royal Thai Government, in support of equal rights and educational opportunities for its people, established the National Education Act, B.E. 2542 (1999). That Act states: "People with physical, mental, intellectual, emotional, social, communication, and learning deficiencies; those with physical disabilities; the crippled; those unable to support themselves; or those destitute or disadvantaged; will have the right and opportunities to receive basic education specifically provided" [1].
The educational provision of special classes for deaf individuals needs to be undertaken carefully. As a starting point, the nature and style of learning should be determined to allow for the enhancement of student skills and the development of appropriate instruction. Therefore, teachers must understand the diversity and learning styles of special learners so as to utilize the right teaching tools and help students learn effectively. For this purpose, digital academic data can be obtained and converted using data mining techniques that elicit significant knowledge from the data [2]. Once mined, the data can be classified into learning style types that improve learning methods. The application of data mining to educational data is referred to as educational data mining (EDM) [3,4]. Some technological approaches, such as e-learning, computer-assisted instruction, the World Wide Web, and multimedia, have also been adopted for classroom use [5,6]. Technology is applied to create learning media for deaf students that are informed by and meet their educational needs.
This study aims to analyze the main factors affecting learning styles by comparing learning type classifications via data synthesis and resolving data imbalance. These efforts resulted in the identification of factors influencing learning styles and the techniques that yielded the highest accuracy from each classification. When learning styles matched needs, the appropriate instructional method was provided to that individual student [7], allowing for improved academic achievement.

Analysis of Learning Factors
Much educational research has explored the extent to which learner variables can predict learning outcomes or future learning behaviors [17]. The investigation of learning styles has significantly enhanced our understanding of how adult learners acquire knowledge via an online curriculum [18]. Electronic catalogs have been designed and built based on the purchasing habits of consumers [19], utilizing VARK to analyze similar information such as gender, age, race, and experience in using both a computer and the internet. Several educational studies have adopted various factors to evaluate the VARK learning styles of first-year students in order to meet various academic demands [20]. Any educational system must be able to support different content and instructional media for all potential learners, thereby providing the most effective learning possible [10]. Such systems must be able to check the correlation between learning style and learning efficacy via problem-solving games and effective teaching strategies [21]. The analysis of gender, age, and education level assists in the determination of learning style [10,20,21]. Questionnaires were provided to elicit the level of satisfaction with VARK learning styles, allowing learners to overcome weaknesses, select their preferred learning style, and improve their academic outcomes [22]. Additionally, other factors such as GPA [10,22], fields of study [21,23], and the student's hometown [20,22] were studied for their effects on selected learning styles. All of these factors affect learning, and understanding and adapting to them results in better learning. These factors were therefore selected for analysis in this research.

Learning Style
Inconsistencies between teaching approaches and student learning styles may result in negative consequences, such as a lack of attention and interaction, and an inability to pass tests and classes, which can eventually result in the student leaving the school. Therefore, educators need to balance their teaching methods with a learning model that responds to a student's needs [24]. Learning style is the personal style that an individual uses in a learning task [25].
Individual learners have different learning styles. They evaluate data and information through their perceptions in different ways. A study of engineering students [26] identified four types of learners: the diverger, the assimilator, the converger, and the accommodator [27,28]. These classifications were then expanded into the investigation of other learning styles in fields such as science. These learning styles were divided into five dimensions with two opposite connecting sides: (1) sensing/intuitive; (2) visual/verbal; (3) inductive/deductive; (4) active/reflective; and (5) sequential/global [29]. Fleming [30] further proposed a sensory model developed from Eicher's work [31], called VARK: visual (V), aural (A), read/write (R), and kinesthetic (K) learning styles. The VARK model is defined as an individual way to collect, manage, and organize ideas and data senses, and it is included in pedagogy due to its relationship with data reception and contribution [32]. Fleming's VARK learning styles were based on four sensory perceptions: visual, learning through seeing; aural, learning through listening and discussing, including music and lectures; learning through reading and writing; and kinesthetic, learning through body movement, expressing feelings, and touching [30,33,34]. The VARK learning styles have been adopted and applied in other fields, such as business, programming, nursing, online learning, and physiology, and have also been adopted to check the acceptance of different educational technologies by learners [19,[35][36][37][38][39][40].

Learning for the Deaf
The most common form of communication is verbal expression and normal language [41]. People with a hearing disability can find it difficult to interact verbally with their peers and must be attentive to the sounds around them [42]. Around the world, many deaf and hard-of-hearing people benefit from offline subtitles (e.g., for pre-recorded television programming) or real-time captioning services (e.g., in classrooms, meetings, and at live events) [43]. Although deaf people are unable to hear, their other senses, such as sight, assist them in 'listening' to others. Facial expressions during speech assist the listening process, in particular lip reading [44]. Where hearing individuals can process errors and make corrections to what they hear, lip readers hone their skills for the same purpose. Developing the ability to understand someone speaking using lip reading, as well as learning to 'sign' language, is particularly difficult [45], and, obviously, sign language is only useful for communication between knowledgeable participants.
Sign language within the deaf community consists of body movements, facial expressions, hand positions, and body posture [46][47][48], which are expressed in three dimensions [49]. They range from individual ad hoc contexts to official, national sign languages [50]. Thai Sign Language is widely broadcast on several television networks offering content translation, and this exposure further serves to raise the profile and awareness of sign language [51]. Deaf children communicate using sign language and by reading lips.
Because reading plays a significant role in learning [52], deaf individuals must memorize characters rather than sounds. In writing, a deaf student typically writes simple sentences with a fixed structure and no transition and is unable to connect complex sentences [53]. In addition, mistakes are likely [54]. For these reasons, specific learning styles for deaf students are needed. It is very important that teaching and learning is based on the preferences and unique needs of deaf students.

Data Analysis Techniques
Data analysis techniques used to search for patterns, hidden relationships, or existing rules in large data are referred to as data mining [55]. Identified patterns may point to particular meanings that can be utilized for certain purposes [56][57][58]. In this study, the process involved finding patterns and relationships hidden in the deaf student data set. The integrated analysis applied in this study was based on basic data relating to deaf students, learning style information, and algorithms. The algorithm for classifying learning styles helped to build models in response to receiving new information. The various algorithms, including decision tree, random forest, Bayesian network, naïve Bayes, k-nearest neighbor (KNN), and multi-layer perceptron, used to classify the data [59] are described in the following subsections.

Decision Tree
Developed from the ID3 algorithm and named after a hierarchical model visually formed in the shape of a tree, decision tree (C4.5) has become the standardized comparative tool for learning algorithms [2]. It represents a learning process, in which data are classified based on their attributes [60]. The algorithm splits observations into multiple branches, also referred to as subsets, based on a decision node with a given criterion [61].
Attributes must be selected as root nodes for this algorithm. A gain criterion is used to select the best attribute. Using information gain reduces the number of classification tests, making the tree less complex. The information gain equation is as follows: where s refers to the number of data sets (e.g., s records); n refers to the total number of different groups in the data set; C i refers to the group of order i where i = 1, . . . , n; s i refers to the number of data points that belong to s in group C i.

Random Forest
Random forest, 2001 [62], is built by constructing several models using decision trees and randomized variables. Random forest is an effective and well-established method for generating multiple classifications and regression trees, based on bagging and random subspace [63]. The outcomes of each model are combined, and the most-repeated one is selected and extracted as the outcome, which in turn forms the structure of the tree. The advantages of this method are forecast precision, user-friendliness, overall effectiveness, and the ability to work with unseen information (as it is less over-fitting than other classifications [64]). Its ease of use derives from it requiring only two parameters: the variables in a random set of each node and the number of trees [65].

Bayesian Network
The Bayesian network classification method, developed from the law of Bay [66], is conducted by plotting a probability graph (known as Bayes net) that reduces limitations introduced by naïve Bayes (see Section 3.1.4). Thus, the Bayesian network is used to explain the independence within conditions between variables. To enhance learning effectiveness, previous knowledge should be input into the Bayesian belief network (Equation (3)) in the form of a network structure that includes the probability table and conditions. For example, X is conditionally independent of Y (which means the probability of X does not depend on Y) when Z is known; this is written in the following equation: In a Bayesian network, each variable contains a particular probability that may derive from a beginning node or the relationship of more than one node. A probability occurring from more than one variable, referred to as a joint probability, is expressed using the following equation: where x i refers to the variable considered from parents and where (x i ) refers to the variable considered from the case of direct parents of x i .

Naïve Bayes
Naïve Bayes classification uses statistical probability to forecast membership according to Bayes' theorem [2]. Bayes enhances the learning model by adding training sets, which accounts for its popularity and widespread use [67]. The algorithm is uncomplicated, learns quickly, and can manage a variety of features or classes within a variety of cases [68]. However, the outcomes are effective only when the data features selected are independent of each other, as written in Equation (4): where P(C) refers to the probability of the incident before incident C occurs; P(A) refers to the probability of the incident before data set A; P(C|A) refers to the probability of incident C when incident A occurs; P(A|C) refers to the probability of incident A when incident C occurs.

K-Nearest Neighbor
KNN [69] is considered one of the ten most suitable algorithms for data classification due to its simple and effective process [59]. Adopting the "lazy learner" concept, KNN states that classifiers from data sets are not needed; instead, it collects data, waits to reach the target, and then starts its analysis. The principle of KNN is similar to that of data clustering: the distance between the predicted data and all nearby data (number of data = k) is measured. This is a widely used fundamental classification [70], in which the predicted outcome is classified within the entire KNN. The distance measurement is conducted in the Euclidean style, where the second root of the attribute differences is squared, as shown below: where X 1 refers to the first attribute of data point 1 and Y 1 refers to the first attribute of data point 2, in which both data (X and Y) contain L attributes.

Multi-Layer Perceptron
The multi-layer perceptron, patterned within an artificial neural network (ANN), is a data-processing model first developed in 1943 by McCulloch Pits as a simple ANN of "McCulloch-Pits" neurons [71]. The concept, inspired by the bioelectric network of neurons and synapses in the human brain, processes data using calculations in a network where several subprocessors work together. Its multi-layer structure is effective in solving complex tasks using supervised learning in backpropagation networks. The architecture sends data Sustainability 2022, 14, 4799 6 of 16 from the input layer to the hidden and output layers, where the processing direction of data flows back to fix errors in every hidden layer, thereby improving the operation.

Dividing Data to Test the Efficiency of the Classification Model
Cross-validation [72,73] (using self-consistency testing, split testing, and k-fold crossvalidation) plays a significant role in measuring the efficiency of a forecasting model. The data are divided into learning and testing data sets for classification. If cross-validation is not selected carefully at this stage, the classification outcomes may contain errors. Therefore, k-fold cross-validation was chosen for model performance testing, as it is a popular method and provides reliable results [74,75]. K-fold cross-validation has become widely used due to its reliability [76]. Testing a model's efficiency using cross-validation is achieved by dividing data into k groups (where k is any value from 1 to n), in which each group contains the same amount of data. A single group is then selected as the test group, and the remaining groups become learning groups. The test group is circulated in k rounds until all groups are completed and all data classified. The most common method, 10-fold validation, provides positive results but takes some time.

Information Gain
Feature selection is a useful way to reduce feature space dimensions. By developing data collection and machine learning techniques, feature selection plays an important role in data mining and machine learning. Feature selection can not only extract significant impact factors but can also improve accuracy [77].
Feature selection is conducted by calculating and ordering the weights correlated between features and classes using the following equation: where Entropy(c 1 ) = −P(c 1 )p(c 1 ) and p(c 1 ) refers to the probability of c 1 . According to Equation (6), information gain relies on measures of differences and dispersions of data, known as entropy. If data are either significantly different or significantly similar, the entropy will be high. More details of entropy can be found in [78].

Synthetic Minority Oversampling Technique
The synthetic minority oversampling technique (SMOTE) [79] randomizes the determined amount of data from the minority class via KNN. The nearest k is chosen to build synthesized samples within the area where vector properties and the nearest k are calculated to find the distance between vectors. The differences are multiplied by random numbers between 0 and 1, which are added to the feature: where N point refers to a newly developed data point of the minority class; O point refers to a data point of the minority class as a starting point of the distance compared with the neighboring point; Random [0, 1] refers to a random number between 0 and 1; distance(x, y, . . . , z) refers to the distance between the starting point and the neighboring point from attributes x and y to z.

Index of Item-Objective Congruence
Item-objective congruence (IOC) is an innovative procedure that is used to evaluate content validity, objective items, and questions [80]. Analysis of the index of items evaluates research tools, with data being collected via questionnaires. The objectives are presented to three to five experts specializing in data assessment, to consider whether the tools are consistent with the objectives. Content validity, language correctness, and question clarity are checked and the IOC determined.

Research Methodology
This research created a model for classifying the learning styles of deaf students who attended Udon Thani and Khon Kaen schools for the deaf and who could communicate in Thai sign language. The objective was to identify a learning style suitable for each student. A limitation was that these were students from the first group of schools to initiate the teaching of deaf students. This was therefore a niche group that might result in a smaller population. Udon Thani and Khon Kaen are special schools for deaf students and were selected because they have the largest number of students in the northeastern region of Thailand. The research was endorsed by the Office of the Khon Kaen University Ethics Committee in Human Research to run from 3 February 2021 to 20 January 2023. The hypothesis was that deaf students were able to determine their appropriate learning style from the relevant factors.
Although Fleming's learning style model has been widely used to classify learning styles, it has never been applied to classify the learning styles of deaf students or to Thai Sign Language (TSL) as a communication language for the deaf. This hybrid learning style for the deaf is VRK + TSL. The research work plan of the VRK + TSL Rule Model is shown in Figure 1. Analysis of the VRK + TSL learning pattern classification was categorized into three parts: 1. The related predictor of VRK + TSL learning (Figure 2) Exact predictors for VRK + TSL learning have not yet been determined. Several literature sources have concluded that "bunches" of data should be employed (see Section 2.2) and we adapted these factors [81] to analyze deaf students. Selection of the most appropriate predictor can be achieved using the following steps: • A questionnaire is developed to determine the predictor for the VRK + TSL learning style, which investigates the factors that affect learning among the deaf. A Likert scale is then used to evaluate the results [82].

•
The questionnaires are analyzed by five experts.

•
The expert comments are gathered and averaged scores greater than 3.50 are used to Analysis of the VRK + TSL learning pattern classification was categorized into three parts: 1.
Exact predictors for VRK + TSL learning have not yet been determined. Several literature sources have concluded that "bunches" of data should be employed (see Section 2.2) and we adapted these factors [81] to analyze deaf students. Selection of the most appropriate predictor can be achieved using the following steps: • A questionnaire is developed to determine the predictor for the VRK + TSL learning style, which investigates the factors that affect learning among the deaf. A Likert scale is then used to evaluate the results [82].

•
The questionnaires are analyzed by five experts.

•
The expert comments are gathered and averaged scores greater than 3.50 are used to construct the appropriate learning pattern.
A questionnaire was constructed based on the VRK + TSL pattern combining factors used for analyzing VARK + TSL learning (Step 1), employing the VARK developed by Neil Fleming. The questionnaire consisted of 16 items with four choices representing four aptitudes: visual, read/write, kinesthetic, and Thai Sign Language, adapted to include VRK + TSL content. The time needed to complete the questionnaire was approximately 30 min. The factors affecting learning among deaf students were analyzed by five experts. Data were collected and analyzed to rate the questionnaires using basic statistics. Content validity, appropriateness of language, and question clarity were also reviewed. The questionnaires exhibited significant ratings (0.60-1.00), implying an index of IOC greater than 0.70. The language was then clarified based on the suggestions of the experts, to ensure participant understanding.

Work Plan of the VRK + TSL Rule Model
Analysis of the VRK + TSL learning pattern classification was categorized into three parts: 1. The related predictor of VRK + TSL learning (Figure 2) Exact predictors for VRK + TSL learning have not yet been determined. Severa literature sources have concluded that "bunches" of data should be employed (see Section 2.2) and we adapted these factors [81] to analyze deaf students. Selection of the mos appropriate predictor can be achieved using the following steps: • A questionnaire is developed to determine the predictor for the VRK + TSL learning style, which investigates the factors that affect learning among the deaf. A Likert scale is then used to evaluate the results [82].

•
The questionnaires are analyzed by five experts.

•
The expert comments are gathered and averaged scores greater than 3.50 are used to construct the appropriate learning pattern.

VRK + TSL learning pattern questionnaires for deaf students (Figure 3)
A questionnaire was constructed based on the VRK + TSL pattern combining factor used for analyzing VARK + TSL learning (Step 1), employing the VARK developed by Neil Fleming. The questionnaire consisted of 16 items with four choices representing fou aptitudes: visual, read/write, kinesthetic, and Thai Sign Language, adapted to include VRK + TSL content. The time needed to complete the questionnaire was approximately 30 min. The factors affecting learning among deaf students were analyzed by five experts Data were collected and analyzed to rate the questionnaires using basic statistics. Conten    Learning style questionnaires were incorporated into a TSL video together with documents for deaf students in Thai secondary schools. Under the supervision of the National Electronics and Computer Technology Center (NECTEC), a QR code was embedded in the documents to connect with the sign language video, as shown in Figure 4. Learning style questionnaires were incorporated into a TSL video together with documents for deaf students in Thai secondary schools. Under the supervision of the National Electronics and Computer Technology Center (NECTEC), a QR code was embedded in the documents to connect with the sign language video, as shown in Figure  4.  The study was reviewed and approved by the Office of the Khon Kaen University Ethics Committee in Human Research. The questionnaires were distributed to students in Grades 7-9 (13-17 years old) in two schools in the northeast of Thailand: School for the Deaf, Khon Kaen, and School for the Deaf, Udon Thani. The objectives and intended benefits of the research were explained to the 82 participants, who were also informed that there were no right or wrong answers and that their class scores would not be affected. The video was also played to aid in student understanding. The students answered the questions based solely on their preferences and were encouraged not to copy from each other.
Sustainability 2022, 14, x FOR PEER REVIEW 11 of 1 Figure 5. The innovative procedure for classifying VRK + TSL learning styles.

Efficacy Measurement
To test the effectiveness of each model, statistical outcomes, accuracy, precision (PREC), recall (REC), and F-measure were taken into consideration, based on th confusion matrix [56] table. This study used educational data to extract knowledge from data obtained via the questionnaire and to determine the learning styles of participants. The data were processed before being analyzed using the following steps:

•
The questionnaires were collected from the deaf students at the special schools in Khon Kaen and Udon Thani, Thailand, as shown in Table 1.

•
The data were then screened and any non-useful content was discarded.

•
The data were classified into four learning groups based on the VRK + TSL model and converted into the format necessary for the next processing step. • Data analysis was conducted using data mining via the decision tree, random forest, Bayesian network, naïve Bayes, multi-layer perceptron, and KNN algorithms. Feature selection was utilized to help select the right feature in the analysis, and feature selection was used with SMOTE to solve data imbalance problems. • After analysis, models were developed and assessed for optimum suitability and effectiveness.

Efficacy Measurement
To test the effectiveness of each model, statistical outcomes, accuracy, precision (PREC), recall (REC), and F-measure were taken into consideration, based on the confusion matrix [56] table.
True positive (TP) occurred when the prediction was true and based on expectation. True negative (TN) occurred when the prediction was true but not based on expectation.
False positive (FP) occurred when the prediction was false but based on expectation (i.e., the reality did not refer to a certain object, but the prediction did refer to it).
False negative (FN) occurred when the prediction was false and not based on an expectation (i.e., the reality referred to a certain object but the prediction did not).
Accuracy is calculated as the number of all correct predictions divided by the total number in the dataset. The best accuracy possible is 1.0 and the worst is 0.0, and it can be defined as PREC is calculated as the number of correct positive predictions divided by the total number of positive predictions. It is also called the positive predictive value. The best PREC is 1.0 and the worst is 0.0, and it can be defined as REC is calculated as the number of correct positive predictions divided by the total number of positives. It is also called TP rate. The best sensitivity is 1.0 and the worst is 0.0, and it can be defined as F-Measure is the process used to find the PREC and REC, using the following equation:

Results
Data obtained from students in Grades 7-9 at the two Schools for the Deaf were tested for effectiveness. The results were divided into 5-fold and 10-fold validations ( Table 2).  Figure 6 shows the division of the data into 5-fold and 10-fold validations together with their varying accuracies for the six algorithms.
The experiments were conducted in three patterns. In the first pattern, the primary data were evaluated for accuracy using the decision tree, random forest, Bayesian network, naïve Bayes, multi-layer perceptron, and KNN classifier algorithms. In the second pattern, the primary data were processed via the feature selection method to find the factors affecting learning patterns. When the data were resynthesized, accuracy could be determined once again using the six comparative algorithms. In the third pattern, the primary data were processed via the feature selection method to find the factors affecting learning patterns.
SMOTE was used to resolve imbalanced data and resynthesize the data. Next, accuracy was tested again via the six algorithms.
The outcomes indicated that each algorithm increased accuracy by adjusting the input data and dividing the data into 5-fold and 10-fold validations. Using both these methods, the accuracy value increased for every algorithm. Dividing the data 10-fold yielded greater accuracy than 5-fold. The highest accuracy was achieved using the multi-layer perceptron algorithm and the lowest was achieved using the naïve Bayes algorithm.  Figure 6 shows the division of the data into 5-fold and 10-fold validations together with their varying accuracies for the six algorithms. The experiments were conducted in three patterns. In the first pattern, the primary data were evaluated for accuracy using the decision tree, random forest, Bayesian network, naïve Bayes, multi-layer perceptron, and KNN classifier algorithms. In the second pattern, the primary data were processed via the feature selection method to find the factors affecting learning patterns. When the data were resynthesized, accuracy could be determined once again using the six comparative algorithms. In the third pattern, the primary data were processed via the feature selection method to find the factors affecting learning patterns. SMOTE was used to resolve imbalanced data and resynthesize the data. Next, accuracy was tested again via the six algorithms.

Discussion
This research investigated the classification used to predict the learning styles of deaf students, using the decision tree, random forest, Bayesian network, naïve Bayes, multilayer perceptron, and KNN algorithms together with feature selection and SMOTE. Their efficiency and accuracy were evaluated using various measurement tools. The combination that achieved the highest accuracy was that of the multi-layer perceptron, random forest, and KNN algorithms, which were performed together with feature selection using information gain and SMOTE to rectify imbalanced data [83]. Classifications were conducted to predict learning styles by the selection of the factors that affect learning, and SMOTE was employed to manage and resolve imbalanced data. Oversampling randomized the minority class and balanced it with the majority class, further enhancing the effectiveness of the classification in predicting the minority class, as well as dividing the data to evaluate model effectiveness. Data types were classified into both 5-and 10-fold validations. Data were divided into k groups and where k equaled 1 to n groups, and each group contained the same amount of data. One group was then selected to be the test group, and the remaining groups acted as training groups. The test group was switched and circulated into k rounds until every rotation was complete. The techniques mentioned above were used to resolve problems and resulted in better outcomes.

Conclusions
The aim of this research was to classify the learning patterns of deaf students based on their learning behaviors. The data classifications were not balanced due to their differences in size. Therefore, to evaluate the effectiveness of each model, the data were divided into 5and 10-fold validations and tested with each comparative algorithm. The algorithm with the highest accuracy was multi-layer perceptron with 60.9756% accuracy. Data accuracy was further improved via feature selection. The multi-layer perceptron algorithm with 5and 10-fold classifications yielded accuracy rates of 70.7317% and 69.5122%, respectively. However, imbalanced data could still exist, as the classifications were conducted with both majority and minority classes simultaneously. The data properties from the majority class could overshadow those of the minority class, which would lead to reduced effectiveness of the minority class data. To solve this problem, SMOTE was used to balance the data from both classes via a random process in conjunction with feature selection. The resynthesized process resulted in improved outcomes, namely, the 5-fold classification with multi-layer perceptron presented 71.7391% accuracy, and the 10-fold classification increased accuracy to 76.0870%.
The effectiveness of the decision tree, random forest, Bayesian network, naïve Bayes, multi-layer perceptron, and KNN algorithms was evaluated for the classification of learning patterns. The accuracy of each algorithm increased when feature selection and SMOTE were applied together. Multi-layer perceptron was the algorithm with the highest accuracy when performed in conjunction with feature selection and SMOTE. This study is the first to demonstrate that feature selection with SMOTE can improve the accuracy of an algorithm and solve imbalanced data problems.

•
Further education should be based on the classification of learning styles for deaf students, focusing on areas where they have preferences, as being engaged and enjoying the educational process benefits future careers.

•
Learning factors for deaf students could be further expanded, potentially leading to the discovery of other more important factors.

•
The concept of the model used in this study could be applied to teaching, prediction, or instructional media for deaf students or other learners with special needs.

•
The data analysis employed in this research was adapted specifically for deaf students and could be further applied to other groups with imbalanced data, for example, speech-impaired or visually impaired learners.
Finally, this research has allowed us to affirm that behavioral patterns in deaf students are very disparate; students can show different types of behavior that range from permanent involvement to virtual silence.
To maximize their potential, it would be necessary to design methodological strategies that promote active student participation and a collaborative approach to the construction of learning.
(NECTEC) for the production of the document that embedded the two-dimensional barcode (QR code) within the sign language videos, as well as the entire SQR (Sign Language QR Code for the Deaf) system. The authors would like to acknowledge Publication Clinic KKU, Thailand.

Conflicts of Interest:
The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.