Factor Implicit Model of Machine Classiﬁcation Learning Based on Factor Space Theory †

: The algorithm proposed in this paper can not only solve the binary classiﬁcation problem, but can also solve the multi-classiﬁcation problem. The combination of categories is deﬁned on the basis of the sweeping serial classiﬁcation algorithm, and the merging serial classiﬁcation method of factors, both explicit and implicit, is proposed. The algorithm steps are given, and the numerical example is used for analysis. The results show that the proposed combined serial scanning classi-ﬁcation method achieves factor concealment and is feasible and practical. The conclusion of factor implicit research on multi-classiﬁcation learning expands the theory and application of factor space.


Introduction
In 1982, Chinese scholar Wang Peizhuang [1] first proposed the new concept of factor space. The theory of factor space has become an important theoretical basis for deeply analyzing the causal relationship between many things, and an indispensable mathematical theoretical basis for mechanistic artificial intelligence, providing an important theoretical framework for the generation of concepts, mathematical reasoning and the distinction and judgment of objects. With the continuous innovation of science and technology and the rapid development of computer network technology in recent years, machine learning has become an increasingly important part of the field of artificial intelligence. As the main task of machine learning, the classification problem can be applied to more and more rich and accurate algorithms. Among them, the dichotomy problem is also an important part of machine learning. It is an important research direction in the field of artificial intelligence to find an algorithm that can solve the dichotomy problem more accurately. Machine learning [2][3][4][5][6][7] has become a key research direction of artificial intelligence and data science. There are many binary classification algorithms in machine learning algorithms [8]. Factor implicit thought in factor space theory can be used to solve the classification problem of machine learning. A classification algorithm is proposed on the basis of factor implicit theory, a factor implicit model is constructed with this algorithm, and then a test classification is carried out. The feasibility of this algorithm in finding the direction of key implicit factors is studied, so that the factor space theory can continue to develop into its next stage and better solve the classification and discrimination problems in life. In this paper, a serial classification algorithm for sweeping classes is proposed, and a factor implicit model is constructed. The results show that the algorithm is feasible and effective.

Scanning Serial Classification Algorithm
Scanning Direction Definition 1. Given divisible two-class data sets The centers of the two classes are When people are sorting, they sweep from one kind of center to another kind of center, and what they notice in their overall vision is the sweeping direction, which is also the sweeping vector. Definition 2. Take the class vector w, for the positive class, a + = argminj{(x j + ,w)|j = 1, . . ., l}, b − = argmaxi{(x i − , w)|i = 1, . . ., k}, J = a + − b − , which is called the interval between the two classes with respect to the direction w. If the interval is positive, the w pair is called the explicit factor of classification, and J* = (a + + b − )/2 is called the partition wall. The two categories are separated according to the projection of data on w. If the interval is negative and there is a + ≤ b − , then the closed interval [a + , b − ] is the mixed domain of the projection of two classification data in the direction w. The sample points projected in the mixing domain are called mixed points, and h(w) is denoted as the number of mixed points in the direction w.
The projection mixing domain is empty, only if the interval is positive. A given classification sample point set X − = {x i − |i = 1, . . ., k > 0}, X + = {x j + |j = 1, . . ., l > 0}; if the convex closures generated by them in the factor space do not intersect each other, this is called a separable two-classification data set.
A sweep class serial classification algorithm, given a classification, can be divided into two sample points X − = {x i − |i = 1, . . ., k > 0}, X + = {x j + |j = 1, . . ., l > 0}, which should be improved from the sweeping class vector w 0 to the explicit and implicit vector w T .
Step 0 t: = 0, select the sweeping class vector w 0 as the initial vector; Step 1 Sequence the sample point set w, find a t + and b t − and remove all non-miscible points from the data set. If the data set is deleted or empty, the solution is obtained, the operation is stopped and the output is: "explicit factor is w t , partition wall is J t ". Otherwise, the mixed domain of wt and projection [a t + , b t − ] is recorded. Delete all non-mixed points and update two classification data sets. Set t: = t + 1, go back to step 1, and repeat the process until the data set is empty. Example 1.
finds the serial scanning class vectors, namely the explicit and implicit vector.
The solution first finds the point falling to the left of the mixed domain from the negative class points. The left endpoint of the mixed domain is a+, and the projection is −3.8. Since , all of them are less than −3.8, x 1 − , and x 2 − and x 3 − must be judged as negative class points, not mixed points. The right endpoint of the mixed domain is b − , and the projection is 0. Since (x 3 + , w) = 7.6, (x 4 + , w) = 18.8, (x 5 + , w) = 33 are all greater than 0, x 3 + , x 4 + and x 5 + must be judged as positive class, not mixed points. Remove the points that have a clear category, and what is left is a collection of mixed points. The new two-category data set is: Firstly, the centers of the positive and negative categories are found: Find the sweep vector: The recalculation is: Because the spacing of w 1 is not positive, it is still not an implicit factor. Its projection mixing domain is [0, 0].
Continuing this process, the remaining mixed point classification data set is: Find the sweep vector: The projection interval is J 2 = a 2 + − b 2 − = 1 > 0. It is known that w 2 is an implicit factor in X 2 − and X 2 + , J * 2 = 0.5. Test step After the explicit and implicit factors are found and the partition wall is calculated, the explicit and implicit model is established to test the test point. The steps are as follows.
Step 1: input test data z = (z 1 , . . . z n ); Step 2: start from t = 0 and calculate and test the data of w t projection (z, w t ) to determine whether the mixed domain [a t + , b t − ]; Step 3: If (z, w t ) is not in the mixed domain, it means that for z, for w t non-mixed points, the category is determined according to its projection. At the left of the mixed domain, it is judged as a negative class, and at the right of the mixed domain, it is judged as a positive class.
Step 4: If (z, w t ) is in the mixed domain, to mix all point as a new binary classification data sets, t = t + 1 = T, go to step 2 until t = t + 1 = T, w T has a partition wall J*, according to the symbol of (z, w T ) − J* to determine the category of z. Obtain the solution, then stop running and output. Example 2. Take two classification data sets given in Example 1, input test points z = (2.5, 1.5), z' = (2.5, −1.5) and try to identify their categories.
The solution to Example 1 gives the initial sweeping class vector w 0 = (2.8, 3.8) for the given two classification sample points set, pointing out that it is not an implicit factor of the given data set, but has a projected mixed domain [3.8,0]. From this, find the projection (z, w 0 ) to see if it has a definite category. Since (z, w 0 ) = 1.3, in the mixed domain of w 0 , there is no clear category, so go further down.
From Example 2, we know that the new data set is: 0) is obtained, and the projection mixed domain is [0, 0]. Since (z, w 1 ) = 2.5 is to the right of the mixed field, z is judged to belong to the positive class.
Similarly, z' prime is in the negative class.

Merging the Sweeping Class Algorithm
Let all data be labeled as class K, and the whole training data set X can be written as X = X 1 + . . . + X K ; here, the sign "plus" represents the non-intersection operation of sets. If the following K 1 categories are combined into one class, remember that X 1 = X 2 + . . . + X K , so there is X = X 1 + X 1 ' using the sweeping algorithm SL (X 1 , X 1 ') of the sequence of criteria to obtain the classification criteria of X 1 and X 1 '. Enter a test point z. If it is judged to belong to class 1, its class is determined and stops. If z belongs to X 1 , then combine the following k 2 categories into one class, remembering that X 2 = X 3 + . . . + X k , so that X 1 = X 2 + X 3 , using the sweeping class algorithm SL (X 2 , X 2 ) can obtain the classification criterion of X 2 and X 2 . If z is judged to belong to X 2 , its class is determined and it stops. If z belongs to X 2 ', then combine the following k−3 categories into one class: X 3 = X 4 + . . . + X k and so on, until X k−1 = X k .
Step 5 Sort out the sequence of criteria.