1. Introduction
DTs (Decision Trees) [
1,
2,
3,
4] and DRSs (Decision Rule Systems) [
5,
6,
7,
8,
9] are common tools for structuring and expressing knowledge. They act as classifiers, providing predictions for unseen cases, and are also employed as algorithms in diverse domains such as fault diagnosis, combinatorial optimization, and beyond. Among classification and knowledge representation models, DTs and DRSs stand out for their high level of interpretability [
10,
11]. Exploring the connections and transformations between DTs and DRSs has become an important focus of research within computer science. In this work, we continue to develop the syntactic approach to study of this task proposed in [
12,
13]. This approach is based on the assumption that only the DRS is available to us, and not the input data. The results of previous studies in this area are summarized mainly in the book [
14] and in the conference papers [
15,
16,
17]. Earlier research has focused on the depth of DTs; in contrast, this paper investigates weighted depth as a measure of the time complexity of DTs.
In this work, we study the following realizability problem. Let a DRS be given. For a tuple of feature values, it is required to recognize whether this DRS contains a DR (Decision Rule) that is realizable for this tuple, i.e., a DR for which the left-hand side is true for this tuple. The tuples under consideration can contain values that do not appear in the DRS. This allows us to model the natural situation when the input tuple for the classifier may contain values that are not present in the training data. In addition, we do not allow the presence in the DRS of two DRs such that the set of conditions from the left-hand side of the first DR is a proper subset of the set of conditions from the left-hand side of the second DR; removing the second DR will not change the solution of the realizability problem.
When addressing the considered problem, it is important to note that in real-life situations we do not have direct access to the tuple of feature values. To determine the value of a feature, it is necessary to compute it for the given input, which can be an expensive procedure.
The complexity of finding feature values is determined by the weight function w, which assigns a weight that is a positive integer to each feature , present in the DRs from the DRS. The weight is interpreted as the complexity of finding the value of the feature . To minimize the total weight of computed feature values, we consider DDTs (Deterministic Decision Trees) and NDTs (Nondeterministic Decision Trees) solving the considered problem and study their weighted depth.
To clarify the possibilities and limitations of using DTs, we modify examples considered in the papers [
15,
17] and in Lemma 13.10 of the book [
14]. First, we discuss a sequence of DRSs for which the minimum depth of deterministic decision DDTs is growing as a logarithm on the number of different features in the DRs from the DRS. In such a situation, the use of DTs seems appropriate. A sequence of DRSs is also considered for which the minimum number of vertices in DDTs grows exponentially with the sum of length of DRs from the DRS. This means that in the general case, instead of constructing the entire DT, its operation on a given tuple of feature values should be modeled using a sufficiently efficient algorithm.
This paper is devoted to the consideration of a new efficient algorithm for modeling the operation of a DDT solving the realizability problem. The weighted depth of this DT is bounded from below by the minimum weighted depth of an NDT and from above by the square of the minimum weighted depth of an NDT.
Similar upper bounds for the depth were obtained in [
18,
19,
20] for Boolean functions (see [
21] for details); in [
14] for functions of
k-valued logic,
; and in the paper [
15] for the problem of finding all realizable DRs. To derive such upper bounds for each of the mentioned cases, an NDT with the minimum depth was considered. Based on this DT, for a given tuple of values of variables or features, the operation of a DDT satisfying the upper bound was described. It is important to note that this description of a DDT cannot be considered an efficient algorithm, since the NDT under consideration can have a huge number of vertices.
Note that this paper is a generalization of a previous conference paper [
17]. In [
17], we proposed an efficient algorithm for modeling the operation of DDTs that solves the realizability problem. The main contribution was showing that the depth of such DTs is bounded from below by the minimum depth of an NDT and from above by the square of the minimum depth of an NDT. In this paper, we extend that approach by considering the notion of weighted depth, which takes into account the complexity of finding feature values. This generalization better reflects real-world applications, where the complexity of finding feature values might vary.
In the present paper, when creating an algorithm for modeling the operation of a DDT for the realizability problem, it was possible to do without directly using an optimal NDT. As a result, the designed modeling algorithm has a polynomial complexity depending on the length of the description of the DRS.
The structure of the paper is as follows: in
Section 2, we provide the key definitions and notation;
Section 3 clarifies possibilities and limitations of using DTs;
Section 4 considers the minimum weighted depth of NDTs;
Section 5 is devoted to the analysis of an efficient algorithm for modeling the operation of a DDT; finally,
Section 6 presents brief conclusions.
2. Definitions
This section introduces the notation and key definitions for DRSs and DTs.
2.1. DRSs—Decision Rule Systems
Let , , and . Denote and . Elements of the set will be called features.
A
-
DR is an expression of the form
where
,
are pairwise different features from
,
, and
. We denote this DR by
r. The number
is interpreted as the
decision of the DR
r. The number
t will be called the
length of the DR
r, denoted
. We denote
and
.
A restricted -DRSS is a finite nonempty set of -DRs such that there are no DRs for which . We denote . We can describe the DRS S by a word in the alphabet such that the indexes of features, their values, and the decisions of DRs are in binary representations. The sign “;” is used to separate DRs. This word will be called the description of the DRS S.
A weight function for the DRS S is a map . The total weight of features in the DR r will be called the weight of the DR r and is denoted .
Let us fix a restricted -DRS S, with which we will work later. We assume for definiteness that and .
For , we denote . We will say that a DR from S is realizable for a tuple if .
For each tuple , we define the value in the following way: if there is a DR from S that is realizable for , then ; otherwise, . The problem of realizability is defined as follows: for a given tuple , it is required to find the value .
We denote by
the set of
systems of equations of the form
where
,
, and
. This system will be called
inconsistent if there exist
such that
,
, and
. If the system of equations
is not inconsistent, then it will be called
consistent. The total weight of features in the system of equations will be called the
weight of the system
, denoted
. We will say that a tuple
is a
solution of the equation system
if
.
We will say that the equation system supports the decision if has no solutions from or if, for any solution of , .
2.2. DTs—Decision Trees
A finite rooted directed tree is a finite directed tree with exactly one vertex that has no incoming edges, referred to as the root. Vertices with no outgoing edges are called leaves, while those that are neither root nor leaves are termed internal vertices. A complete path is a sequence , where is the root, is a leaf, and each connects to for .
A DT over the problem is a finite rooted labeled directed tree with at least two vertices, such that:
The root and its outgoing edges are not labeled.
Every internal vertex of is labeled by a feature from , and the edges leaving it are labeled by elements of .
Each leaf of is labeled with a decision from the set .
A DT over the problem is termed a deterministic if exactly one edge leaves from the root; at every internal vertex, the outgoing edges are labeled with pairwise distinct labels.
Let be a DT over the problem . We denote by the set of complete paths in the DT . Let be a complete path in . We correspond to this path an equation system . If and , then . Let and (for ) the vertex be labeled with the feature . Let the edge be labeled with the number . Then, . We denote by the decision attached to the vertex . We will say that the complete path accepts the tuple if .
We will say that solves the problem nondeterministically if, for any tuple , there exists a path which accepts the tuple and if for any path , the system of equations supports the decision . In this case, we will also say that is an NDT solving the problem .
We will say that solves the problem deterministically if is a DDT, which solves the problem nondeterministically. In this case, we will also say that is a DDT solving the problem .
For any complete path , we denote by the number of internal vertices in and denote by the total weight of features attached to internal vertices of . The value is called the depth of the DT . The value is called the weighted depth of the DT .
We denote by the minimum weighted depth of an NDT over the problem which solves this problem. We denote by the minimum weighted depth of a DDT over the problem which solves this problem. It is clear that .
3. Two Sequences of DRSs (Decision Rule Systems)
In this section, we consider two sequences of DRSs. For each DRS in the first sequence, the minimum depth of a DDT is significantly less than the number of features in the DRS. This example shows that using DTs is reasonable. For each DRS in the second sequence, the number of vertices in any DDT is exponential related to the sum of length of the DRs in the DRS. This example shows that rather than constructing the entire DT, it is more reasonable to model its operation for a given tuple of feature values.
Let us begin with the first sequence of DRSs. Let . A complete binary tree of depth t is a finite directed tree with root in which each non-leaf vertex has exactly two outgoing edges and the length of each complete path is equal to t. The vertex set of this tree is naturally divided into levels: for , the ith level contains all vertices that are located at distance i from the root. It is clear that each level i includes vertices. Consequently, the number of non-leaf vertices equals , while the number of leaf vertices is .
Let
denote a labeled complete binary tree of depth
t, where non-leaf nodes are assigned features
and leaf vertices are labeled with integers
. For every non-leaf vertex, its outgoing edges are labeled by 0 and 1, respectively. For
, we define a DR
as follows. Consider a full path
in
ending at the leaf node labeled
j, where for each
, the vertex
is labeled with feature
and the connecting edge
is labeled with a number
. Then, the corresponding DR
is provided by
We denote the set of all such DRs by
. It is straightforward that
. For the DRS
, we will consider tuples of values of features from the set
.
Next, let us analyze the problem and show that . We transform the tree into a DDT over the problem . For each , we replace the label j attached to a leaf vertex of with the label 1. For each non-leaf vertex w of , we add to the tree a vertex and an edge that leaves the vertex w and enters the vertex . The edge is labeled with the number 2 and the vertex is labeled with the number 0. We add to the tree a vertex v and an edge d that leaves the vertex v and enters the root of . Both v and d are unlabeled. It can be shown that the obtained DDT solves the problem and has depth t.
Hence, for every , we obtain a DRS satisfying and there exists a DDT solving for which the depth equals t.
Now, let us move to the second sequence of DRSs. For any , we define the DRS . Clearly, . For , we consider tuples of values of features from the set .
We denote by the set of tuples such that for . It is clear that and that for any there is no a DR from that is realizable for the tuple . Let be a DDT over the problem which solves this problem. Let and . It is clear that there exist complete paths and in such that accepts and accepts . Let us now show that . Let us first assume the contrary, . It is clear that the equation system supports the decision 0. Because , there exists such that the th and th digits of the tuples and are different. Therefore, the features and are not attached to any vertex of the path . Using this fact, it is easy to show that there exists a tuple such that is a solution of the equation system and that the DR from is realizable for the tuple ; however, this is impossible. Therefore, . Thus, there are at least pairwise different complete paths in the tree .
As a result, for each we obtain an example of a DRS which consists of q DRs of length 2 and for which any DDT solving the problem has at least vertices.
4. On the Minimum Weighted Depth of NDTs (Nondeterministic Decision Trees)
In this section, we return to the study of the restricted -DRS with and prove three lemmas related to the minimum weighted depth of an NDT solving the problem .
Let . A system of equations will be called a certificate for the tuple if supports the decision and if is a solution of .
Let . We denote , where is a DR from S with the minimum weight that is realizable for . Let . We denote by a subsystem with the minimum weight of the system such that is inconsistent for each DR .
Lemma 1. For any tuple , the equation system is a certificate for with the minimum weight.
Proof. Let , , , and be a solution of . We now show that is a certificate for if and only if for some DR that is realizable for . If such a DR exists, then evidently is a certificate for . Let there be no such DR. We change the value of each feature in that does not belong to to k. As a result, we obtain a tuple such that and is a solution of . Therefore, is not a certificate for . From here, it follows that the equation system , where is a DR from S with the minimum weight that is realizable for , is a certificate for with the minimum weight.
Let , , and ; additionally, let be a solution of . We now show that is a certificate for if and only if the system of equations is inconsistent for each DR . If is inconsistent for each DR , then evidently is a certificate for . Let the system be not inconsistent for some DR . In , change the values of all features belonging to to values from . As a result, we obtain a tuple from which is a solution of and for which . Therefore, is not a certificate for . From here, it follows that a subsystem with the minimum weight of the system such that is inconsistent for each DR is a certificate for with the minimum weight. □
Lemma 2. .
Proof. Let be an NDT over the problem which solves this problem. Let . Then, has a complete path accepting . This means that is a solution of the equation system . Therefore, the system of equations supports the decision . Thus, is a certificate for , and by Lemma 1 we have . As a result, we obtain .
Let us now describe an NDT over the problem . The set of complete paths of is equal to , where and the leaf vertex of is labeled with . From Lemma 1, it follows that the complete path accepts the tuple and that the equation system supports the decision . Therefore, solves the problem . Thus, . □
For , we denote and .
Corollary 1. .
Lemma 3. .
Proof. Let us show that . It is clear that for each such that , we have . Let . We consider a tuple such that values of features from are equal to corresponding values from and values of all other features are equal to k. It is clear that is the only DR from S that is realizable for . Therefore, . As a result, we obtain . From here, it follows that . □
5. Algorithm Simulating the Operation of a DDT (Deterministic Decision Tree)
In this section, we continue to study the restricted
-DRS
with
. Let
. We now consider an algorithm that describes the operation on the tuple
of a DDT
over the problem
which solves this problem. As a result, we obtain the description of a complete path
in the DT
that accepts the tuple
. The set of complete paths of the DT
coincides with the set
. The text in square brackets below is not a description of the algorithm’s and DT’s actions, and will only be used to prove a statement about this Algorithm 1.
| Algorithm 1 Simulation of DDT operation |
Step 1. For , set . Set . [For any with , set . Set .] Step 2. If , then returns the decision 0 and stops. If W contains an empty system of equations, then returns the decision 1 and stops. Let and W contain no empty systems of equations. Choose a system from W with the minimum number j. Let be all features from and . The DT computes values of features and obtains the system of equations . For each , we remove from W if the system is inconsistent. Otherwise, we set . [For each , we remove from V if the system is inconsistent. Otherwise, we set .] Return to Step 2. |
Remark 1. It is possible to show that the time complexity of Algorithm 1 is polynomial depending on the length of the description of DRS S.
Theorem 1. The DDT solves the problem and satisfies the inequality .
Proof. First, we show that solves the problem . Let . It is clear that the complete path in the DT accepts . Let . We can show that the equation system is inconsistent for each DR . Then, . It is clear that for each solution of the system we have . Therefore, supports the decision 0. Let . We can show that for some DR . Then, . It is clear that for each solution of the system , we have . Therefore, supports the decision 1. Thus, the DT solves the problem .
Let us consider the work of the DT on a tuple . By Lemma 3, during each complete repetition of Step 2, the DT computes the values of features for which the total weight is at most . We now show that the number of complete repetitions of Step 2 is at most .
Let , , and . From the definition of , it follows that the system of equations is inconsistent. From here, it follows that before each repetition of Step 2 of Algorithm 1, for each nonempty system and each nonempty system , the system of equations is inconsistent. Using this fact, it can be shown that after each complete repetition of Step 2, each system of equations from V will either be removed from V or its cardinality will decrease by at least 1. Evidently, the cardinality of each system from V before the first repetition of Step 2 is at most . Therefore, after complete repetitions of Step 2, the set V will be empty or will contain only empty systems of equations.
We denote by the system of equations consisting of all features computed by during complete repetitions of Step 2 with their values from .
Let V contain an empty system. Then, for some tuple such that . Thus, the set W is empty and stops. Let V be empty. Then, the system is inconsistent for any such that . Let us show that for some DR . First, assume the contrary, that for each DR , is not a subset of . In the tuple , we change the values of all features that do not belong to to k. We denote the obtained tuple . It is clear that is a solution of the equation system and ; however, this is impossible, as the system is inconsistent. As a result, we have that stops after at most complete repetitions of Step 2. Thus, . □
Corollary 2. .
Proof. This statement follows from the fact that , Theorem 1 and Corollary 1. □
We denote by the maximum weight of a DR from S.
Corollary 3. .
Proof. This statement follows from Theorem 1, Lemma 3, and Corollary 1. □
From Theorem 1, it follows that . We now show that this bound is unimprovable even for depth, i.e., when .
Proposition 1. For any , there exists a restricted -DRS such that , , and .
Proof. Let
. We now consider a restricted
-DRS
, where for
the DR
is equal to
It is clear that
and that the values of all features from the DRs
belong to the set
. For the DRS
, we will consider tuples of values of features from the set
.
The length of each DR from is equal to p. Using Lemma 3, we obtain that . Let and , i.e., let all DRs from be non-realizable for the tuple . It is clear that for any two DRs such that . Using this fact, it is easy to show that the minimum cardinality of a certificate for is equal to q. Therefore, .
It is clear that . Let us show that . To this end, we consider a DDT over the problem which solves this problem and for which . Now, let us analyze the work of , which is described as follows. Let compute the value of feature from DR . If the values of all other features from have already been computed, then ; otherwise, . If does not compute the values of all features, then we will not know the solution to the realizability problem. Therefore, and . Thus, . □
It follows from Proposition 1 that we cannot improve the accuracy bound for Algorithm 1 provided by Theorem 1 and that we cannot find algorithms with better accuracy bounds based on parameters and .
6. Conclusions
In this paper, an algorithm for modeling the operation of a DDT solving the problem of realizability is proposed. The running time of this algorithm is polynomial depending on the length of the description of the DRS under consideration. The weighted depth of the modeled DDT does not exceed the square of the minimum weighted depth of an NDT solving the realizability problem. In the future, we plan to design similar algorithms for some other problems related to DRSs. In the more distant future, we plan to conduct an experimental study of such algorithms.