1. Introduction
DTs (Decision Trees) [
1,
2,
3,
4] and DRSs (Decision Rule Systems) [
5,
6,
7,
8,
9] are common tools for structuring and expressing knowledge. They act as classifiers, providing predictions for unseen cases, and are also employed as algorithms in diverse domains such as fault diagnosis, combinatorial optimization, and beyond. Among classification and knowledge representation models, DTs and DRSs stand out for their high level of interpretability [
10,
11]. Exploring the connections and transformations between DTs and DRSs has become an important focus of research within computer science. In this work, we continue to develop the syntactic approach to study of this task proposed in [
12,
13]. This approach is based on the assumption that only the DRS is available to us, and not the input data. The results of previous studies in this area are summarized mainly in the book [
14] and in the conference papers [
15,
16,
17]. Earlier research has focused on the depth of DTs; in contrast, this paper investigates weighted depth as a measure of the time complexity of DTs.
In this work, we study the following realizability problem. Let a DRS be given. For a tuple of feature values, it is required to recognize whether this DRS contains a DR (Decision Rule) that is realizable for this tuple, i.e., a DR for which the left-hand side is true for this tuple. The tuples under consideration can contain values that do not appear in the DRS. This allows us to model the natural situation when the input tuple for the classifier may contain values that are not present in the training data. In addition, we do not allow the presence in the DRS of two DRs such that the set of conditions from the left-hand side of the first DR is a proper subset of the set of conditions from the left-hand side of the second DR; removing the second DR will not change the solution of the realizability problem.
When addressing the considered problem, it is important to note that in real-life situations we do not have direct access to the tuple of feature values. To determine the value of a feature, it is necessary to compute it for the given input, which can be an expensive procedure.
The complexity of finding feature values is determined by the weight function w, which assigns a weight  that is a positive integer to each feature ,  present in the DRs from the DRS. The weight  is interpreted as the complexity of finding the value of the feature . To minimize the total weight of computed feature values, we consider DDTs (Deterministic Decision Trees) and NDTs (Nondeterministic Decision Trees) solving the considered problem and study their weighted depth.
To clarify the possibilities and limitations of using DTs, we modify examples considered in the papers [
15,
17] and in Lemma 13.10 of the book [
14]. First, we discuss a sequence of DRSs for which the minimum depth of deterministic decision DDTs is growing as a logarithm on the number of different features in the DRs from the DRS. In such a situation, the use of DTs seems appropriate. A sequence of DRSs is also considered for which the minimum number of vertices in DDTs grows exponentially with the sum of length of DRs from the DRS. This means that in the general case, instead of constructing the entire DT, its operation on a given tuple of feature values should be modeled using a sufficiently efficient algorithm.
This paper is devoted to the consideration of a new efficient algorithm for modeling the operation of a DDT solving the realizability problem. The weighted depth of this DT is bounded from below by the minimum weighted depth of an NDT and from above by the square of the minimum weighted depth of an NDT.
Similar upper bounds for the depth were obtained in [
18,
19,
20] for Boolean functions (see [
21] for details); in [
14] for functions of 
k-valued logic, 
; and in the paper [
15] for the problem of finding all realizable DRs. To derive such upper bounds for each of the mentioned cases, an NDT with the minimum depth was considered. Based on this DT, for a given tuple of values of variables or features, the operation of a DDT satisfying the upper bound was described. It is important to note that this description of a DDT cannot be considered an efficient algorithm, since the NDT under consideration can have a huge number of vertices.
Note that this paper is a generalization of a previous conference paper [
17]. In [
17], we proposed an efficient algorithm for modeling the operation of DDTs that solves the realizability problem. The main contribution was showing that the depth of such DTs is bounded from below by the minimum depth of an NDT and from above by the square of the minimum depth of an NDT. In this paper, we extend that approach by considering the notion of weighted depth, which takes into account the complexity of finding feature values. This generalization better reflects real-world applications, where the complexity of finding feature values might vary.
In the present paper, when creating an algorithm for modeling the operation of a DDT for the realizability problem, it was possible to do without directly using an optimal NDT. As a result, the designed modeling algorithm has a polynomial complexity depending on the length of the description of the DRS.
The structure of the paper is as follows: in 
Section 2, we provide the key definitions and notation; 
Section 3 clarifies possibilities and limitations of using DTs; 
Section 4 considers the minimum weighted depth of NDTs; 
Section 5 is devoted to the analysis of an efficient algorithm for modeling the operation of a DDT; finally, 
Section 6 presents brief conclusions.
  2. Definitions
This section introduces the notation and key definitions for DRSs and DTs.
  2.1. DRSs—Decision Rule Systems
Let , , and . Denote  and . Elements of the set  will be called features.
A 
-
DR is an expression of the form
        where 
, 
 are pairwise different features from 
, 
, and 
. We denote this DR by 
r. The number 
 is interpreted as the 
decision of the DR 
r. The number 
t will be called the 
length of the DR 
r, denoted 
. We denote 
 and 
.
A restricted -DRSS is a finite nonempty set of -DRs such that there are no DRs  for which . We denote . We can describe the DRS S by a word in the alphabet  such that the indexes of features, their values, and the decisions of DRs are in binary representations. The sign “;” is used to separate DRs. This word will be called the description of the DRS S.
A weight function for the DRS S is a map . The total weight of features in the DR r will be called the weight of the DR r and is denoted .
Let us fix a restricted -DRS S, with which we will work later. We assume for definiteness that  and .
For , we denote . We will say that a DR  from S is realizable for a tuple  if .
For each tuple , we define the value  in the following way: if there is a DR from S that is realizable for , then ; otherwise, . The problem of realizability is defined as follows: for a given tuple , it is required to find the value .
We denote by 
 the set of 
systems of equations of the form
        where 
, 
, and 
. This system will be called 
inconsistent if there exist 
 such that 
, 
, and 
. If the system of equations 
 is not inconsistent, then it will be called 
consistent. The total weight of features in the system of equations will be called the 
weight of the system 
, denoted 
. We will say that a tuple 
 is a 
solution of the equation system 
 if 
.
We will say that the equation system  supports the decision  if  has no solutions from  or if, for any solution  of , .
  2.2. DTs—Decision Trees
A finite rooted directed tree is a finite directed tree with exactly one vertex that has no incoming edges, referred to as the root. Vertices with no outgoing edges are called leaves, while those that are neither root nor leaves are termed internal vertices. A complete path is a sequence , where  is the root,  is a leaf, and each  connects  to  for .
A DT over the problem  is a finite rooted labeled directed tree  with at least two vertices, such that:
- The root and its outgoing edges are not labeled. 
- Every internal vertex of  is labeled by a feature from , and the edges leaving it are labeled by elements of . 
- Each leaf of  is labeled with a decision from the set . 
A DT over the problem  is termed a deterministic if exactly one edge leaves from the root; at every internal vertex, the outgoing edges are labeled with pairwise distinct labels.
Let  be a DT over the problem . We denote by  the set of complete paths in the DT . Let  be a complete path in . We correspond to this path an equation system . If  and , then . Let  and (for ) the vertex  be labeled with the feature . Let the edge  be labeled with the number . Then, . We denote by  the decision attached to the vertex . We will say that the complete path  accepts the tuple  if .
We will say that  solves the problem  nondeterministically if, for any tuple , there exists a path  which accepts the tuple  and if for any path , the system of equations  supports the decision . In this case, we will also say that  is an NDT solving the problem .
We will say that  solves the problem  deterministically if  is a DDT, which solves the problem  nondeterministically. In this case, we will also say that  is a DDT solving the problem .
For any complete path , we denote by  the number of internal vertices in  and denote by  the total weight of features attached to internal vertices of . The value  is called the depth of the DT . The value  is called the weighted depth of the DT .
We denote by  the minimum weighted depth of an NDT over the problem  which solves this problem. We denote by  the minimum weighted depth of a DDT over the problem  which solves this problem. It is clear that .
  3. Two Sequences of DRSs (Decision Rule Systems)
In this section, we consider two sequences of DRSs. For each DRS in the first sequence, the minimum depth of a DDT is significantly less than the number of features in the DRS. This example shows that using DTs is reasonable. For each DRS in the second sequence, the number of vertices in any DDT is exponential related to the sum of length of the DRs in the DRS. This example shows that rather than constructing the entire DT, it is more reasonable to model its operation for a given tuple of feature values.
Let us begin with the first sequence of DRSs. Let . A complete binary tree of depth t is a finite directed tree with root in which each non-leaf vertex has exactly two outgoing edges and the length of each complete path is equal to t. The vertex set of this tree is naturally divided into  levels: for , the ith level contains all vertices that are located at distance i from the root. It is clear that each level i includes  vertices. Consequently, the number of non-leaf vertices equals , while the number of leaf vertices is .
Let 
 denote a labeled complete binary tree of depth 
t, where non-leaf nodes are assigned features 
 and leaf vertices are labeled with integers 
. For every non-leaf vertex, its outgoing edges are labeled by 0 and 1, respectively. For 
, we define a DR 
 as follows. Consider a full path 
 in 
 ending at the leaf node labeled 
j, where for each 
, the vertex 
 is labeled with feature 
 and the connecting edge 
 is labeled with a number 
. Then, the corresponding DR 
 is provided by
We denote the set of all such DRs by 
. It is straightforward that 
. For the DRS 
, we will consider tuples of values of features from the set 
.
Next, let us analyze the problem  and show that . We transform the tree  into a DDT  over the problem . For each , we replace the label j attached to a leaf vertex of  with the label 1. For each non-leaf vertex w of , we add to the tree  a vertex  and an edge  that leaves the vertex w and enters the vertex . The edge  is labeled with the number 2 and the vertex  is labeled with the number 0. We add to the tree  a vertex v and an edge d that leaves the vertex v and enters the root of . Both v and d are unlabeled. It can be shown that the obtained DDT  solves the problem  and has depth t.
Hence, for every , we obtain a DRS  satisfying  and there exists a DDT  solving  for which the depth equals t.
Now, let us move to the second sequence of DRSs. For any , we define  the DRS . Clearly, . For , we consider tuples of values of features from the set .
We denote by  the set of tuples  such that  for . It is clear that  and that for any  there is no a DR from  that is realizable for the tuple . Let  be a DDT over the problem  which solves this problem. Let  and . It is clear that there exist complete paths  and  in  such that  accepts  and  accepts . Let us now show that . Let us first assume the contrary, . It is clear that the equation system  supports the decision 0. Because , there exists  such that the th and th digits of the tuples  and  are different. Therefore, the features  and  are not attached to any vertex of the path . Using this fact, it is easy to show that there exists a tuple  such that  is a solution of the equation system  and that the DR  from  is realizable for the tuple ; however, this is impossible. Therefore, . Thus, there are at least  pairwise different complete paths in the tree .
As a result, for each  we obtain an example of a DRS  which consists of q DRs of length 2 and for which any DDT solving the problem  has at least  vertices.
  4. On the Minimum Weighted Depth of NDTs (Nondeterministic Decision Trees)
In this section, we return to the study of the restricted -DRS  with  and prove three lemmas related to the minimum weighted depth of an NDT solving the problem .
Let . A system of equations  will be called a certificate for the tuple  if  supports the decision  and if  is a solution of .
Let . We denote , where  is a DR from S with the minimum weight that is realizable for . Let . We denote by  a subsystem with the minimum weight of the system  such that  is inconsistent for each DR .
Lemma 1.  For any tuple , the equation system  is a certificate for  with the minimum weight.
 Proof.  Let , , , and  be a solution of . We now show that  is a certificate for  if and only if  for some DR  that is realizable for . If such a DR exists, then evidently  is a certificate for . Let there be no such DR. We change the value of each feature in  that does not belong to  to k. As a result, we obtain a tuple  such that  and  is a solution of . Therefore,  is not a certificate for . From here, it follows that the equation system , where  is a DR from S with the minimum weight that is realizable for , is a certificate for  with the minimum weight.
Let , , and ; additionally, let  be a solution of . We now show that  is a certificate for  if and only if the system of equations  is inconsistent for each DR . If  is inconsistent for each DR , then evidently  is a certificate for . Let the system  be not inconsistent for some DR . In , change the values of all features belonging to  to values from . As a result, we obtain a tuple  from  which is a solution of  and for which . Therefore,  is not a certificate for . From here, it follows that a subsystem  with the minimum weight of the system  such that  is inconsistent for each DR  is a certificate for  with the minimum weight.    □
 Lemma 2.  .
 Proof.  Let  be an NDT over the problem  which solves this problem. Let . Then,  has a complete path  accepting . This means that  is a solution of the equation system . Therefore, the system of equations  supports the decision . Thus,  is a certificate for , and by Lemma 1 we have . As a result, we obtain .
Let us now describe an NDT  over the problem . The set of complete paths of  is equal to , where  and the leaf vertex of  is labeled with . From Lemma 1, it follows that the complete path  accepts the tuple  and that the equation system  supports the decision . Therefore,  solves the problem . Thus, .    □
 For , we denote  and .
Corollary 1.  .
 Lemma 3.  .
 Proof.  Let us show that . It is clear that for each  such that , we have . Let . We consider a tuple  such that values of features from  are equal to corresponding values from  and values of all other features are equal to k. It is clear that  is the only DR from S that is realizable for . Therefore, . As a result, we obtain . From here, it follows that .    □
   5. Algorithm Simulating the Operation of a DDT (Deterministic Decision Tree)
In this section, we continue to study the restricted 
-DRS 
 with 
. Let 
. We now consider an algorithm that describes the operation on the tuple 
 of a DDT 
 over the problem 
 which solves this problem. As a result, we obtain the description of a complete path 
 in the DT 
 that accepts the tuple 
. The set of complete paths of the DT 
 coincides with the set 
. The text in square brackets below is not a description of the algorithm’s and DT’s actions, and will only be used to prove a statement about this Algorithm 1.
	  
| Algorithm 1 Simulation of DDT operation | 
| Step 1. For , set . Set . [For any  with , set . Set .] Step 2. If , then  returns the decision 0 and stops. If W contains an empty system of equations, then  returns the decision 1 and stops. Let  and W contain no empty systems of equations. Choose a system  from W with the minimum number j. Let  be all features from  and . The DT  computes values of features  and obtains the system of equations . For each , we remove  from W if the system  is inconsistent. Otherwise, we set . [For each , we remove  from V if the system  is inconsistent. Otherwise, we set .]
 Return to Step 2.
 | 
Remark 1.  It is possible to show that the time complexity of Algorithm 1 is polynomial depending on the length of the description of DRS S.
 Theorem 1.  The DDT  solves the problem  and satisfies the inequality .
 Proof.  First, we show that  solves the problem . Let . It is clear that the complete path  in the DT  accepts . Let . We can show that the equation system  is inconsistent for each DR . Then, . It is clear that for each solution  of the system  we have . Therefore,  supports the decision 0. Let . We can show that  for some DR . Then, . It is clear that for each solution  of the system , we have . Therefore,  supports the decision 1. Thus, the DT  solves the problem .
Let us consider the work of the DT  on a tuple . By Lemma 3, during each complete repetition of Step 2, the DT  computes the values of features for which the total weight is at most . We now show that the number of complete repetitions of Step 2 is at most .
Let , , and . From the definition of , it follows that the system of equations  is inconsistent. From here, it follows that before each repetition of Step 2 of Algorithm 1, for each nonempty system  and each nonempty system , the system of equations  is inconsistent. Using this fact, it can be shown that after each complete repetition of Step 2, each system of equations from V will either be removed from V or its cardinality will decrease by at least 1. Evidently, the cardinality of each system from V before the first repetition of Step 2 is at most . Therefore, after  complete repetitions of Step 2, the set V will be empty or will contain only empty systems of equations.
We denote by  the system of equations consisting of all features computed by  during  complete repetitions of Step 2 with their values from .
Let V contain an empty system. Then,  for some tuple  such that . Thus, the set W is empty and  stops. Let V be empty. Then, the system  is inconsistent for any  such that . Let us show that  for some DR . First, assume the contrary, that for each DR ,  is not a subset of . In the tuple , we change the values of all features that do not belong to  to k. We denote the obtained tuple . It is clear that  is a solution of the equation system  and ; however, this is impossible, as the system  is inconsistent. As a result, we have that  stops after at most  complete repetitions of Step 2. Thus, .    □
 Corollary 2.  .
 Proof.  This statement follows from the fact that , Theorem 1 and Corollary 1.    □
 We denote by  the maximum weight of a DR from S.
Corollary 3.  .
 Proof.  This statement follows from Theorem 1, Lemma 3, and Corollary 1.    □
 From Theorem 1, it follows that . We now show that this bound is unimprovable even for depth, i.e., when .
Proposition 1.  For any , there exists a restricted -DRS  such that , , and .
 Proof.  Let 
. We now consider a restricted 
-DRS 
, where for 
 the DR 
 is equal to
It is clear that 
 and that the values of all features from the DRs 
 belong to the set 
. For the DRS 
, we will consider tuples of values of features from the set 
.
The length of each DR from  is equal to p. Using Lemma 3, we obtain that . Let  and , i.e., let all DRs from  be non-realizable for the tuple . It is clear that  for any two DRs  such that . Using this fact, it is easy to show that the minimum cardinality of a certificate for  is equal to q. Therefore, .
It is clear that . Let us show that . To this end, we consider a DDT  over the problem  which solves this problem and for which . Now, let us analyze the work of , which is described as follows. Let  compute the value of feature  from DR . If the values of all other features from  have already been computed, then ; otherwise, . If  does not compute the values of all features, then we will not know the solution to the realizability problem. Therefore,  and . Thus, .    □
 It follows from Proposition 1 that we cannot improve the accuracy bound for Algorithm 1 provided by Theorem 1 and that we cannot find algorithms with better accuracy bounds based on parameters  and .
  6. Conclusions
In this paper, an algorithm for modeling the operation of a DDT solving the problem of realizability is proposed. The running time of this algorithm is polynomial depending on the length of the description of the DRS under consideration. The weighted depth of the modeled DDT does not exceed the square of the minimum weighted depth of an NDT solving the realizability problem. In the future, we plan to design similar algorithms for some other problems related to DRSs. In the more distant future, we plan to conduct an experimental study of such algorithms.