Abstract
The study of the relationships between DRSs (Decision Rule Systems) and DTs (Decision Trees) is of considerable interest in computer science. In this paper, we consider classes of DRSs that are closed under specific operations. First, we examine classes that are closed under the operation of the removal of features and analyze the functions characterizing the worst-case dependence of the minimum depth of DDTs (Deterministic Decision Trees) and NDTs (Nondeterministic Decision Trees), solving the task of finding all true DRs in a DRS on the number of different features in the system. Second, we extend our analysis to classes that are closed under the removal of features and rules, studying the worst-case behavior of the minimum DT depth for the task of finding at least one true DR. Third, we investigate classes closed under the removal of features and rules in the context of finding all right-hand sides of true DRs. We prove that, in all three cases, the corresponding functions characterizing the worst-case minimum depth of DTs are either bounded from above by a constant or grow linearly.
Keywords:
closed class; decision rule system; deterministic decision tree; nondeterministic decision tree MSC:
68Q25
1. Introduction
DTs (Decision Trees) [1,2,3,4,5] and DRSs (Decision Rule Systems) [6,7,8,9,10,11] are common tools for structuring and expressing knowledge. They act as classifiers, providing predictions for unseen cases, and are also employed as algorithms in diverse domains such as fault diagnosis, combinatorial optimization, and beyond. Among classification and knowledge representation models, DTs and DRSs stand out for their high level of interpretability [12,13]. Exploring the connections and transformations between DTs and DRSs has become an important focus of research within computer science.
In this work, we examine classes of DRSs that are closed under certain operations. The operation of feature removal is natural: if we have a DRS but cannot work with one of its features because it is unavailable for some reason, we can remove this feature from the DR and try to work with the resulting DRS. Similarly, rule removal is a fundamental operation that allows us to disregard certain DRs while maintaining the structure of the system.
One of the main tasks associated with a DRS is to find, given a tuple of feature values, all DRs that are true on this tuple (that have a true left-hand side). To solve this task, we use DDTs (Deterministic Decision Trees) and NDTs (Nondeterministic Decision Trees). In addition, we consider two related tasks: finding at least one true DR in a DRS and finding all right-hand sides of true DRs.
For an arbitrary closed class of DRSs, we investigate functions that describe, in the worst-case scenario, how the minimal depth of DDTs and NDTs required to solve these problems depends on the number of distinct features in the system. Specifically, we analyze the task of finding all true DRs in classes closed under the removal of features, the task of finding at least one true DR in classes closed under the removal of features and rules, and the task of finding all right-hand sides of true DRs in classes closed under the same operations. We prove that, in all three cases, the behavior of these functions is such that they are bounded from above by a constant value or have linear growth.
In this paper, we continue to develop a syntactic approach to the study of the relationships between DRSs and DTs, outlined in [14,15]. This method relies on the assumption that we have access only to the DRS itself, without the underlying input data. An overview of earlier research in this area is presented primarily in one book [16]. In that book, we considered all three tasks: finding all true DRs, finding at least one true DR, and finding all right-hand sides of true DRs. However, unlike in this paper, we did not consider NDTs and classes of DRSs closed under specific operations.
Note that this paper is an extension of two conference papers: in [17], we analyzed the problem of finding all realizable rules, and in [18], we studied the problem of finding at least one realizable rule.
2. Definitions
This section introduces the notation and key definitions for DRSs and DTs.
2.1. DRSs—Decision Rule Systems
Let and . Elements of the set F will be called features. Let . Let .
Let represent the set of equation systems of the form
where , and . We say that the system is inconsistent if there exist indices with , such that but . Otherwise, the system of equations will be consistent.
Definition 1.
A k-DR (Decision Rule) is defined as an expression of the form
where , are pairwise different features from F, , and .
This DR will be denoted by r. The expression is referred to as the left-hand side, while the value is termed the right-hand side of r. The integer m is called the length of the DR and is written as . Let and . Two DRs, and , will be called equal if and the right-hand sides of the DRs and are equal.
Definition 2.
A k-DRS (Decision Rule System) S is defined as a finite set of k-DRs.
We define , as the set of right-hand sides of all DRs in S, and for a nonempty DRS S. If , then and .
Let and , where . For , denote .
A DR is said to be true for a tuple whenever . We denote by the subset of S consisting of all such rules that hold for .
Definition 3.
All Rules task for a DRS S: given a tuple , determine the subset of rules . This task will be denoted by .
Definition 4.
Some Rules task for a DRS S: given a tuple , determine a subset such that
- Every rule in Z is true for the ;
- If , then no rule from S is true for .
We denote this task as .
Definition 5.
All Decisions task for a DRS S: given , determine a subset such that
- Every rule in Z is true for the ;
- For every , any DR from S with the right-hand side equal to σ is not true for the tuple .
We denote this task as .
We now define the operation of removal of a feature from the DRSS. Let . If , then denote . If , then denote by the DR derived from r by the removal from the left-hand side of r the equality containing the feature . We denote . The DRS is the result of applying to S the operation of removal feature .
The operation of removal of a rule from the DRS S is defined in a natural way: we can remove from S an arbitrary DR.
Later, we will consider two types of closed classes of DRSs: classes of k-DRSs closed under the removal of features and classes of k-DRSs closed under the removal of features and rules.
It is easy to see that the empty DRS belongs to any closed class.
2.2. DTs—Decision Trees
A finite rooted directed tree is a finite directed tree with exactly one vertex that has no incoming edges, referred to as the root. Vertices with no outgoing edges are called leaves, while those that are neither root nor leaves are termed internal vertices. A complete path is a sequence where is the root, is a leaf, and each connects to for .
Consider S to be a nonempty k-DRS.
Definition 6.
A DT (Decision Tree) over the DRS S is a finite rooted labeled directed tree with at least two vertices, such that
- The root and its outgoing edges are not labeled;
- Every internal vertex of is labeled by a feature from , and the edges leaving it are labeled by elements of ;
- Each leaf of is labeled by a subset of S.
Definition 7.
A DT over S is termed a DDT (Deterministic Decision Tree) if exactly one edge leaves from the root and, at every internal vertex, the outgoing edges are labeled with pairwise distinct labels.
Let be a DT over S. Denote by the set of all complete paths in the DT . For a complete path , we associate an equation system . If and , then . For , each vertex () is labeled by a feature , and the edge is labeled with a value . In this case, . We denote by the collection of DRs attached to the leaf .
Let be a DT over S, , and . The path is said to accept the tuple whenever .
For each of the three tasks under consideration, we give definitions of DTs that solve them nondeterministically. After each definition, we add some explanations about a complete path accepting a tuple in a DT that solves the task. In particular, these definitions and explanations entail Remarks 1, 2, and 3 below.
Definition 8.
We say that nondeterministically solves the task if, for every tuple , there exists a complete path that accepts , and each complete path ξ with a consistent equation system satisfies the following:
- For every , we have .
- For every , the union is inconsistent.
In this situation, is referred to as an NDT (Nondeterministic Decision Tree) solving .
Let solve the task nondeterministically, , , and accept . This means that . From this, it follows that is consistent. Then, for any DR , we have and . Therefore, r is true for .
We also have that, for any DR , the system of equations is inconsistent. Since , it follows that is also inconsistent. Therefore, r is not true for . Thus, .
Remark 1.
Let solve the task nondeterministically, , , and ξ accept . Then, .
Definition 9.
We say that nondeterministically solves the task if, for every tuple , there exists a complete path that accepts , and each complete path ξ with consistent satisfies the following:
- For each , we have .
- If , then is inconsistent for every .
In this case, is called an NDT (Nondeterministic Decision Tree) that solves .
Let solve the task nondeterministically, , , and accept . This means that . From this, it follows that is consistent. Then, for any DR , we have and . Therefore, r is true for .
If , then for any DR , the system of equations is inconsistent. Since , it follows that is also inconsistent. Therefore, r is not true for .
Definition 10.
We say that nondeterministically solves the task if, for every tuple , there exists a complete path that accepts , and each complete path ξ with consistent satisfies the following:
- For all , we have the relation .
- If and the right-hand side of r does not belong to the set , then the system of equations is inconsistent.
In this case, is called an NDT (Nondeterministic Decision Tree) that solves .
Let solve the task nondeterministically, , , and accept . This means that . From this, it follows that is consistent. Then, for any DR , we have and . Therefore, r is true for .
If and the right-hand side of r is not contained in , then the union is inconsistent. Because , this implies that is also inconsistent. Therefore, r is not true for .
Remark 2.
Let , solve the task nondeterministically, , , ξ accept , and . Then, for any DR , the system of equations is inconsistent, and the DR r is not true for .
Remark 3.
Let , solve the task nondeterministically, , , ξ accept , and . Then, .
Definition 11.
Let . We say that solves the task deterministically if it is a DDT (Deterministic Decision Tree) that also solves task in the nondeterministic sense. In this case, is referred to as a DDT solving .
Definition 12.
For each complete path , let denote the number of internal nodes along ξ. The value is defined as the depth of the DT .
Let S be a nonempty k-DRS and let . Define as the minimum depth of a DDT over S that solves , and as the minimum depth of an NDT over S that solves . For the empty system , we set .
3. Main Results
Let , and C be a class of k-DRSs closed under the removal of features if ; otherwise, C is a class of k-DRSs closed under the removal of features and rules. In this section, we investigate the functions and , which are defined as follows. Let , then
These functions characterize the minimum depth of DTs solving the task for systems from the closed class C growthin the worst case with the growth of the number of different features in the DRSs. In the case of the function , we consider DDTs, and in the case of the functions , we consider NDTs. We begin by establishing several auxiliary results.
Lemma 1.
For any k-DRS S and , the inequality holds.
Proof.
If the DRS S is empty, then . Let S be a nonempty DRS. It is easy to show that we can construct a DDT solving the task by the sequential computation of all features from . The depth of this DT is equal to . Thus, . □
Lemma 2.
Let , and C be a class of k-DRSs closed under the removal of features if ; otherwise, C is a class of k-DRSs closed under the removal of features and rules. Then, for any , the values and are defined and the inequalities hold.
Proof.
Let . If the DRS S is empty, then . Let S be a nonempty DRS. It is clear that any DDT solving the task is an NDT solving the task . Therefore . From Lemma 1, it follows that . From all of the above, the statement of the lemma follows. □
Lemma 3.
Let , and C be a class of k-DRSs closed under the removal of features if ; otherwise, C is a class of k-DRSs closed under the removal of features and rules. If the value for DRSs is bounded from above by a positive constant b, then for any ,
Proof.
Let the value for DRSs be bounded from above by a positive constant b. Using Lemma 1, we obtain that for any , . Therefore, for any . Using Lemma 2, we obtain that for any . □
Lemma 4.
Let C be a class of k-DRSs closed under the removal of features, and let the value for systems not be bounded from above. If the value for systems is not bounded from above, then for any , there exists a DRS from C with that contains a DR of the length n.
Proof.
Let . Let S be a system from C such that , and r be a DR from S, the length of which is at least n. We remove from r (and from features and obtain a DR of the length n. We also remove from S all features that do not belong to the set . As a result, we obtain a DRS from C with containing the DR of length n. □
Lemma 5.
Let C be a class of k-DRSs closed under the removal of features, and the value for systems be not bounded from above. If the value for system is bounded from above, then for any , there is a DRS from C with that contains n DRs of length 1 with pairwise different features.
Proof.
Let the value for DRSs be bounded from above by a positive integer l.
Let and . We denote by the set of DRs from S, the length of which is equal to t. We know that the value for DRSs is not bounded from above by a constant. We now show that there exists for which the value for DRSs is not bounded from above by a constant. Assume the contrary. Then, there exists a positive constant b such that, for any , the value for DRSs is at most b. It is clear that if . Therefore, for any , . As a result, we obtain that for any , which is impossible.
We denote by the minimum number for which the value for DRSs is not bounded from above by a constant. We will show that .
Let and . The set will be called the type of the DR r. For , we denote by the number of DRs with pairwise different types such that . Denote .
First, we consider the case when the value for DRSs is bounded from above by a positive integer d. Let and S be a DRS from C for which . Denote . Choose a feature and remove from the DRSs Q and S all features with the exception of , which belong to DRs , such that . As a result, we remove at most features and obtain DRSs and . One can show that , , and . If , choose a feature different from and remove from the DRSs and all features with the exception of , which belong to DRs such that . We denote by and the obtained DRSs, etc. We repeat the described procedure n times and obtain DRSs and . In each of these DRSs, there are n pairwise different features and n DRs such that, for , and . Moreover, . Since n is an arbitrary number from , we obtain that the value for DRSs is not bounded from above by a constant. Therefore, .
We now consider the case when the value for DRSs is not bounded from above by a constant. We will prove that . Let us assume the contrary: . Let . Choose a DRS such that . Let and . Remove the feature from the DRS S. As a result, we obtain a DRS from C, which contains a number of DRs of the length . Let us show that these DRs contain at least n pairwise different features. Let us assume the contrary: the considered DRs contain only pairwise different features. Then, the number of different types of DRs of length in S that contain the feature is at most the number of different subsets of the cardinality of the set of considered m features, which is at most . Evidently, , but this is impossible, since . Taking into account that n is an arbitrary number from , we obtain that the value for DRSs is not bounded from above by a constant. We obtained the contradiction. Thus, .
Let and S be a DRS from C for which . We remove from S all features with the exception of n pairwise different features from . We denote by the obtained DRS from C. It is clear that and contains n DRs of length 1 with pairwise different features. □
We will now formulate and prove the main results of this paper.
Theorem 1.
Let C be a class of k-DRSs closed under the removal of features. Then
(a) If the value for DRSs is bounded from above by a positive constant b, then for any .
(b) Otherwise, , for any .
Proof.
(a) This statement follows from Lemma 3.
(b) Let the value for DRSs be not bounded from above by a positive constant. We now consider two possibilities.
(b.1) Let the value for DRSs be not bounded from above by a constant. Let . We now show that . It is clear that . Let . From Lemma 4, it follows that there exists a DRS from C with containing a DR of the length n.
Let be an NDT over that solves the task and satisfies . Consider a tuple for which the DR is true. Then there exists a complete path that accepts the tuple . By Remark 1, the set associated with the leaf of coincides with , the set of rules in S that are true for . In particular, . Hence, it follows that the relation holds, and therefore and . Since , we obtain . From Lemma 2 we have .
(b.2) Let the value for DRSs be bounded from above by a positive integer l. Let . We now show that for any . It is clear that . Let . From Lemma 5, this implies the existence of a DRS from C with that contains n DRs of length 1 with pairwise different features.
Let be an NDT over , which solves the task and for which . Let be a tuple for which the DRs are true. Then there exists a path , which accepts the tuple . From Remark 1, it follows that the set attached to the leaf vertex of coincides with the set of DRs from S that are true for the tuple . In particular, . From here it follows that the relation holds for . Therefore and . Taking into account that , we obtain . From Lemma 2 it follows that . □
Theorem 2.
Let , and C be a class of k-DRSs closed under the removal features and rules. Then,
(a) If the value for DRSs is bounded from above by a positive constant b, then for any .
(b) Otherwise, , for any .
Proof.
(a) This statement follows from Lemma 3.
(b) Let the value for DRSs be not bounded from above by a positive constant. We now consider two possibilities.
(b.1) Let the value for DRSs be not bounded from above by a constant. Let . We now show that . It is clear that . Let . From Lemma 4, it follows that there exists a DRS from C with containing a DR of length n. Denote by the DRS obtained from by removal of all DRs with the exception of . Then, .
Let be an NDT over , which solves the task and for which . Let be a tuple for which the DR is true. Then there exists a path , which accepts the tuple . Using Remark 2 and taking into account that there is only , and it is true for , we obtain . From here, we have . Therefore, and . Since , we obtain . From Lemma 2 it follows that .
(b.2) Let the value for DRSs be bounded from above by a positive integer l. Let . We now show that for any . It is clear that . Let . From Lemma 5, it follows that there exists a DRS from C with that contains n DRs of the length 1 with pairwise different features. Denote by the DRS obtained from by removal of all DRs with the exception of . Then, .
Let be an NDT over , which solves the task and for which . Let be a tuple for which all the DRs are not true. Then, there exists a path , which accepts the tuple . Using Remark 3, we obtain . By Remark 2, we see that is inconsistent for . Therefore and . Taking into account that , we obtain . From Lemma 2 it follows that . □
Let , and C be a class of DRS closed under the operation of removal of features only. We now show that in this case the functions and can be bounded from above by a constant even if the value for DRSs is not bounded from above by a constant.
Let B be a finite subset of the set . Denote . Let us consider the class of 2-DRSs. One can show that the class C is closed under the removal of features, and the value for DRSs is not bounded from above by a constant. Let , B be nonempty, and . Then, the DDT depicted in Figure 1 solves the task . Therefore, the functions and are bounded from above by the constant 1.
Figure 1.
DDT (Deterministic Decision Tree) solving the task , .
4. Conclusions
In this paper, for arbitrary closed classes of DRSs, we investigated the functions characterizing the worst-case dependence of the minimum depth of DDTs and NDTs on the number of different features in the DRS for three tasks:
- Finding all true DRs in a DRS, considering classes closed under the removal of features operation.
- Determining whether at least one true DR exists, considering classes closed under the removal of features and rules.
- Finding all right-hand sides of true DRs in a DRS, considering classes closed under the removal of features and rules.
It was proven that, in all three cases, the functions describing the worst-case depth of DTs are either bounded from above by a constant or grow linearly. In the future, we plan to extend this study to further tasks of DRSs and DTs.
Author Contributions
Conceptualization, M.M.; Methodology, K.D. and M.M.; Formal analysis, K.D. and M.M.; Writing—original draft, K.D. and M.M.; Writing—review & editing, K.D. and M.M.; Supervision, M.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by King Abdullah University of Science and Technology (KAUST).
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Acknowledgments
The research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST).
Conflicts of Interest
The authors declare no conflicts of interest.
References
- AbouEisha, H.; Amin, T.; Chikalov, I.; Hussain, S.; Moshkov, M. Extensions of Dynamic Programming for Combinatorial Optimization and Data Mining; Intelligent Systems Reference Library; Springer: Cham, Switzerland, 2019; Volume 146. [Google Scholar]
- Alsolami, F.; Azad, M.; Chikalov, I.; Moshkov, M. Decision and Inhibitory Trees and Rules for Decision Tables with Many-valued Decisions; Intelligent Systems Reference Library; Springer: Cham, Switzerland, 2020; Volume 156. [Google Scholar]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Springer: Cham, Switzerland, 1984. [Google Scholar]
- Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: Burlington, MA, USA, 1993. [Google Scholar]
- Rokach, L.; Maimon, O. Data Mining with Decision Trees—Theory and Applications; Series in Machine Perception and Artificial Intelligence; Springer: Cham, Switzerland, 2007; Volume 69. [Google Scholar]
- Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A. Logical analysis of numerical data. Math. Program. 1997, 79, 163–190. [Google Scholar] [CrossRef]
- Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A.; Mayoraz, E.; Muchnik, I.B. An implementation of logical analysis of data. IEEE Trans. Knowl. Data Eng. 2000, 12, 292–306. [Google Scholar] [CrossRef]
- Chikalov, I.; Lozin, V.V.; Lozina, I.; Moshkov, M.; Nguyen, H.S.; Skowron, A.; Zielosko, B. Three Approaches to Data Analysis—Test Theory, Rough Sets and Logical Analysis of Data; Intelligent Systems Reference Library; Springer: Cham, Switzerland, 2013; Volume 41. [Google Scholar]
- Fürnkranz, J.; Gamberger, D.; Lavrac, N. Foundations of Rule Learning; Cognitive Technologies: Arlington, VA, USA, 2012. [Google Scholar]
- Pawlak, Z. Rough Sets—Theoretical Aspects of Reasoning about Data; Theory and Decision Library: Series D; Springer: Cham, Switzerland, 1991; Volume 9. [Google Scholar]
- Pawlak, Z.; Skowron, A. Rudiments of rough sets. Inf. Sci. 2007, 177, 3–27. [Google Scholar] [CrossRef]
- Costa, V.G.; Pedreira, C.E. Recent advances in decision trees: An updated survey. Artif. Intell. Rev. 2023, 56, 4765–4800. [Google Scholar] [CrossRef]
- Molnar, C. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable, 2nd ed.; 2022; Available online: https://freecomputerbooks.com/Interpretable-Machine-Learning-A-Guide-for-Making-Black-Box-Models-Explainable.html (accessed on 14 July 2025).
- Moshkov, M. Some relationships between decision trees and decision rule systems. In Proceedings of the Rough Sets and Current Trends in Computing, First International Conference, RSCTC’98, Warsaw, Poland, 22–26 June 1998, Proceedings; Polkowski, L., Skowron, A., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 1998; Volume 1424, pp. 499–505. [Google Scholar]
- Moshkov, M. On transformation of decision rule systems into decision trees. In Proceedings of the Seventh International Workshop Discrete Mathematics and its Applications, Moscow, Russia, 29 January–2 February 2001; Part 1. Center for Applied Investigations of Faculty of Mathematics and Mechanics; Moscow State University: Moscow, Russia, 2001; pp. 21–26. (In Russian) [Google Scholar]
- Durdymyradov, K.; Moshkov, M.; Ostonov, A. Decision Trees Versus Systems of Decision Rules. A Rough Set Approach; Studies in Big Data; Springer: Cham, Switzerland, 2024; Volume 160. [Google Scholar]
- Durdymyradov, K.; Moshkov, M. Deterministic and nondeterministic decision trees for decision rule systems from closed classes. In Proceedings of the 24th International Conference on Artificial Intelligence and Soft Computing (ICAISC 2025), Zakopane, Poland, 22–26 June 2025. [Google Scholar]
- Durdymyradov, K.; Moshkov, M. On depth of deterministic and nondeterministic decision trees for decision rule systems from closed classes. In Proceedings of the 29th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2025), Osaka, Japan, 10–12 September 2025. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).