Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth

Durdymyradov, Kerven; Moshkov, Mikhail

doi:10.3390/axioms14110794

Open AccessArticle

Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth

by

Kerven Durdymyradov

^*

and

Mikhail Moshkov

^*

Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Axioms 2025, 14(11), 794; https://doi.org/10.3390/axioms14110794

Submission received: 1 October 2025 / Revised: 22 October 2025 / Accepted: 25 October 2025 / Published: 28 October 2025

Download Versions Notes

Abstract

In this paper, an efficient algorithm for modeling the operation of a DDT (Deterministic Decision Tree) solving the problem of realizability of DRs (Decision Rules) is proposed and analyzed. For this problem, it is assumed that a DRS (Decision Rule System) is given; for an arbitrary tuple of feature values, it is required to recognize whether there is a DR realizable on this tuple, i.e., a DR for which the left-hand side is true on the tuple. It is shown that the weighted depth of the modeled DDT does not exceed the square of the minimum weighted depth of the NDT (Nondeterministic Decision Tree) solving the realizability problem.

Keywords:

system of decision rules; deterministic decision tree; nondeterministic decision tree; weighted depth

MSC:

68Q10; 68Q25; 68T20

1. Introduction

DTs (Decision Trees) [1,2,3,4] and DRSs (Decision Rule Systems) [5,6,7,8,9] are common tools for structuring and expressing knowledge. They act as classifiers, providing predictions for unseen cases, and are also employed as algorithms in diverse domains such as fault diagnosis, combinatorial optimization, and beyond. Among classification and knowledge representation models, DTs and DRSs stand out for their high level of interpretability [10,11]. Exploring the connections and transformations between DTs and DRSs has become an important focus of research within computer science. In this work, we continue to develop the syntactic approach to study of this task proposed in [12,13]. This approach is based on the assumption that only the DRS is available to us, and not the input data. The results of previous studies in this area are summarized mainly in the book [14] and in the conference papers [15,16,17]. Earlier research has focused on the depth of DTs; in contrast, this paper investigates weighted depth as a measure of the time complexity of DTs.

In this work, we study the following realizability problem. Let a DRS be given. For a tuple of feature values, it is required to recognize whether this DRS contains a DR (Decision Rule) that is realizable for this tuple, i.e., a DR for which the left-hand side is true for this tuple. The tuples under consideration can contain values that do not appear in the DRS. This allows us to model the natural situation when the input tuple for the classifier may contain values that are not present in the training data. In addition, we do not allow the presence in the DRS of two DRs such that the set of conditions from the left-hand side of the first DR is a proper subset of the set of conditions from the left-hand side of the second DR; removing the second DR will not change the solution of the realizability problem.

When addressing the considered problem, it is important to note that in real-life situations we do not have direct access to the tuple of feature values. To determine the value of a feature, it is necessary to compute it for the given input, which can be an expensive procedure.

The complexity of finding feature values is determined by the weight function w, which assigns a weight

w (f_{i})

that is a positive integer to each feature

f_{i}

,

i = 1, \dots, n

present in the DRs from the DRS. The weight

w (f_{i})

is interpreted as the complexity of finding the value of the feature

f_{i}

. To minimize the total weight of computed feature values, we consider DDTs (Deterministic Decision Trees) and NDTs (Nondeterministic Decision Trees) solving the considered problem and study their weighted depth.

To clarify the possibilities and limitations of using DTs, we modify examples considered in the papers [15,17] and in Lemma 13.10 of the book [14]. First, we discuss a sequence of DRSs for which the minimum depth of deterministic decision DDTs is growing as a logarithm on the number of different features in the DRs from the DRS. In such a situation, the use of DTs seems appropriate. A sequence of DRSs is also considered for which the minimum number of vertices in DDTs grows exponentially with the sum of length of DRs from the DRS. This means that in the general case, instead of constructing the entire DT, its operation on a given tuple of feature values should be modeled using a sufficiently efficient algorithm.

This paper is devoted to the consideration of a new efficient algorithm for modeling the operation of a DDT solving the realizability problem. The weighted depth of this DT is bounded from below by the minimum weighted depth of an NDT and from above by the square of the minimum weighted depth of an NDT.

Similar upper bounds for the depth were obtained in [18,19,20] for Boolean functions (see [21] for details); in [14] for functions of k-valued logic,

k \geq 2

; and in the paper [15] for the problem of finding all realizable DRs. To derive such upper bounds for each of the mentioned cases, an NDT with the minimum depth was considered. Based on this DT, for a given tuple of values of variables or features, the operation of a DDT satisfying the upper bound was described. It is important to note that this description of a DDT cannot be considered an efficient algorithm, since the NDT under consideration can have a huge number of vertices.

Note that this paper is a generalization of a previous conference paper [17]. In [17], we proposed an efficient algorithm for modeling the operation of DDTs that solves the realizability problem. The main contribution was showing that the depth of such DTs is bounded from below by the minimum depth of an NDT and from above by the square of the minimum depth of an NDT. In this paper, we extend that approach by considering the notion of weighted depth, which takes into account the complexity of finding feature values. This generalization better reflects real-world applications, where the complexity of finding feature values might vary.

In the present paper, when creating an algorithm for modeling the operation of a DDT for the realizability problem, it was possible to do without directly using an optimal NDT. As a result, the designed modeling algorithm has a polynomial complexity depending on the length of the description of the DRS.

The structure of the paper is as follows: in Section 2, we provide the key definitions and notation; Section 3 clarifies possibilities and limitations of using DTs; Section 4 considers the minimum weighted depth of NDTs; Section 5 is devoted to the analysis of an efficient algorithm for modeling the operation of a DDT; finally, Section 6 presents brief conclusions.

2. Definitions

This section introduces the notation and key definitions for DRSs and DTs.

2.1. DRSs—Decision Rule Systems

Let

N_{0} = {0, 1, 2, \dots}

,

k \in N_{0} ∖ {0, 1}

, and

n \in N_{0} ∖ {0}

. Denote

E_{k} = {0, 1, \dots, k - 1}

and

F_{n} = {f_{1}, \dots, f_{n}}

. Elements of the set

F_{n}

will be called features.

A

(n, k)

-DR is an expression of the form

(f_{i_{1}} = σ_{1}) \land \dots \land (f_{i_{t}} = σ_{t}) \to γ,

where

t \in N_{0} ∖ {0}

,

f_{i_{1}}, \dots, f_{i_{t}}

are pairwise different features from

F_{n}

,

σ_{1}, \dots, σ_{t}

\in E_{k}

, and

γ \in N_{0}

. We denote this DR by r. The number

γ

is interpreted as the decision of the DR r. The number t will be called the length of the DR r, denoted

l (r)

. We denote

F t (r) = {f_{i_{1}}, \dots, f_{i_{t}}}

and

K (r) = {f_{i_{1}} = σ_{1}, \dots, f_{i_{t}} = σ_{t}}

.

A restricted

(n, k)

-DRSS is a finite nonempty set of

(n, k)

-DRs such that there are no DRs

r, r^{'} \in S

for which

K (r) \subset K (r^{'})

. We denote

F t (S) = ⋃_{r \in S} F (r)

. We can describe the DRS S by a word in the alphabet

{0, 1, (,), \land, f, \to,;}

such that the indexes of features, their values, and the decisions of DRs are in binary representations. The sign “;” is used to separate DRs. This word will be called the description of the DRS S.

A weight function for the DRS S is a map

w : F t (S) \to N_{0} ∖ {0}

. The total weight of features in the DR r will be called the weight of the DR r and is denoted

w (r)

.

Let us fix a restricted

(n, k)

-DRS S, with which we will work later. We assume for definiteness that

S = {r_{1}, \dots, r_{m}}

and

F t (S) = F_{n}

.

For

\bar{v} = (v_{1}, \dots, v_{n}) \in E_{k + 1}^{n}

, we denote

K (S, \bar{v}) = {f_{1} = v_{1}, \dots, f_{n} = v_{n}}

. We will say that a DR

r_{i}

from S is realizable for a tuple

\bar{v} \in E_{k + 1}^{n}

if

K (r_{i}) \subseteq K (S, \bar{v})

.

For each tuple

\bar{v} \in E_{k + 1}^{n}

, we define the value

d e c (\bar{v})

in the following way: if there is a DR from S that is realizable for

\bar{v}

, then

d e c (\bar{v}) = 1

; otherwise,

d e c (\bar{v}) = 0

. The problem of realizability

R (S)

is defined as follows: for a given tuple

\bar{v} \in E_{k + 1}^{n}

, it is required to find the value

d e c (\bar{v})

.

We denote by

E S (S)

the set of systems of equations of the form

α = {f_{i_{1}} = σ_{1}, \dots, f_{i_{t}} = σ_{t}},

where

t \in N_{0}

,

f_{i_{1}}, \dots,

f_{i_{t}} \in F_{n}

, and

σ_{1}, \dots, σ_{t} \in E_{k + 1}

. This system will be called inconsistent if there exist

p, q \in {1, \dots, t}

such that

p \neq q

,

i_{p} = i_{q}

, and

σ_{p} \neq σ_{q}

. If the system of equations

α

is not inconsistent, then it will be called consistent. The total weight of features in the system of equations will be called the weight of the system

α

, denoted

w (α)

. We will say that a tuple

\bar{v} = (v_{1}, \dots, v_{n}) \in E_{k + 1}^{n}

is a solution of the equation system

α

if

α \subseteq K (S, \bar{v})

.

We will say that the equation system

α

supports the decision

d \in {0, 1}

if

α

has no solutions from

E_{k + 1}^{n}

or if, for any solution

\bar{v} \in E_{k + 1}^{n}

of

α

,

d e c (\bar{v}) = d

.

2.2. DTs—Decision Trees

A finite rooted directed tree is a finite directed tree with exactly one vertex that has no incoming edges, referred to as the root. Vertices with no outgoing edges are called leaves, while those that are neither root nor leaves are termed internal vertices. A complete path is a sequence

ξ = u_{1}, e_{1}, \dots, u_{m}, e_{m}, u_{m + 1}

, where

u_{1}

is the root,

u_{m + 1}

is a leaf, and each

e_{i}

connects

u_{i}

to

u_{i + 1}

for

i = 1, \dots, m

.

A DT over the problem

R (S)

is a finite rooted labeled directed tree

G

with at least two vertices, such that:

The root and its outgoing edges are not labeled.
Every internal vertex of $G$ is labeled by a feature from $F t (S)$ , and the edges leaving it are labeled by elements of $E_{k}$ .
Each leaf of $G$ is labeled with a decision from the set ${0, 1}$ .

A DT over the problem

R (S)

is termed a deterministic if exactly one edge leaves from the root; at every internal vertex, the outgoing edges are labeled with pairwise distinct labels.

Let

Γ

be a DT over the problem

R (S)

. We denote by

C P (Γ)

the set of complete paths in the DT

Γ

. Let

ξ = u_{1}, e_{1}, \dots, u_{t}, e_{t}, u_{t + 1}

be a complete path in

Γ

. We correspond to this path an equation system

K (ξ) \in E S (S)

. If

t = 1

and

ξ = u_{1}, e_{1}, u_{2}

, then

K (ξ) = \emptyset

. Let

t \geq 2

and (for

j = 2, \dots t

) the vertex

u_{j}

be labeled with the feature

f_{i_{j}}

. Let the edge

e_{j}

be labeled with the number

σ_{j} \in E_{k + 1}

. Then,

K (ξ) = {f_{i_{2}} = σ_{2}, \dots, f_{i_{t}} = σ_{t}}

. We denote by

d (ξ)

the decision attached to the vertex

u_{t + 1}

. We will say that the complete path

ξ

accepts the tuple

\bar{v} \in E_{k + 1}^{n}

if

K (ξ) \subseteq K (S, \bar{v})

.

We will say that

G

solves the problem

R (S)

nondeterministically if, for any tuple

\bar{v} \in E_{k + 1}^{n}

, there exists a path

ξ \in C P (G)

which accepts the tuple

\bar{v}

and if for any path

ξ \in C P (G)

, the system of equations

K (ξ)

supports the decision

d (ξ)

. In this case, we will also say that

G

is an NDT solving the problem

R (S)

.

We will say that

G

solves the problem

R (S)

deterministically if

G

is a DDT, which solves the problem

R (S)

nondeterministically. In this case, we will also say that

G

is a DDT solving the problem

R (S)

.

For any complete path

ξ \in C P (G)

, we denote by

h (ξ)

the number of internal vertices in

ξ

and denote by

h_{w} (ξ)

the total weight of features attached to internal vertices of

ξ

. The value

h (G) = max {h (ξ) : ξ \in C P (G)}

is called the depth of the DT

G

. The value

h_{w} (G) = max {h_{w} (ξ) : ξ \in C P (G)}

is called the weighted depth of the DT

G

.

We denote by

h_{w}^{a} (S)

the minimum weighted depth of an NDT over the problem

R (S)

which solves this problem. We denote by

h_{w}^{d} (S)

the minimum weighted depth of a DDT over the problem

R (S)

which solves this problem. It is clear that

h_{w}^{a} (S) \leq h_{w}^{d} (S)

.

3. Two Sequences of DRSs (Decision Rule Systems)

In this section, we consider two sequences of DRSs. For each DRS in the first sequence, the minimum depth of a DDT is significantly less than the number of features in the DRS. This example shows that using DTs is reasonable. For each DRS in the second sequence, the number of vertices in any DDT is exponential related to the sum of length of the DRs in the DRS. This example shows that rather than constructing the entire DT, it is more reasonable to model its operation for a given tuple of feature values.

Let us begin with the first sequence of DRSs. Let

t \in N_{0} ∖ {0, 1}

. A complete binary tree of depth t is a finite directed tree with root in which each non-leaf vertex has exactly two outgoing edges and the length of each complete path is equal to t. The vertex set of this tree is naturally divided into

t + 1

levels: for

i = 0, \dots, t

, the ith level contains all vertices that are located at distance i from the root. It is clear that each level i includes

2^{i}

vertices. Consequently, the number of non-leaf vertices equals

2^{0} + \dots + 2^{t - 1} = 2^{t} - 1

, while the number of leaf vertices is

2^{t}

.

Let

H_{t}

denote a labeled complete binary tree of depth t, where non-leaf nodes are assigned features

f_{1}, \dots, f_{2^{t} - 1}

and leaf vertices are labeled with integers

1, \dots, 2^{t}

. For every non-leaf vertex, its outgoing edges are labeled by 0 and 1, respectively. For

j = 1, \dots, 2^{t}

, we define a DR

r_{j}

as follows. Consider a full path

u_{0}, e_{0}, \dots, u_{t - 1}, e_{t - 1}, u_{t}

in

H_{t}

ending at the leaf node labeled j, where for each

i = 0, \dots, t - 1

, the vertex

u_{i}

is labeled with feature

f_{l_{i}}

and the connecting edge

e_{i}

is labeled with a number

v_{i} \in E_{2}

. Then, the corresponding DR

r_{j}

is provided by

(f_{l_{0}} = v_{0}) \land \dots \land (f_{l_{t - 1}} = v_{t - 1}) \to j .

We denote the set of all such DRs by

R_{t} = {r_{1}, \dots, r_{2^{t}}}

. It is straightforward that

| F t (R_{t}) | = 2^{t} - 1

. For the DRS

R_{t}

, we will consider tuples of values of features from the set

E_{3}^{2^{t} - 1}

.

Next, let us analyze the problem

R (R_{t})

and show that

h^{d} (R_{t}) \leq t

. We transform the tree

H_{t}

into a DDT

G_{t}

over the problem

R (R_{t})

. For each

j = 1, \dots, 2^{t}

, we replace the label j attached to a leaf vertex of

H_{t}

with the label 1. For each non-leaf vertex w of

H_{t}

, we add to the tree

H_{t}

a vertex

u_{w}

and an edge

e_{w}

that leaves the vertex w and enters the vertex

u_{w}

. The edge

e_{w}

is labeled with the number 2 and the vertex

u_{w}

is labeled with the number 0. We add to the tree

H_{t}

a vertex v and an edge d that leaves the vertex v and enters the root of

H_{t}

. Both v and d are unlabeled. It can be shown that the obtained DDT

G_{t}

solves the problem

R (R_{t})

and has depth t.

Hence, for every

t \in N_{0} ∖ {0, 1}

, we obtain a DRS

R_{t}

satisfying

| F t (R_{t}) | = 2^{t} - 1

and there exists a DDT

G_{t}

solving

R (R_{t})

for which the depth equals t.

Now, let us move to the second sequence of DRSs. For any

q \in N_{0} ∖ {0}

, we define

U_{q}

the DRS

{(f_{2 i - 1} = 0) \land (f_{2 i} = 0) \to 0 : i = 1, \dots, q}

. Clearly,

| F t (U_{q}) | = 2 q

. For

U_{q}

, we consider tuples of values of features from the set

E_{3}^{2 q}

.

We denote by

Δ

the set of tuples

(v_{1}, \dots, v_{2 q}) \in {0, 2}^{2 q}

such that

{v_{2 i - 1}, v_{2 i}} = {0, 2}

for

i = 1, \dots, q

. It is clear that

|Δ| = 2^{q}

and that for any

\bar{v} \in Δ

there is no a DR from

U_{q}

that is realizable for the tuple

\bar{v}

. Let

G

be a DDT over the problem

R (U_{t})

which solves this problem. Let

\bar{v}, \bar{σ} \in Δ

and

\bar{v} \neq \bar{σ}

. It is clear that there exist complete paths

ξ

and

τ

in

G

such that

ξ

accepts

\bar{v}

and

τ

accepts

\bar{σ}

. Let us now show that

ξ \neq τ

. Let us first assume the contrary,

ξ = τ

. It is clear that the equation system

K (ξ)

supports the decision 0. Because

\bar{v} \neq \bar{σ}

, there exists

i \in {1, \dots, q}

such that the

(2 i - 1)

th and

(2 i)

th digits of the tuples

\bar{v}

and

\bar{σ}

are different. Therefore, the features

f_{2 i - 1}

and

f_{2 i}

are not attached to any vertex of the path

ξ

. Using this fact, it is easy to show that there exists a tuple

\bar{γ} \in E_{3}^{2 q}

such that

\bar{γ}

is a solution of the equation system

K (ξ)

and that the DR

(f_{2 i - 1} = 0) \land (f_{2 i} = 0) \to 0

from

U_{t}

is realizable for the tuple

\bar{γ}

; however, this is impossible. Therefore,

ξ \neq τ

. Thus, there are at least

2^{q}

pairwise different complete paths in the tree

G

.

As a result, for each

q \in N_{0} ∖ {0}

we obtain an example of a DRS

U_{q}

which consists of q DRs of length 2 and for which any DDT solving the problem

R (U_{q})

has at least

2^{q}

vertices.

4. On the Minimum Weighted Depth of NDTs (Nondeterministic Decision Trees)

In this section, we return to the study of the restricted

(n, k)

-DRS

S = {r_{1}, \dots, r_{m}}

with

F t (S) = {f_{1}, \dots, f_{n}}

and prove three lemmas related to the minimum weighted depth of an NDT solving the problem

R (S)

.

Let

\bar{v} \in E_{k + 1}^{n}

. A system of equations

α \in E S (S)

will be called a certificate for the tuple

\bar{v}

if

α

supports the decision

d e c (\bar{v})

and if

\bar{v}

is a solution of

α

.

Let

d e c (\bar{v}) = 1

. We denote

c e r (\bar{v}) = K (r_{i})

, where

r_{i}

is a DR from S with the minimum weight that is realizable for

\bar{v}

. Let

d e c (\bar{v}) = 0

. We denote by

c e r (\bar{v})

a subsystem with the minimum weight of the system

K (S, \bar{v})

such that

c e r (\bar{v}) \cup K (r_{i})

is inconsistent for each DR

r_{i} \in S

.

Lemma 1.

For any tuple

\bar{v} \in E_{k + 1}^{n}

, the equation system

c e r (\bar{v})

is a certificate for

\bar{v}

with the minimum weight.

Proof.

Let

\bar{v} \in E_{k + 1}^{n}

,

d e c (\bar{v}) = 1

,

α \in E S (S)

, and

\bar{v}

be a solution of

α

. We now show that

α

is a certificate for

\bar{v}

if and only if

K (r_{i}) \subseteq α

for some DR

r_{i} \in S

that is realizable for

\bar{v}

. If such a DR exists, then evidently

α

is a certificate for

\bar{v}

. Let there be no such DR. We change the value of each feature in

\bar{v}

that does not belong to

α

to k. As a result, we obtain a tuple

{\bar{v}}^{'} \in E_{k + 1}^{n}

such that

d e c ({\bar{v}}^{'}) = 0

and

\bar{v}

is a solution of

α

. Therefore,

α

is not a certificate for

\bar{v}

. From here, it follows that the equation system

K (r_{i})

, where

r_{i}

is a DR from S with the minimum weight that is realizable for

\bar{v}

, is a certificate for

\bar{v}

with the minimum weight.

Let

\bar{v} \in E_{k + 1}^{n}

,

d e c (\bar{v}) = 0

, and

α \in E S (S)

; additionally, let

\bar{v}

be a solution of

α

. We now show that

α

is a certificate for

\bar{v}

if and only if the system of equations

α \cup K (r_{i})

is inconsistent for each DR

r_{i} \in S

. If

α \cup K (r_{i})

is inconsistent for each DR

r_{i} \in S

, then evidently

α

is a certificate for

\bar{v}

. Let the system

α \cup K (r_{i})

be not inconsistent for some DR

r_{i} \in S

. In

\bar{v}

, change the values of all features belonging to

r_{i}

to values from

K (r_{i})

. As a result, we obtain a tuple

{\bar{v}}^{'}

from

E_{k + 1}^{n}

which is a solution of

α

and for which

d e c ({\bar{v}}^{'}) = 1

. Therefore,

α

is not a certificate for

\bar{v}

. From here, it follows that a subsystem

α

with the minimum weight of the system

K (S, \bar{v})

such that

α \cup K (r_{i})

is inconsistent for each DR

r_{i} \in S

is a certificate for

\bar{v}

with the minimum weight. □

Lemma 2.

h_{w}^{a} (S) = max {w (c e r (\bar{v})) : \bar{v} \in E_{k + 1}^{n}}

.

Proof.

Let

G

be an NDT over the problem

R (S)

which solves this problem. Let

\bar{v} \in E_{k + 1}^{n}

. Then,

G

has a complete path

ξ

accepting

\bar{v}

. This means that

\bar{v}

is a solution of the equation system

K (ξ)

. Therefore, the system of equations

K (ξ)

supports the decision

d e c (\bar{v})

. Thus,

K (ξ)

is a certificate for

\bar{v}

, and by Lemma 1 we have

h_{w} (ξ) \geq w (c e r (\bar{v}))

. As a result, we obtain

h_{w}^{a} (S) \geq max {w (c e r (\bar{v})) : \bar{v} \in E_{k + 1}^{n}}

.

Let us now describe an NDT

G_{0}

over the problem

R (S)

. The set of complete paths of

G_{0}

is equal to

{ξ_{\bar{v}} : \bar{v} \in E_{k + 1}^{n}}

, where

K (ξ_{\bar{v}}) = c e r (\bar{v})

and the leaf vertex of

ξ_{\bar{v}}

is labeled with

d e c (\bar{v})

. From Lemma 1, it follows that the complete path

ξ_{\bar{v}}

accepts the tuple

\bar{v}

and that the equation system

K (ξ_{\bar{v}})

supports the decision

d e c (\bar{v})

. Therefore,

G_{0}

solves the problem

R (S)

. Thus,

h_{w}^{a} (S) \leq max {w (c e r (\bar{v})) : \bar{v} \in E_{k + 1}^{n}}

. □

For

b \in {0, 1}

, we denote

h^{b} (S) = max {| c e r (\bar{v}) | : \bar{v} \in E_{k + 1}^{n}, d e c (\bar{v}) = b}

and

h_{w}^{b} (S) = max {w (c e r (\bar{v})) : \bar{v} \in E_{k + 1}^{n}, d e c (\bar{v}) = b}

.

Corollary 1.

h_{w}^{a} (S) = max {h_{w}^{0} (S), h_{w}^{1} (S)}

.

Lemma 3.

h_{w}^{1} (S) = max {w (r_{i}) : r_{i} \in S}

.

Proof.

Let us show that

{c e r (\bar{v}) : \bar{v} \in E_{k + 1}^{n}, d e c (\bar{v}) = 1} = {K (r_{1}), \dots, K (r_{m})}

. It is clear that for each

\bar{v} \in E_{k + 1}^{n}

such that

d e c (\bar{v}) = 1

, we have

c e r (\bar{v}) \in {K (r_{1}), \dots,

K (r_{m})}

. Let

r_{i} \in S

. We consider a tuple

\bar{v} \in E_{k + 1}^{n}

such that values of features from

r_{i}

are equal to corresponding values from

r_{i}

and values of all other features are equal to k. It is clear that

r_{i}

is the only DR from S that is realizable for

\bar{v}

. Therefore,

c e r (\bar{v}) = K (r_{i})

. As a result, we obtain

{c e r (\bar{v}) : \bar{v} \in E_{k + 1}^{n}, d e c (\bar{v}) = 1} = {K (r_{1}), \dots, K (r_{m})}

. From here, it follows that

h_{w}^{1} (S) = max {w (r_{i}) : r_{i} \in S}

. □

5. Algorithm Simulating the Operation of a DDT (Deterministic Decision Tree)

In this section, we continue to study the restricted

(n, k)

-DRS

S = {r_{1}, \dots, r_{m}}

with

F t (S) = {f_{1}, \dots, f_{n}}

. Let

\bar{v} = (v_{1}, \dots, v_{n}) \in E_{k + 1}^{n}

. We now consider an algorithm that describes the operation on the tuple

\bar{v}

of a DDT

G (S)

over the problem

R (S)

which solves this problem. As a result, we obtain the description of a complete path

τ (\bar{v})

in the DT

G (S)

that accepts the tuple

\bar{v}

. The set of complete paths of the DT

G (S)

coincides with the set

{τ (\bar{v}) : \bar{v} \in E_{k + 1}^{n}}

. The text in square brackets below is not a description of the algorithm’s and DT’s actions, and will only be used to prove a statement about this Algorithm 1.

Algorithm 1 Simulation of DDT operation

Step 1. For

i = 1, \dots, m

, set

Q_{i} : = K (r_{i})

. Set

W : = {Q_{1}, \dots, Q_{m}}

. [For any

\bar{v} \in E_{k + 1}^{n}

with

d e c (\bar{v}) = 0

, set

P (\bar{v}) = c e r (\bar{v})

. Set

V : = {P (\bar{v}) : \bar{v} \in E_{k + 1}^{n}, d e c (\bar{v}) = 0}

.]
Step 2. If

W = \emptyset

, then

G (S)

returns the decision 0 and stops. If W contains an empty system of equations, then

G (S)

returns the decision 1 and stops. Let

W \neq \emptyset

and W contain no empty systems of equations. Choose a system

Q_{j}

from W with the minimum number j. Let

f_{i_{1}}, \dots, f_{i_{p}}

be all features from

Q_{j}

and

i_{1} < \dots < i_{p}

. The DT

G (S)

computes values of features

f_{i_{1}}, \dots, f_{i_{p}}

and obtains the system of equations

α = {f_{i_{1}} = v_{i_{1}}, \dots, f_{i_{p}} = v_{i_{p}}}

. For each

Q_{l} \in W

, we remove

Q_{l}

from W if the system

Q_{l} \cup α

is inconsistent. Otherwise, we set

Q_{l} : = Q_{l} ∖ α

. [For each

P (\bar{v}) \in V

, we remove

P (\bar{v})

from V if the system

P (\bar{v}) \cup α

is inconsistent. Otherwise, we set

P (\bar{v}) : = P (\bar{v}) ∖ α

.]
Return to Step 2.

Remark 1.

It is possible to show that the time complexity of Algorithm 1 is polynomial depending on the length of the description of DRS S.

Theorem 1.

The DDT

G (S)

solves the problem

R (S)

and satisfies the inequality

h_{w} (G (S)) \leq h^{0} (S) h_{w}^{1} (S)

.

Proof.

First, we show that

G (S)

solves the problem

R (S)

. Let

\bar{v} \in E_{k + 1}^{n}

. It is clear that the complete path

τ (\bar{v})

in the DT

G (S)

accepts

\bar{v}

. Let

d e c (\bar{v}) = 0

. We can show that the equation system

K (τ (\bar{v})) \cup K (r_{i})

is inconsistent for each DR

r_{i} \in S

. Then,

d (τ (\bar{v})) = 0

. It is clear that for each solution

\bar{σ} \in E_{k + 1}^{n}

of the system

K (τ (\bar{v}))

we have

d e c (\bar{σ}) = 0

. Therefore,

K (τ (\bar{v}))

supports the decision 0. Let

d e c (\bar{v}) = 1

. We can show that

K (r_{i}) \subseteq K (τ (\bar{v}))

for some DR

r_{i} \in S

. Then,

d (τ (\bar{v})) = 1

. It is clear that for each solution

\bar{σ} \in E_{k + 1}^{n}

of the system

K (τ (\bar{v}))

, we have

d e c (\bar{σ}) = 1

. Therefore,

K (τ (\bar{v}))

supports the decision 1. Thus, the DT

G (S)

solves the problem

R (S)

.

Let us consider the work of the DT

G (S)

on a tuple

\bar{v} \in E_{k + 1}^{n}

. By Lemma 3, during each complete repetition of Step 2, the DT

G (S)

computes the values of features for which the total weight is at most

h_{w}^{1} (S)

. We now show that the number of complete repetitions of Step 2 is at most

h^{0} (S)

.

Let

r_{i} \in S

,

\bar{σ} \in E_{k + 1}^{n}

, and

d e c (\bar{σ}) = 0

. From the definition of

c e r (\bar{σ})

, it follows that the system of equations

K (r_{i}) \cup c e r (\bar{σ})

is inconsistent. From here, it follows that before each repetition of Step 2 of Algorithm 1, for each nonempty system

Q_{l} \in W

and each nonempty system

P (\bar{σ}) \in V

, the system of equations

Q_{l} \cup P (\bar{σ})

is inconsistent. Using this fact, it can be shown that after each complete repetition of Step 2, each system of equations from V will either be removed from V or its cardinality will decrease by at least 1. Evidently, the cardinality of each system from V before the first repetition of Step 2 is at most

h^{0} (S)

. Therefore, after

h^{0} (S)

complete repetitions of Step 2, the set V will be empty or will contain only empty systems of equations.

We denote by

β

the system of equations consisting of all features computed by

G (S)

during

h^{0} (S)

complete repetitions of Step 2 with their values from

\bar{v}

.

Let V contain an empty system. Then,

c e r (\bar{ρ}) \subseteq β

for some tuple

\bar{ρ} \in E_{k + 1}^{n}

such that

d e c (\bar{ρ}) = 0

. Thus, the set W is empty and

G (S)

stops. Let V be empty. Then, the system

β \cup c e r (\bar{ρ})

is inconsistent for any

\bar{ρ} \in E_{k + 1}^{n}

such that

d e c (\bar{ρ}) = 0

. Let us show that

K (r_{i}) \subseteq β

for some DR

r_{i} \in S

. First, assume the contrary, that for each DR

r_{i} \in S

,

K (r_{i})

is not a subset of

β

. In the tuple

\bar{v}

, we change the values of all features that do not belong to

β

to k. We denote the obtained tuple

{\bar{v}}^{'}

. It is clear that

{\bar{v}}^{'}

is a solution of the equation system

β

and

d e c ({\bar{v}}^{'}) = 0

; however, this is impossible, as the system

β \cup c e r ({\bar{v}}^{'})

is inconsistent. As a result, we have that

G (S)

stops after at most

h^{0} (S)

complete repetitions of Step 2. Thus,

h_{w} (G (S)) \leq h^{0} (S) h_{w}^{1} (S)

. □

Corollary 2.

h_{w} (G (S)) \leq h_{w}^{a} {(S)}^{2}

.

Proof.

This statement follows from the fact that

h^{0} (S) \leq h_{w}^{0} (S)

, Theorem 1 and Corollary 1. □

We denote by

w^{max} (S)

the maximum weight of a DR from S.

Corollary 3.

h_{w} (G (S)) \leq w^{max} (S) h_{w}^{a} (S)

.

Proof.

This statement follows from Theorem 1, Lemma 3, and Corollary 1. □

From Theorem 1, it follows that

h_{w}^{d} (S) \leq h_{w}^{0} (S) h_{w}^{1} (S)

. We now show that this bound is unimprovable even for depth, i.e., when

h_{w} = h

.

Proposition 1.

For any

p, q \in N_{0} ∖ {0}

, there exists a restricted

(p q, 2)

-DRS

S_{p q}

such that

h_{1} (S_{p q}) = p

,

h_{0} (S_{p q}) = q

, and

h^{d} (S_{p q}) = p q

.

Proof.

Let

p, q \in N_{0} ∖ {0}

. We now consider a restricted

(p q, 2)

-DRS

S_{p q} = {r_{1}, \dots, r_{q}}

, where for

i = 1, \dots, q

the DR

r_{i}

is equal to

(f_{p (i - 1) + 1} = 0) \land \dots \land (f_{p (i - 1) + p} = 0) \to i .

It is clear that

F t (S_{p q}) = {f_{1}, \dots, f_{p q}}

and that the values of all features from the DRs

r_{1}, \dots, r_{q}

belong to the set

E_{2}

. For the DRS

S_{p q}

, we will consider tuples of values of features from the set

E_{3}^{p q}

.

The length of each DR from

S_{p q}

is equal to p. Using Lemma 3, we obtain that

h_{1} (S_{p q}) = p

. Let

\bar{v} \in E_{3}^{p q}

and

d e c (\bar{v}) = 0

, i.e., let all DRs from

S_{p q}

be non-realizable for the tuple

\bar{v}

. It is clear that

F t (r_{i}) \cap

F t (r_{j}) = \emptyset

for any two DRs

r_{i}, r_{j} \in S_{p q}

such that

i \neq j

. Using this fact, it is easy to show that the minimum cardinality of a certificate for

\bar{v}

is equal to q. Therefore,

h_{0} (S_{p q}) = q

.

It is clear that

h^{d} (S_{p q}) \leq | F t (S_{p q}) | = p q

. Let us show that

h^{d} (S_{p q}) \geq p q

. To this end, we consider a DDT

G

over the problem

R (S_{p q})

which solves this problem and for which

h (G) = h^{d} (S_{p q})

. Now, let us analyze the work of

G

, which is described as follows. Let

G

compute the value of feature

f_{t}

from DR

r_{s}

. If the values of all other features from

r_{s}

have already been computed, then

f_{t} = 2

; otherwise,

f_{t} = 0

. If

G

does not compute the values of all features, then we will not know the solution to the realizability problem. Therefore,

h (G) \geq p q

and

h^{d} (S_{p q}) \geq p q

. Thus,

h^{d} (S_{p q}) = p q

. □

It follows from Proposition 1 that we cannot improve the accuracy bound for Algorithm 1 provided by Theorem 1 and that we cannot find algorithms with better accuracy bounds based on parameters

h_{w}^{0} (S)

and

h_{w}^{1} (S)

.

6. Conclusions

In this paper, an algorithm for modeling the operation of a DDT solving the problem of realizability is proposed. The running time of this algorithm is polynomial depending on the length of the description of the DRS under consideration. The weighted depth of the modeled DDT does not exceed the square of the minimum weighted depth of an NDT solving the realizability problem. In the future, we plan to design similar algorithms for some other problems related to DRSs. In the more distant future, we plan to conduct an experimental study of such algorithms.

Author Contributions

Conceptualization, K.D. and M.M.; Methodology, K.D. and M.M.; Formal analysis, K.D. and M.M.; Writing—original draft, K.D. and M.M.; Writing – review & editing, K.D. and M.M.; Supervision, K.D. and M.M.; Funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

Research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST).

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

We are very grateful to the anonymous reviewers for their helpful comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; CRC: New York, NY, USA, 1984. [Google Scholar]
Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: Burlington, MA, USA, 1993. [Google Scholar]
Rokach, L.; Maimon, O. Data Mining with Decision Trees—Theory and Applications; Series in Machine Perception and Artificial Intelligence; World Scientific Publishing Co., Inc.: River Edge, NJ, USA, 2007; Volume 69. [Google Scholar]
Murthy, S.K.; Salzberg, S. Decision Tree Induction: How Effective is the Greedy Heuristic? In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, QC, Canada, 20–21 August 1995; Fayyad, U.M., Uthurusamy, R., Eds.; AAAI Press: Washington, DC, USA, 1995; pp. 222–227. [Google Scholar]
Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A. Logical analysis of numerical data. Math. Program. 1997, 79, 163–190. [Google Scholar] [CrossRef]
Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A.; Mayoraz, E.; Muchnik, I.B. An Implementation of Logical Analysis of Data. IEEE Trans. Knowl. Data Eng. 2000, 12, 292–306. [Google Scholar] [CrossRef]
Fürnkranz, J.; Gamberger, D.; Lavrac, N. Foundations of Rule Learning; Cognitive Technologies: Arlington, VA, USA, 2012. [Google Scholar]
Pawlak, Z. Rough Sets—Theoretical Aspects of Reasoning about Data; Theory and Decision Library: Series D; Kluwer Academic Publishers: Norwell, MA, USA, 1991; Volume 9. [Google Scholar]
Pawlak, Z.; Skowron, A. Rudiments of rough sets. Inf. Sci. 2007, 177, 3–27. [Google Scholar] [CrossRef]
Costa, V.G.; Pedreira, C.E. Recent advances in decision trees: An updated survey. Artif. Intell. Rev. 2023, 56, 4765–4800. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed. 2022. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 2 September 2025).
Moshkov, M. Some Relationships between Decision Trees and Decision Rule Systems. In Proceedings of the Rough Sets and Current Trends in Computing, First International Conference, RSCTC’98, Warsaw, Poland, 22–26 June 1998; Proceedings; Lecture Notes in Computer Science. Polkowski, L., Skowron, A., Eds.; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1424, pp. 499–505. [Google Scholar]
Moshkov, M. On transformation of decision rule systems into decision trees. In Proceedings of the Seventh International Workshop Discrete Mathematics and its Applications, Moscow, Russia, 29 January–2 February 2001; Moscow State University: Moscow, Russia, 2001; pp. 21–26. (In Russian). [Google Scholar]
Durdymyradov, K.; Moshkov, M.; Ostonov, A. Decision Trees Versus Systems of Decision Rules: A Rough Set Approach; Studies in Big Data; Springer: Cham, Switzerland, 2024; Volume 160. [Google Scholar]
Durdymyradov, K.; Moshkov, M. Deterministic and nondeterministic decision trees for recognition of all realizable decision rules. In Proceedings of the Intelligent Information and Database Systems. ACIIDS, Kitakyushu, Japan, 23–25 April 2025; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2025; Volume 15684, pp. 3–17. [Google Scholar]
Durdymyradov, K.; Moshkov, M. Deterministic and Nondeterministic Decision Trees for Recognition of Properties of Decision Rule Systems. In Proceedings of the Rough Sets: International Joint Conference, IJCRS 2025, Chongqing, China, 11–13 May 2025; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2025; Volume 15709, pp. 426–435. [Google Scholar]
Durdymyradov, K.; Moshkov, M. Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules. In Proceedings of the 51st International Conference on Current Trends in Theory and Practice of Computer Science SOFSEM 2026, Kraków, Poland, 9–13 February 2026. (submitted). [Google Scholar]
Blum, M.; Impagliazzo, R. Generic Oracles and Oracle Classes (Extended Abstract). In Proceedings of the 28th Annual Symposium on Foundations of Computer Science, Los Angeles, CA, USA, 27–29 October 1987; IEEE Computer Society: Washington, DC, USA, 1987; pp. 118–126. [Google Scholar]
Hartmanis, J.; Hemachandra, L.A. One-way functions, robustness, and the non-isomorphism of NP-complete sets. In Proceedings of the Second Annual Conference on Structure in Complexity Theory, Cornell University, Ithaca, NY, USA, 16–19 June 1987; IEEE Computer Society: Washington, DC, USA, 1987. [Google Scholar]
Tardos, G. Query complexity, or why is it difficult to separate NP^A ∩ coNP^A from P^A by random oracles A? Combinatorica 1989, 9, 385–392. [Google Scholar] [CrossRef]
Buhrman, H.; de Wolf, R. Complexity measures and decision tree complexity: A survey. Theor. Comput. Sci. 2002, 288, 21–43. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Durdymyradov, K.; Moshkov, M. Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth. Axioms 2025, 14, 794. https://doi.org/10.3390/axioms14110794

AMA Style

Durdymyradov K, Moshkov M. Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth. Axioms. 2025; 14(11):794. https://doi.org/10.3390/axioms14110794

Chicago/Turabian Style

Durdymyradov, Kerven, and Mikhail Moshkov. 2025. "Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth" Axioms 14, no. 11: 794. https://doi.org/10.3390/axioms14110794

APA Style

Durdymyradov, K., & Moshkov, M. (2025). Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth. Axioms, 14(11), 794. https://doi.org/10.3390/axioms14110794

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Modeling of Deterministic Decision Trees for Recognition of Realizable Decision Rules: Bounds on Weighted Depth

Abstract

1. Introduction

2. Definitions

2.1. DRSs—Decision Rule Systems

2.2. DTs—Decision Trees

3. Two Sequences of DRSs (Decision Rule Systems)

4. On the Minimum Weighted Depth of NDTs (Nondeterministic Decision Trees)

5. Algorithm Simulating the Operation of a DDT (Deterministic Decision Tree)

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI