Previous Article in Journal
Lipschitz Continuity Results for a Class of Parametric Variational Inequalities and Applications to Network Games
Previous Article in Special Issue
Implementation Aspects in Regularized Structural Equation Models

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Mathematical Foundation of a Functional Implementation of the CNF Algorithm

1
2
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Algorithms 2023, 16(10), 459; https://doi.org/10.3390/a16100459
Original submission received: 14 August 2023 / Revised: 24 September 2023 / Accepted: 24 September 2023 / Published: 27 September 2023
(This article belongs to the Special Issue Mathematical Models and Their Applications IV)

## Abstract

:
The conjunctive normal form (CNF) algorithm is one of the best known and most widely used algorithms in classical logic and its applications. In its algebraic approach, it makes use in a loop of a certain well-defined operation related to the “distributivity” of logical disjunction versus conjunction. For those types of implementations, the loop iteration runs a comparison between formulas to decide when to stop. In this article, we explain how to pre-calculate the exact number of loop iterations, thus avoiding the work involved in the above-mentioned comparison. After that, it is possible to concatenate another loop focused now on the “associativity” of conjunction and disjunction. Also for that loop, we explain how to calculate the optimal number of rounds, so that the decisional comparison phase for stopping can be also avoided.

## 1. Introduction

The conjunctive normal form (CNF) is famous in the history of thought because it organizes discourse by normalizing potentially chaotic statements through the conjunction of a series of statements in the form of disjunctions of other simple statements, namely atomic statements, or the negation of them.
CNF has proven to be essential in the treatment of the SAT problem (from “satisfiability”, usually abbreviated to SAT), which in turn is at the core of automated theorem proving, thanks to the effectiveness of the resolution rule and what is known about the treatment of Horn clauses (see [1]).
The other great utility of CNF lies in the minimization of Boolean expressions under the condition of being expressed as a product of sums (POS) (see [2,3]). It is clear that the POS criterion is the dual concept of the sum of products (SOP) criterion.
In general, transforming a formula into CNF is the essence of Petrick’s method, an algorithm widely used in various fields: cybernetics, economics, linguistics, philosophy, psychology, etc. For an explanation of the algorithm and detailed applications, see [2] (p. 157), plus extensive comments and applications of the CNF algorithm. In [4] (pp. 69–71), we find a brilliant application of Petrick’s method to finite Boolean algebra.
Given a propositional formula or a Boolean expression, it is possible to obtain for it an equivalent formula or expression, as the case may be, in CNF by semantic means or by algebraic manipulations. Both procedures are described in [5,6,7]; however, algebraic manipulation may be faster.
If we focus on the method of syntactic analysis for obtaining CNF, we will consider propositional logic formulas, without loss of generality, in the appropriate language. In this line of thought, the essence of the algorithm is quite simple: after internalizing negation and eliminating double negation, the main task is to replace subformulas of the form $( α ∨ ( β ∧ γ ) )$ by their equivalent $( ( α ∨ β ) ∧ ( α ∨ γ ) )$ (distributivity). For the sake of efficiency, we will use Polish notation here as was introduced by Jan Łukasiewicz in [8] (pp. 33–34). Łukasiewicz selected K (resp. A) to represent conjunction (resp. disjunction). Disjunction was originally called “alternation” by Łukasiewicz (in Polish “alternacja”, hence the symbol “A”). The word “disjunction” in Polish is “koniunkcja”, hence the symbol “K”. Our notation in this logical work, which is classical notation, is based on this Łukasiewicz guideline because it has the great advantage of avoiding parentheses and at the same time being univocal. Furthermore, Polish notation is ideal for transitioning to a functional implementation in the Haskell language given the peculiarities of its syntax (see [9]). Since de functor A stands for disjunction and functor K for conjunction in standard Polish notation, what we are saying is that the application of distributivity translates $A α K β γ$ to $K A α β A α γ$.
With regard to the CNF algorithm, the current state of the art can be found, for example, at [5,6]. To fix ideas, we will focus on [5] (p. 26) and look at Algorithm 2.3, called TREE-EQ-CNF. Its pseudocode contains the following snippet:
• ...
• {
•   do {
•    $α o r i g = α$
•   …
•   } while ($α o r i g ! = α$);
•   return($α$)
• }
Therefore, we note that the basis of the algorithm is a while loop with the sentinel condition $α o r i g ≠ α$, where $α o r i g$ is a copy of the value of the $α$ at the start of the current round. Therefore, each round of the while loop requires a comparison operation between the formula we are transforming in that round and the result of its transformation, stopping the process when there is a match. The basis of this work is to avoid all the comparisons we have just detected, the number of which will depend on the case, and instead to carry out a single inspection of the formula passed as a parameter initially to the coding of the algorithm; that single inspection, ultimately related to “distributivity”, will give us the minimum number of rounds for the correct operation of the loop. However, the expression of the resulting formula in the CNF algorithm is not satisfactory until we express it canonically with the functors K and A left-loaded. This has to do with “associativity” and will lead us a second time to the situation where we have to calculate the minimum number of rounds for another while loop beforehand; that calculation will be carried out with the function akr defined below in Section 3.
The basis of the classical algorithm, which is essentially the substitution of subformulas by equivalents, does not change in our work, but we will provide a very formal presentation of the substitution process in the algorithm. That presentation will be based on the specialized theory of recursive works, namely the lambda calculus.
Finally, as a result of the work, we provide a functional implementation (https://github.com/ringstellung/CNF, accessed on 23 September 2023) of the algorithm in the Haskell programming language with these and other contributions.
In summary, the sections of this article contain the following. Since the aim of this paper is the manipulation of formulas over a language, in Section 2, we suggest the rigorous definition of language and formula. Four subsets of formulas generated by a non-empty subset of formulas according to appropriate rules are then suggested; essentially they are the support for defining the concept of clause and formula in conjunctive normal forms. The section continues by giving several different concepts of the complexity of a formula. The  rest of the section is devoted to defining the concept of semantic equivalence of formulas, according to classical propositional logic; some examples; and the statement of a classical result on distributivity. Negation is not mentioned because it is not relevant to the theoretical framework of the article. Section 3 is devoted to the treatment of the distributivity that, in a broad sense of the term, classical propositional logic contains between disjunction and conjunction. In that section, the function dak is defined (see Figure 1), which, when applied to formulas, manages to reduce the alternation in them of the connector K over A in a unit; this alternation is measured in each formula by the function alt. The section concludes by showing that the alternation of each formula is precisely the minimum number of applications of dak to it, in order to obtain another equivalent formula in conjunctive normal form. Practical laboratory experience in the field of logical deduction indicates that the associativity of the connectives K and A must be taken into account in order to obtain a canonical form, which, in Polish notation, accumulates these connectives at the beginning of the formula. The rigorous treatment of this matter is the aim of Section 4. In writing it, we were inspired by the structure of Section 3; however, the intrinsic theory is appreciably more complex. It is all based on two “measures” on formulas that essentially express how far the formula is from that canonical form; the maximum of these two values is also important. In that section, we justify the expressive capacity of the aforementioned measures to characterise the membership of the subsets of formulas defined in Section 2. The counterpart for the associativity of the dak function is defined, namely the lasc function (see Figure 2). This time, the application of the lasc function decreases the separation measure of the formula to the canonical form by half; in that sense, it serves to make big steps. A certain natural value based on the logarithm in base two gives the minimum number of applications of lasc to the formula to obtain its desired canonical form. The last section, Section 5, is for the conclusions.

## 2. Basic Definitions and Preliminary Results

In this section, we list the main definitions of the basic concepts that will be used in the development of this work, as well as the essential results needed in it.
Here, the concept of sentencial or propositional language is that of J.D. Monk in [10] (see also [11,12]), but our functors (in the sense of Jan Łukasiewicz in  [8]) will be A and K. Moreover, it is necessary to impose that X, the set of atomic or propositional variables, be a nonfinite numerable set; its elements are notated by the last lower–case letters of the latin alphabet: x, y, z, etc., subindicating them if necessary. We will call $L KA$ the above propopositional language and $P ( L KA )$, or simply $P ( L )$, the set of its formulas or sentences. The reader should be thoroughly familiar with the principle of induction for sentences, the construction sequence for sentences, and the unique readability principle. We will also assume the knowledge of the principle of finite induction in its different formulations, as it is exposed, for example, in [6].
The statements of some particular results (lemmas and theorems), in this and subsequent sections, will be given without complete proofs, because some of them are quite evident, and the rest could be carried out using the principle of finite induction, taking into account, in a meticulous and orderly manner, all the possible cases that can occur, in a similar way as it is carried out in the selected proofs included in this paper.
Given a set $Δ$ of formulas of the language $L$, consider the smallest set of formulas that, containing it, is at the same time closed for the functors A (resp. K); it will be represented by $A ( Δ )$ (resp. $K ( Δ )$). Thus, the elements of $K ( A ( X ) )$ are exactly the set of formulas in conjunctive normal form. Clauses are exactly the formulas of the smallest set, $al ( X )$, containing X and being closed for the functor A whenever it operates on an element of X to its right. The set $kl ( X )$ (resp. $lcnf ( X )$) is the smallest set containing X (resp. $al ( X )$) and being closed for the functor K whenever it operates on an element of X ($al ( X )$) to its right.
Remark 1.
Note that $al ( X ) ∪ kl ( X ) ⊆ lcnf ( X )$ and that $lcnf ( X ) ⊆ K ( A ( X ) )$.
Definition 1.
For all $α ∈ P ( L KA )$ let $comp ( α )$ (complexity), the natural value defined as follows:
and $comp k ( α )$ (complexity in K) defined by
If necessary, consider $comp a ( α )$ as the dual concept of $comp k ( α )$.
Consider the semantic consequence ⊧ in the classical sense, as in, for example, Ref. [5]. The formula $α$ and $β$ are equivalent, in symbols $α = β$, iff by definition $α ⊧ β$ and $β ⊧ α$. In the following, we will use the symbol = to indicate equivalence and the symbol ≡ to indicate syntactic equality (verbatim equal). It is well known that if $α = A φ ψ$ and $β = A φ ξ$, then $A φ K ψ ξ = K α β$.
Remark 2.
It is well know as a basic theorem of classical logic that if $α = A φ ψ$ (resp. $α = K φ ψ$), then $A φ A ψ ξ = A α ξ$ (resp. $K φ K ψ ξ = K α ξ$).

## 3. Distributivity

The aim now is, given a formula $φ$ in $P ( L )$, to find $φ c n f$ in $K ( A ( X ) )$ such that both are equivalent. This will be the basis of the algorithm, and the first thing to do is to determine how far $φ$ is from $K ( A ( X ) )$. As we shall see shortly, the expected measure is given by the function $a l t$, which indicates the alternation in its formula argument of the symbols K and A from the inner to the outside formula.
Definition 2.
Let α be any formula in $P ( L )$. The alternation of α, in symbols $alt ( α )$, is defined by
In Lemma 1, we characterise the meaning of “belong to the set $K ( A ( X ) )$” by means of the map alt. As we shall see, the formulas for which $alt ( α ) = 0$ are exactly those of the set $K ( A ( X ) )$.
Lemma 1.
Let $α ∈ P ( L )$. The following statements are equivalent:
1.
$alt ( α ) = 0$.
2.
$α ∈ K ( A ( X ) )$.
Definition 3
(distributivity). Consider the following compound rules on binary relationships between formulas of $P ( L )$:
$x i → dak x i$
$A φ ξ → dak α A ψ ξ → dak β A K φ ψ ξ → dak K α β$
$A ξ φ → dak α A ξ ψ → dak β A ξ K φ ψ → dak K α β$
$φ → dak φ ′ ψ → dak ψ ′ A φ ψ → dak A φ ′ ψ ′$
$φ → dak φ ′ ψ → dak ψ ′ K φ ψ → dak K φ ′ ψ ′$
where (2)–(4) are applied with the precedence indicated by the order in which they are given. This being so, we define the following application:
$dak : P ( L ) ⟶ P ( L )$
by
$dak ( φ ) ≡ ψ , provided that φ → dak ψ$
Remark 3.
dak is an application, because a clear and univocal precedence has been established in the rules on which its definition is based. For example, note that by Rule (2), $dak ( A K x y z ) = K dak ( A x z ) dak ( A y z )$; dak is a recursive process with stop in propositional variables due to Rule (1). Moreover, it is clear from Remark 2 that for all $ζ ∈ P ( L )$, $dak ( ζ ) = ζ$ (semantical equality).
By Lemma 2, $α ∈ A ( X )$ is a sufficient condition for $dak ( α ) ≡ α$, but it is not a necessary condition. Lemma 3, which is a consequence of Lemma 2, gives the necessary and sufficient condition, although this will be fully concluded in Corollary 1. The proof of Lemma 2 is straighforward.
Lemma 2.
Let α be a formula in $P ( L )$. If  $α ∈ A ( X )$, then $dak ( α ) ≡ α$.
Lemma 3.
Let $α ∈ P ( L )$. If  $α ∈ K ( A ( X ) )$ then $dak ( α ) ≡ α$.
Proof.
Let us assume that $α ∈ K ( A ( X ) )$ and $comp ( α ) = n$. Reasoning by induction on $comp ( α )$ we will show that $dak ( α ) ≡ α$. As an induction hypothesis, suppose that the implication is true for any formula $β ∈ K ( A ( X ) )$ such that $comp ( β ) < n$. If  $α ∈ K ( A ( X ) )$, then two situations are possible:
• $α ∈ A ( X )$; in this case $dak ( α ) ≡ α$ is what Lemma 2 states.
• There exist formulas $φ$ and $ψ$ in $K ( A ( X ) )$ of complexities lower than those of $α$ such that $α ≡ K φ ψ$. For the calculation of $dak ( α )$, it is only possible to start with Rule (5):
$dak ( α ) ≡ dak ( K φ ψ ) ≡ K dak ( φ ) dak ( ψ ) by Rule ( 5 ) ≡ K φ ψ induc . hyp . ≡ α$
□
As for Theorem 1, in essence its meaning is that in applying dak to a given formula not in $K ( A ( X ) )$, say $α$, according to the “measure” alt, the result is closer to $K ( A ( X ) )$ than $α$.
Theorem 1.
For all $α ∈ P ( L )$,
Proof.
The proof is by induction on the complexity of the formula $α$. Let $α$ be a formula in $P ( L )$ such that $comp ( α ) = n$. Suppose, as an induction hypothesis, that (6) holds for any formula $β$ such that $comp ( β ) < n$. Several cases are possible:
1.
$α ≡ x ∈ X$; in this case, $dak ( α ) ≡ x ∈ X$ and since $X ⊆ K ( A ( X ) )$, we deduce according to Lemma 1 that $alt ( α ) = 0$, which proves the result in this case.
2.
$α ≡ A K φ ψ ξ$; therefore, $α ∉ K ( A ( X ) )$ and
$alt ( α ) = 1 + max { alt ( K φ ψ ) , alt ( ξ ) } = 1 + max { alt ( φ ) , alt ( ψ ) , alt ( ξ ) }$
On the other hand, $dak ( α ) ≡ K dak ( A φ ξ ) dak ( A ψ ξ )$ so that
$alt ( dak ( α ) ) = max { alt ( dak ( A φ ξ ) ) , alt ( dak ( A ψ ξ ) ) }$
For short, we will call $β$ to $A φ ξ$ and $γ$ to $A ψ ξ$. Let us bear in mind the following:
(a)
$K ∈ β$; then $alt ( β ) = 1 + max { alt ( φ ) , alt ( ξ ) }$ and $β ∉ K ( A ( X ) )$. Since $comp ( β ) < comp ( α )$, the induction hypothesis allows us to establish that:
$alt ( dak ( β ) ) = alt ( β ) − 1 (9) = 1 + max { alt ( φ ) , alt ( ξ ) } − 1 (10) = max { alt ( φ ) , alt ( ξ ) }$
(b)
$K ∉ β$; then $φ , ξ , β ∈ A ( X )$. According to Lemma 2, then $dak ( β ) ≡ β$ and, according to Lemma 1,
$alt ( dak ( β ) ) = alt ( β ) = 0$
$alt ( φ ) = 0$
$alt ( ξ ) = 0$
We will now analyse equality (8) on a case-by-case basis:
(a)
$K ∈ β$ and $K ∈ γ$; then
$alt ( dak ( α ) ) = max { alt ( dak ( β ) ) , alt ( dak ( γ ) ) } by ( 8 ) = max { alt ( φ ) , alt ( ψ ) , alt ( ξ ) } by ( 10 ) = alt ( α ) − 1$
(b)
$K ∉ β$ and $K ∈ γ$; then
$alt ( dak ( α ) ) = max { alt ( dak ( β ) ) , alt ( dak ( γ ) ) } by ( 8 ) = max { 0 , alt ( dak ( γ ) ) } by ( 11 ) = alt ( dak ( γ ) ) = max { alt ( ψ ) , alt ( ξ ) } by ( 10 ) = alt ( ψ ) by ( 13 ) = max { 0 , alt ( ψ ) , 0 } = max { alt ( φ ) , alt ( ψ ) , alt ( ξ ) } by ( 11 ) and ( 13 ) = alt ( α ) − 1 by ( 7 )$
(c)
$K ∈ β$ and $K ∉ γ$; this situation is treated as the case in paragraph 2b.
(d)
$K ∉ β$ and $K ∉ γ$; in this case $β , γ ∈ A ( X )$ and $α ∈ K ( A ( X ) )$. By what Lemma 3 states, $dak ( α ) ≡ α$ and, as  Lemma 1 states, $alt ( α ) = 0$; so $alt ( dak ( α ) ) = 0$.
3.
$α ≡ A ξ K φ ψ$; this situation is treated as the case in paragraph 2.
4.
$α ≡ A φ ψ$; neither $φ$ nor $ψ$ begin with K but $K ∈ α$; without loss of generality, suppose that $alt ( ψ ) ≤ alt ( φ )$, whence $K ∈ φ$ and $φ$ begins with A, i.e., $φ ∉ K ( A ( X ) )$. Then $alt ( α ) = 1 + alt ( φ )$ and
$alt ( dak ( α ) ) = alt ( A dak ( φ ) dak ( ψ ) ) by Rule ( 4 ) = 1 + max { alt ( dak ( φ ) ) , alt ( dak ( ψ ) ) } = 1 + alt ( dak ( φ ) ) = 1 + alt ( φ ) − 1 hyp . induc . and conditions of φ = alt ( φ ) = alt ( α ) − 1$
5.
$α ≡ K φ ψ$; without loss of generality, suppose that $alt ( ψ ) ≤ alt ( φ )$. If  $alt ( φ ) = 0$, then $alt ( ψ ) = 0$, $φ , ψ , α ∈ K ( A ( X ) )$ and so $alt ( α ) = 0$ (see Lemma 1). If  $alt ( φ ) ≠ 0$, i.e., $φ ∉ K ( A ( X ) )$, then $α ∉ K ( A ( X ) )$. Thus,
$alt ( dak ( α ) ) = alt ( K dak ( φ ) dak ( ψ ) ) by Rule ( 5 ) = max { alt ( dak ( φ ) ) , alt ( dak ( ψ ) ) } def . of alt = alt ( dak ( φ ) ) = alt ( φ ) − 1 induc . hyp . and conditions of φ = max { alt ( φ ) , alt ( ψ ) } − 1 = alt ( α ) − 1 def . of alt$
□
As a consequence of Theorem 1, it follows that the sufficient condition of Lemma 3 is also a necessary condition.
Corollary 1.
Let α be any formula in $P ( L )$. The following statements are equivalent:
1.
$dak ( α ) ≡ α$.
2.
$α ∈ K ( A ( X ) )$.
3.
$alt ( α ) = 0$.
Given any formula $α$ in $P ( L )$, we now know that it is possible to obtain from it another in conjunctive normal form by iterated application of the dak function. Moreover, the minimum number or iterations required is exactly $alt ( α )$. By Remark 3, we know that this other formula in conjunctive normal form is logically equivalent to $α$, the input formula.
Corollary 2.
For all $α ∈ P ( L )$, the natural number $alt ( α )$ is the smallest natural number m satisfying $dak m ( α ) ∈ K ( A ( X ) )$.
Proof.
The proof is by induction on n according to the predicate $Q ( n )$ of the literal content:
• If $α ∈ P ( L )$ and $n = alt ( α )$, then n is the smallest natural m satisfying $dak m ( α ) ∈ K ( A ( X ) )$.
The reasoning is as follows:
• $n = 0$; if $α ∈ P ( L )$ and $0 = alt ( α )$, then by Corollary 1, we know that $α ∈ K ( A ( X ) )$, i.e., $dak 0 ( α ) ∈ K ( A ( X ) )$, since $dak 0$ is the identity map. Since 0 is the smallest natural number, the set of natural numbers smaller than it is empty, from which we conclude the assertion.
• Suppose that $0 < n$, that $Q ( n − 1 )$ is true, and that $α ∈ P ( L )$ is fixed but arbitrary under the condition that $n = alt ( α )$. As we know from Theorem 1 and Corollary 1, it holds that
$alt ( dak ( α ) ) = alt ( α ) − 1 = n − 1$
By (14) and the induction hypothesis, we have, in particular, that
$dak n ( α ) = dak n − 1 ( dak ( α ) ) ∈ K ( A ( X ) )$
On the other hand, let m be a natural number, such that $m < n$. Three cases can occur:
*
$m = n − 1$; then:
$alt ( dak n − 1 ( α ) ) = alt ( α ) − n + 1 = n − n + 1 = 1$
so (see Corollary 1) $dak n − 1 ( α ) ∉ K ( A ( X ) )$.
*
$0 < m < n − 1$; by Theorem 1, we know that $alt ( dak ( α ) ) = n − 1$, and since $m − 1 < m < n − 1$, by the induction hypothesis, we have
$dak m ( α ) = dak m − 1 ( dak ( α ) ) ∉ K ( A ( X ) )$
*
$m = 0$; $dak 0 ( α ) = α$, and since $alt ( α ) = n > 0$, we deduce that $dak 0 ( α ) ∉ K ( A ( X ) )$.
By the principle of finite induction, we deduce that $Q ( n )$ is true for any natural number n. Since the function alt can be applied to any formula, the result is true.    □

## 4. Associativity

By iterating dak from any formula, we obtain, as we have seen, a formula equivalent to it that is in conjunctive normal form. However, for certain formulas in conjunctive normal form, there are several others also in conjunctive normal form that are equivalent to it, but such that they are all distinct from each other. In this section, we intend to provide an algorithm to select among all those formulas, one of which we will consider in canonical form. For practical reasons, we will consider the set $lcnf ( X )$, the set of formulas in conjunctive left normal form, as the one that gathers exactly all the formulas in canonical form.
Definition 4.
Let $ar : P ( L ) ⟶ Z$ be defined as follows:
and let $kr : P ( L ) ⟶ Z$ be defined as follows:
Also let $akr : P ( L ) ⟶ Z$ be defined as follows:
$akr ( α ) = max { ar ( α ) , kr ( α ) }$
Remark 4.
The maps ar and kr have the properties given in Lemma 4. This Lemma characterises the elements of $K ( X )$, $A ( X )$, and  X. The particular assignation in both functions of the value $− 1$ to the elements of X is just in order to adjust the final computations accordingly.
Lemma 4.
For all $α ∈ P ( L )$:
1.
$α ∈ K ( X )$ if, and only if, $ar ( α ) = − 1$.
2.
$α ∈ A ( X )$ if, and only if, $kr ( α ) = − 1$.
3.
$α ∈ X$ if, and only if, $kr ( α ) = − 1 = ar ( α )$.
4.
$K ( X ) ∩ A ( X ) = X$.
Proof.
Let us prove statement 1. First we will reason by induction according to the complexity of $α$ and according to the predicate $Q ( n )$ of the literal content:
Suppose, as an induction hypothesis, that n is a natural number and that for any natural number k such that $k < n$, $Q ( k )$ holds. We have the following cases:
• $n = 0$; then let —as the only case of interest— $α ≡ x ∈ X$. By Definition 4, $ar ( α ) = − 1$, so $Q ( 0 )$ is true.
• $n > 0$; if $α ∈ K ( X )$ and $n > 0$, there must exist $φ , ψ ∈ K ( X )$ such that $α ≡ K φ ψ$. Then
so $Q ( n )$ is true in this case.
By the second principle of finite induction, for any natural number n is true $Q ( n )$ and hence the implication. Reciprocally, let us now consider the predicate $Q ( n )$:
Suppose, as an induction hypothesis, that n is a natural number and that for any natural number k, such that $k < n$, $Q ( k )$ holds. We have the following cases:
• $n = 0$; then let —as the only case of interest— $α ≡ x ∈ X$. Since $X ⊆ K ( X )$ it follows that $Q ( 0 )$ is true.
• $n > 0$; let $α ∈ P ( L )$ such that $comp ( α ) = n$ and $ar ( α ) = − 1$. In principle, the following are possible:
-
there exist $φ , ψ ∈ P ( L )$, such that $α ≡ K φ ψ$; then
$− 1 = ar ( α ) = max { ar ( φ ) , ar ( ψ ) } ⇒ ar ( φ ) = − 1 = ar ( ψ ) ⇒ φ , ψ ∈ K ( X ) induc . hyp . ⇒ α ∈ K ( X )$
-
there exist $φ , ψ ∈ P ( L )$, such that $α ≡ A φ ψ$; then
$− 1 = ar ( α ) = max { ar ( φ ) , 1 + ar ( ψ ) } ≥ 0 ⇒ therefore this case is not possible$
so that $Q ( n )$ is true.
By the second principle of finite induction, for any natural number n, $Q ( n )$ holds and hence the implication. Statement 2 can be proved with the same scheme as above. Suppose now that $kr ( α ) = − 1 = ar ( α )$. Since $ar ( α ) = − 1$, we have that $α ∈ K ( X )$ and if there exist $φ , ψ ∈ P ( L )$, such that $α ≡ K φ ψ$, then one would have
which is absurd, so $α ∈ X$. The reciprocal statement is obviously true and it follows that $K ( X ) ∩ A ( X ) = X$.    □
The proof of Lemma 5 is straightforward from Lemma 4.
Lemma 5.
For all $α ∈ P ( L )$:
1.
If $α ∈ K ( X ) ∖ X$ then $0 ≤ kr ( α )$.
2.
If $α ∈ A ( X ) ∖ X$ then $0 ≤ ar ( α )$.
3.
$ar ( α ) = − 1$ and $kr ( α ) = − 1$ if, and only if, $α ∈ X$.
Remark 5.
The respective reciprocal statements of the first two sentences of Lemma 5 are not true. Indeed, $kr ( A K x y x ) = 0$ (resp. $ar ( K A x y x ) = 0$), and yet $A K x y x ∉ K ( X ) ∖ X$ (resp. $K A x y x ∉ A ( X ) ∖ X$).
Lemma 6 characterises the elements of $kl ( X ) ∖ X$ and $al ( X ) ∖ X$. Its proof can be carried out by induction by making a careful distinction of cases.
Lemma 6.
For all $α ∈ P ( L )$,
1.
$ar ( α ) = − 1$ and $kr ( α ) = 0$ if, and only if, $α ∈ kl ( X ) ∖ X$.
2.
$ar ( α ) = 0$ and $kr ( α ) = − 1$ if, and only if, $α ∈ al ( X ) ∖ X$.
Lemma 7.
For all $α ∈ K ( A ( X ) )$, the  following statements are equivalent:
1.
$ar ( α ) = 0$ and $kr ( α ) = 0$.
2.
$α ∈ lcnf ( X ) ∖ ( al ( X ) ∪ kl ( X ) )$.
Remark 6.
The formula $α ≡ A K x y x$ satisfies $ar ( α ) = 0 = kr ( α )$, but  $α ∉ lcnf ( X )$; hence the need for the restriction in the statement of Lemma 7.
By means of akr, the above technical lemmas make it possible to characterise in Theorem 2 the set $lcnf ( X )$ of formulas in left conjunctive normal form. Note how ≤ appears in the statement, again highlighting the subtle role played by $− 1$ in the definition of akr.
Theorem 2.
For all $α ∈ K ( A ( X ) )$, the  following statements are equivalent:
1.
$akr ( α ) ≤ 0$.
2.
$α ∈ lcnf ( X )$.
Proof.
Let us first show that statement 1 is a sufficient condition for 2. to be fulfilled. The following cases are possible:
1.
$ar ( α ) = − 1 = kr ( α )$; as stated in Lemma 5, $α ∈ X$.
2.
$ar ( α ) = − 1$ and $kr ( α ) = 0$; as stated in Lemma 6, $α ∈ kl ( X ) ∖ X$.
3.
$ar ( α ) = 0$ and $kr ( α ) = − 1$; as stated in Lemma 6, $α ∈ al ( X ) ∖ X$.
4.
$ar ( α ) = 0$ and $kr ( α ) = 0$; as stated in Lemma 7, $α ∈ lcnf ( X ) ∖ ( al ( X ) ∪ kl ( X ) )$.
• and hence, $α ∈ lcnf ( X )$. However,  1 is a necessary condition for 2, which follows as a consequence of the aforementioned lemmas.    □
Now everything is ready to carry out the accumulation of the functors A and K on the left side of the formula without changing its logical meaning. This task will be carried out by the function lasc, defined in Definition 5, by means of convenient iterations. The lasc function is the classical one, but formulated here univocally in a novel recursive way via rules.
Definition 5
(left associativity). Consider the following rules:
$x i → lasc x i$
$A φ ψ → lasc α ξ → lasc β A φ A ψ ξ → lasc A α β$
$K φ ψ → lasc α ξ → lasc β K φ K ψ ξ → lasc K α β$
$φ → lasc φ ′ ψ → lasc ψ ′ A φ ψ → lasc A φ ′ ψ ′$
$φ → lasc φ ′ ψ → lasc ψ ′ K φ ψ → lasc K φ ′ ψ ′$
where (16)–(19) shall be applied with the priority from highest to lowest according to the order given. Les us now define the map $lasc : P ( L ) ⟶ P ( L )$ by $lasc ( φ ) ≡ ψ$ if, and  only if, $φ → lasc ψ$.
It is clear that for any formula $φ$, $lasc ( φ )$ is equivalent to $φ$ (see Remark 2); therefore, lasc does not alter the logical meaning of the formulas by acting on them, although it does eventually alter their syntax. What is stated in Lemma 8 is obviously true.
Lemma 8.
For all $α ∈ P ( L )$,
1.
If $α ∈ A ( X )$ then $lasc ( α ) ∈ A ( x )$.
2.
If $α ∈ K ( X )$ then $lasc ( α ) ∈ K ( x )$.
Lemma 9.
For all $α ∈ A ( X )$,
Proof.
The proof is by induction on the complexity of $α$.    □
Remark 7.
It is also evident that for all $α ∈ A ( X )$
As we can see, the reductive role of lasc on $akr ( α )$ is very powerful when applying lasc to the formula $α$; as we can see, it is such that divides the “complexity of the situation” by 2, which inevitably invokes the logarithm in base 2. The information provided by Theorem 3 is crucial in this section, so we will give a detailed demonstration of it.
Theorem 3.
For all $α ∈ K ( A ( X ) )$:
1.

2.
3.
Proof.
To prove 1, we will reason by induction about the complexity of $α$ using the predicate $Q ( n )$ of the literal content:
Suppose, as an induction hypothesis, that n is a natural number and that for any natural number k such that $k < n$$Q ( k )$ holds. We have the following cases:
• $n = 0$; must be $α ∈ A ( X )$, the formula for which is , as set out in Lemma 9.
• $n > 0$; let —as the only case of interest— $α ≡ K φ ρ$ for certain $φ , ρ ∈ K ( A ( X ) )$. Let us distinguish the following cases:
-
$ρ ∈ A ( X )$; then:
-
$ρ ∉ A ( X )$; then $α ≡ K φ K ψ ξ$ for certain $ψ , ξ ∈ K ( A ( X ) )$. In this case,
By the second principle of finite induction, for every natural number n, $Q ( n )$ holds, hence the validity of statement 1. To prove 2, let us reason by induction about the complexity of $α$ according to the predicate $Q ( n )$ of the literal content:
Suppose, as an induction hypothesis, that n is a natural number and that for any natural number k such that $k < n$, $Q ( k )$ holds. We have the following cases:
• $n = 0$; must be $α ∈ A ( X )$, the formula for which is , as set out in Remark 7.
• $n > 0$; let —as the only case of interest— $α ≡ K φ ρ$ for certain $φ , ρ ∈ K ( A ( X ) )$. Let us distinguish the following cases:
-
$ρ ∈ A ( X )$; then
-
$ρ ∉ A ( X )$; then $α ≡ K φ K ψ ξ$ for certain $ψ , ξ ∈ K ( A ( X ) )$. In this case,
By the second principle of finite induction, for every natural number n, $Q ( n )$ holds, hence the validity of statement 2. Statement 3 is immediate from statements 1 and 2, given that
□
Lemma 10.
Let $α ∈ A ( X )$. The following statements are equivalent:
1.
$α ∈ al ( X )$
2.
$lasc ( α ) ≡ α$
3.
$ar ( α ) ≤ 0$
Proof.
To show that statement 1 implies statement 2, we will reason by induction about the complexity of $α$ according to the predicate $Q ( n )$ of the literal content:
Suppose, as an induction hypothesis, that n is a natural number and that for any natural number k such that $k < n$, $Q ( k )$ holds. We distinguish the following cases:
• $n = 0$; must be $α ≡ x ∈ X$ and then $lasc ( α ) ≡ lasc ( x ) ≡ x ≡ α$.
• $n > 0$; let —as the only case of interest— $α ≡ A φ x$, where $φ ∈ al ( X )$ and $x ∈ X$. Then
$lasc ( α ) ≡ lasc ( A φ x ) ≡ A lasc ( φ ) lasc ( x ) ≡ A φ x induc . hyp . and Definition 5 ≡ α$
hence, $Q ( n )$ holds.
By the second principle of finite induction, for every natural number n$Q ( n )$ holds, hence the validity of statement 2. Let us now suppose that statement 2 is true, i.e., that $α ∈ A ( X )$ and that $lasc ( α ) ≡ α$; then, one has
from which we deduce that $ar ( α ) ∈ { − 1 , 0 }$, i.e., that statement 3 holds. Finally, suppose that $α ∈ A ( X )$ and that $ar ( α ) ≤ 0$. The following cases are possible (note that, according to statement 2 of Lemma 4, necessarily $kr ( α ) = − 1$):
• $ar ( α ) = − 1$ and $kr ( α ) = − 1$; then $α ∈ X ⊆ al ( X )$ (see Lemma 5).
• $ar ( α ) = 0$ and $kr ( α ) = − 1$; then $α ∈ al ( X ) ∖ X$ (see Lemma 6).
This proves that under the assumption of statement 3. the fact $α ∈ al ( X )$ is satisfied, as we sought to prove.    □
In Theorem 4, the effects of alt and lasc are finally combined to characterise the formulas in $lcnf ( X )$.
Theorem 4.
For all $α ∈ P ( L )$, the following statements are equivalent:
1.
$α ∈ lcnf ( X )$.
2.
$dak ( α ) ≡ α$ and $lasc ( α ) ≡ α$.
3.
$alt ( α ) = 0$ and $akr ( α ) ≤ 0$.
Proof.
Let $α$ be any formula in $P ( L )$. Assume what statement 1 states, i.e., that $α ∈ lcnf ( X )$. We will reason by induction about the complexity of $α$ according to the predicate $Q ( n )$ of the literal content:
Suppose, as an induction hypothesis, that n is a natural number and that for any natural number k such that $k < n$, $Q ( k )$ holds. We distinguish the following cases:
• $n = 0$; must be $α ≡ x ∈ X$ and then $dak ( α ) ≡ dak ( x ) ≡ x ≡ α$ and similarly, $lasc ( α ) ≡ α$. It follows that $Q ( 0 )$ is true.
• $n > 0$; let—as the only case of interest— $α ≡ K φ ψ$, where $φ ∈ lcnf ( X )$ and $ψ ∈ al ( X )$. Then
$dak ( α ) ≡ dak ( K φ ψ ) ≡ K dak ( φ ) dak ( ψ ) ≡ K φ dak ( ψ ) induc . hyp . ≡ K φ ψ Remark 1 and induc . hyp . ≡ α lasc ( α ) ≡ lasc ( K φ ψ ) ≡ K lasc ( φ ) lasc ( ψ ) ≡ K φ lasc ( ψ ) induc . hyp . ≡ K φ ψ Lemma 10 ≡ α$
so we know that $Q ( n )$ is true.
By the second principle of finite induction, for every natural number n is true $Q ( n )$, hence the validity of assertion 2. If we now assume that 2 is true, that $alt ( α ) = 0$ and that $α ∈ K ( A ( X ) )$ is ensured by the Corollary 1. In particular, as a consequence of Theorem 3, we have
and therefore, $ar ( α ) ≤ 0$ and $kr ( α ) ≤ 0$; thus, we have proved 3. Let us finally assume 3 to be true and show that $α ∈ lcnf ( X )$. Since $alt ( α ) = 0$ and again using Corollary 1, we know that $α ∈ K ( A ( X ) )$. According to Theorem 2, since $akr ( α ) ≤ 0$, $α ∈ lcnf ( X )$ must necessarily hold and this is what statement 1 establishes.    □
Definition 6.
For all $α ∈ K ( A ( X ) )$, $hre ( α )$ is the natural number defined by the equality:
Remark 8.
Understanding the evaluation of the expressions from a “lazy” point of view, the following equality should be accepted:
$hre ( α ) = ( 1 − χ { − 1 , 0 } ( akr ( α ) ) ) ( ⌊ l o g 2 ( akr ( α ) ) ⌋ + 1 )$
where, of course, $χ { − 1 , 0 }$ is the characteristic function on the set ${ − 1 , 0 }$. Let it also be noted that in the case where for the formula α one has $0 < akr ( α )$, then $hre ( α )$ is the number of digits in the (single) binary expression of $akr ( α )$ when it is greater than 0, and 0 otherwise.
Finally, the next corollary, Corollary 3, informs us that for any formula $α$ in $K ( A ( X ) )$, $lasc hre ( α ) ( α )$ is a formula in left conjunctive normal form (equivalent to $α$, of course) and that per iteration of lasc, the number $hre ( α )$ of iterations is the smallest number of those that achieve it. It is based on how many times the function must be iterated to obtain zero, in order to express x in binary form. This gives us an estimate of the complexity of our algorithm: it is logarithmic, which is fantastic news. The corresponding result, with its complete proof, is given just below. The demonstration of the corollary is simple if we rely on this observation, and it can be carried out by inductive reasoning based on a careful distinction of cases.
Corollary 3.
For all $α ∈ K ( A ( X ) )$, $hre ( α )$ is the smallest of the natural numbers m satisfying $lasc m ( α ) ∈ lcnf ( X )$.

## 5. Conclusions

Algorithm 1 summarises the content of the article. The structure of the work suggests using for loops instead of while loops, and we have done so. The following comments are appropriate:
• As we commented before, according to Definition 3, Definition 5, and Remark 2, the result of the two for loops in Algorithm 1 is a formula equivalent to the one that each of the loops took as a starting point. Thus, in each application of the algorithm, the result obtained retains the semantic meaning of the form it took as initial data.
• Each for loop in Algorithm 1 acts if and only if a change in the formula is necessary, i.e., if m > 0. In this case, the pre-calculated length of each of the two loops (respectively, alt, according to Defintion 2, and akr, acording to Definition 4) is exactly the minimum necessary to obtain the intended purpose of the loop. This extreme is guaranteed by Corollary 2 and Corollary 3. In short, under  this approach, the length of the two for loops is optimal.
• The algorithm achieves what it sets out to do, as stated in Corollary 2 and Corollary 3.
• In each of the for loops of Algorithm 1, the comparison between the result formula of each round and that of the previous one, which was carried out in the classical formulation of the algorithm (reproduced in summary and pseudocode in Section 1) in order to decide whether the work has been completed, has disappeared. In its place, a single operation has appeared for each loop, before its start, consisting of the traversal and reading of a single formula, namely the loop data formula. In the general case, a  priori this is an improvement over the classical formulation of the CNF algorithm, which in pseudocode we summarily reproduced in Section 1. Very unfavorable examples can be consulted in the readme.md of our GitHub repository https://github.com/ringstellung/CNF (accessed on 23 September 2023), and you can see how it is confronted by our Haskell implementation.
• Definition 3 and Definition 5 are an unpublished presentation of the well-known core of the CNF algorithm. We believe we have included, with great concreteness, the concept of “recursive substitution” inspired by the usual style of lambda calculus. We believe we have found the appropriate language to be able to express in a very precise way, eliminating all ambiguity, a process usually described with certain formal licenses.
We have no doubt that other improvements in the efficiency of the CNF algorithm are possible, although we have only dealt with one here. In the future, we will investigate how to combine all of them. For this, we will build on some of those already introduced in the Haskell code of our GitHub repository.
The content of this article will be used in the future to act in the field of the classical SAT problem, e.g., to optimally prepare the application of the Davis–Putnam algorithm. We intend to compare this improved algorithm with that of deduction by sequents.
 Algorithm 1 CNF simplified algorithm for formulas in $P ( L )$ Require:  $α ∈ P ( L )$Ensure:  $β ∈ lcnf ( X )$ procedure CNFsa($α$)     $m ← alt ( α )$     for $1 ≤ n ≤ m$ do         $α ← dak ( α )$     end for     $m ← hre ( α )$     for $1 ≤ n ≤ m$ do         $α ← lasc ( α )$     end for     return $α$ end procedure
In order to show the feasibility and effectiveness of the theoretical advances presented in this article, we have elaborated a functional implementation for the Haskell language. In it, we also give a preview of the implementation of the Davis–Putnam algorithm. The code can be consulted at https://github.com/ringstellung/CNF (accessed on 23 September 2023). This justifies the fundamental objective of the work, which is set out in its title.

## Author Contributions

The authors of this article contributed equally to formal analysis, investigation, methodology, project administration, writing—original draft and writing—review & editing. All authors have read and agreed to the published version of the manuscript.

## Funding

This research received no external funding.

## Data Availability Statement

In order to facilitate the article’s comprehension and leveraging of the presented results and to allow the immediate utilization of the algorithmic improvements by researchers on the topic, we have made the code of the corresponding Haskell implemented programs public through a GitHub repository: https://github.com/ringstellung/CNF (accessed on 23 September 2023).

## Acknowledgments

The authors would like to thank the staff of the Algorithms’ journal, in particular its editing assistants, for all the encouragement and support during the process of preparing and submitting the article.

## Conflicts of Interest

The authors declare no conflict of interest.

## Abbreviations

The following abbreviations are used in this manuscript:
 CNF conjunctive normal form DNF disjunctive normal form PNF prenex normal form POS product of sums SOP sum of products SAT Boolean satisfiability iff if and only if def. definition induc hyp. induction hypothesis resp. respectively p. page pp. pages

## References

1. Tseitin, G.S. On the Complexity of Derivation in Propositional Calculus. In Automation of Reasoning. 2 Classical Papers on Computational Logic 1967–1970; Bolç, L., Bundy, A., Siekmann, J., Sloman, A., Eds.; Springer: Berlin/Heidelberg, Germany, 1983; pp. 466–483. [Google Scholar]
2. Hill, F.J.; Peterson, G.R. Introduction to Switching Theory and Logical Design; John Wiley & Sons: Hoboken, NJ, USA, 1981. [Google Scholar]
3. Whitesitt, J.E. Boolean Algebra and Its Applications; Addison-Wesley Publishing Company: Boston, MA, USA, 1962. [Google Scholar]
4. Gorbátov, V.A. Fundamentos de la Matemática Discreta; Mir: Moscow, Russia, 1988. [Google Scholar]
5. Büning, H.K.; Lettmann, T. Propositional Logic: Deduction and Algorithms; Cambridge University Press: Cambridge, UK, 1999. [Google Scholar]
6. Gill, A. Applied Algebra for the Computer Sciences; Prentice-Hall: Hoboken, NJ, USA, 1976. [Google Scholar]
7. Jackson, P.; Sheridan, D. Clause form Conversions for Boolean Circuits. In SAT 2004, LNCS 3542; Hoos, H.H., Mitchell, D.G., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 183–198. [Google Scholar]
8. Łukasiewicz, J. Elements of Mathematical Logic; Pergamon Press: Oxford, UK, 1966. [Google Scholar]
9. Lipovača, M. Learn You a Haskell for Great Good; No Starch Press: San Francisco, CA, USA, 2011. [Google Scholar]
10. Monk, J.D. Mathematical Logic; Springer: Berlin/Heidelberg, Germany, 1976. [Google Scholar]
11. Bourbaki, N. Théorie des Ensembles; Eléments de Mathématique; Hermann: Paris, France, 1977. [Google Scholar]
12. Kunen, K. The Foundations of Mathematics; Mathematical Logic and Foundations; College Publications: Norcross, GA, USA, 2009; Volume 19. [Google Scholar]
Figure 1. Axiom and rules for the definition of dak.
Figure 1. Axiom and rules for the definition of dak.
Figure 2. Axiom and rules for the definition of lasc.
Figure 2. Axiom and rules for the definition of lasc.
 Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## Share and Cite

MDPI and ACS Style

García-Olmedo, F.M.; García-Miranda, J.; González-Rodelas, P. Mathematical Foundation of a Functional Implementation of the CNF Algorithm. Algorithms 2023, 16, 459. https://doi.org/10.3390/a16100459

AMA Style

García-Olmedo FM, García-Miranda J, González-Rodelas P. Mathematical Foundation of a Functional Implementation of the CNF Algorithm. Algorithms. 2023; 16(10):459. https://doi.org/10.3390/a16100459

Chicago/Turabian Style

García-Olmedo, Francisco Miguel, Jesús García-Miranda, and Pedro González-Rodelas. 2023. "Mathematical Foundation of a Functional Implementation of the CNF Algorithm" Algorithms 16, no. 10: 459. https://doi.org/10.3390/a16100459

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.