A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory

Cabbolet, Marcoen J. T. F.

doi:10.3390/axioms10020119

Open AccessArticle

A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory

by

Marcoen J. T. F. Cabbolet

Center for Logic and Philosophy of Science, Free University of Brussels, Pleinlaan 2, 1050 Brussels, Belgium

Axioms 2021, 10(2), 119; https://doi.org/10.3390/axioms10020119

Submission received: 9 April 2021 / Revised: 4 June 2021 / Accepted: 9 June 2021 / Published: 14 June 2021

Download

Browse Figures

Versions Notes

Abstract

It is well known that Zermelo-Fraenkel Set Theory (ZF), despite its usefulness as a foundational theory for mathematics, has two unwanted features: it cannot be written down explicitly due to its infinitely many axioms, and it has a countable model due to the Löwenheim–Skolem theorem. This paper presents the axioms one has to accept to get rid of these two features. For that matter, some twenty axioms are formulated in a non-classical first-order language with countably many constants: to this collection of axioms is associated a universe of discourse consisting of a class of objects, each of which is a set, and a class of arrows, each of which is a function. The axioms of ZF are derived from this finite axiom schema, and it is shown that it does not have a countable model—if it has a model at all, that is. Furthermore, the axioms of category theory are proven to hold: the present universe may therefore serve as an ontological basis for category theory. However, it has not been investigated whether any of the soundness and completeness properties hold for the present theory: the inevitable conclusion is therefore that only further research can establish whether the present results indeed constitute an advancement in the foundations of mathematics.

Keywords:

foundations of mathematics; set theory; category theory; non-classical logic

MSC:

03B60; 03E65; 18A15

1. Introduction

1.1. Motivation

Zermelo–Fraenkel set theory, which emerged from the axiomatic set theory proposed by Zermelo in [1] by implementing improvements suggested independently by Skolem and Fraenkel, is without a doubt the most widely accepted foundational theory for mathematics. However, it has two features that we may call “unwanted” or “pathological”. Let us, in accordance with common convention, use ZFC to denote the full theory, ZF for the full theory minus the axiom of choice (AC), and let us use ZF(C) in statements that are to hold for both ZF and ZFC; in a sentence, the two pathological features are then the following:

(i): The number of axioms of ZF(C) is infinite, as a result of which ZF(C) cannot be written down explicitly;
(ii): A corollary of the downward Löwenheim–Skolem theorem [2,3] is that ZF(C) has a countable model (if it has a model at all), as a result of which we have to swallow that there is a model of ZF(C) in which the powerset of the natural numbers is countable.

However, let us be more specific, starting with the infinite axiomatization of ZF(C). One might have the impression that ZF(C) has just eight or nine axioms—that is eight for ZF and nine for ZFC which includes AC. This impression, however, is false. The crux is that ZF(C) contains two axiom schemata, which are sometimes colloquially referred to as “axioms”: each axiom schema consists of infinitely many axioms. The first of these is the separation axiom schema, usually abbreviated by SEP. A standard formulation of SEP is the following:

\forall X \exists Y \forall z (z \in Y \Leftrightarrow z \in X \land Φ)

(1)

This means: for any set X there is a set Y, such that any z is an element of Y if and only if z is an element of X and

Φ

is true. With such an axiom, we can construct a subset Y of X. The fact is, however, that

Φ

is a metavariable: it is a variable that stands for any well-formed formula not open in X. So, in fact, we have a separation axiom for every suitable well-formed formula that we substitute for

Φ

in Equation (1). For example, if we substitute “

z = \emptyset

” for “

Φ

” in Equation (1), then we have a separation axiom. The second axiom schema of ZF(C) is the replacement axiom schema, usually abbreviated by REP. A standard formulation of REP is the following:

\forall x \exists! y Φ (x, y) \Rightarrow \forall X \exists Y \forall y (y \in Y \Leftrightarrow \exists x \in X (Φ (x, y)))

(2)

This means: if for every set x there is a unique set y such that the relation

Φ (x, y)

is true, then for any set X there is a set Y made up of all the elements y for which there is an element x in X such that the relation

Φ (x, y)

is true. Again,

Φ

is a metavariable but now it stands for any well-formed functional relation. So, in fact, we have a replacement axiom for every well-formed functional relation that we substitute for

Φ

in Equation (2). For example, if we substitute “

y = {x, \emptyset}

" for “

Φ (x, y)

” in Equation (2), then we have a replacement axiom. Ergo, ZF(C) is infinitely axiomatized: as a consequence, ZF(C) cannot ever be written down explicitly—that is our first pathological feature of ZF(C).

Now let us turn to the corollary of the downward Löwenheim–Skolem theorem that if ZF(C) has a model, it has a countable model—of all people, Zermelo himself considered this a pathological feature of the theory [4]. This countable model, let us call it “M”, thus consists of a universe

| M |

, which is a countable family of sets

m_{1}, m_{2}, m_{3}, \dots

such that for every axiom A of ZF(C), its translation

A^{*}

in the language of the model M is valid in M. That is, we have

⊧_{M} A^{*}

(3)

for the translation

A^{*}

of any axiom A of ZF(C) in the language of M. Elaborating, the countable universe

| M |

contains at least the following sets:

(i): The natural numbers $0, 1, 2, \dots$ defined as sets;
(ii): The countable set of natural numbers $N = {0, 1, 2, \dots}$ ;
(iii): Countably many subsets $S_{1}, S_{2}, S_{3}, \dots$ of the set $N$ ;
(iv): A countable set K that contains the above subsets of $N$ , so $K = {S_{1}, S_{2}, S_{3}, \dots}$ .

Now the powerset axiom of ZF(C), usually abbreviated POW, is the following:

\forall X \exists Y \forall z (z \in Y \Leftrightarrow z \subset X)

(4)

This means: for any set X there is a set Y made up of the subsets of X. The set Y is then usually denoted by

Y = P (X)

. However, in our model M, this translates as: for any set X in

| M |

there is a set Y in

| M |

made up of all the subsets of X in

| M |

. As a result, we have

⊧_{M} K = P (N)

(5)

So, in our model M the powerset of the natural numbers is a countable set. Intuitively we think that the powerset of

N

is uncountable and contains subsets of

N

other than the

S_{j}

’s in the universe

| M |

of the model M. However, the crux is that these “other” subsets of

N

are not in

| M |

. Löwenheim and Skolem have proven that the existence of such a model M is an inevitable consequence of the standard first-order axiomatization of ZF. This is a famous result in the foundations of mathematics and one that is completely counterintuitive to boot. The proof is not constructive, that is, there exists no specification of how the set K can be constructed, but nevertheless we have to swallow that ZF(C) has a model M in which a countable set K is the powerset of the natural numbers. This is the second pathological feature of ZF(C).

The purpose of this paper is to present the axioms one has to accept such that the axioms of ZF can be derived from the new axioms—we omit a discussion of AC—but such that these two pathological features are both removed. The collection of these new axioms will henceforth be referred to with the symbol

T

(a Gothic “T”): it is thus the case that

T

has a finite number of non-logical axioms of finite length.

1.2. Related Works

While it is emphasized that it is not the purpose of this paper to review every idea ever published in set theory, it is true that at least one set theory that can be finitely axiomatized has already been suggested, namely Von Neumann–Gödel–Bernays set theory (NGB). NGB is provably a mere conservative extension of ZF and as such “it seems to be no stronger than ZF” [5]. However, NGB shares the second of the two aforementioned unwanted features with ZF: if it has a model at all, it also has a countable model. In that regard, Von Neumann has been quoted stating

“At present we can do no more than note that we have one more reason here to entertain reservations about set theory and that for the time being no way of rehabilitating this theory is known”.
[6]

That shows that Von Neumann too considered the countable model a pathological feature. That being said, a finite theory in the same language as ZF (without extra objects) and as strong as ZF has already in the 1950s been proved impossible by Montague [7]. This result is a landmark in the historical development of the foundations of mathematics: even though it may not be stated anywhere explicitly, it has ever since become accepted that there is no standard solution to “our” problem identified in the beginning. That is, the general consensus is that it is impossible to develop a foundational theory for mathematics in standard first-order language that lacks both pathological features of ZF—all the more so because the downward Löwenheim–Skolem theorem pertains to standard first-order theories in general. It is therefore not surprising that this is not an active research field: those working in the foundations of mathematics are well aware of “our” problem, but ongoing research in set theory focuses on other topics such as, for example, large cardinals, forcing, and inner models—see, e.g., [8,9,10] for some recent works. That being said, the finitely axiomatized theory

T

that we present here is a nonstandard solution to “our” problem: it entails a rather drastic departure from the language, ontology, and logic of ZF. Earlier, a radical departure was already suggested by Lawvere, who formulated a theory of the category of sets: here the ∈-relation has been defined in terms of the primitive notions of category theory, that is, in terms of mappings, domains, and codomains [11]. The present theory

T

, however, is more of a “marriage”—for lack of a better term—between set theory and category theory: the ∈-relation is maintained as an atomic expression, while the notion of a category is built in as the main structural element. So, regarding the philosophical position on the status of category theory, here neither Lawvere’s position, that category theory provides the foundation for mathematics [12], Mayberry’s position, that category theory requires set theory as a foundation [13], nor Landry’s position, that category theory provides (all of) the language for mathematics [14], is taken: instead, the present position is that category theory is incorporated in the foundations.

1.3. Informal Overview of the Main Result

Category theory being incorporated means that the universe of (mathematical) discourse is not Cantor’s paradise of sets, but a category:

Definition 1.

The universe of discourse is a category

C

consisting of:

(i): A proper class of objects, each of which is a set;
(ii): A proper class of arrows, each of which is a function.

□

As to the meaning of the term “universe of discourse” in Definition 1, the following quote from a standard textbook is interesting:

“Instead of having to say the entities to which the predicates might in principle apply, we can make things easier for ourselves by collectively calling these entities the universe of discourse”.
[15]

So, a theory without a universe of discourse is nothing but a schema of meaningless strings of symbols, which are its axioms; an inference rule is then nothing but a rule by which (other) meaningless strings of symbols can be “deduced” from those axioms. Such a notion of “theory” might be acceptable from the perspective of a formalist, but here the position is taken that theories c.q. strings of symbols without meaning are not interesting. So, to the present theory

T

is associated a Platonic universe of discourse—in case of the category

C

of Definition 1—which we can think of as being made up of the things that satisfy the axioms of

T

. That does not mean that for every thing in the universe of discourse a constant needs to be included in the formal language: the vocabulary contains only countably many constants. However,

T

has been formulated with an intended model in mind: its universe is then a “Platonic imitation” of the universe of discourse. Below we briefly elaborate on the universe of discourse using set-builder notation: strictly speaking this is not a part of the formal language, but given its status as a widely used tool for the description of sets it is suitable for an informal introductory exposition that one can hold in the back of one’s mind.

For starters, the primitive notion of a set is, of course, that it is an object made up of elements: a thing

α

being an element of a set S is formalized by an irreducible ∈-relation, that is, by an atomic expression of the form

α \in S

. The binary predicate “∈” is part of the language for

T

: there is no need to express it in the language of category theory, since that does not yield a simplification. (Hence the language of the present theory is not reduced to the language of category theory.) In ZF we then have the adage “everything is a set”, meaning that if we have

x \in y

, then x is a set too. Here, however, that adage remains valid in this proper class of objects only to the extent that all the objects are sets—the adage does not hold for all elements of all sets. That is, we will assume that every object of the category is either the empty set or an object that contains elements, but an element of an object—if any—can either be the empty set, or again an object made up of elements, or a function: a function is then not a set. A number of constructive set-theoretical axioms then describe in terms of the ∈-relation which sets there are at least in this proper class of sets; these axioms are very simple theorems of ZF that hardly need any elaboration.

As to the notion of a function, in the framework of ZF a function is identified with its graph. However, as hinted at above, here we reject that set-theoretical reduction. First of all, functions are objects sui generis. If we use simple symbols like X and Y for sets, then a composite symbol

f_{X}

—to be pronounced “f-on-X”—can be used for a function on X: in the intended model, a function is a thing

f_{X}

where the “f” in “

f_{X}

” stands for the graph of the function and the “X” in “

f_{X}

” for its domain. To give an example, let the numbers 0 and 1 be defined as sets, e.g., by

0 : = \emptyset

and

1 : = {\emptyset}

, and let, for sets x and y, a two-tuple

〈 x, y 〉

be defined as a set, e.g., by

〈 x, y 〉 : = {x, {x, y}}

; then the composite symbol

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}}

refers to the function on the set

{0, 1}

whose graph is the set

{〈 0, 1 〉, 〈 1, 0 〉}

. However, we have

f_{X} \neq Y

for any function on any domain X and for any set Y, so

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}} \neq {〈 0, 1 〉, 〈 1, 0 〉}

(6)

That is, the function

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}}

is not identical to its graph

{〈 0, 1 〉, 〈 1, 0 〉}

—nor, in fact, to any other set Y. At this point one might be inclined to think that the whole idea of functions on a set as objects sui generis is superfluous and should be eliminated in favor of the idea that functions are identified with their graphs (which are sets). That, however, has already been tried in an earlier stage of this investigation: it turned out to lead to unsolvable difficulties with the interpretation of the formalism, cf. [16]. The crux is that the main constructive axiom of the theory—here a constructive axiom is an axiom that, when certain things are given (e.g., one or two sets or a set and a predicate), states the existence of a uniquely determined other thing [17]—“produces” things referred to by a symbol

F_{X}

: one gets into unsolvable difficulties if one tries to interpret these things as sets.

However, functions are not completely different than sets. Contrary to a set, a function in addition has a domain and a codomain—both are sets—and it also does something: namely, it maps every element in its domain to an element in its codomain. This first aspect, that a function “has” a domain and a codomain, can be expressed in the language of category theory: an atomic formula

f_{X} : Y ↠ Z

expresses that the function

f_{X}

has domain Y and codomain Z. In accordance with existing convention, the two-headed arrow “↠” expresses that

f_{X}

is a surjection: the codomain is always the set of the images of the elements of the domain under

f_{X}

. Since such an expression

f_{X} : Y ↠ Z

is irreducible in the present framework, it requires some function-theoretical axioms to specify when such an atomic formula is true and when it is not. For example, for the function

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}}

discussed above we have

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}} : {0, 1} ↠ {0, 1}

(7)

while other expressions

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}} : Y ↠ Z

are false. Again, at this point one might be inclined to think that these expressions

f_{X} : Y ↠ Z

are superfluous, because the notation

f_{X}

already indicates that X is the domain of

f_{X}

. The crux, however, is that these expressions are essential to get to a finite axiomatization: we may, then, have the opinion that it is obvious from the notation

f_{X}

that X is the domain of

f_{X}

, but only an expression

f_{X} : X ↠ Z

expresses this fact—it is, thus, an axiom of the theory that

f_{X} : Y ↠ Z

is only true if

Y = X

. In particular, this expression is not true if

Y ⊊ X

.

The second aspect, that for any set X any function

f_{X}

“maps” an element y of its domain to an element z of its codomain, is expressed by another atomic formula

f_{X} : y \mapsto z

. In the framework of ZF this is just a notation for

〈 y, z 〉 \in f_{X}

, but in the present framework this is also an irreducible expression: therefore, it requires some more function-theoretical axioms to specify when such an atomic formula is true and when not. The idea, however, is this: given a set X and a function

f_{X}

, precisely one expression

f_{X} : y \mapsto z

is true for each element y in the domain of

f_{X}

. For example, we have

\begin{matrix} {〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}} : 0 \mapsto 1 \end{matrix}

(8)

\begin{matrix} {〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}} : 1 \mapsto 0 \end{matrix}

(9)

for the above function

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}}

. Importantly, each function is thus a total function. However, for any set X, any total function

f_{X}

with codomain Y, and any set Z for which

X ⊊ Z

, we may view

f_{X}

as a partial function on Z. We might express that with a formula

f_{X} : Z ↛ Y

, which we may define in the language of our theory

T

by

f_{X} : Z ↛ Y \Leftrightarrow f_{X} : X ↠ Y \land X \subset Z

(10)

We then have

f_{X} : z \mapsto y

for some

y \in Y

if

z \in X

, and

\forall u (f_{X} : z ↛ u)

if

z \in Z / X

. We will not use the notion of a partial function to build our theory

T

, but this shows that we can still use the notion of a partial function in the framework of

T

.

All the above can be expressed with a dozen and a half very simple axioms—these can, in fact, all be reformulated in the framework of ZF. The present axiomatic schema does, however, contain “new mathematics” in the form of the second axiom of a pair of constructive, function-theoretical axioms. The first one is again very simple and merely states that given any two singletons

X = {x}

and

Y = {y}

, there exists an “ur-function” that maps the one element x of X to the one element y of Y. An ur-function is thus a function with a singleton domain and a singleton codomain. The above function

{〈 0, 1 〉, 〈 1, 0 〉}_{{0, 1}}

is thus not an ur-function but the function referred to by the symbol

{〈 0, 1 〉}_{{0}}

is: we have

\begin{matrix} {〈 0, 1 〉}_{{0}} : {0} ↠ {1} \end{matrix}

(11)

\begin{matrix} {〈 0, 1 〉}_{{0}} : 0 \mapsto 1 \end{matrix}

(12)

That said, the second of the pair of constructive, function-theoretical axioms is a new mathematical principle: it states that given any family of ur-functions

f_{{j}}

indexed in a set Z, there exists a sum function

F_{Z}

such that the sum function maps an element j of Z to the same image as the corresponding ur-function

f_{{j}}

. The formulation of this principle requires, however, a new non-classical concept that can be called a “multiple quantifier”. This new concept can be explained as follows. Suppose we have defined the natural numbers

0, 1, 2, \dots

as sets, suppose the singletons

{0}, {1}, {2}, \dots

exist as well, and suppose we also have the set of natural numbers

ω = {0, 1, 2, \dots}

. We can, then, consider variables

f_{{0}}, f_{{1}}, f_{{2}}, \dots

ranging over all ur-functions on a singleton of a natural number as indicated by the subscript—so,

f_{{0}}

is a variable that ranges over all ur-functions on the singleton of 0. In the standard first-order language of ZF, we have the possibility to quantify over all ur-functions on a singleton

{a}

by using a quantifier

\forall f_{{a}}

, and we have the possibility to use any finite number of such quantifiers in a sentence. For example, we can have a formula

\forall f_{{0}} \forall f_{{1}} \forall f_{{2}} Ψ

(13)

meaning: for all ur-functions on

{0}

, for all ur-functions on

{1}

, and for all ur-functions on

{2}

,

Ψ

. We can then introduce a new notation by the following postulate of meaning:

{(\forall f_{{j}})}_{j \in {0, 1, 2}} Ψ \Leftrightarrow \forall f_{{0}} \forall f_{{1}} \forall f_{{2}} Ψ

(14)

This formula can be read as: for any family of ur-functions indexed in

{0, 1, 2}

,

Ψ

. So

{(\forall f_{{i}})}_{i \in {0, 1, 2}}

is then a multiple quantifier that, in this case, is equivalent to three quantifiers in standard first-order language. The next step is now that we lift the restriction that a multiple quantifier has to be equivalent to a finite number of standard quantifiers: with that step we enter into non-classical territory. In the language of

T

we can consider a formula like

{(\forall f_{{i}})}_{i \in ω} Ψ

(15)

The multiple quantifier is equivalent to a sequence

\forall f_{{0}} \forall f_{{1}} \forall f_{{2}} \dots

of infinitely many standard quantifiers in this case. However, in the present framework the constant

ω

in Formula (15) can be replaced by any constant, yielding non-classical multiple quantifiers equivalent to an uncountably infinite number of standard quantifiers.

To yield meaningful theorems, the subformula

Ψ

of Formula (15) has to be open in infinitely many variables

f_{{0}}, f_{{1}}, f_{{2}}, \dots

, each of which ranges over all ur-functions on a singleton of a natural number. This is achieved by placing a conjunctive operator

⋀_{i \in ω}

in front of a standard first-order formula

Ψ (f_{{i}})

that is open in a composite variable

f_{{i}}

, yielding an expression

⋀_{i \in ω} Ψ (f_{{i}})

(16)

Syntactically this is a formula of finite length, but semantically it is the conjunction of a countably infinite family of formulas

Ψ (f_{{0}}), Ψ (f_{{1}}), Ψ (f_{{2}}), \dots

. Together with the multiple quantifier from Formula (15) it yields a sentence

{(\forall f_{{i}})}_{i \in ω} ⋀_{i \in ω} Ψ (f_{{i}})

: semantically it contains a bound occurrence of the variables

f_{{0}}, f_{{1}}, f_{{2}}, \dots

. The non-classical “sum function axiom” constructed with these formal language elements is so powerful that it allows one to derive the infinite schemas SEP and REP of ZF from just a finite number of axioms.

That said, the literature already contains a plethora of so-called infinitary logics; see [18] for a general overview and [19,20] for an example of some recent work. In the framework of such an infinitary logic, one typically can form infinitary conjunctions of a collection

Σ

of standard first-order formulas indexed by some set S. So, if

Σ = {ϕ_{j} | j \in S}

is such a collection of formulas, then a formula

⋀ Σ

stands for

\land_{j \in S} ϕ_{j}

, which resembles Formula (16). That way, one can, for example, form the conjunction of all the formulas of the SEP schema of ZF: that yields a finite axiomatization of set theory. However, such an infinitary conjunction cannot be written down explicitly: the string of symbols “

\land_{j \in S} ϕ_{j}

” is an informal abbreviation that is not part of the formal language—that is, it is not a well-formed formula. In the present case, however, we are only interested in well-formed formulas of finite length, which can be written down explicitly. This is achieved by building restrictions in the definition of the syntax, so that a conjunctive operator like

⋀_{x \in ω}

only forms a well-formed formula if it is put in front of an atomic expression of the type

f_{{x}} : x \mapsto t (x)

where both

f_{{x}}

and

t (x)

are terms with an occurrence of the same variable x that occurs in the conjunctive operator: the way typographically finite expressions obtain that semantically is infinitary conjunctions. So, the language for our theory

T

is much more narrowly defined than the language of (overly?) general infinitary logics: as it turns out, that suffices for our present aims.

That brings us to the next point, which is to discuss the (possible) practical use of the new finite theory

T

as a foundational theory for mathematics, that is, as a framework for proofs in mathematics. The practical usefulness of the schema lies therein that it (i) provides an easy way to construct sets and (ii) that categories like Top, Mon, Grp, etc., which are subjects of study in category theory, can be viewed as subcategories of the category of sets and functions of Definition 1, thus providing a new approach to the foundational problem identified in [21]. While that latter point (ii) hardly needs elaboration, the former (i) does. The crux is that one does not need to apply the non-classical sum function axiom directly: what one uses is the theorem—or rather: the theorem schema—that given any set X, we can construct a new function on X by giving a function prescription. So, this is a philosophical nuance: in ZF one constructs a new object with the ∈-relation, but in the present framework one can construct a new function

f_{X}

not with the ∈-relation but by simply defining which expressions

f_{X} : y \mapsto z

are true for the elements y in the domain. On account of the sum function axiom, it is then a guarantee that the function

f_{X}

exists. So, given any set X one can simply give a defining function prescription

f_{X} : y \overset{def}{⟼} ı z Φ (y, z)

(17)

(where the iota-term

ı z Φ (y, z)

denotes the unique thing z for which the functional relation

Φ (y, z)

holds) and it is then a guarantee that

f_{X}

exists. Furthermore, having constructed the function, we have implicitly constructed its graph and its image set. Ergo, giving a function prescription is constructing a set. The non-classical axiom, which may be cumbersome to use directly, stays thus in the background: one uses the main theorem of

T

.

What remains to be discussed is that

T

, as mentioned in the first paragraph, entails the axioms of ZF and does not have a countable model. As to the first-mentioned property, it will be proven that the infinite axiom schemas SEP and SUP of ZF, translated in the formal language of

T

, can be derived from the finitely axiomatized theory

T

: that provides an argument for considering

T

to be not weaker than ZF. Secondly, even though the language of

T

has only countably many constants, it will be shown that the validity of the non-classical sum function axiom in a model

M

of

T

has the consequence that the downward Löwenheim–Skolem theorem does not hold: if

T

has a model

M

, then

M

is uncountable. This is a significant result that does not hold in the framework of ZF: it provides, therefore, an argument for considering

T

to be stronger than ZF.

1.4. Point-by-Point Overview

(i)

Zermelo–Fraenkel set theory is the most widely accepted foundational theory for mathematics, but the problem is that it has two unwanted features:

–: It has infinitely many axioms;
–: It has a countable model in which the powerset of the naturals is countable;

(ii)

Since the 1950s it has been generally accepted that this problem has no standard solution;

(iii)

This paper presents a nonstandard solution, which opens up an entirely new research field that may be called “infinitary mathematical logic”;

(iv)

The main result, the non-classical theory

T

, incorporates category theory and axiomatic set theory in a single framework;

(v)

Category theory is incorporated by the ontological assumption that the universe of

T

is a class of sets and a class of functions: this universe satisfies the axioms of category theory;

(vi)

Set theory is incorporated in the sense that the axioms of ZF can be derived from the axioms of

T

;

(vii)

T

can be considered stronger than ZF since it lacks the two unwanted features:

–: $T$ has finitely many axioms of finite length;
–: If $T$ has a model, it is uncountable;

(viii)

T

is introduced here as a candidate for a foundational theory for mathematics, the corresponding philosophy of mathematics being that mathematics—i.e., all of mathematics—can be viewed as the collection of valid inferences within the framework of

T

;

(ix)

The true merit of

T

as a foundational theory for mathematics has yet to be established by a program for further research: negative results may invalidate the idea that

T

is suitable as a foundational theory for mathematics.

This concludes the introductory discussion. The remainder of this paper is organized as follows. The next section axiomatically introduces this finitely axiomatized non-classical theory

T

. The section thereafter discusses the theory

T

(i) by deriving its main theorem, (ii) by deriving the axiom schemas SEP and REP of ZF, (iii) by showing that the downward Löwenheim–Skolem theorem does not hold for the non-classical theory

T

, (iv) by showing the axioms of category theory hold for the class of arrows of the universe of discourse meant in Definition 1, and (v) by addressing the main concerns for inconsistency. The final section states the conclusions.

2. Axiomatic Introduction

2.1. Formal language

First of all a remark. In standard first-order logic, the term “quantifier” refers both to the logical symbols “∀” and “∃” and to combinations like

\forall x

,

\exists y

consisting of such a symbol and a variable. While that may be unproblematic in standard logic, here the logical symbols “∀” and “∃” are applied in different kinds of quantifiers. Therefore, to avoid confusion we will refer to the symbol “∀” by the term “universal quantification symbol” and to the symbol “∃” by the term “existential quantification symbol”.

Definition 2.

The vocabulary of the language

L_{T}

of

T

consists of the following:

(i): The constants ∅ and $ω$ , to be interpreted as the empty set and the first infinite ordinal;
(ii): The constants $1_{\emptyset}$ , to be interpreted as the inactive function;
(iii): Simple variables $x, X, y, Y, \dots$ ranging over sets;
(iv): For any constant $\hat{X}$ referring to an individual set, composite symbols $f_{\hat{X}}, g_{\hat{X}}, \dots$ with an occurrence of the constant $\hat{X}$ as the subscript are simple variables ranging over functions on that set $\hat{X}$ ;
(v): For any simple variable X ranging over sets, composite symbols $f_{X}, g_{X}, \dots$ with an occurrence of the variable X as the subscript are composite variables ranging over functions on a set X;
(vi): Simple variables $α, β, \dots$ ranging over all things (sets and functions);
(vii): The binary predicates “∈” and “=”, and the ternary predicates “ $(.) : (.) ↠ (.)$ ” and “ $(.) : (.) \mapsto (.)$ ”;
(viii): The logical connectives of first-order logic $\neg, \land, \lor, \Rightarrow, \Leftrightarrow$ ;
(ix): The universal and existential quantification symbols $\forall, \exists$ ;
(x): The brackets “(’ and ‘)". □

Remark 1.

To distinguish between the theory

T

and its intended model, boldface symbols with a hat like

\hat{X}

,

\hat{α}

, etc., will be used to denote constants in the vocabulary of the theory

T

, while underlined boldface symbols like

\underset{̲}{X}

,

\underset{̲}{α}

, etc., will be used to denote individuals in the intended model of

T

. □

Definition 3.

The syntax of the language

L_{T}

is defined by the following clauses:

(i): If t is a constant, or a simple or a composite variable, then t is a term;
(ii): If $t_{1}$ and $t_{2}$ are terms, then $t_{1} = t_{2}$ and $t_{1} \in t_{2}$ are atomic formulas;
(iii): If $t_{1}$ , $t_{2}$ , and $t_{3}$ are terms, then $t_{1} : t_{2} ↠ t_{3}$ and $t_{1} : t_{2} \mapsto t_{3}$ are atomic formulas;
(iv): If $Φ$ and $Ψ$ are formulas, then $\neg Φ$ , $(Φ \land Ψ)$ , $(Φ \lor Ψ)$ , $(Φ \Rightarrow Ψ)$ , $(Φ \Leftrightarrow Ψ)$ are formulas;
(v): If $Ψ$ is a formula and t a simple variable ranging over sets, over all things, or over functions on a constant set, then $\forall t Ψ$ and $\exists t Ψ$ are formulas;
(vi): If X and $f_{\hat{X}}$ are simple variables ranging respectively over sets and over functions on the set $\hat{X}$ , $f_{X}$ a composite variable with an occurrence of X as the subscript, and $Ψ$ a formula with an occurrence of a quantifier $\forall f_{\hat{X}}$ or $\exists f_{\hat{X}}$ but with no occurrence of X, then $\forall X [X \ \hat{X}] Ψ$ and $\exists X [X \ \hat{X}] Ψ$ are formulas. □

Remark 2.

Regarding clause (vi) of Definition 3,

[u \ t] Ψ

is the formula obtained from

Ψ

by replacing t everywhere by u. This definition will be used throughout this paper. Note that if

Ψ

is a formula with an occurrence of the simple variable

f_{\hat{X}}

, then

[X \ \hat{X}] Ψ

is a formula with an occurrence of the composite variable

f_{X}

. □

Definition 4.

The language

L_{T}

contains the following special language elements:

(i): If t is a simple variable ranging over sets, over all things, or over functions on a constant set, then $\forall t$ and $\exists t$ are quantifiers with a simple variable;
(ii): If $f_{X}$ is a composite variable, then $\forall f_{X}$ and $\exists f_{X}$ are a quantifiers with a composite variable.

A sequence like

\forall X \forall f_{X}

can be called a double quantifier. □

The scope of a quantifier is defined as usual: note that a quantifier with a composite variable can only occur in the scope of a quantifier with a simple variable. A free occurrence and a bounded occurrence of a simple variable is also defined as usual; these notions can be simply defined for formulas with a composite variable.

Definition 5.

Let

f_{X}

be a composite variable with an occurrence of the simple variable X; then

(i): An occurrence of $f_{X}$ in a formula $Ψ$ is free if that occurrence is neither in the scope of a quantifier with the composite variable $f_{X}$ nor in the scope of a quantifier with the simple variable X;
(ii): An occurrence of $f_{X}$ in a formula $Ψ$ is bounded if that occurrence is in the scope of a quantifier with the composite variable $f_{X}$ .

A sentence is a formula with no free variables—simple or composite. A formula is open in a variable if there is a free occurrence of that variable. A formula that is open in a composite variable

f_{X}

is also open in the simple variable X. □

Definition 6.

The semantics of any sentence without a quantifier with a composite variable is as usual. Furthermore,

(i): A sentence $\forall X Ψ$ with an occurrence of a quantifier $\forall f_{X}$ or $\exists f_{X}$ with the composite variable $f_{X}$ is valid in a model $M$ if and only if for every assignment g that assigns an individual set $g (X) = \underset{̲}{X}$ in $M$ as a value to the variable X, the sentence $[\underset{̲}{X} \ X] Ψ$ is valid in $M$ ;
(ii): The sentence $[\underset{̲}{X} \ X] Ψ$ obtained in clause (i) is a sentence without a quantifier with a composite variable, hence with usual semantics.

The semantics of a sentence

\exists X Ψ

with an occurrence of a quantifier

\forall f_{X}

or

\exists f_{X}

is left as an exercise. □

After the introduction of the “standard” first-order axioms of

T

, the language

L_{T}

will be extended to enable the formulation of the desired non-classical formulas.

2.2. Set-Theoretical Axioms

Below the set-theoretical axioms are listed; these are all standard first-order formulas. Due to the simplicity of the axioms, comments are kept to a bare minimum.

Axiom 1.

Extensionality Axiom for Sets (EXT): two sets X and Y are identical if they have the same things (sets and functions) as elements.

\forall X \forall Y (X = Y \Leftrightarrow \forall α (α \in X \Leftrightarrow α \in Y))

(18)

□

Axiom 2.

For any set X, any function

f_{X}

is not identical to any set Y:

\forall X \forall f_{X} \forall Y (f_{X} \neq Y)

(19)

□

Axiom 3.

A set X has no domain or codomain, nor does it map any thing to an image:

\forall X \forall α \forall β (X : α ↠ β \land X : α \mapsto β)

(20)

□

These latter two axioms establish that sets are different from functions on a set and do not have the properties of functions on a set.

Axiom 4.

Empty Set Axiom (EMPTY): there exists a set X, designated by the constant ∅, that has no elements.

\exists X (X = \emptyset \land \forall α (α \notin X))

(21)

□

Axiom 5.

Axiom of Pairing (PAIR): for every thing

α

and every thing

β

there exists a set X that has precisely the things

α

and

β

as its elements.

\forall α \forall β \exists X \forall γ (γ \in X \Leftrightarrow γ = α \lor γ = β)

(22)

□

Remark 3.

Using set-builder notation, the empty set can be interpreted as the individual

{}

in the intended model. Furthermore, given individual things

\underset{̲}{α}

and

\underset{̲}{β}

in the universe of the intended model, the pair set of

\underset{̲}{α}

and

\underset{̲}{β}

can then be identified with the individual

{\underset{̲}{α}, \underset{̲}{β}}

. Note that this is a singleton if

\underset{̲}{α} = \underset{̲}{β}

. □

Remark 4.

The sets in the present theory cannot be viewed as multisets, the objects of multiset theory [22]. In the framework of our theory

T

, we have for any thing

α

in the category of sets and functions that

{α} = {α, α} = {α, α, α}

. In multiset theory, this is not the case: an element may occur more than once in a multiset. We then have

{α, α} \neq {α, α, α}

for multisets

{α, α}

and

{α, α, α}

. □

Definition 7. (Extension of vocabulary of

L_{T}

.)

If t is a term, then

{(t)}^{+}

is a term, to be called “the singleton of t”—which may be written as

t^{+}

if no confusion arises. In particular, if

α

is a variable ranging over all things and x a variable ranging over all sets, then

α^{+}

is a variable ranging over all singletons and

x^{+}

a variable ranging over all singletons of sets. We thus have

\forall α \forall β (β = α^{+} \Leftrightarrow \exists X (β = X \land \forall γ (γ \in X \Leftrightarrow γ = α)))

(23)

Likewise for

x^{+}

. □

Notation 1.

On account of Definition 7, the constants

\emptyset^{+}, \emptyset^{+ +}, \emptyset^{+ + +}, \dots

are contained in the language

L_{T}

. Therefore, we can introduce the (finite) Zermelo ordinals at this point as a notation for these singletons:

\{\begin{matrix} 0 : = \emptyset \\ 1 : = \emptyset^{+} \\ 2 : = \emptyset^{+ +} \\ ⋮ \end{matrix}

(24)

According to the literature, the idea stems from unpublished work by Zermelo in 1916 [23]. □

Remark 6.

Sum Set Axiom (SUM): for every set X there exists a set Y made up of the elements of the elements of X.

\forall X \exists Y \forall α (α \in Y \Leftrightarrow \exists Z (Z \in X \land α \in Z))

(25)

□

Remark 5.

Given an individual set

\underset{̲}{X}

in the universe of the intended model, the sum set of

\underset{̲}{X}

can be denoted by the symbol

⋃ \underset{̲}{X}

and, using set-builder notation, can be identified with the individual

{α | \exists Z \in \underset{̲}{X} (α \in Z)}

. □

Remark 7.

Powerset Axiom (POW): for every set X there is a set Y made up of the subsets of X.

\forall X \exists Y \forall α (α \in Y \Leftrightarrow \exists Z (Z \subset X \land α = Z))

(26)

□

Remark 6.

Given an individual set

\underset{̲}{X}

in the universe of the intended model, the powerset of

\underset{̲}{X}

can be denoted by the symbol

P (\underset{̲}{X})

and, using set-builder notation, be identified with the individual

{x | x \subset \underset{̲}{X}}

. □

Remark 8.

Infinite Ordinal Axiom (INF): the infinite ordinal

ω

is the set of all finite Zermelo ordinals.

0 \in ω \land \forall α (α \in ω \Rightarrow α^{+} \in ω) \land \forall β \in ω (\exists γ \in ω (β = γ^{+}) \Leftrightarrow β = \emptyset))

(27)

□

Remark 7.

The set

ω

in INF is uniquely determined. In the intended model, the set

ω

can be denoted by the symbol

N

and, using set-builder notation, be identified with the individual

N : = {{}, {{}}, {{{}}}, \dots}

. □

Axiom 9.

Axiom of Regularity (REG): every nonempty set X contains an element

α

that has no elements in common with X.

\forall X \neq \emptyset \exists α (α \in X \land \forall β (β \in α \Rightarrow β \notin X))

(28)

□

Definition 8.

For any things that are

α

and

β

, the two-tuple

〈 α, β 〉

is the pair set of

α

and the pair set of

α

and

β

; using the iota-operator we get

〈 α, β 〉 : = ı x (\forall γ (γ \in x \Leftrightarrow γ = α \lor \exists Z (γ = Z \land \forall η (η \in Z \Leftrightarrow η = α \lor η = β)))

(29)

□

A simple corollary of Definition 8 is that for any things that are

α

and

β

, the two-tuple

〈 α, β 〉

always exists. There is, thus, no danger of nonsensical terms involved in the use of the iota-operator in Definition 8.

Remark 8.

Given individual things

\underset{̲}{α}

and

\underset{̲}{β}

in the universe of the intended model, the two-tuple

〈 \underset{̲}{α}, \underset{̲}{β} 〉

can, using set-builder notation, be identified with the individual

{\underset{̲}{α}, {\underset{̲}{α}, \underset{̲}{β}}}

. □

In principle, these set-theoretical axioms suffice: the function-theoretical axioms in the next section provide other means for the construction of sets.

2.3. Standard Function-Theoretical Axioms

Axiom 10.

A function

f_{X}

on a set X has no elements:

\forall X \forall f_{X} \forall α (α \notin f_{X})

(30)

□

Remark 9.

One might think that Axiom 10 destroys the uniqueness of the empty set. However, that is not true. It is true that a function on a set X and the empty set share the property that they have no elements, but the empty set is the only set that has this property: Axiom 2 guarantees, namely, that a function on a set X is not a set! □

Axiom 11.

General Function-Theoretical Axiom (GEN-F): for any nonempty set X, any function

f_{X}

has a set Y as domain and a set Z as codomain, and maps every element

α

in Y to a unique image

β

:

\forall X \forall f_{X} (X \neq \emptyset \Rightarrow \exists Y \exists Z (f_{X} : Y ↠ Z \land \forall α \in Y \exists! β (f_{X} : α \mapsto β)))

(31)

□

Axiom 12.

For any set X, any function

f_{X}

has no other domain than X:

\forall X \forall f_{X} \forall α (α \neq X \Rightarrow \forall ξ (f_{X} : α ↠ ξ))

(32)

□

Axiom 13.

For any set X, any function

f_{X}

does not take a thing outside X as argument:

\forall X \forall f_{X} \forall α \notin X \forall β (f_{X} : α \mapsto β)

(33)

□

Remark 10.

Axiom 11 dictates that precisely one expression

f_{X} : α \mapsto β

is true for each

α \in X

. This does not a priori exclude that such an expression can also be true for another thing

α

not in X. However, by Axiom 13 this is excluded. □

Axiom 14.

For any nonempty set X and any function

f_{X}

, the image set is the only codomain:

\forall X \forall f_{X} (X \neq \emptyset \Rightarrow \forall β (f_{X} : X ↠ β \Rightarrow \exists Z (β = Z \land \forall γ (γ \in Z \Leftrightarrow \exists η \in X (f_{X} : η \mapsto γ)))))

(34)

(This is the justification for the use of the two-headed arrow “↠”, commonly used for surjections.) □

Remark 11.

Note that these first function-theoretical axioms already provide a tool to construct a set: if we can construct a new function

f_{X}

on a set X from existing functions (an axiom will be given further below), then these axioms guarantee the existence of a unique codomain made up of all the images of the elements of X under

f_{X}

. □

Remark 12.

Given an individual set

\underset{̲}{X}

and an individual function

{\underset{̲}{f}}_{\underset{̲}{X}}

in the universe of the intended model, this unique codomain can be denoted by

{\underset{̲}{f}}_{\underset{̲}{X}} [\underset{̲}{X}]

or

cod ({\underset{̲}{f}}_{\underset{̲}{X}})

, and, using set-builder notation, be identified with the individual

{β | \exists α \in \underset{̲}{X} ({\underset{̲}{f}}_{\underset{̲}{X}} : α \mapsto β)}

in the universe of the intended model. Furthermore, given a thing

\underset{̲}{α}

in

\underset{̲}{X}

, its unique image under

{\underset{̲}{f}}_{\underset{̲}{X}}

can be denoted by the symbol

{\underset{̲}{f}}_{\underset{̲}{X}} (\underset{̲}{α})

. □

Notation 2.

At this point we can introduce expressions

f_{X} : X \to Y

, to be read as “the function f-on-X is a function from the set X to the set Y”, by the postulate of meaning

f_{X} : X \to Y \Leftrightarrow \exists Z (f_{X} : X ↠ Z \land Z \subset Y)

(35)

This provides a connection to existing mathematical practices. □

Axiom 15.

Inverse Image Set Axiom (INV): for any nonempty set X and any function

f_{X}

with domain X and any co-domain Y, there is for any thing

α

a set

Z \subset X

that contains precisely the elements of X that are mapped to

α

by

f_{X}

:

\forall X \neq \emptyset \forall f_{X} \forall Y (f_{X} : X ↠ Y \Rightarrow \forall β \exists Z \forall α (α \in Z \Leftrightarrow α \in X \land f_{X} : α \mapsto β))

(36)

□

Remark 13.

Note that INV, in addition to GEN-F, also provides a tool to construct a set: if we have constructed a new function

f_{X}

with domain X from existing functions, then with this axiom guarantees that the inverse image set exists of any thing

β

in the codomain of

f_{X}

. □

Remark 14.

Given an individual set

\underset{̲}{X}

, an individual function

{\underset{̲}{f}}_{\underset{̲}{X}}

, and an individual thing

\underset{̲}{β}

in the universe of the intended model, the unique inverse image set can be denoted by the symbol

{\underset{̲}{f}}_{\underset{̲}{X}}^{- 1} (\underset{̲}{β})

and can, using set-builder notation, be identified with the individual

{α | α \in \underset{̲}{X} \land {\underset{̲}{f}}_{\underset{̲}{X}} : α \mapsto \underset{̲}{β}}

in the universe of the intended model. □

Axiom 16.

Extensionality Axiom for Functions (EXT-F): for any set X and any function

f_{X}

, and for any set Y and any function

g_{Y}

, the function

f_{X}

and the function

g_{Y}

are identical if and only if their domains are identical and their images are identical for every argument:

\forall X \forall f_{X} \forall Y \forall g_{Y} (f_{X} = g_{Y} \Leftrightarrow X = Y \land \forall α \forall β (f_{X} : α \mapsto β \Leftrightarrow g_{Y} : α \mapsto β))

(37)

□

Axiom 17.

Inactive Function Axiom (IN-F): there exists a function

f_{\emptyset}

, denoted by the constant

1_{\emptyset}

, which has the empty set as domain and codomain, and which does not map any argument to any image:

\exists f_{\emptyset} (f_{\emptyset} = 1_{\emptyset} \land f_{\emptyset} : \emptyset ↠ \emptyset \land \forall α \forall β (f_{\emptyset} : α \mapsto β))

(38)

□

Note that there can be no other functions on the empty set than the inactive function

1_{\emptyset}

, since the image set is always empty: the atomic expression

f_{\emptyset} : \emptyset ↠ A

cannot be true for any nonempty set A.

Axiom 18.

Ur-Function Axiom (UFA): for any things

α

and

β

there exists an ur-function

f_{α^{+}}

with domain

α^{+}

and codomain

β^{+}

that maps

α

to

β

:

\forall α \forall β \exists f_{α^{+}} (f_{α^{+}} : α^{+} ↠ β^{+} \land f_{α^{+}} : α \mapsto β)

(39)

□

Remark 15.

Given individual things

\underset{̲}{α}

and

\underset{̲}{β}

in the universe of the intended model, the ur-function on

{\underset{̲}{α}}

that maps

\underset{̲}{α}

to

\underset{̲}{β}

can, using set-builder notation, be identified with the individual

{〈 \underset{̲}{α}, \underset{̲}{β} 〉}_{{\underset{̲}{α}}}

in the universe of the intended model. Note that the graph of the ur-function is guaranteed to exist. □

Axiom 19.

Axiom of Regularity for Functions (REG-F): for any set X and any function

f_{X}

with any codomain Y,

f_{X}

does not take itself as argument or has itself as image:

\forall X \forall f_{X} \forall Y (f_{X} : X ↠ Y \Rightarrow \forall α (f_{X} : f_{X} \mapsto α \land f_{X} : α \mapsto f_{X}))

(40)

□

Remark 16.

As to the first part, Wittgenstein already mentioned that a function cannot have itself as argument [24]. The second part is to exclude the existence of pathological “Siamese twin functions”, e.g., the ur-function

f_{X}

and

g_{Y}

given, using set-builder notation, by

\begin{matrix} f_{X} : {h_{Y}} ↠ {f_{X}}, f_{X} : h_{Y} \mapsto f_{X} \end{matrix}

(41)

\begin{matrix} h_{Y} : {f_{X}} ↠ {h_{Y}}, g_{Y} : f_{X} \mapsto g_{Y} \end{matrix}

(42)

We thus have

dom (f_{X}) = X = {h_{Y}}

and

dom (h_{Y}) = Y = {f_{X}}

; if one tries to substitute that in the above equations, then one gets “infinite towers”. These functions may not be constructible from the axioms, but they could exist a priori in the category of sets and functions: to avoid that, we practice mathematical eugenics and prevent them from occurring with REG-F. See Figure 1 for an illustration. The name “Siamese twin functions” is derived from the name “Siamese twin sets” for sets A and B satisfying

A \in B \land B \in A

, as published by Muller in [25]. □

2.4. The Non-Classical Function-Theoretical Axiom and Inference Rules

Definition 9.

The vocabulary of

L_{T}

as given by Definition 2 is extended:

(i): With symbols “ı”, the iota-operator, and “⋀”, the conjunctor;
(ii): For any constant $\hat{X}$ denoting a set, with enough composite symbols ${\hat{f}}_{α^{+}}$ , ${\hat{h}}_{β^{+}}$ , … such that each of these is a variable that ranges over a family of ur-functions indexed in $\hat{X}$ .

The syntax of

L_{T}

as given by Definition 3 is extended with the following clauses:

(iii): If ${\hat{f}}_{α^{+}}$ is a variable as in clause (ii) of Definition 9, then ${\hat{f}}_{α^{+}}$ is a term;
(iv): If t is a term and $u_{t^{+}}$ is a composite term with an occurrence of t, and $β$ is a variable ranging over all things, then $ı β (u_{t^{+}} : t \mapsto β)$ is a iota-term denoting the image of t under the ur-function $u_{t^{+}}$ ;
(v): If $\hat{X}$ is a constant designating a set, $α$ a simple variable ranging over all things, and $Ψ (α)$ an atomic formula of the type $t : t^{'} \mapsto t^{''}$ that is open in $α$ , then $⋀_{α \in \hat{X}} Ψ (α)$ is a formula;
(vi): If $Φ$ is a formula with a subformula $⋀_{α \in \hat{X}} Ψ (α)$ as in (v) with an occurrence of a composite variable $f_{α^{+}}$ , then ${(\forall f_{α^{+}})}_{α \in \hat{X}} Φ$ and ${(\exists f_{α^{+}})}_{α \in \hat{X}} Φ$ are formulas;
(vii): If X is a simple variable ranging over sets, and $Ψ$ a formula with no occurrence of X but with a subformula ${(\forall f_{α^{+}})}_{α \in \hat{X}} Φ$ as in (vi), then $\forall X [X \ \hat{X}] Ψ$ and $\exists X [X \ \hat{X}] Ψ$ are formulas with a subformula ${(\forall f_{α^{+}})}_{α \in X} [X \ \hat{X}] Φ$ ;
(viii): If X is a simple variable ranging over sets, and $Ψ$ a formula with no occurrence of X but with a subformula ${(\exists f_{α^{+}})}_{α \in \hat{X}} Φ$ as in (vi), then $\forall X [X \ \hat{X}] Ψ$ and $\exists X [X \ \hat{X}] Ψ$ are formulas with a subformula ${(\exists f_{α^{+}})}_{α \in X} [X \ \hat{X}] Φ$ . □

Remark 17.

Concerning the iota-operator in clause (iv) of Definition 9, we thus have

\forall α \forall f_{α^{+}} \forall γ (γ = ı β (f_{α^{+}} : α \mapsto β) \Leftrightarrow f_{α^{+}} : α \mapsto γ)

(43)

Note that upon assigning constant values to

α

and

f_{α^{+}}

, the term

ı β (f_{α^{+}} : α \mapsto β)

always refers to an existing, unique thing: there is thus no danger of nonsensical terms involved in this use of the iota-operator. □

Definition 10.

The following special language elements are added:

(i)

If

\hat{X}

is a constant designating a set, X and

α

simple variables ranging over sets c.q. things, and

f_{α^{+}}

a composite variable ranging over ur-functions on

α^{+}

, then

${(\forall f_{α^{+}})}_{α \in \hat{X}}$ is a multiple universal quantifier;
${(\exists f_{α^{+}})}_{α \in \hat{X}}$ is a multiple existential quantifier;
${(\forall f_{α^{+}})}_{α \in X}$ in the scope of a quantifier $\forall X$ is a universally generalized multiple universal quantifier;
${(\forall f_{α^{+}})}_{α \in X}$ in the scope of a quantifier $\exists X$ is an existentially generalized multiple universal quantifier;
${(\exists f_{α^{+}})}_{α \in X}$ in the scope of a quantifier $\forall X$ is a universally generalized multiple existential quantifier;
${(\exists f_{α^{+}})}_{α \in X}$ in the scope of a quantifier $\exists X$ is an existentially generalized multiple existential quantifier;

(ii)

If

\hat{X}

is a constant designating a set, X a simple variable ranging over sets, and

α

a simple variable ranging over all things, then

$⋀_{α \in \hat{X}}$ is a conjunctive operator with constant range;
$⋀_{α \in X}$ is a conjunctive operator with variable range. □

Concerning the language elements in clause (i), if ʞ and

ʞ^{'}

are existential or universal quantification symbols, then we can generally say that

{(ʞ f_{α^{+}})}_{α \in \hat{X}}

is a multiple quantifier and that

{(ʞ f_{α^{+}})}_{α \in X}

in the scope of a quantifier

ʞ^{'} X

is a generalized multiple quantifier.

Definition 11.

If

\hat{X}

is a constant designating a set and

⋀_{α \in \hat{X}} Ψ

is a subformula of a formula

Φ

, then

Ψ

is the scope of the conjunctive operator; furthermore,

(i): If there is an occurrence of a variable $α$ and/or ${\hat{f}}_{α^{+}}$ in the scope of the conjunctive operator, then a formula $⋀_{α \in \hat{X}} Ψ$ has a semantic occurrence of each of the constants $\hat{α}$ referring to a thing in the range of the variable $α$ , and/or of each of the constant ur-functions ${\hat{u}}_{{\hat{α}}^{+}}$ over which the variable ${\hat{f}}_{α^{+}}$ ranges—a subformula $⋀_{α \in \hat{X}} Ψ$ has thus to be viewed as the conjunction of all the formulas $[\hat{α} \ α] {[{\hat{u}}_{{\hat{α}}^{+}} \ \hat{f}]}_{α^{+}}] Ψ$ with $\hat{α} \in \hat{X}$ .
(ii): If there is an occurrence of a composite variable $f_{α^{+}}$ in the scope of the conjunctive operator, then the subformula $⋀_{α \in \hat{X}} Ψ (f_{α^{+}})$ has a free semantic occurrence of each of the simple variables $f_{{\hat{α}}^{+}}$ ranging over ur-functions on the singleton of $\hat{α}$ with $\hat{α} \in \hat{X}$ —the formula $⋀_{α \in \hat{X}} Ψ (f_{α^{+}})$ has thus to be viewed as the conjunction of all the formulas $[\hat{α} \ α] [f_{{\hat{α}}^{+}} \ f_{α^{+}}] Ψ$ . □

Definition 12.

If

\hat{X}

is a constant designating a set, ʞ an existential or universal quantification symbol, and

{(ʞ f_{α^{+}})}_{α \in \hat{X}} Ψ

a subformula of a formula

Φ

, then

Ψ

is the scope of the multiple quantifier; likewise for the scope of the generalized multiple quantifiers of Definition 10. If a formula

Ψ

has a free semantic occurrence of each of the simple variables

f_{{\hat{α}}^{+}}

with a constant

\hat{α} \in \hat{X}

, then a formula

{(ʞ f_{α^{+}})}_{α \in \hat{X}} Ψ

has a bounded semantic occurrence of each of the simple variables

f_{{\hat{α}}^{+}}

with a constant

\hat{α} \in \hat{X}

. A non-classical formula

Ψ

without free occurrences of variables is a sentence. If X is a simple variable ranging over sets and

Ψ

is a sentence with an occurrence of a multiple quantifier

{(ʞ f_{α^{+}})}_{α \in \hat{X}}

and with no occurrence of X, then

\forall X [X \ \hat{X}] Ψ

and

\exists X [X \ \hat{X}] Ψ

are sentences with an occurrence of a generalized multiple quantifier

{(ʞ f_{α^{+}})}_{α \in X}

. □

Axiom 20.

Sum Function Axiom (SUM-F): for any nonempty set X and for any family of ur-functions

f_{α^{+}}

indexed in X, there is a sum function

F_{X}

with some codomain Y such that the conjunction of all mappings by

F_{X}

of

α

to its image under the ur-function

f_{α^{+}}

holds for

α

ranging over X:

\forall X \neq \emptyset {(\forall f_{α^{+}})}_{α \in X} \exists F_{X} \exists Y (F_{X} : X ↠ Y \land ⋀_{α \in X} F_{X} : α \mapsto ı β (f_{α^{+}} : α \mapsto β))

(44)

□

With SUM-F, all non-logical axioms of the present non-classical theory have been introduced. However, still, rules of inference must be given to derive meaningful theorems from SUM-F. So, the rules of inference that follow have to be seen as part of the logic.

Inference Rule 1.

Nonstandard Universal Elimination:

\forall X Ψ ({(ʞ f_{α^{+}})}_{α \in X}, ⋀_{α \in X}) ⊢ [\hat{X} \ X] Ψ for any constant \hat{X}

(45)

where

Ψ ({(ʞ f_{α^{+}})}_{α \in X}, ⋀_{α \in X})

is a formula with an occurrence of a generalized multiple quantifier and of a conjunctive operator with variable range, and where

[\hat{X} \ X] Ψ

is a formula with an occurrence of a multiple quantifier

{(ʞ f_{α^{+}})}_{α \in \hat{X}}

and of a conjunctive operator

⋀_{α \in \hat{X}}

with constant range. □

Thus speaking, from SUM-F we can deduce a formula

{(\forall f_{α^{+}})}_{α \in \hat{X}} \exists F_{\hat{X}} \exists Y (F_{\hat{X}} : \hat{X} ↠ Y \land ⋀_{α \in \hat{X}} F_{\hat{X}} : α \mapsto ı β (f_{α^{+}} : α \mapsto β))

(46)

for any constant

\hat{X}

designating a set.

Inference Rule 2.

Multiple Universal Elimination:

{(\forall f_{α^{+}})}_{α \in \hat{X}} Φ (f_{α^{+}}) ⊢ [{\hat{f}}_{α^{+}} \ f_{α^{+}}] Φ

(47)

where

Φ (f_{α^{+}})

is a formula with an occurrence of the same composite variable

f_{α^{+}}

that also occurs in the preceding multiple universal quantifier

{(\forall f_{α^{+}})}_{α \in \hat{X}}

, and where

{\hat{f}}_{α^{+}}

is a variable as meant in clause (ii) of Definition 9. □

Thus speaking, from a sentence (46), which is an instance of SUM-F derived by inference rule 1, we can derive a formula

\exists F_{\hat{X}} \exists Y (F_{\hat{X}} : \hat{X} ↠ Y \land ⋀_{α \in \hat{X}} F_{\hat{X}} : α \mapsto ı β ({\hat{f}}_{α^{+}} : α \mapsto β))

(48)

for each variable

{\hat{f}}_{α^{+}}

ranging over a family of ur-functions indexed in

\hat{X}

. Note that the range of such a variable

{\hat{f}}_{α^{+}}

is constructed by assigning to each of the simple variables

f_{{\hat{α}}^{+}}

semantically occurring in Formula (46) a constant value

{\hat{u}}_{{\hat{α}}^{+}}

.

Inference Rule 3.

Nonstandard Rule-C:

\exists t Φ ⊢ [\hat{t} \ t] Φ

(49)

where t is a simple variable x ranging over sets or a simple variable

f_{\hat{X}}

ranging over functions on a constant set, and where

\hat{t}

is a constant in the range of t that does not occur in

Φ

but for which

[\hat{t} \ t] Φ

holds. If

Φ

has an occurrence of a generalized multiple quantifier

{(ʞ f_{α^{+}})}_{α \in t}

, then

[\hat{t} \ t] Φ

has an occurrence of a multiple quantifier

{(ʞ f_{α^{+}})}_{α \in \hat{t}}

; if

Φ

has an occurrence of a conjunctive operator

⋀_{α \in t}

with variable range, then

[\hat{t} \ t] Φ

has an occurrence of a conjunctive operator

⋀_{α \in \hat{t}}

with constant range. □

Thus speaking, from SUM-F we can deduce a formula

\exists Y ({\hat{F}}_{\hat{X}} : \hat{X} ↠ Y) \land ⋀_{α \in \hat{X}} {\hat{F}}_{\hat{X}} : α \mapsto ı β ({\hat{f}}_{α^{+}} : α \mapsto β)

(50)

which is a conjunction of a standard first-order formula and a non-classical formula with an occurrence of the new constant

{\hat{F}}_{\hat{X}}

, designating the sum function on

\hat{X}

, in the scope of a conjunctive operator. Of course, this conjunction

Ψ \land Φ

is true if and only if both its members are true. This requires one more inference rule.

Inference Rule 4.

Conjunctive Operator Elimination:

⋀_{α \in \hat{X}} Ψ (α) ⊢ [\hat{α} \ α] Ψ (α)

(51)

where

Ψ (α)

is a formula of the type

t : t^{'} \mapsto t^{''}

that is open in

α

, and

\hat{α}

any constant designating an element of

\hat{X}

. □

Thus speaking, from the right member of the conjunction (50) we can derive an entire schema, consisting of one standard first-order formula

{\hat{F}}_{\hat{X}} : \hat{α} \mapsto ı β ({\hat{u}}_{{\hat{α}}^{+}} : \hat{α} \mapsto β)

(52)

for each constant

{\hat{u}}_{{\hat{α}}^{+}}

. So given an infinitary conjunction

⋀_{α \in \hat{X}} {\hat{F}}_{\hat{X}} : α \mapsto ı β ({\hat{f}}_{α^{+}} : α \mapsto β)

, the sentences (52) derived by rule 4 are true for each constant

{\hat{u}}_{{\hat{α}}^{+}}

semantically occurring in Equation (50).

Remark 18.

Given an individual set

\underset{̲}{X}

and a variable

{\underset{̲}{f}}_{α^{+}}

that ranges over a family of ur-functions indexed in

\underset{̲}{X}

in the universe of the intended model, the unique sum function

{\underset{̲}{F}}_{\underset{̲}{X}}

for which

⋀_{α \in \underset{̲}{X}} {\underset{̲}{F}}_{\underset{̲}{X}} : α \mapsto ı β ({\underset{̲}{f}}_{α^{+}} : α \mapsto β)

(53)

can, using set-builder notation, be identified with the individual

{\underset{̲}{F}}_{\underset{̲}{X}} = {〈 α, β 〉 | α \in \underset{̲}{X} \land β = {\underset{̲}{f}}_{α^{+}} (α)}_{\underset{̲}{X}}

(54)

in the universe of the intended model. The graph of

{\underset{̲}{F}}_{\underset{̲}{X}}

, which can be identified with the set

{〈 α, β 〉 | α \in \underset{̲}{X} \land β = {\underset{̲}{f}}_{α^{+}} (α)}

, is certain to exist, see Theorem 1 (next section). So, constructing a sum function is a means to constructing a set. □

Example 1.

Consider the infinite ordinal

ω

from Axiom 8: its elements are the finite ordinals 0, 1, 2, … Applying Nonstandard Universal Elimination, we thus deduce from SUM-F that

{(\forall f_{α^{+}})}_{α \in ω} \exists F_{ω} \exists Y (F_{ω} : ω ↠ Y \land ⋀_{α \in ω} F_{ω} : α \mapsto ı β (f_{{α}} : α \mapsto β))

(55)

On account of the ur-function Axiom 18 we have

\forall x \in ω \exists f_{x^{+}} (f_{x^{+}} : x \mapsto x)

(56)

That is, for any finite ordinal x there is an ur-function that on the singleton of x that maps x to itself. Let the variable

{\hat{f}}_{α^{+}}^{1}

range over these identity ur-functions; applying Multiple Universal Elimination to the sentence (55) then yields a sentence

\exists F_{ω} \exists Y (F_{ω} : ω ↠ Y \land ⋀_{α \in ω} F_{ω} : α \mapsto ı β ({\hat{f}}_{α^{+}}^{1} : α \mapsto β))

(57)

Introducing the new constant

1_{ω}

by applying Rule-C to the sentence (57) and substituting

ı β ({\hat{f}}_{α^{+}}^{1} : α \mapsto β) = α

then yields the conjunction

1_{ω} : ω ↠ ω \land ⋀_{α \in ω} 1_{ω} : α \mapsto α)

(58)

By applying Conjunctive Operator Elimination to the right member of this conjunction (58), we obtain the countable schema

\{\begin{matrix} 1_{ω} : 0 \mapsto 0 \\ 1_{ω} : 1 \mapsto 1 \\ 1_{ω} : 2 \mapsto 2 \\ ⋮ \end{matrix}

(59)

This example demonstrates, strictly within the language of

T

, how SUM-F and the inference rules can be used to construct the identity function on

ω

from a family of ur-functions indexed in

ω

. □

Remark 19.

Summarizing, it has thus to be taken:

(i): That SUM-F is a typographically finite sentence;
(ii): That an instance (46) of SUM-F, deduced by applying Nonstandard Universal Elimination, is a typographically finite sentence;
(iii): That a Formula (48), deduced from an instance of SUM-F by applying Multiple Universal Elimination, is a typographically finite sentence;
(iv): That a conjunction (50), deduced by applying Rule-C to a sentence deduced from SUM-F by successively applying Nonstandard Universal Elimination and Multiple Universal Elimination, is a typographically finite sentence.

We thus get that SUM-F being true means that every instance (46) of SUM-F obtained by Nonstandard Universal Elimination is true; that an instance (46) of SUM-F being true means that for any variable

{\hat{f}}_{α^{+}}

ranging over a family of ur-functions indexed in

\hat{X}

, Formula (48) is true; and that a non-classical Formula (48) with an occurrence of a variable

{\hat{f}}_{α^{+}}

being true means that, after applying Rule-C, the schema of standard formulas (52) obtained by Conjunctive Operator Elimination is true—one true standard formula obtains for every ur-function

{\hat{u}}_{{\hat{α}}^{+}}

in the range of the variable

{\hat{f}}_{α^{+}}

. □

This concludes the axiomatic introduction of the non-classical theory

T

. Since we are primarily interested in the theorems that can be derived from the axioms of

T

, no rules have been given for the introduction of (multiple) quantifiers or conjunctive operators. Below such rules are given for the sake of completeness, but these will not be discussed. (These rules are not a part of the current axiomatic system, but we can consider adding these rules if we want to extend the current system.)

Inference Rule 5.

Conjunctive Operator Introduction:

{[I (α) \ α] Ψ (α)}_{I (α) \in \hat{X}} ⊢ ⋀_{α \in \hat{X}} Ψ (α)

(60)

where

Ψ (α)

is a formula of the type

t : t^{'} \mapsto t^{''}

that is open in

α

, and

{[I (α) \ α] Ψ (α)}_{I (α) \in \hat{X}}

is a possibly infinite collection of formulas, each of which is obtained by interpreting the variable

α

as a constant

I (α) \in \hat{X}

and replacing

α

in

Ψ (α)

everywhere by

I (α)

. □

Note that the collection of formulas

{[I (α) \ α] Ψ (α)}_{I (α) \in \hat{X}}

in Equation (60) is itself not a well-formed formula of the language

L_{T}

, but each of the formulas in the collection is.

Inference Rule 6.

Multiple Universal Introduction:

Φ (⋀_{α \in \hat{X}} Ψ ({\hat{f}}_{α^{+}})) ⊢ {(\forall f_{α^{+}})}_{α \in \hat{X}} [f_{α^{+}} \ {\hat{f}}_{α^{+}}] Φ

(61)

where

Φ (⋀_{α \in \hat{X}} Ψ ({\hat{f}}_{α^{+}}))

denotes a formula

Φ

with a subformula

⋀_{α \in \hat{X}} Ψ ({\hat{f}}_{α^{+}})

(implying that

Ψ

is an atomic formula of the type

t : t^{'} \mapsto t^{''}

), and where the variable

{\hat{f}}_{α^{+}}

ranges over an arbitrary family of ur-functions indexed in

\hat{X}

. □

Inference Rule 7.

Multiple Existential Introduction:

Φ (⋀_{α \in \hat{X}} Ψ ({\hat{f}}_{α^{+}})) ⊢ {(\exists f_{α^{+}})}_{α \in \hat{X}} [f_{α^{+}} \ {\hat{f}}_{α^{+}}] Φ

(62)

where

Φ (⋀_{α \in \hat{X}} Ψ ({\hat{f}}_{α^{+}}))

denotes a formula

Φ

with a subformula

⋀_{α \in \hat{X}} Ψ ({\hat{f}}_{α^{+}})

(implying that

Ψ

is an atomic formula of the type

t : t^{'} \mapsto t^{''}

), and where the variable

{\hat{f}}_{α^{+}}

ranges over a specific family of ur-functions indexed in

\hat{X}

. □

Remark 20.

Nonstandard Universal Quantification, i.e., the rule

Ψ (\hat{X}) ⊢ \forall X [X \ \hat{X}] Ψ

(63)

for a non-classical formula

Ψ

with an occurrence of an arbitrary constant

\hat{X}

, and Nonstandard Existential Quantification, i.e., the rule

Ψ (\hat{X}) ⊢ \exists X [X \ \hat{X}] Ψ

(64)

for a non-classical formula

Ψ

with an occurrence of a specific constant

\hat{X}

, are the same as in the standard case, but with the understanding that upon quantification a multiple quantifier

{(ʞ f_{α^{+}})}_{α \in \hat{X}}

in

Ψ

becomes a generalized multiple quantifier

{(ʞ f_{α^{+}})}_{α \in X}

in

[X \ \hat{X}] Ψ

, and a conjunctive operator

⋀_{α \in \hat{X}}

with constant range in

Ψ

becomes a conjunctive operator

⋀_{α \in X}

with variable range in

[X \ \hat{X}] Ψ

. □

3. Discussion

3.1. Main theorems

Theorem 1.

Graph Theorem: for any set X and any function

f_{X}

with any codomain Y, there is a set Z that is precisely the graph of the function

f_{X}

—that is, there is a set Z whose elements are precisely the two-tuples

〈 α, β 〉

made up of arguments and images of the function

f_{X}

. In a formula:

\forall X \forall f_{X} \forall Y (f_{X} : X ↠ Y \Rightarrow \exists Z \forall ζ (ζ \in Z \Leftrightarrow \exists α \exists β (ζ = 〈 α, β 〉 \land f_{X} : α \mapsto β))))

(65)

□

Proof.

Let

\hat{X}

be an arbitrary set, and let the function

{\hat{f}}_{\hat{X}}

be an arbitrary function on

\hat{X}

. On account of GEN-F (Axiom 11), for any

α \in X

there is then precisely one

β

such that

{\hat{f}}_{\hat{X}} : α \mapsto β

. Using Definition 8, there exists then for each

α \in \hat{X}

a singleton

{〈 α, β 〉}^{+}

such that

{\hat{f}}_{\hat{X}} : α \mapsto β

. On account of the ur-function axiom (Axiom 18), there exists then also an ur-function

u_{α^{+}} : α^{+} ↠ {〈 α, β 〉}^{+}, u_{α^{+}} : α \mapsto 〈 α, β 〉

for each

α \in \hat{X}

. Thus, on account of SUM-F there is a sum function

{\hat{G}}_{\hat{X}}

with some codomain Z such that

{\hat{G}}_{\hat{X}}

maps every

α \in X

precisely to the two-tuple

〈 α, β 〉

for which

{\hat{f}}_{\hat{X}} : α \mapsto β

. On account of GEN-F, the codomain Z of

{\hat{G}}_{\hat{X}}

exists, and on account of Axiom 14 it is unique: this codomain is precisely the graph of

{\hat{f}}_{\hat{X}}

. Since

\hat{X}

and

{\hat{f}}_{\hat{X}}

were arbitrary, the Graph Theorem follows from universal generalization. □

In the intended model of

T

, there is thus no risk involved in identifying a function f with the individual

graph {(f)}_{dom (f)}

, where

graph (f)

is the graph of f and

dom (f)

the domain of f, cf. Remark 18.

Theorem 2.

Main Theorem: for any nonempty set X, if there is a functional relation

Φ (α, β)

that relates every

α

in X to precisely one

β

, then there is a function

F_{X}

with some codomain Y that maps every

η \in X

to precisely that

ξ \in Y

for which

Φ (η, ξ)

. In a formula, using the iota-operator:

\forall X \neq \emptyset (\forall α \in X \exists! β Φ (α, β) \Rightarrow \exists F_{X} \exists Y (F_{X} : X ↠ Y \land \forall η \in X (F_{X} : η \mapsto ı ξ Φ (η, ξ))))

(66)

□

Proof.

Let

\hat{X}

be an arbitrary nonempty set. Suppose then, that for every

α \in \hat{X}

we have precisely one

β

such that

Φ (α, β)

. On account of the ur-function axiom (Axiom 18), for an arbitrary constant

\hat{α} \in \hat{X}

there exists then also an ur-function

{\hat{u}}_{{\hat{α}}^{+}}

for which

{\hat{u}}_{{\hat{α}}^{+}} : \hat{α} \mapsto ı β Φ (\hat{α}, β)

(67)

Let the variable

{\hat{f}}_{α^{+}}

range over these ur-functions. We then deduce from SUM-F by applying Nonstandard Universal Elimination and subsequently Multiple Universal Elimination that

\exists F_{\hat{X}} \exists Y (F_{\hat{X}} : \hat{X} ↠ Y \land ⋀_{α \in \hat{X}} F_{\hat{X}} : α \mapsto ı β ({\hat{f}}_{α^{+}} : α \mapsto β))

(68)

By subsequently applying Rule-C and Conjunctive Operator Elimination we then deduce the schema

{\hat{F}}_{\hat{X}} : \hat{α} \mapsto ı β Φ (\hat{α}, β)

(69)

for the sum function

{\hat{F}}_{\hat{X}}

. Generalizing this schema we obtain

\forall η \in \hat{X} ({\hat{F}}_{\hat{X}} : η \mapsto ı ξ Φ (η, ξ)))

(70)

We thus obtain

\exists F_{\hat{X}} \exists Y (F_{\hat{X}} : \hat{X} ↠ Y \land \forall η \in \hat{X} (F_{\hat{X}} : η \mapsto ı ξ Φ (η, ξ))))

(71)

Since the functional relation was assumed, we get

\forall α \in \hat{X} \exists! β Φ (α, β) \Rightarrow Ψ

where

Ψ

is Formula (71). Since

\hat{X}

was an arbitrary nonempty set, we can quantify over nonempty sets. This gives precisely the requested Formula (66). □

Remark 21.

Theorem 2 is an infinite schema, with one formula for every functional relation

Φ

. The point is this: given a set X, on account of this theorem we can construct a function

f_{X}

by giving a function prescription—what we then actually do is define an ur-function for every

α \in X

; the function

f_{X}

then exists on account of SUM-F. Furthermore, by constructing the function we construct its graph, which exists on account of Theorem 1. Generally speaking, if we define an ur-function for each singleton

α^{+} \subset X

, then we do not yet have the graphs of these ur-functions in a set. However, in the present framework, the set of these graphs is guaranteed to exist. Ergo, giving a function prescription is constructing a set! □

3.2. Derivation of SEP and REP of ZF

We start by proving that the infinite axiom schema SEP of ZF is a theorem (schema) of our theory

T

:

Theorem 3.

Separation Axiom Scheme of ZF:

\forall X \exists Y \forall α (α \in Y \Leftrightarrow α \in X \land Φ (α))

Proof.

Let

\hat{X}

be an arbitrary set and let

Φ

be an arbitrary unary relation on

\hat{X}

. On account of the ur-function axiom (Axiom 18), for an arbitrary constant

\hat{α} \in \hat{X}

there exists then an ur-function

{\hat{u}}_{{\hat{α}}^{+}}

for which

\{\begin{matrix} {\hat{u}}_{{\hat{α}}^{+}} : \hat{α} \mapsto 1 i f Φ (\hat{α}) \\ {\hat{u}}_{{\hat{α}}^{+}} : \hat{α} \mapsto 0 i f \neg Φ (\hat{α}) \end{matrix}

(72)

Let the variable

{\hat{f}}_{η^{+}}

range over these ur-functions. On account of Theorem 2 we then get

\exists F_{\hat{X}} \forall η \in \hat{X} (F_{\hat{X}} : η \mapsto ı β ({\hat{f}}_{η^{+}} : η \mapsto β))

(73)

Let this sum function be designated by the constant

{\hat{F}}_{\hat{X}}

. On account of INV (Axiom 15), the inverse image set

{\hat{F}}_{\hat{X}}^{- 1} (1)

exists: we then have

\forall α (α \in {\hat{F}}_{\hat{X}}^{- 1} (1) \Leftrightarrow α \in \hat{X} \land Φ (α))

. Theorem 3 then obtains from here by existential generalization and universal generalization. □

Proceeding, we prove that the infinite axiom schema REP of ZF is a theorem (schema) of our theory

T

:

Theorem 4.

Replacement Axiom Scheme of ZF:

\forall X (\forall α \in X \exists! β Φ (α, β) \Rightarrow \exists Z \forall γ (γ \in Z \Leftrightarrow \exists ξ (ξ \in X \land Φ (ξ, γ))))

Proof.

Let

\hat{X}

be an arbitrary set and let there for every

α \in \hat{X}

be precisely one

β

such that

Φ (α, β)

. Then on account of Theorem 2, a sum function

{\hat{F}}_{\hat{X}}

exists for which

\forall α \in X ({\hat{F}}_{\hat{X}} : α \mapsto ı β Φ (α, β))

(74)

On account of Axiom 14, the codomain of

{\hat{F}}_{\hat{X}}

is the image set; denoting this by

{\hat{F}}_{\hat{X}} [\hat{X}]

we then have

\forall γ (γ \in {\hat{F}}_{\hat{X}} [\hat{X}] \Leftrightarrow \exists ξ (ξ \in X \land Φ (ξ, γ)))

(75)

Since the functional relation

Φ

on the arbitrary set

\hat{X}

was assumed, this is implied by

\forall α \in \hat{X} \exists! β Φ (α, β)

. We write out this implication: Theorem 4 then obtains by existential generalization and universal generalization. □

These two theorems’ schemas provide an argument for considering our theory

T

to be not weaker than ZF. However, we have strictly speaking not yet proven that every result in ZF about sets automatically translates to the present framework, in which not all things are sets.

Remark 22.

Should further research on

T

reveal unintended consequences that render it inconsistent or otherwise useless, there is still the possibility to remove SUM-F from

T

and add the above theorem schemas 3 and 4 as axioms to

T

. That still gives a theory—although a standard one with infinitely many axioms—that incorporates set theory and category theory into a single framework. □

3.3. Model theory

Definition 13.

A model

M

of the present theory

T

consists of the universe

| M |

of

M

, which is a concrete category made up of a nonempty collection of objects (sets) and a nonempty collection of arrows (functions on sets), and the language

L_{M}

of

M

, which is the language

L_{T}

of

T

extended with a constant for every object and for every arrow in

| M |

, such that the axioms of

T

are valid in

M

. □

In standard first-order logic it is well defined what it means that a formula is “valid” in a model M. This notion of validity translates to the framework of

T

for all standard formulas. However, it remains to be established what it means that SUM-F and non-classical consequences thereof are valid in a model

M

of

T

. Recall that symbols referring to individuals in

| M |

will be underlined to distinguish these individuals from constants of

T

.

Definition 14. (Validity of non-classical formulas):

(i): A sentence $\forall X \neq \emptyset Ψ$ with a non-classical subformula $Ψ$ , such as the sum function axiom, is valid in a model $M$ of $T$ if and only if for every assignment g that assigns an individual nonempty set $g (X) = \underset{̲}{X}$ in $| M |$ as a value to the variable X, $[\underset{̲}{X} \ X] Ψ$ is valid in $M$ ;
(ii): A sentence ${(\forall f_{α^{+}})}_{α \in \underset{̲}{X}} Φ$ with an occurrence of an individual nonempty set $\underset{̲}{X}$ of $| M |$ , such as an instance of SUM-F, is valid in a model $M$ of $T$ if and only if for every “team assignment” g that assigns an individual ur-function $g (f_{{\underset{̲}{α}}^{+}}) = {\underset{̲}{u}}_{{\underset{̲}{α}}^{+}}$ in $| M |$ as a value to each variable $f_{{\underset{̲}{α}}^{+}}$ semantically occurring in $Φ$ , the sentence $[{\underset{̲}{f}}_{α^{+}}^{g} \ f_{α^{+}}] Φ$ with the variable ${\underset{̲}{f}}_{α^{+}}^{g}$ ranging over the family of ur-functions ${({\underset{̲}{u}}_{α^{+}})}_{α \in \underset{̲}{X}}$ is valid in $M$ ;
(iii): A sentence $\exists t Υ$ with an occurrence of a simple variable t ranging over sets or over functions on a set $\underset{̲}{X}$ and with $Υ$ being a non-classical formula, such as the sentences that can be obtained by successively applying Nonstandard Universal Elimination and Multiple Universal Elimination to SUM-F, is valid in a model $M$ of $T$ if and only if for at least one assignment g that assigns an individual function $g (t) = {\underset{̲}{F}}_{\underset{̲}{X}}$ or an individual nonempty set $g (t) = \underset{̲}{Y}$ as value to the variable t, the sentence $[g (t) \ t] Υ$ is valid in $M$ ;
(iv): A sentence $⋀_{α \in \underset{̲}{X}} Ψ ({\underset{̲}{f}}_{α^{+}}, α)$ is valid in a model $M$ of $T$ if and only if for every assignment g that assigns an individual ur-function $g ({\underset{̲}{f}}_{α^{+}}) = {\underset{̲}{u}}_{{\underset{̲}{α}}^{+}}$ from the range ${({\underset{̲}{u}}_{α^{+}})}_{α \in \underset{̲}{X}}$ of the variable ${\underset{̲}{f}}_{α^{+}}$ and an individual $\underset{̲}{α}$ as values to the variables ${\underset{̲}{f}}_{α^{+}}$ and $α$ respectively, the sentence $[\underset{̲}{α} \ α] [{\underset{̲}{u}}_{{\underset{̲}{α}}^{+}} \ {\underset{̲}{f}}_{α^{+}}] Ψ$ is valid in $M$ .

This defines the validity of the non-classical formulas that can be deduced from SUM-F in terms of the well-established validity of standard first-order formulas. □

Proposition 1.

If

T

has a model

M

, then

M

is not countable.

Proof.

Suppose

T

has a model

M

, and

M

is countable. That means that there are only countably many subsets of

N = {0, 1, 2, \dots}

in

M

, and that the powerset

P (N)

in

M

contains those subsets: we thus assume that there are subsets of

N

that are “missing” in

M

. Let

\underset{̲}{A}

be any subset of

N

that is not in

M

, and let

\underset{̲}{h} \in \underset{̲}{A}

. All numbers

0, 1, 2, \dots

are in

M

(including

\underset{̲}{h}

), so for an arbitrary number

\underset{̲}{n} \in N

there is thus on account of the ur-function axiom (Axiom 18) an ur-function on

{\underset{̲}{n}}

that maps

\underset{̲}{n}

to

\underset{̲}{n}

and an ur-function on

{\underset{̲}{n}}

that maps

\underset{̲}{n}

to

\underset{̲}{h}

. Since

N

is in

M

, we get on account of SUM-F and Nonstandard Universal Elimination that

⊧_{M} {(\forall f_{p^{+}})}_{p \in N} \exists F_{N} \exists Y (F_{N} : N ↠ Y \land ⋀_{p \in N} F_{N} : p \mapsto ı q (f_{p^{+}} : p \mapsto q))

(76)

Equation (76) being valid in

M

means thus that for any team assignment g, there is a sum function

{\underset{̲}{F}}_{N}^{g}

in

| M |

for which

\forall p \in N ({\underset{̲}{F}}_{N}^{g} : p \mapsto ı q ({\underset{̲}{f}}_{p^{+}}^{g} : p \mapsto q))

(77)

where the variable

{\underset{̲}{f}}_{p^{+}}^{g}

ranges over ur-functions

g (f_{0^{+}}), g (f_{1^{+}}), g (f_{2^{+}}), \dots

That said, the crux is that there is a team assignment

g^{*}

which assigns to the variables

f_{0^{+}}, f_{1^{+}}, f_{2^{+}}, \dots

, the constants

g^{*} (f_{0^{+}}), g^{*} (f_{1^{+}}), g^{*} (f_{2^{+}}), \dots

such that for all

n \in ω

we have

\{\begin{matrix} g^{*} (f_{n^{+}}) : n \mapsto n i f n \in \underset{̲}{A} \\ g^{*} (f_{n^{+}}) : n \mapsto \underset{̲}{h} i f n \notin \underset{̲}{A} \end{matrix}

(78)

To see that, note that it is a certainty that there is at least one team assignment

g^{0}

such that Equation (78) is satisfied for

n = 0

. Namely, there is (i) at least one team assignment g such that

g (f_{0^{+}}) = {〈 0, 0 〉}_{{0}}

so that

g (f_{0^{+}}) : 0 \mapsto 0

, and (ii) at least one team assignment

g^{'}

such that

g^{'} (f_{0^{+}}) = {〈 0, \underset{̲}{h} 〉}_{{0}}

so that

g^{'} (f_{0^{+}}) : 0 \mapsto \underset{̲}{h}

. Now assume that there is a team assignment

g^{k}

such that Equation (78) is satisfied for

n \in {0, 1, \dots, k}

: it is then a certainty that there is at least one team assignment

g^{k + 1}

such that Equation (78) is satisfied for

n \in {0, 1, \dots, k + 1}

. Namely, there is (i) at least one team assignment

g_{k + 1}^{k}

such that for

n \in {0, 1, \dots, k}

we have

g_{k + 1}^{k} (f_{n^{+}}) = g^{k} (f_{n^{+}})

and such that for

n = k + 1

we have

g_{k + 1}^{k} (f_{n^{+}}) = {〈 n, n 〉}_{{n}}

so that

g_{k + 1}^{k} (f_{{(k + 1)}^{+}}) : k + 1 \mapsto k + 1

, and (ii) at least one team assignment

g_{\underset{̲}{h}}^{k}

such that for

n \in {0, 1, \dots, k}

we have

g_{\underset{̲}{h}}^{k} (f_{n^{+}}) = g^{k} (f_{n^{+}})

but such that for

n = k + 1

we have

g_{\underset{̲}{h}}^{k} (f_{{(n)}^{+}}) = {〈 n, \underset{̲}{h} 〉}_{{n}}

so that

g_{\underset{̲}{h}}^{k} (f_{{(k + 1)}^{+}}) : k + 1 \mapsto \underset{̲}{h}

. By induction, there is thus a team assignment

g^{*}

such that Equation (78) is satisfied for all

n \in ω

. Ergo, there is a sum function

{\underset{̲}{F}}_{ω}^{*}

for which

\forall p \in ω ({\underset{̲}{F}}_{ω}^{*} : p \mapsto g^{*} (f_{p^{+}}) (p))

(79)

However, then we get

{\underset{̲}{F}}_{ω}^{*} [ω] = \underset{̲}{A}

, so

\underset{̲}{A}

is in

M

, contrary to what was assumed. Ergo, if

T

has a model, it is not countable. □

Remark 23.

Proposition 1 is a significant result that does not hold in ZF: given Theorems 3 and 4, this provides an argument for considering the present theory

T

to be stronger than ZF. The crux here is that the non-classical sentence (76) has to be valid in

M

: the notion of validity of Definition 14 entails that there are uncountably many variables

{\underset{̲}{f}}_{p^{+}}^{g}

, ranging over a family of individual ur-functions indexed in

N

, in the language of the model. As a result, the subsets of

N

that can be constructed within the model are non-denumerable—a model of

T

in which Multiple Universal Elimination, inference rule 2, applies for at most countably many variables

{\underset{̲}{f}}_{p^{+}}^{g}

is thus nonexistent. Thus speaking, the Löwenheim–Skolem theorem does not apply because

T

is a non-classical first-order theory, meaning that

T

cannot be reformulated as a standard first-order theory. □

Proposition 2.

T

is not a second-order theory.

Proof.

Let us assume that

T

is a second-order theory. That is, let us assume that the use of a multiple quantifier

{(\forall f_{{α}})}_{α \in \hat{X}}

amounts to second-order quantification. With such a multiple quantifier, we de facto quantify over all functional relations on the set

\hat{X}

—note, however, that we do not quantify over all functional relations on the universe of sets! However, this has an equivalent in ZF: if

\hat{X}

is a constant (a set), and

{\hat{Y}}^{\hat{X}}

is the set of all functions from

\hat{X}

to a set

\hat{Y}

, then with the quantifier

\forall B \forall f \in B^{\hat{X}}

in a sentence

\forall B \forall f \in B^{\hat{X}} Ψ

(80)

we de facto quantify over all functional relations on the set

\hat{X}

too. Ergo, if

T

is a second-order theory, then ZF is a second-order theory too. However, ZF is a first-order theory and not a second-order theory. So by modus tollens,

T

is not a second-order theory. □

As an additional heuristic argument, we can also directly compare second-order quantification and the present non-classical first-order quantification with a multiple quantifier

{(\forall f_{{α}})}_{α \in \hat{X}}

. Let us first look at second-order quantification with a quantifier

\forall Φ

where the variable

Φ

ranges over functional relations on the class of sets. An arbitrary individual functional relation

\hat{Φ}

has the entire proper class of things as its “domain”, so

\hat{Φ}

corresponds to a proper class of ur-functions: for an arbitrary thing

\hat{α}

there is a

\hat{Φ}

-related ur-function

{\hat{u}}_{{\hat{α}}^{+}}

for which

{\hat{u}}_{{\hat{α}}^{+}} : \hat{α} \mapsto ı β \hat{Φ} (\hat{α}, β)

. A quantifier

\forall Φ

is thus equivalent to a proper class of simple quantifiers

f_{{\hat{α}}^{+}}

ranging over ur-functions on the singleton of a thing

\hat{α}

. The universe of

T

, however, does not contain a set

\hat{U}

of all things, so there is no multiple quantifier

{(\forall f_{α^{+}})}_{α \in \hat{U}}

which would be equivalent to quantifier

\forall Φ

: a multiple quantifier

{(\forall f_{α^{+}})}_{α \in \hat{X}}

is at most equivalent to an infinite set of simple quantifiers

f_{{\hat{α}}^{+}}

and the degree of infinity is then bounded by the notion of a set. Thus, since a set does not amount to a proper class, a multiple quantifier

{(\forall f_{α^{+}})}_{α \in \hat{X}}

does not amount to second-order quantification. See Figure 2 for an illustration.

This heuristic argument can also be formulated in another way. Let

\hat{U}

and

\hat{X}

be two nonempty disjoint sets, and let us then compare a truly second-order quantification

\forall Φ

over all functional relations on the class of sets with our non-classical first-order quantification with a multiple quantifier

{(\forall f_{{α}})}_{α \in \hat{X}}

. In the first case, given any individual functional relation

\hat{Φ}

we can construct image sets of both

\hat{U}

and

\hat{X}

that contain precisely the images of the elements of

\hat{U}

c.q.

\hat{X}

under the functional relation

\hat{Φ}

. In the second case, however, given any family of ur-functions

{({\hat{f}}_{{α}})}_{α \in \hat{X}}

we can construct the image set of

\hat{X}

that for each element

α

of

\hat{X}

contains the image of

α

under the ur-function

{\hat{f}}_{α^{+}}

, but we cannot construct any image set of

\hat{U}

. That shows that our non-classical first-order quantification is not the same as second-order quantification.

3.4. The Axioms of Category Theory

In Definition 1 it has been assumed that the universe of sets and functions is a category. In this section we prove that the axioms of category theory for the arrows indeed hold for the functions (which are the arrows of the present category). That means that we must prove the following:

(i): That domain and codomain of any function on any set are unique;
(ii): That, given sets X and Y and functions $f_{X}$ and $g_{Y}$ with $Y = f_{X} [X]$ , there is a function $h_{X} = g_{Y} \circ f_{X}$ such that $h_{X}$ maps every $α \in X$ to the image under $g_{Y}$ of its image under $f_{X}$ ;
(iii): That for any set X there is a function $1_{X}$ such that $f_{X} \circ 1_{X} = f_{X}$ and $1_{f_{X} [X]} \circ f_{X} = f_{X}$ for any function $f_{X}$ on X.

Ad(i): domain and codomain of a function

f_{X}

are unique

This has already been proven in Section 2.3. GEN-F (Axiom 11) guarantees that for any set X, any function

f_{X}

has at least one domain and at least one codomain. Axiom 12 guarantees that no other thing (set or function) than X is a domain of

f_{X}

. Furthermore, Axiom 14 guarantees that no other thing (set or function) than the image set

f_{X} [X]

is a codomain of

f_{X}

. That proves uniqueness of domain and codomain.

Ad(ii): existence of the composite of two functions

Given a set X, a function

f_{X}

and a function

g_{Y}

with

Y = f_{X} [X]

, there is for every

α \in X

precisely one ur-function

h_{α^{+}}

such that

h_{α^{+}} : α \mapsto ı β (g_{Y} : f_{X} (α) \mapsto β)

(81)

So, there is a sum function

H_{X}

that maps each

α \in X

precisely to its image under the ur-function

h_{α^{+}}

for which

ı ξ (h_{α^{+}} : α \mapsto ξ) = ı β (g_{Y} : f_{X} (α) \mapsto β)

. This sum function is precisely the composite

H_{X} = g_{Y} \circ f_{X}

. The proof that function composition is associative is omitted.

Ad(iii): existence of an identity function on any set X

Given a set X, there is for every

α \in X

precisely one ur-function

1_{α^{+}}

for which

1_{α^{+}} : α \mapsto α

(82)

Therefore, there is a sum function

F_{X}

that maps every

α \in X

to

ı ξ (1_{α^{+}} : α \mapsto ξ) = α

. This sum function is the requested function

1_{X}

. The proof that this sum function

1_{X}

satisfies the properties that

g_{X} \circ 1_{X} = g_{X}

and

1_{g_{X} [X]} \circ g_{X} = g_{X}

for any function

g_{X}

on X, is omitted.

This shows that the axioms for a category hold for the proper class of functions.

3.5. Concerns Regarding Inconsistency

We address the main concerns regarding inconsistency, which are in particular that the existence of a set of all sets or a set of all functions can be derived from

T

.

Conjecture 1.

The category of sets and functions does not contain a set of all sets.

Heuristic argument:

A set

\hat{U}

of all sets does not exist (i) because REG (Axiom 9) excludes that

\hat{U}

exists a priori, and (ii) because SUM-F, the only constructive axiom of

T

that is not a theorem of ZF, excludes that

\hat{U}

exists by construction. The crux is that one must first have constructed the set X before one can construct a sum function

F_{X}

: the set X is thus a regular set, and by applying SUM-F one cannot create a new set with a higher cardinality than the set X because the graph of

F_{X}

contains precisely one element for each element of X. It is the same for the image set

F_{X} [X]

: it cannot have a higher cardinality than X. Therefore, SUM-F does not allow the construction of a set

\hat{U}

of all sets. □

Conjecture 2.

The category of sets and functions on sets does not contain a set

\hat{Ω}

of all functions.

Heuristic argument:

Suppose that we have a set

\hat{Ω}

such that

\forall X \forall f_{X} (f_{X} \in \hat{Ω})

(83)

Then on account of Theorem 3 we can single out the subset

{\hat{Ω}}_{1}

of all identity functions:

\forall α (α \in {\hat{Ω}}_{1} \Leftrightarrow α \in \hat{Ω} \land \exists X (α = 1_{X}))

(84)

However, then there also exists the identity function

1_{{\hat{Ω}}_{1}}

, for which

1_{{\hat{Ω}}_{1}} : {\hat{Ω}}_{1} ↠ {\hat{Ω}}_{1}, 1_{{\hat{Ω}}_{1}} : 1_{{\hat{Ω}}_{1}} \mapsto 1_{{\hat{Ω}}_{1}}

(85)

This latter feature that

1_{{\hat{Ω}}_{1}}

maps itself to itself contradicts the axiom of regularity for functions, REG-F (Axiom 19). Ergo, there is no set

\hat{Ω}

of all functions. □

4. Concluding Remarks

4.1. Limitations of the Present Study

First of all, the syntax of the formal language for the theory

T

has been defined in such a way that the axioms of

T

are well-formed formulas. While this definition of the syntax has been checked for obvious mistakes, a limitation of the present study is that it has not been checked exhaustively for unintended consequences. That is, it may turn out that details of the present definition of the syntax require revision to avoid the situation that this or that “weird” formula becomes a well-formed formula.

As to the axioms of

T

, the axiom of regularity for functions, Axiom 19, has been formulated to rule out the existence of the pathological functions mentioned in its formulation. While the existence of certain other pathological objects, such as a set X and a function

f_{X}

for which

X = {f_{X}}

, is also ruled out as a corollary of this axiom, it cannot be excluded that a creative mind can come up with even more pathological objects that are not ruled out by the two axioms of regularity of

T

. That is, it may turn out that the axioms of regularity require revision to avoid the situation that

T

has a model in which certain pathological objects exist a priori.

Furthermore, since

T

is a non-classical theory there is the obvious risk that

T

is not (relatively) consistent. While the theory

T

has been checked for the most obvious concerns regarding inconsistency—we have argued that the category of sets and functions cannot contain a set of all sets nor a set of all functions—it has not been checked exhaustively for inconsistency. That is, further research may reveal that

T

has unintended consequences which render it inconsistent. In such a case the approach would be to resolve the inconsistency by a revision of the axioms of

T

. However, if that fails, we still have the prospect of a fallback position: as outlined in Remark 22, we can remove the non-classical part of

T

and add the infinite theorem schemas for SEP and REP as axioms; the standard first-order theory thus obtained then no longer solves the problem identified in the introduction, but it still incorporates category theory and set theory in a single framework.

A further limitation of this study is that the axiom of choice has been left out. We can easily express AC in the language

L_{T}

as

\forall X \neq \emptyset (Θ (X) \Rightarrow \exists f_{X} \forall Z \in X \forall γ (f_{X} : Z \mapsto γ \Rightarrow γ \in Z)

(86)

where

Θ (X)

stands for

\forall α \in X \exists Y (α = Y \land \exists η (η \in Y)) \land \forall U \in X \forall V \in X \neg \exists β (β \in U \land β \in V)

(87)

However, the question is whether this has to be added as an axiom or whether it can be derived as a theorem of

T

—we certainly have for any

Z \in X

that there is an ur-function

f_{Z^{+}}

such that

ı ξ (f_{Z^{+}} : Z \mapsto ξ) \in Z

. We leave this as a topic for further research.

Last but not least, another limitation of the present study is that the metamathematics of

T

have not been studied. That is, it has not been investigated whether the calculus has any of the various soundness and completeness properties. This is left as a topic for further research—(dis-)proving that these properties hold is a sizeable research project in itself.

4.2. Aesthetic Counterarguments

The present theory

T

gives immediate rise to at least three purely aesthetical arguments for rejection.

First of all, the universe of

T

contains sets and functions, the latter being objects sui generis: this entails a departure from the adage “everything is a set” that holds in the framework of ZF(C) and that will be enough to evoke feelings of dislike among mathematical monists who hold the position that set theory, in particular ZF or ZFC, has to be the foundation for mathematics.

Secondly, although the universe of

T

is a category, the formal language of

T

contains ∈-relations

t_{1} \in t_{2}

as atomic formulas: an ∈-relation is, thus, not reduced to a mapping in the language of category theory and that fact alone will be enough to evoke feelings of dislike among mathematical monists who hold the position that category theory has to be the foundation for mathematics.

Thirdly, the language of

T

entails a rather drastic departure from standard first-order language: that will be enough to evoke feelings of dislike among those who attach a notion of beauty to the standard first-order language of ZF(C) or who consider that the language of category theory is all of the language of mathematics.

These arguments can be dismissed as nonmathematical, but those who experience these feelings of dislike will nevertheless reject the theory

T

straightaway as not suitable as a foundational theory for mathematics.

4.3. Main Conclusions

The main conclusion is that the aim stated in the introduction has been achieved: a theory

T

, with a vocabulary containing countably many constants, has been introduced which lacks the two unwanted c.q. pathological features of ZF. Each axiom of

T

is a typographically finite sentence, so contrary to what is the case with infinitary logics, each axiom can be written down explicitly. However, not just that:

T

is finitely axiomatized, so contrary to what is the case with ZF, the entire theory

T

can be written down explicitly on a piece of paper. In addition, it has been shown that

T

, contrary to ZF, does not have a countable model—if it has a model at all, that is. This failure of the downward Löwenheim–Skolem theorem for

T

is due to the non-classical nature of

T

.

Furthermore, three reasons can be given as to why

T

might be potentially applicable as a foundational theory for mathematics. First of all, it has been proven that the axioms of ZF, translated in the language

L_{T}

of

T

, can be derived from

T

. While we acknowledge that this result does not automatically imply that all theorems about sets derived within the framework of ZF are necessarily also true in the framework of

T

because in the latter framework sets exist whose elements are not sets, it nevertheless shows that the tools available in the framework ZF for constructing sets are also available in the framework of

T

. Secondly, it has been proved that the axioms of a category hold for the universe of discourse that is associated to

T

, which is a category of sets and functions: this universe might then serve as the ontological basis for the various (large) categories studied in category theory. Thirdly,

T

is easy to use in everyday mathematical practice because for any set X we can construct a function

f_{X}

by giving a defining function prescription

f_{X} : X ↠ Y, f_{X} : α \overset{def}{⟼} ι β Φ (α, β)

where

Φ

is some functional relation:

T

then guarantees that

F_{X}

exists, as well as its graph, its image set, and the inverse image sets for every element of its codomain—ergo, giving a defining function prescription is a tool for constructing sets.

That being said, there are also arguments why

T

cannot be generally accepted as a new foundational theory for mathematics right now by a collective act of instant rationality. The strongest of these may be that the methamathematics have not been studied: contrary to the purely aesthetic arguments against

T

, negative results in that direction may yield a decisive argument to reject

T

as not suitable as a foundational theory for mathematics. Therefore, this topic should be given highest priority in further research; the prospect is that it should become clear within a few years whether or not the various soundness and completeness properties apply.

The bottom line is that the present results are rather avant-garde and that further research is necessary to establish whether the non-classical theory

T

introduced in this paper constitutes an advancement in the foundations of mathematics. The proven fact that

T

lacks the pathological features of ZF may provide a reason for such further research, but it is emphasized that it may turn out to be a dead end. That is to say: the present marriage of set theory and category theory—as we called it in the introduction—may look promising from a certain perspective, but it still may end in divorce.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The author wants to thank Harrie de Swart (Tilburg University), Jean Paul van Bendegem, and Colin Rittberg (both Free University of Brussels) for their useful comments. This research has been facilitated by the Foundation Liberalitas (The Netherlands).

Conflicts of Interest

The author declares no conflict of interest.

References

Zermelo, E. Untersuchungen über die Grundlagen der Mengenlehre. I. Math. Ann. 1908, 65, 261–281. [Google Scholar] [CrossRef]
Löwenheim, L. Über die Auflösung von Gleichungen im logischen Gebietekalkul. Math. Ann. 1910, 68, 169–207. [Google Scholar] [CrossRef]
Skolem, T. Einige Bemerkungen zur axiomatischen Begrundung der Mengenlehre. In Proceedings of the 5th Scandinavian Congress of Mathematicians, Helsinki, Finland, 4–7 July 1922; pp. 217–232. [Google Scholar]
Ebbinghaus, H.D. Ernst Zermelo: An Approach to His Life and Work; Springer: Berlin, Germany, 2007. [Google Scholar]
Mendelson, E. An Introduction to Mathematical Logic, 4th ed.; Chapman & Hall: London, UK, 1997. [Google Scholar]
Van Dalen, D.; Ebbinghaus, H.D. Zermelo and the Skolem Paradox. Bull. Symb. Log. 2000, 6, 145–161. [Google Scholar] [CrossRef]
Montague, R. Contributions to the Axiomatic Foundations of Set Theory. Ph.D. Thesis, University of California, Berkeley, CA, USA, 1957. [Google Scholar]
Bagaria, J.; Koelner, P.; Hugh Woodin, W. Large cardinals beyond choice. Bull. Symb. Log. 2019, 25, 283–318. [Google Scholar] [CrossRef]
Gitman, V.; Hamkins, J.D.; Holy, P.; Schlicht, P.; Williams, K.J. The exact strength of the class forcing theorem. J. Symb. Log. 2020, 85, 869–905. [Google Scholar] [CrossRef]
Barton, N.; Caicedo, A.E.; Fuchs, G.; Hamkins, J.D.; Reitz, J.; Schindler, R. Inner-Model Reflection Principles. Stud. Log. 2020, 108, 573–595. [Google Scholar] [CrossRef]
Lawvere, F.W. An Elementary Theory of the Category of Sets. Proc. Natl. Acad. Sci. USA 1964, 52, 1506–1511. [Google Scholar] [CrossRef] [PubMed]
Lawvere, F.W. The Category of Categories as a Foundation of Mathematics. In Proceedings of the Conference on Categorical Algebra, La Jolla, CA, USA, 7–12 June 1965; Eilenberg, S., Harrison, D.K., Röhrl, H., MacLane, S., Eds.; Springer: Berlin, Germany, 1966; pp. 1–20. [Google Scholar]
Mayberry, J. What is required of a foundation for mathematics? Philos. Math. 1994, 2, 16–35. [Google Scholar] [CrossRef]
Landry, E. Category Theory: The Language of Mathematics. Philos. Sci. 1999, 66, S14–S27. [Google Scholar] [CrossRef]
Gamut, L.T.F. Logic, Language and Meaning; Chicago University Press: Chicago, IL, USA, 1991; Volume 1, p. 71. [Google Scholar]
Cabbolet, M.J.T.F. Finitely Axiomatized Set Theory: A nonstandard first-order theory implying ZF. arXiv 2014, arXiv:1402.1017. [Google Scholar]
Bernays, P. Axiomatic Set Theory; Dover Publications Inc.: Mineola, NY, USA, 1968. [Google Scholar]
Bell, J.L. Infinitary Logic. In The Stanford Encyclopedia of Philosophy, Winter 2016 ed.; Zalta, E.N., Ed.; 2016; Available online: https://plato.stanford.edu/archives/win2016/entries/logic-infinitary/ (accessed on 12 June 2021).
Stronkowski, M.M. Axiomatizations of universal classes through infinitary logic. Algebra Univers 2018, 79, 26. [Google Scholar] [CrossRef]
Džamonja, M. Chain Logic and Shelah’s Infinitary Logic. arXiv 2020, arXiv:1908.01177v3. [Google Scholar]
Cabbolet, M.J.T.F. The Importance of Developing a Foundation for Naive Category Theory. Thought 2015, 4, 237–242. [Google Scholar] [CrossRef]
Blizard, W.D. Multiset theory. Notre Dame J. Form. Log. 1989, 30, 36–66. [Google Scholar] [CrossRef]
Levy, A. Basic Set Theory; Springer: Berlin, Germany, 1979; p. 52. [Google Scholar]
Wittgenstein, L. Tractatus Logico-Philosophicus; Kegan Paul, Trench, Trubner & Co.: London, UK, 1922; p. 3.333. [Google Scholar]
Muller, F.A. Sets, Classes, and Categories. Br. J. Philos. Sci. 2001, 52, 539–573. [Google Scholar] [CrossRef]

Figure 1. Venn diagram of the Siamese twin functions

f_{X}

and

h_{Y}

. The left oval together with the black point inside it is a Venn diagram representing the singleton

{f_{X}}

; the right oval together with the black point inside it is a Venn diagram representing the singleton

{h_{y}}

. The upper arrow represents the mapping of

f_{X}

to

h_{Y}

by

h_{Y}

, the lower arrow the mapping of

h_{Y}

to

f_{X}

by

f_{X}

.

Figure 1. Venn diagram of the Siamese twin functions

f_{X}

and

h_{Y}

. The left oval together with the black point inside it is a Venn diagram representing the singleton

{f_{X}}

; the right oval together with the black point inside it is a Venn diagram representing the singleton

{h_{y}}

. The upper arrow represents the mapping of

f_{X}

to

h_{Y}

by

h_{Y}

, the lower arrow the mapping of

h_{Y}

to

f_{X}

by

f_{X}

.

Figure 2. Illustration of the heuristic argument. In both diagrams (a,b), all things in the universe of T are for illustrative purposes represented on the horizontal and vertical axes. In diagram (a), the dotted black line represents an arbitrary functional relation

\hat{Φ}

: each dot corresponds to a constant ur-function as indicated, so the dotted line is equivalent to a proper class of ur-functions. In diagram (b) it is indicated of which things on the horizontal axis the set

\hat{X}

is made up, and each of the black dots within the red oval corresponds to a constant ur-functions: the dotted line segment is thus equivalent to a set of ur-functions. So, a multiple quantifier

{(\forall f_{α^{+}})}_{α \in \hat{X}}

cannot be equivalent to a quantifier

\forall Φ

.

Figure 2. Illustration of the heuristic argument. In both diagrams (a,b), all things in the universe of T are for illustrative purposes represented on the horizontal and vertical axes. In diagram (a), the dotted black line represents an arbitrary functional relation

\hat{Φ}

: each dot corresponds to a constant ur-function as indicated, so the dotted line is equivalent to a proper class of ur-functions. In diagram (b) it is indicated of which things on the horizontal axis the set

\hat{X}

is made up, and each of the black dots within the red oval corresponds to a constant ur-functions: the dotted line segment is thus equivalent to a set of ur-functions. So, a multiple quantifier

{(\forall f_{α^{+}})}_{α \in \hat{X}}

cannot be equivalent to a quantifier

\forall Φ

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cabbolet, M.J.T.F. A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory. Axioms 2021, 10, 119. https://doi.org/10.3390/axioms10020119

AMA Style

Cabbolet MJTF. A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory. Axioms. 2021; 10(2):119. https://doi.org/10.3390/axioms10020119

Chicago/Turabian Style

Cabbolet, Marcoen J. T. F. 2021. "A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory" Axioms 10, no. 2: 119. https://doi.org/10.3390/axioms10020119

APA Style

Cabbolet, M. J. T. F. (2021). A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory. Axioms, 10(2), 119. https://doi.org/10.3390/axioms10020119

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Finitely Axiomatized Non-Classical First-Order Theory Incorporating Category Theory and Axiomatic Set Theory

Abstract

1. Introduction

1.1. Motivation

1.2. Related Works

1.3. Informal Overview of the Main Result

1.4. Point-by-Point Overview

2. Axiomatic Introduction

2.1. Formal language

2.2. Set-Theoretical Axioms

2.3. Standard Function-Theoretical Axioms

2.4. The Non-Classical Function-Theoretical Axiom and Inference Rules

3. Discussion

3.1. Main theorems

3.2. Derivation of SEP and REP of ZF

3.3. Model theory

3.4. The Axioms of Category Theory

3.5. Concerns Regarding Inconsistency

4. Concluding Remarks

4.1. Limitations of the Present Study

4.2. Aesthetic Counterarguments

4.3. Main Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI