1. Introduction
The term malware is used to characterize any type of malign or malicious software, regardless of how it works, its intent, or how it is distributed. There are several types of malware such as virus, worms, ransomware, scareware, adware, spyware, and fileless malware, each with its own characteristics, operation, and spread strategy.
In this paper, the problem of algorithmically categorizing programs as either malware or non-malware using deductive procedures and formal proof systems is addressed. Our research focuses on examining the deductive capabilities of formal systems in the context of generating theorems capable of classifying all computer programs as either malware or non-malware. It is based on the fundamental principles of computability and recursive function theory, which examine the theoretical solvability of problems using the model of the universal Turing Machine as a model of effective computation. The foundations of these theories were laid down in the seminal paper authored by Alan Turing (see [
1]).
Formal proofs regarding the inherent difficulty of algorithmically detecting malware in general, including malware, have been proposed in the past. These proofs specifically pertain to a critical category of such entities, namely computer viruses. A virus is a malware program, modeled as a Turing Machine, whose purpose is to replicate in other programs and, thus, propagate the infection. The formal investigation of the properties of computer programs that operate as viruses as well as the computational difficulty in detecting them was introduced in the mid-80s with the influential work of Fred Cohen (see [
2,
3]). Cohen first presented a precise, formal, definition of the distinctive behaviours exhibited by a virus computer program. He then proceeds to demonstrate that it is computationally impossible to construct a program (i.e., Turing Machine) that can identify viruses (or, more precisely, Turing Machines that operate as viruses).
In particular, Cohen defined a virus as a Turing Machine that creates a copy of itself into other Turing Machines or, in Turing Machine terminology, embeds its transition function into the transition function of its target. In this way, it can replicate itself indefinitely. Then Cohen proved that the fundamental undecidable problem, called the Halting Problem, of deciding whether a given Turing Machine halts on a given input, that is deciding the language corresponding to the Halting Problem, is reducible to the problem of deciding whether a given Turing Machine operates as a virus. In other words, the language is reducible to the “virus” language . As the language (problem) is undecidable, the language is also undecidable. Thus, Cohen proved that, in the context of his definition of a virus, it is impossible to construct a Turing Machine (program) that can detect any virus since if such a Turing Machine existed, it would lead to the decidability of the language which is, provably, undecidable.
In this paper, we propose a definition of a generic malware entity which is intuitive, precise, and reasonably restricted (so as to facilitate a theoretical analysis) in accordance with Cohen’s virus paradigm. We also deploy Turing Machine as the computational model of malware programs and, thus, we define malware as a program, or Turing Machine, that performs at least one action from a predetermined set of actions that define the behaviour of malware. These actions are referred to as “states” in the Turing Machine definition.
It is worth mentioning here that in the proposed model, malware behaviour is not manifested by merely locating the actions in the code of a program through, e.g., syntactic analysis or pattern matching. Rather, in our approach, only the actual execution of such actions is considered to manifest malware behaviour, as the threatening character of a malware’s actions becomes apparent only during or after their execution. Admittedly, malware behaviour can be complex and intricate, as in DoS (Denial of Service) attacks, or even be comprised of multiple combined steps (e.g., ransomware, which encrypts files on the victim’s computer and asks for a ransom in order to release the decryption key). Thus, the proposed model is rather limited in capturing all potential types of malware behaviour. Nevertheless, we opted for a simple malware model so that we could use deep and well-established results about undecidability provided by the Theory of Computation and, as a result, present preliminary theoretical results based on a widely accepted computational model and its capabilities. In essence, one can use our results to conclude that since detecting the proposed simple malware type is undecidable, then detecting more complex malware behaviour is also expected to be an undecidable problem.
2. Our Contributions
In the context outlined above, the main outcome of previous research efforts is that the malware/non-malware classification problem is characterized by theoretical undecidability or undecidability in principle. This indicates that no algorithm (Turing Machine) exists that is capable of determining for all programs whether they are malware or non-malware according to a given formal definition of a malware program. This was, first, shown in the pioneering work of Fred Cohen in [
2,
3], who considered the problem of detecting a virus through its self-replication behaviour. Although this result is fundamental and groundbreaking, it nevertheless does not provide deeper insights with respect to the nature of other potential malware behaviours as well as other aspects of the inherent undecidability of detecting, effectively, such behaviours.
The main goal of our work is to demonstrate, further to this, and similar, undecidability results that, given any anti-malware Turing Machine acting as a detector of malware behaviour, it is possible to construct, algorithmically and systematically, Turing Machines or programs that elude a definitive classification as either malware or non-malware by the given anti-malware program. In other words, it is possible to construct Turing Machines, both of the malware and non-malware variety, that elude definitive classification as either malware or non-malware regardless of the malware detection strategy adopted.
In essence, our whole approach is a humble example of results similar in nature to Gödel’s famous deep discovery about the incompleteness of consistent formal systems. In particular, Gödel’s First Incompleteness Theorem asserts that for any given consistent formal system powerful enough to be able to support statements about itself (e.g., formal systems that encompass Peano’s Arithmetic), there exist self-referential statements that cannot be proven true or false within the formal system. (For an accessible exposition of Gödel’s Incompleteness Theorems and their implications for formal systems, the reader is directed to [
4].) Moreover, these statements are actually true, albeit unprovable, only under the condition that the formal system in hand is consistent.
Motivated by the significant implications of the incompleteness theorem, we demonstrate that the Turing Machines which have an undecidable malware or non-malware characterization within a consistent formal system are, actually, non-malware. Nevertheless, it is impossible to establish this fact only through the application of the formal methods permitted by the formal system. In other words, we demonstrate that the Turing Machines under consideration operate as non-malware when the formal system is consistent.
Our departure point, however, is that we provide and discuss a fundamental shift from identifying malware through isolated operations or Turing Machine states to characterizing its behavior as sequences of, otherwise benign and fully legitimate actions (computer operations) or state transitions (sequences) within a Turing Machine framework. This perspective acknowledges that malware is not merely defined by singular malicious instructions but by the structured execution of a series of benign operations that collectively result in a harmful outcome. By defining malware in terms of sequences of state transitions, the detection problem becomes equivalent to the Halting Problem, which is provably undecidable. This undecidability emerges from the need to predict the entire execution evolution of a program rather than identifying specific instructions in isolation, thereby rendering any formal classification of malware fundamentally incomplete. Theorem 5 in
Section 5 shows, formally, that this approach preserves the undecidability status of the Halting Problem when sequences of actions are considered. Based on these findings, Theorem 7 proves the undecidability results for the malware classification problem, based on sequence of actions of Turing Machines.
This approach, as detailed in
Section 5, is more realistic than conventional static analysis methods because it reflects the dynamic execution of malware, which often involves obfuscation, delayed execution, and conditional triggers that cannot be captured through syntactic pattern recognition alone. Consequently, our approach also highlights an inherent limitation in malware detection systems, as their classification powers are confined by the undecidability of tracing malware behaviors across sequential computational states, leading to the inevitable evasion of certain threats.
With respect to our strategy, in order to investigate more deeply the undecidability properties of malware, we direct our attention towards formal systems, specifically those that possess the computational capacity to support statements regarding the problem of determining whether a particular Turing Machine halts or does not halt when provided with a specific input. We demonstrate that for any such consistent (i.e., it does not contain or lead to contradictions) formal system there exists an infinite recursively enumerable set of Turing Machines (i.e., programs) that provably cannot be classified, by the deductive powers of the formal system, either as malware or non-malware.
The approach of this paper, as theoretical and detached from reality as it may appear, nevertheless can be cast in a realistic context and applied in several real world settings, providing a framework for studying the fundamental powers and limitations of anti-malware programs as well as potential malware construction algorithms that result directly from these limitations, such as stealth or zero-day malware. To show this potential, let us consider a specific anti-malware program . As any specific program in any programming language, the functionality of can be simulated by a Turing Machine which we denote by . In turn, this Turing Machine can be converted into an equivalent formal system . Thus, the anti-malware program can be converted into an equivalent formal system with exactly the same deductive powers and limitations.
Based on this analogy, Theorem 8 provides an approach which, given a formal system (i.e., a model of an anti-malware Turing Machine or program) allows the construction of a Turing Machine (program) that cannot be classified either as malware or non-malware within this formal system. The basis for this result is the Turing Machines described in Theorem 3 and (as a genarilization) in Theorem 4, which can be algorithmically constructed for any given formal system, and their properties are unprovable within the system. Thus, in general, no formal system (i.e., anti-malware Turing Machine) can correctly classify all programs as malware or non-malware. This result implies that Turing Machines (actually, an infinity of them) can be effectively constructed to evade detection by a given formal (anti-malware) system. We also show that if the formal system is consistent, then these Turing Machines are non-malware but this is impossible to prove within the given formal system.
We next apply Theorem 10, stated in
Section 8, which reveals how one can construct an infinity of malware programs that evades the detection powers of the given formal system. This, in turn, can be used as a warning for the developers of
with respect to the identified limitations of their anti-malware program, which renders it incapable of detecting a wide range (actually an infinite number) of malware programs, as Corollary 4, also from
Section 8, states. Actually, these malware programs can be considered as zero-day malware exploits that take advantage of inherent vulnerabilities of
that are probably not known to their developers. This lack of knowledge is mostly due to the complexity of understanding, formally and fully, the workings of these programs which may even rely on heuristics and machine learning (and, thus, perhaps not fully understandable or predictable) approaches.
Moreover, our approach provides a systematic and fully automated way to uncover several zero-day vulnerabilities and reveal the exact form of the corresponding series of malware programs that exploit these vulnerabilities, for any given anti-malware software. More specifically, zero-day malware presents a significant challenge for antivirus programs and cybersecurity professionals. While antivirus vendors continuously update their detection methods and signature databases to address emerging threats, there is always a time window of risk between the discovery of a zero-day malware and the deployment of effective countermeasures. As such, organizations must adopt a multi-layered approach to cybersecurity, including proactive threat hunting, intrusion detection, network segmentation, and user awareness training, to mitigate the risks posed by zero-day malware. Zero-day vulnerabilities are unknown to software developers at the time of software release. These vulnerabilities exist in the software applications themselves, which is our case in this paper, in the operating system, or in other third-party applications. When a zero-day vulnerability is exploited by malware, it means that the antivirus software does not have a signature or detection mechanism in place to recognize and block that specific malware variant. As a result, the malware can bypass traditional antivirus defenses and infect systems undetected. However, it is worth noting that antivirus software vendors continuously update their detection algorithms and signatures to detect new malware variants, including zero-day threats. They do this by analyzing the patterns, behaviours, and characteristics of known malware samples and by employing advanced heuristics and machine learning techniques to identify suspicious or malware related activities (see [
5]). Despite these efforts, zero-day malware can still pose a significant challenge to antivirus software, especially in the early stages before the malware is detected and added to antivirus signature databases. This underscores the importance of implementing layered security measures, such as intrusion detection systems, network segmentation, and user education.Additionally, organizations should prioritize timely software patching and vulnerability management to minimize the potential impact of zero-day vulnerabilities on their systems.
Then, this paper generalizes these concepts to the definition of the obfuscator in the context of malware detection and classification. Specifically, it extends the given results on the undecidability of the malware/non-malware classification problem by demonstrating that specific types of malware can be constructed which exploit inherent formal system limitations through obfuscation mechanisms based exactly on these limitations. In
Section 9, Definition 11, we define a malware obfuscator as a special type of program that, due to unprovable properties within a formal system, enables the embedding of malware that “piggybacks” on the obfuscator to evade detection. This contribution is significant because it moves beyond traditional notions of obfuscation, which are often limited to syntactic transformations or encryption techniques, and instead frames obfuscation as a fundamental property of computation and formal reasoning. It is shown that within any given formal system, there exist programs that cannot be definitively classified as malware or non-malware due to the incompleteness of formal systems, an idea reminiscent of Gödel’s famous incompleteness theorems. This establishes that obfuscation is not only a practical challenge but a theoretical inevitability, reinforcing the argument that malware detection faces intrinsic computational limitations. The authors further illustrate that the act of detecting malware in an obfuscation-resistant setting is formally equivalent to the Halting Problem, making it undecidable. This insight deepens the understanding of why anti-malware technologies confront perhaps major difficulties against sophisticated threats, as even the most advanced detection mechanisms are fundamentally constrained by the limitations of the formal deductive system underlying a particular anti-malware strategy.
In addition, a concrete practical example of how obfuscators can be used to construct malware programs that evade formal classification is provided. This construction involves embedding a malware program within a specially designed obfuscator, which possesses certain properties unprovable within a given formal system modeling an anti-malware Turing Machine (or program). The key insight is that the obfuscator itself is not inherently malware, but it provides the computational context within which malware can hide and remain undetectable by the anti-malware Turing Machine. More specifically, based on obfuscators, a hiding malware can be constructed as follows, for a given anti-malware Turing Machine viewed as a formal system : We append (i) a Turing Machine (obfuscator) with a set of properties that cannot be proved within ; and (ii) a malware program that executes at least one sequence of actions defined as malicious. The composite Turing Machine activates the malware part conditionally, based on computational properties of the obfuscator Turing Machine that cannot be determined, however, within the given formal system. Thus, no anti-malware program operating under that system can conclusively prove whether the program is malware or benign. For concreteness, we provide a real example of an obfuscator within the formal system Zermelo–Fraenkel Set Theory with the Axiom of Choice or ZFC. More specifically, we deploy a 7910-state Turing Machine proposed in the literature that acts as an obfuscator for ZMC with respect to its Halting status. With this example, which is realizable in practice, we show that the undecidability status of malware classification is not merely a remote theoretical possibility but a real risk deeply rooted in the very limitations of formal reasoning.
With respect to our preliminary work in [
6], in the present paper we extend our previous results in order to address the construction of malware programs that can act like “stealth” malware. They can be embedded in special non-malware programs called obfuscators, which have certain unprovable properties within a formal system. Malware programs can hide themselves in obfuscators and remain undetectable, taking advantage of the unprovable properties of the obfuscators of formal systems.
We would like to emphasize, at this point, that the problems addressed in our work cannot be resolved directly using Rice’s Theorem (see [
7]), which is an extremely powerful tool for proving undecidability results. One reason is that the formulation of the malware detection problem is based on properties or actions of the targeted Turing Machines and not of the language they accept. Rice’s theorem can be used to derive undecidability results related to the properties of the languages accepted by Turing Machines. Moreover, Rice’s theorem cannot be deployed towards the algorithmic construction of specific Turing Machines having a concrete behaviour that is undecidable (malware detection, in our case).
Before we proceed, we should remark that theoretical impossibility does not imply impossibility in practice since Turing Machines are idealistic models of computers with unlimited computational resources. However, the finite nature of real computers and programs renders all undecidable problems decidable by simple (but highly inefficient in practice) brute force approaches. Thus, theoretical impossibility results may not translate readily into impossibility results in practice, which explains the fact that many anti-malware programs exist today that are very effective in detecting malware programs. For instance, some well-known methods for malware detection are signature-based scanning, heuristic analysis, real time behavioural monitoring solutions, and sandbox analysis. Novel approaches to malware detection and categorization using grayscale images of malware files and deep learning methods have been proposed (e.g., Convolutional Neural Networks in [
8]).
With respect to the organization of the rest of the paper, in
Section 3 we discuss papers related to the proposed framework and how our approach differs from them. In
Section 4, we provide the main elements of Recursive Function Theory which will be used to derive our results. In
Section 6, we prove that deciding whether a given program (Turing Machine) is a malware or non-malware program is undecidable. Furthermore, in
Section 7, we show that given any anti-malware program modeled as a Turing Machine or a formal system, we can construct, systematically, infinitely many programs (Turing Machines) that evade characterization as either malware or non-malware programs. In addition, in
Section 8, we show how to construct programs (Turing Machines) that are malware programs and which evade a formal characterization as such by the given anti-malware program. In
Section 9, based on the results of
Section 8, we provide a generic definition of such malware programs as obfuscators and point to specific, practical constructions existing in the related literature. Finally, in
Section 10 we summarise our results, while in
Section 11 we provide directions for further explorations of the power of anti-malware programs based on the theoretical framework proposed in our work.
3. Related Work
The approach presented in the paper extends and complements previous works on malware detection, particularly those focused on obfuscation, metamorphic transformations, and heuristic-based analysis. While earlier studies, such as the ones cited in this paper, explore practical challenges in malware detection, ranging from NP-completeness in static detection methods to statistical and machine learning-based identification techniques, our paper adopts a fundamentally different perspective by addressing the inherent theoretical limitations of malware classification within formal systems. Unlike studies that investigate heuristic and statistical models to enhance detection, our work establishes a rigorous computability-theoretic framework to prove that malware detection is not only computationally infeasible but formally undecidable. In particular, it builds on Cohen’s foundational work on malware undecidability by generalizing his virus definition to encompass all forms of malware behavior. Furthermore, it extends previous discussions on obfuscation by introducing a novel paradigm where malware exploits the unprovable properties of obfuscators, leveraging the incompleteness of formal systems to evade classification. Although our approach is not AI-based, it complements AI approaches to malware detection by providing a fundamental understanding of their limitations. AI-driven methods, such as deep learning and heuristic-based detection, rely on pattern recognition and probabilistic inference rather than formal proof systems. However, since malware can be systematically designed to evade classification within any consistent formal system, our results suggest that AI-based detection mechanisms may also be inherently constrained by the same theoretical limits. This theoretical foundation does not replace AI-based methods but instead provides insight into why AI techniques may struggle against adversarially generated malware, which can be constructed in ways that align with undecidable problems. By framing malware classification as a Gödelian incompleteness problem rather than simply an issue of computational complexity, the paper augments existing research with a perspective that is both more general and more fundamental, shedding light on the intrinsic weaknesses of both formal and algorithmic malware detection methods, including AI-driven solutions.
The study in [
9] highlights the computational challenges posed by advanced code obfuscation techniques in metamorphic viruses, demonstrating that reliable static detection of such viruses is an NP-complete problem. Their work focuses on the use of metamorphic transformations to evade signature-based detection by altering the virus’s code without changing its functionality.
Similarly, the study in [
10] extends the discussion by investigating methods for detecting undetectable and metamorphic computer viruses. In particular, the authors propose heuristic-based techniques aimed at detecting viruses that leverage polymorphic and metamorphic transformations to avoid detection.
The study in [
11], on the other hand, takes a more comprehensive approach by evaluating the use of statistical and machine learning models to analyse the structural patterns in metamorphic viruses, presenting practical solutions to identify such threats.
In contrast to these approaches, which primarily address the practical challenges of malware detection from a computational and heuristic perspective, our approach focuses on the inherent theoretical limitations of formal systems in algorithmically classifying programs as malware or non-malware. By grounding our analysis in computability theory and leveraging the concept of Turing machines, we demonstrate that the problem of distinguishing malware from nonmalware is not merely computationally infeasible, but is fundamentally undecidable in principle. Unlike the work [
9], which focuses on the limitations of static analysis or the statistical methods considered in [
11], our framework highlights the limitations of formal deductive systems, showing that consistent formal systems are incapable of definitively classifying certain programs.
Furthermore, while ref. [
9], as well as ref. [
11], emphasize the obfuscation and metamorphic transformations that render malware undetectable in practice, we take the argument a step further by constructing Turing Machines that elude definitive classification within consistent formal systems. Specifically, we introduce a novel paradigm where malware can exploit the unprovable properties of obfuscators, effectively hiding within special programs and taking advantage of the incompleteness of formal systems, akin to Gödel’s Incompleteness Theorem.
While the aforementioned works rely on practical and computational approaches, such as heuristic analysis, statistical modeling, and static detection techniques, our research delves into the theoretical or in principle undecidability of malware detection by any approach. By providing a systematic framework for constructing stealth malware that exploits formal systems (i.e., anti-malware strategies) limitations, we demonstrate that the challenges of malware detection are deeply rooted in the inherent properties of computation and logical deduction. This theoretical foundation not only complements but also expands upon the practical insights offered by [
9,
10,
11] as well as similar approaches revealing new dimensions in the ongoing research focused on the nature of undetectable (or hard to detect) malware [
12,
13,
14,
15,
16,
17].
Although our approach for identifying and describing, algorithmically, inherent (infinitely many) vulnerabilities of anti-malware programs cannot be considered as an Artificial Intelligence-based, or AI-based, approach, with respect to related work, our investigation uncovered several AI/ML research papers whose results are strongly related to our approach in the sense that they use certain algorithmic approaches (i.e., formal systems in our viewpoint), such as heuristics, machine learning, and computational intelligence, to identify vulnerabilities of anti-malware programs and, especially, to automate exploit generation for anti-malware applications. In essence, the cited papers focus on a task similar to ours, i.e., generating, in an automated or algorithmic way, vulnerabilities and exploits for anti-malware programs and security protection methods. Our work, however, is far more general since it can be applied to any anti-malware program through its transformation into a formal system. The proposed approach also provides an automated process which generates an infinity of malware programs as zero-day exploits or vulnerabilities of a given anti-malware program. Below, we provide a brief overview of related work in this context, which is also presented briefly in
Table 1.
The authors in [
18] propose an approach for Android malware detection through Generative Adversarial Networks or GANs. The authors propose a GAN-based approach to generate synthetic malware samples, which are used to augment training datasets for machine learning-based detection systems. The authors’ results demonstrate a systematic approach to address the challenge of limited training data in malware detection. By generating algorithmically synthetic or artificial malware samples, the proposed approach improves the robustness and effectiveness of machine learning-based detection models.
The authors of [
19] present a method for automated malware generation using Dynamic Symbolic Execution, or DSE. They propose a technique that systematically explores program paths to generate threat inputs that trigger vulnerabilities, resulting in the creation of new malware variants. The main contribution of this paper is that automated malware generation using DSE offers a systematic approach to identifying vulnerabilities and creating exploit payloads. This automated malware generation process can be leveraged for both offensive and defensive purposes in cybersecurity.
The authors of [
20] provide an overview of various obfuscation techniques employed by malware authors to evade detection by security mechanisms. The paper outlines different obfuscation methods, such as code obfuscation, data obfuscation, and control obfuscation, highlighting their significance in malware development. Additionally, it discusses the challenges faced by malware analysts in detecting and analyzing obfuscated malware. The survey serves as a valuable resource for understanding the tactics used by malware creators to circumvent security measures.
The authors of [
21] discuss how signature-based intrusion detection systems (IDSs) can be evaded by polymorphic worms, which vary their payloads in every infection attempt. In this paper, they propose Honeycyber, a system for automated signature generation for zero-day polymorphic worms. The system is able to generate signatures to match most polymorphic worm instances with low false positives and low false negatives.
The authors of [
22] propose AEG (Automatic Exploit Generation), a framework that utilizes machine learning techniques to tailor exploit payloads based on target applications. AEG analyzes applications’ vulnerabilities and generates payloads optimized for successful exploitation. AEG demonstrates the effectiveness of AI-based approaches in automating the process of exploit generation. By leveraging machine learning, AEG can adapt exploit payloads to specific target environments, thereby increasing the likelihood of successful attacks.
The authors in [
23] use DeepLocker as a proof of concept to show how next-generation malware could leverage artificial intelligence. DeepLocker is a malware generation engine that a malware author could use to empower traditional malware samples, such as WannaCry, with artificial intelligence. A deep Convolutional Neural Network (CNN) is deployed to customize a malware attack by combining a benign application and a malware sample to generate a hybrid malware that bypasses detection by exposing (mimicking) benign behaviours. These techniques can also produce stealthy malware payloads within benign applications that conceal their malware actions. DeepLocker represents a significant advancement in the field of automated malware generation, as it demonstrates the potential of AI-based techniques to create sophisticated and evasive malware strains. The paper also highlights the need for enhanced cybersecurity measures to detect and mitigate such threats.
The authors in [
24] present Stegomalware, a systematic survey of malware hiding and detection in images, machine learning models, and related research challenges. Stegomalware involves hiding and detecting malware in images. It explores various methods used for concealing malware code or data within digital images, as well as techniques for detecting such hidden malware. The authors also discuss the role of machine learning models in stegomalware detection, and highlight research challenges in this area.
In [
25], the authors focus on the risks and implications associated with the “weaponization” of malware. The authors discuss how malware code can be used as a tool for cyber-warfare or criminal activities, highlighting the importance of understanding and mitigating such malware threats on a large scale.
In [
26], the authors provide an overview of Artificial Intelligence methods deployed in malware development. The paper covers various AI techniques, such as machine learning, deep learning, and natural language processing, and their roles in developing malware detection, analysis, and mitigation strategies.
In [
27], the authors explore the potential implications of using artificial intelligence in the development and execution of malware. The paper discusses how AI technologies could empower malware authors to create, systematically, more sophisticated and evasive threats, posing challenges for cybersecurity professionals.
The authors of [
28] present a survey on artificial intelligence techniques deployed in malware development as next-generation threats. The authors focus on the role of artificial intelligence in shaping a new, more aggressive landscape of malware threats. The paper also discusses specific AI-driven techniques used by cybercriminals to develop and deploy advanced malware strains.
In [
29], the authors explore the potential threats posed by artificially intelligent cyberattacks. The paper discusses how AI technologies could be leveraged by malicious actors to automate and enhance the effectiveness of cyberattacks (especially malware), requiring new defense strategies to counter them.
In summary, our approach can be seen as a generalization of all such approaches, providing a generic theoretical framework for studying anti-malware applications and identifying, at an early stage before their deployment, several of their inherent vulnerabilities, which may lead to the construction of zero-day exploits and malware strains with stealth properties (see also
Section 9 on obfuscators).
Moreover, the viewpoint that a malicious piece of code inside a program consists of a sequence of legitimate primitive operations that, however, when executed in this sequence lead the system to a risky state has some important implications. Since an OS and/or a processor provides large sets of primitive legitimate operations, a malware developer can produce infinitely many (in accordance to our results) sequences of primitive operations that render the system unstable or inoperable in the end of their execution. Our approach provides a generic theoretical framework for proving that the construction of such actually zero-day exploits and malware strains with stealth properties, are indeed a real possibility in any formal approach for identifying malware. This may be seen a justification for the AI/ML approaches which may be viewed as approaches that “learn” malicious behaviour through feature vectors composed of malware and non-malware strains (i.e., sequences of primitive operations) attempting to capture benign and malicious behaviour in, actually, an infinity of possible behaviours, both benign and malicious. This fact leads to the justification of the anti-malware strategy focused on AI and ML techniques.
4. Recursive Function Theory and Turing Machines with, Formally, Unprovable Properties
We will proceed on the assumption that the reader has a fundamental level of knowledge in the principles and results of Recursive Function Theory, for which we direct the reader to the relevant sources such as, for instance [
30,
31,
32,
33]. For completeness, however, we include in what follows some basic definitions and results which we will need for stating and proving our results. Regarding the brief presentation of the basic elements of Recursive Function Theory, we adhere to the presentation style used in the excellent and well-known book on the topic by Hopcroft and Ullmann [
32]. Before we go into the specifics, we would like to point out that there are other, more realistic computational models that might be used for our purposes. However, given that the our approach is based on the ultimate computation model, the Turing Machine model, we have chosen to derive our results for this model in order to keep our discussions as straightforward and accessible as possible. Extensions to other computation models are not hard to achieve. In this context, it is important to emphasize the fact that Computation and Recursive Function Theory are not dependent on any particular computation paradigm. The model of computation can be any reasonable model, including the Turing Machine, the
-terms (in the
-Calculus formalism), and
-recursive functions, as well as, from a more practical standpoint, any real computer programming language. If we ignore practical issues related to the differences in the efficiency of the computations on each model, it has been demonstrated that the theoretical computing capabilities of all of these different computation models are equivalent (see, e.g., [
32]).
Below, for completeness, we provide a brief introduction to the basic concepts and results from Computability Theory which are relevant to this paper.
Definition 1 (Turing Machine)
. A Turing machine is a theoretical computational model consisting of an infinite tape, a tape head that can read and write symbols, and a finite set of states. Formally, a Turing Machine is defined as a tuple , as follows:
Q is a finite set of states,
Σ is the input alphabet (excluding the blank symbol),
Γ is the tape alphabet (including the blank symbol ⊔) where ,
is the transition function,
is the initial state,
is the accepting state, and
is the rejecting state, where .
Definition 2 (Oracle Turing Machine)
. An oracle Turing Machine is an extension of a standard Turing Machine that has access to an oracle, which is a black box capable of solving a specific decision problem instantaneously. Formally, an oracle Turing Machine is defined as a tuple , as follows:
Q is a finite set of states,
Σ is the input alphabet (excluding the blank symbol),
Γ is the tape alphabet (including the blank symbol ⊔),
is the transition function, which may depend on the response of the oracle,
is the initial state,
is the accepting state,
is the rejecting state, where ,
O is an oracle for a language . The machine can query the oracle with a string w by writing w on a special query tape and entering a special query state. The oracle then provides an answer (typically, whether ) instantaneously.
The machine can use the oracle O as a subroutine during its computation, and the oracle’s answers can affect the machine’s subsequent transitions and halting behaviour.
Oracle Turing Machines are a theoretical abstraction in Computability Theory that expands upon the idea of classical Turing machines. These machines are outfitted with an “oracle”, which is a theoretical instrument capable of immediately solving certain choice problems. The oracle functions as a black box, providing solutions without revealing the underlying computation. These machines are mostly utilized in the field of computational complexity, specifically for comprehending complexity classes like as P, NP, and PSPACE (see [
32] for a formal presentation of these classes). Researchers can investigate the impact of accessing specific issue solutions on computational power by equipping a Turing Machine with an oracle. This is especially advantageous for investigating the connections between complexity classes. By conducting a comparison of the problem-solving capabilities of a Turing Machine with and without an oracle, researchers might acquire valuable insights on the unresolved subject of whether P is equal to NP, which is a fundamental inquiry in the field of computer science. Oracle Turing Machines also play a crucial role in the idea of relativization, which investigates how the validity of specific computational claims might be influenced by the selection of an oracle. These relativized arguments have significance for comprehending the limitations of proving techniques and approaches such as the ones, for example, which may be deployed for resolving the famous, longstanding, open problem in Complexity Theory of whether
or not (see [
32] for an excellent presentation of relativization and its important consequences on the
question). Furthermore, these machines play a crucial role in determining the relationships between problems, particularly in distinguishing which issues are fully representative of specific complexity classes. This means that they are the most challenging problems within those classes. This comprehension facilitates the classification of issues according to their computational complexity and resource demands.
Definition 3 (Turing Decidable Set)
. A set over an alphabet Σ is called Turing decidable (or recursively enumerable) if there exists a Turing Machine M that, given any string , halts and accepts if , and halts and rejects if . In other words, A is Turing decidable if there exists a Turing Machine that decides membership in A.
Definition 4 (Turing Acceptable Set)
. A set over an alphabet Σ is called Turing acceptable (or recursively enumerable) if there exists a Turing Machine M that, given any string , halts and accepts if . If , the machine M may either halt and reject or loop indefinitely. In other words, A is Turing acceptable if there exists a Turing Machine that enumerates the elements of A by accepting them.
In theoretical computer science, the terms Turing acceptable, Turing recognizable, and recursively enumerable (often abbreviated as RE) are synonymous and describe the same concept. Specifically, a set is considered Turing acceptable or recursively enumerable if there exists a Turing Machine M such that the following conditions are met:
For every string , the machine M halts and accepts.
For every string , the machine M either halts and rejects or does not halt (runs indefinitely).
This definition implies that a Turing Machine can enumerate the elements of the set A by accepting exactly those strings that are in A. However, if a string is not in A, the behaviour of the machine is not required to be consistent: it may either reject by halting or run forever.
Thus, the concepts of Turing acceptable, Turing recognizable, and recursively enumerable are interchangeable and all refer to sets for which a Turing Machine can recognize or enumerate the elements, even if it does not halt on all inputs.
Definition 5 (Recursive Function)
. A function is called recursive (or computable) if there exists a Turing Machine that computes f. That is, for every k-tuple of natural numbers, the machine halts with written on the tape.
Definition 6 (Partial Recursive Function)
. A function is called partial recursive if there exists a Turing Machine that computes f, where ⊥ indicates that the machine does not halt (i.e., the function is not defined for some inputs). If , the machine runs forever on input .
Recursive functions, or total recursive functions, are a set of functions that map natural numbers to natural numbers and may be computed by a Turing Machine that halts on every input. Computability theory relies on the foundation of these functions, which are established by a set of essential operations—zero function, successor function, projection functions—and are closed under composition, primitive recursion, and minimization. A function is said to be primitive recursive if it can be defined without the minimization operator, which ensures that it always produces a result in a finite number of steps. Partial recursive functions expand the concept of computability to encompass functions that may not hand and provide a result for every input, indicating that the corresponding Turing Machine may not terminate for certain inputs.This class includes all recursive functions as well as some that are not total. The concept of partial recursive functions is crucial for understanding the limits of what can be computed, as it encompasses problems that are undecidable, such as the Halting Problem, where the function in question is not defined for every input (see [
32]).
Definition 7 (Many-One Reduction)
. Let A and B be sets. A function is a many-one reduction (or mapping reduction) from A to B if f is a total computable function such that for all , if and only if . This is denoted as .
Definition 8 (Turing Reduction)
. A set A is Turing reducible to a set B, denoted , if there exists an oracle Turing Machine which, given any and access to an oracle for B, can decide whether . The oracle for B is a hypothetical device that can instantly determine membership of any element in B.
In this paper, we deploy many-one reductions. We used Turing reductions in [
34] to prove the undecidability of a problem defined to capture the evasion powers of Panopticons modelled as special types of Turing Machines with respect the properties of the information they gather (i.e., sets they accept). It is interesting to note that, as the experience of working on the present and that paper showed, when one investigates, formally, the detection problem for entities based on their syntactic structure or description (e.g., malware), many-one reductions are more suitable to handle the inherent difficulty of the problem, while for entities whose detection is based on the properties of the information they process or language they accept (e.g., Panopticons, as defined and studied in [
34]), Turing reductions are more appropriate.
In addition, the computational procedures or programs that may be created within the context of the Turing Machine formalism or a real programming language can be listed or enumerated effectively, i.e., systematically (for an exposition to how this enumeration can be performed for Turing Machines, see [
32]). This indicates that there is a mechanism that can enumerate all of the programs in any language or computation formalism, such as the Turing Machine, one by one, in such a way that all programs appear at some point of the enumeration process (see [
32]).
Also, it is a well-known result of computability theory that the number of arguments in a function (i.e., inputs to a program) does not matter for studying the power of a computation model. This is because some of the arguments may always be embedded (i.e., hard-wired) in the program itself in an easy fashion, reducing the total number of its arguments. This is stated, formally, in the following result, which is a simplified form of Kleene’s
-theorem (see [
30]), applicable to functions with two arguments (it easily generalizes to any number of arguments).
Theorem 1 (simplified form of Kleene’s
theorem)
. Let be a partial recursive function. Then, there is a total recursive function σ of one variable, such that for all x and y. That is, if is considered as the integer (code) representing some TM then .
Further to this result, we will use another fundamental result from recursive function theory. This result is the Recursion Theorem, according to which every total (i.e., defined for all values in its domain) recursive function that maps the set of all Turing Machine encodings (i.e., suitably encoded Turing Machines or programs) onto Turing Machine indices has a fixed point. Formally, this result is stated below by Theorem 2.
In our approach, we fix a powerful enough formal system
, e.g., one that includes Peano’s Arithmetic. The only requirement of such a system is that within it we can write statements about the properties of natural numbers (for us, these numbers are Turing Machine indices or encodings). We deploy a known result from Recursive Function Theory, which states that given such a formal system, which will be denoted by
, we can effectively, i.e., algorithmically, construct a Turing Machine
M possessing this property: no proof exists in
for the statement that asserts that the Turing Machine
M, when given any specific input, halts and, at the same time, no proof exists that
M does not halt. The proof of the theorem can be found in [
32], Chapter 8. In what follows, we will denote by
this Turing Machine (there may be several Turing Machines with this property, however) whose halting status is unprovable using the deductive powers of
. As expected, the specifics of
depend heavily on
. However, the crucial fact in our approach is that the Turing Machine
can be constructed effectively, i.e., using an algorithmic procedure (but not necessarily efficiently, i.e., fast, which does not affect our results).
First, we state the Recursion Theorem (see [
32]), which is central to our arguments that are discussed in
Section 6.
Theorem 2. For any total recursive function σ there exists an such that , for all x, where is the function that is computed by the Turing Machine .
Using this theorem, the following result, which is also important for our approach discussed in
Section 6, is proven in [
32] (first proven in a more general form in [
35]):
Theorem 3. Given a formal system , we can construct a Turing Machine for which no proof exists in that it either halts or does not halt.
The following corollary is not stated in [
32]. However, it is discussed in [
35], and it is not difficult to show that it provides evidence for a weakness of
which can be deployed to construct Turing Machines whose malware status, either malware or non-malware, cannot be proven within
.
Corollary 1. For the -th Turing Machine , which is the constructed in the proof of Theorem 3 and computing the function , it holds that it does not halt for every input j if and only if is consistent.
Theorem 3 can be generalized for other undecidable properties of Turing Machines and, in particular, properties about their outputs when viewed as integer computation procedures. In this context, we give the definition of non-trivial properties of sets of integers:
Definition 9 (Non-trivial properties of integers)
. A non-trivial property of the set of positive integers is any subset P of such that and .
As an example, a non-trivial property of the set of integers is the set of primes, i.e., the set characterizing the property of an integer being prime. We can show the following, in a similar fashion to Theorem 3:
Theorem 4. Given a formal system and a property P of integers, we can construct a Turing Machine for which no proof exists in , when given j as input, that the output of is, either, in P or it is not in P.
Proof. Given
and
P, we construct a Turing Machine
which computes the following function of two inputs:
As in the proof of Theorem 3, the Turing Machine
operates by enumerating proofs in
. Then, one of the following three cases may be true:
- 1.
A proof is found that the Turing Machine with input j gives as output an integer not in P. Then M outputs the value .
- 2.
A proof is found that the Turing Machine with input j gives as output an integer in P. Then M outputs the value .
- 3.
None of the two proofs above can be deduced from . Then the value of M will be undefined since M never halts.
Using the
-theorem and the Recursion Theorem, as in the proof of Theorem 3, we arrive at the following for an integer
which can be effectively constructed as follows:
The Turing Machine
is the Turing Machine computing the function
. There are three possible cases for the value of
:
- 1.
, where . This implies that a proof was found by the th Turing Machine, i.e., by itself, that outputs a value not in P, which is a contradiction since it must hold that .
- 2.
, where . This implies that a proof was found by the th Turing Machine, i.e., by itself, that outputs a value in P, which is a contradiction since it must hold that .
- 3.
is undefined. This implies that none of the proofs that either outputs a value in P or outputs a value not in P can be provided by the formal system .
The only case from these three cases that does not lead to a contradiction is the third case. Thus, a Turing Machine has been effectively constructed, according to the conclusion of the theorem. □
Turing Machines, such as the ones constructed in the proofs of Theorems 3 and 4 can be used, as we will show below, for constructing Turing Machines whose malware status cannot be proven within any given formal system.
5. A Generic Abstract Computational Model for Malware
This section discusses how malware, as a computational artifact, can be modeled using Turing Machines. It outlines the correspondence between TM operations and malware behaviour, also addressing the complexities of representing malicious action sequences. Furthermore, the document explores how Turing Machines can represent the sequence of operations that characterize malware behaviour, distinguishing these from benign behaviours and discussing the theoretical challenges in modelling these distinctions.
The computation of a TM is a sequence of configurations, each defined by the current state, the tape contents, and the head position. The computation of a malware, however, in a realistic computing machine is a sequence of CPU instructions. In what follows, we provide the details of modelling malware actions as sequences of states and transitions between successive states and then discuss how to relate this theoretical model of malware with realistic computing machines.
A malware, like any computer program, is a computational process consisting of sequences of instructions designed to perform specific (often malicious) tasks. The malicious nature of a malware typically arises from the sequence of operations, rather than the operations themselves, many of which are benign in isolation. Examples of such operations are the following:
Scanning filesystems (e.g., SCAN or OPEN FILE);
Encrypting files (e.g., ENCRYPT);
Deleting files (e.g., DELETE);
Extract data from file and send it over the Internet (e.g., EXTRACT-AND-TRANSMIT).
In order to cast malicious actions into the Turing Machine formalism we need to map these actions to Turing Machine operations. This can be accomplished as follows:
Tape Symbols: Represent entities such as files, data blocks, or network packets.
States: Model the different stages of malware execution (e.g., SCAN, ENCRYPT, DELETE).
Transition Function: Encodes the logic of malware execution, specifying how the system transitions between stages based on input.
The malicious character of a malware is manifested by specific sequences or patterns of states, such as the following for instance:
This sequence of malicious actions can be modelled by transitions between TM states:
However, a key distinction between Turing Machines and modern ( stored program) computers lies in how they manage operations and states:
Turing Machine States: Represent conditions within the computation, guiding transitions based on the current symbol.
Computers execute CPU instructions: The instructions correspond to discrete operations, such as arithmetic or data movement, typically executed in sequence.
While TM states are abstract and encode computational logic, CPU instructions are explicit commands towards direct execution. These two fundamental concepts can be associated as follows:
A TM state may simulate a single or multiple CPU instructions by encoding their logic in its transitions.
A CPU’s program counter can be viewed as analogous to a TM’s state, determining the next operation to execute.
Thus, in order to simulate a CPU instruction set with a Turing Machine:
- 1.
Define the CPU’s instruction set as a finite set of symbols .
- 2.
Map each instruction to a sequence of TM states and transitions.
- 3.
Use the tape to represent memory, with symbols encoding registers and data.
For example, consider a standard CPU instruction, such as “ADD R1, R2”. Then, this instruction, when implemented by a Turing Machine, may be transformed into the following set of transitions and changes on the tape:
This sequence of Turing Machine operations reflects the fact that an “ADD” instruction normally consists of three basic steps: (i) fetch the operands, (ii) compute the sum, and (iii) store the sum. That is, this sequence of Turing Machine transitions simulates what “ADD” does: it fetches data from registers
and
, computes the sum, and writes the result back to memory (i.e., tape).
Consequently, simulating different computer (CPU) architectures can be accomplished by encoding their instruction sets and control logic:
Define a universal TM that interprets CPU-specific instruction sets stored on its tape.
Encode different CPU architectures as specific configurations or subroutines within the TM.
Use a lookup table on the tape to simulate varying instruction formats (e.g., x86, ARM).
This approach highlights the universality of TMs and their ability to emulate any computational process, including the behaviours of modern CPUs.
5.1. The Sequence Traversal Problem and Its Equivalence with the Halting Problem
As we discussed above, the main motivation behind the approach of modeling malware behaviour as a sequence of Turing Machine states and transitions between them was that the operations performed by malware are indistinguishable from benign processes when considered in isolation. For instance, scanning filesystems (
SCAN) and encrypting files (
ENCRYPT) can be part of both legitimate software (e.g., antivirus software or secure backups) and malware (e.g., ransomware). Accordingly, a legitimate backup tool might execute the following sequence:
while a ransomware might execute the following sequence:
Consequently, as we focus our attention on sequences of states as malicious actions, we will need an undecidability result related to sequences of states and not only the states themselves. To this end, we define the Sequence Traversal Problem for Turing Machines.
Sequence Traversal Problem
Input: A Turing Machine M with a transition function , an input w, and a specific sequence of states with specified transitions , where a is the input symbol, b is the output symbol, and is the direction of head movement.
Output: Does M traverse the sequence in order, by applying the specified transitions during its operation on w?
We prove that this problem is equivalent to the Halting Problem (and, thus, it is undecidable) by giving the following:
5.1.1. Reduction of the Halting Problem to the Sequence Traversal Problem
Theorem 5. The Halting Problem reduces to the Sequence Traversal Problem.
Proof. Given a Turing Machine M and an input w, we construct a new Turing Machine as follows:
- 1.
first simulates M on w.
- 2.
If M halts on w, transitions through a predefined sequence of states with specific transitions .
- 3.
If M does not halt, never enters or traverses the sequence.
It holds that traverses the sequence with the specified transitions if and only if M halts on w. □
5.1.2. Reduction of the Sequence Traversal Problem to the Halting Problem
Theorem 6. The Sequence Traversal Problem reduces to the Halting Problem.
Proof. We are given a Turing Machine M, an input w, and a specific sequence of states with specified transitions . We construct a new Turing Machine as follows:
- 1.
simulates M on w.
- 2.
During the simulation, checks whether M begins traversing the sequence with the specified transitions.
- 3.
If M does not traverse the sequence, enters an infinite loop.
- 4.
If M completes the sequence with the specified transitions, halts.
We observe halts if and only if M traverses the sequence with the specified transitions. □
In the next sections, we will derive all the results based on the archetypal undecidable problem, the Halting Problem, for convenience and simplicity. However, based on the discussion above, they are readily transferable to the more realistic Sequence Traversal Problem.
6. Theoretical Impossibility of a Complete Formal Malware/Non-Malware Program Classification
Modelling malware as a Turing Machine highlights the computational nature of malicious behaviours, focusing on sequences of simple operations that correspond to particular states of a TM rather than on each individual operation, as we have already discussed in
Section 5. A sequence of simple benign operations/states that either damages a system or harms its users is considered as a malicious action. Based on this concept, we can form composite states as ordered sets of certain simple states. A composite state can be either benign or malicious depending on the effect (system damage or user harm) after its appearance. This approach effectively abstracts malware into fundamental computational constructs, i.e., Turing Machines, allowing for a deep and systematic analysis.
Based on the above discussion, we give a simple formal definition of malware that generalizes Cohen’s ideas about viruses. In doing so, we extend in a straightforward way the standard Turing Machine model in order to model malware entities.
Definition 10 (TM Model for a malware entity)
. A Turing Machine for a malware is a septuple, defined as , where Q is a finite set of states, is a distinguished state called the start state, is a distinguished set of states linked to malware actions, , is a set of final states, and , Σ is a finite set of symbols called the input alphabet, Γ is a finite set of symbols called the tape alphabet (actually Σ along with some special symbols like space, i.e., ), δ is a partial function from to called the transition function.
Actually, corresponds to the set of the malicious composite states discussed above that, when executed, manifest malware behaviour. We assume that transitions from states in do not change the Turing Machine’s tape contents, i.e., they are purely interactions with the external environment of the Turing Machine and can affect only the environment. As an external environment of a TM we consider the environment where a TM operates (it could be any other harmless TM under any known OS). Actually, we consider that a malware TM that does not modify its tape may evade detection more easily since it is not easy to trace its internal operation. For the same reason, we also consider that and , i.e., a malware TM always starts from a harmless state and halts harmlessly in order to evade its detection.
Thus, based on the proposed formal malware model, malware actions are defined as a set of states considered threatening or harmful, according to the state of the art (each time) in malware technology and malware evolution, while anti-malware actions are encompassed into the definition of the malware recognition problem. In this context, a malware system’s goal is the solution of the malware recognition problem, i.e., deciding whether a given Turing Machine’s description corresponds to a malware in the sense that the give Turing Machine will, eventually, reach one of the malware states.
Based on the discussion above, malware behaviour is formally manifested by the execution (not simply the existence in the Turing Machine’s description) of a specific sequence of actions, (e.g., it will publish secret information about an entity, it will download information illegally, etc.) that are reflected by reaching, during its operation, states in the set . We stress the word “execution” in order to preclude situations where a false alarm is raised for a “malware” program which merely contains the states in without ever invoking them. Such programs actually operate normally without ever executing any actions characteristic of malware behaviour.
The Malware Detection Problem
Input: A description of a Turing Machine (program).
Output: If the input Turing Machine behaves like malware according to Definition 10, output True. Otherwise, output False.
More formally, if
denotes the language consisting of Turing Machine encodings
which are malware according to Definition 10, then we want to decide
, i.e., to design a Turing Machine that, given
, decides whether
belongs in
or not according to this definition. We will show that
is recursive in
. This implies that if we had a decision procedure for
, then this procedure could also be used for deciding
, a problem (language) that is undecidable. Thus, no decision procedure exists for
, either. More formally, the following can be proved in a similar context (for the proof, please see [
6,
34]):
Theorem 7 (Theoretical impossibility of malware detection)
. The language is undecidable.
7. Construction of Potentially Malware Programs Which Evade Formal Characterization
We now turn to actually constructing a particular Turing Machine, which cannot be classified as malware or non-malware by purely formal procedures, within any consistent formal system .
Theorem 8 (malware/non-malware classification-resistant programs)
. Let be a consistent formal system. Then we can construct a Turing Machine for which there is no proof in that it behaves as malware and no proof that it does not behave as malware.
Proof. Let be a Turing Machine whose halting status on any given input j cannot be proven in in either direction, i.e., “halts” or “does not halt”. Such a Turing Machine exists by Theorem 3. This is a Turing Machine like , which is constructed in Theorem 3. A new Turing Machine, , of one input and , the malware states, is constructed. It is composed of three parts. The first part is a non-malware Turing Machine, denoted by ; the second part is ; and the third part is a malware Turing Machine, denoted by .
The construction details of these Turing Machines are not hard, but they are tedious, so we will provide a rather high level description. Given
,
is the Turing Machine implied by Theorem 3 (see [
32,
35] for the construction details). With respect to
, it can be any Turing Machine that simply does not use any states in
, e.g., a Turing Machine that computes a simple arithmetic function. Finally,
executes, during its operation, at least one state in
. It is not hard to construct such a Turing Machine, e.g., it can be a Turing Machine that simply, after leaving the start state, executes one more step involving a state in
before halting (i.e., reaching a final state).
With respect to its operation, first activates the first part, i.e., , which may ignore the input, say j, and operates with its non-malware behaviour, i.e., it never visits states in during its operation. Then, is activated with input j. By construction, does not use states in . Finally, the third part, i.e., , starts operating, exhibiting malware behaviour by visiting at least one state in during its operation.
Suppose now that a proof exists in that is a malware. By the construction of , the only way to demonstrate malware behaviour is to activate its third part, i.e., . This, in turn, can occur only if the second part, i.e., , halted on input j. Thus, the same proof that is a malware als, serves as a proof that halts on input j.
Suppose, on the other hand, that a proof exists in that is not a malware. By the construction of , this can happen only if is never activated during the operation of . In turn, this can happen only if does not halt on input j. Thus, again the same proof that is not a malware also serves as a proof that does not halt on input j. □
From Theorem 8 we have the following corollary:
Corollary 2. Given any formal system , there exist infinitely many effectively constructible Turing Machines for which there is no proof in that they behave as malware and no proof that they do not behave so.
Proof. Observe that in the effective, i.e., algorithmic, construction process described in Theorem 8, can be any of the countably many infinite Turing Machines that simply avoid the states in and can be any of the countably many infinite Turing Machines that visit at least one state in during their operation. stays fixed (it depends only on ). □
Finally, in the same spirit as Corollary 1 for the Turing Machine or in the notation of Theorem 8, we prove the following about which actually shows that , as well as the infinitely many Turing Machines built around in Corollary 2, is not a malware but no proof exists within the formal system if it is consistent.
Corollary 3. For the Turing Machine , it holds that it is not malware if and only if is consistent.
Proof. The proof is essentially the same as the proof of Corollary 1, since contains and it is constructed in such a way so that it is not a malware if and only if does not halt on any particular input j. □
8. Construction of Guaranteed Malware Programs That Evade Formal Characterization
Corollary 3 shows that the Turing Machine cannot be malware if the formal system is consistent, which can either be true if a formal consistency proof exists or very likely if has been carefully designed and widely tested in various contexts. If a consistency proof exists for , then is actually harmless, although no proof exists in that it is non-malware.
In this section, we show that, given a formal system
, it is possible to construct an infinite, recursively enumerable set of Turing Machines that are actually malware, regardless of the consistency of
, but this is impossible to prove within
if it is consistent (otherwise, anything can be trivially proved). We begin with the following theorem, given in [
35]:
Theorem 9. Given a formal system , there exists an algorithm, or Turing Machine, denoted by which can be explicitly given and whose running time is quadratic, , but there is no proof in that it runs in time .
Based on this result, we can prove the following in a way analogous to the proof of Theorem 8.
Theorem 10 (malware/non-malware classification-resistant programs)
. Let be a consistent formal system. Then we can construct a Turing Machine, which will always exhibit malware behaviour and for which there is no proof within that it indee, behaves as malware (i.e., the formal methods in cannot detect this malware Turing Machine or program).
Proof. The proof follows the same idea as the proof of Theorem 8. Let
be a Turing Machine with running time
for which there is no proof in
that it runs in time
, as stated in Theorem 9. Let
be a new Turing Machine of one input and
the malware states. This Turing Machine is composed of three parts. The first part is
, the second part is a non-malware Turing Machine, denoted by
, and the third part is a malware Turing Machine, denoted by
. The construction details of
and
are, again, not hard to state, but they are tedious and do not contribute to the understanding of the ideas of the proof. Thus, we refer to the proof of Theorem 8 for a high level description of these two Turing Machines. On the other hand, given
, the operation details and properties of
can be found in [
35], but the ideas on which it is based are similar to the ideas behind Theorems 2 and 3.
With respect to its operation, the composite Turing Machine initially activates the first part, i.e., , which starts operating on its input string, say j with , according to its own transition function. By construction, it never visits states in during its operation, i.e., it does not exhibit malware behaviour. Moreover, it is not hard to include in certain transition functions that implement a “clock”, or counter, which counts the number of steps, i.e., the computation time, of .
After stops, checks the value of the step counter. If the value is equal to , then activates the malware Turing Machine ; otherwise, it activates the non-malware Turing Machine .
Suppose now that a proof exists within the formal system that is malware, i.e., it exhibits malware behaviour when activated by visiting states in . By its construction, can exhibit malware behaviour only by activating , which is a malware Turing Machine. This, in turn, can occur only if halted on input j in time , which is always true, by its construction (see Theorem 9). Thus, if a proof existed in that is a malware, the same proof can be used to show that halts in less than steps on the input j, with . Such a proof does not exist, according to Theorem 9. Therefore, no proof exists in that is malware, although it is actually a malware program. □
Note that if is consistent, there can be no proof in that is non-malware since this would imply that halted in at least steps, which is not true (see Theorem 9). Again, similarly to Corollary 2 resulting from Theorem 8, we have the following corollary as an immediate consequence of Theorem 10:
Corollary 4. To any formal system , there correspond infinitely many effectively constructible malware Turing Machines for which there is no proof in that they are, indeed, malware.
Proof. Again, in the construction process of the malware Turing Machine described in Theorem 10, can be any of countably many infinite Turing Machines that avoid all the states in , and can be any of countably many infinite Turing Machines that visit at least one state in during their operation. The Turing Machine stays fixed and depends only on the specifics of the formal system . □
9. Generalization and Practical Constructions: Obfuscator Turing Machines
We observe that Theorems 8 and 10, which gave the construction of the Turing Machines whose malware status cannot be proved within a given formal system , share the same idea: (i) first, a specific Turing Machine M is constructed that has a particular property which is impossible to prove within , and (ii) then, another Turing Machine is constructed, incorporating the workings of M, such that any proof within that it acts as a malware or non-malware translates directly to a proof for the (unprovable in ) property of M.
Thus, the Turing Machines and , which possess certain properties unprovable in a given formal system (see Theorems 3 and 9), led in Theorems 8 and 10, respectively, to the construction of the Turing Machines and , whose malware status is unprovable in . In some sense, given any formal system , we can effectively construct Turing Machines which can act as obfuscators for constructing malware Turing Machine programs that are undetectable by the formal system , much like stealth malware programs behave by “hiding” their behaviour from anti-malware software.
Thus, we propose the following definition in the context of malware detection based on formal systems and their proof procedures:
Definition 11. Given a formal system , an obfuscator Turing Machine, or simply obfuscator, of is a Turing Machine with a property that cannot be proved within .
Examples of obfuscators are the Turing Machines and that we discussed in this paper. As we suggested, obfuscators can be used in order to construct malware Turing Machines which are undetectable by anti-malware Turing Machines (programs) whose malware detection procedures are based on the axioms and deductive methods of a particular formal system .
With respect to practical constructions of obfuscators, one can actually construct the Turing Machines
and
as long as one has the description of the target formal system
. Although the process is tedious, it is worth undertaking. However, there exists already a concrete obfuscator construction in the literature which can give us an idea of the structure, operation, and size of obfuscator Turing Machines. In [
36], a complete description of a Turing Machine of 7910 states and an alphabet of two symbols is provided (with its full transition function), whose behaviour is independent of Zermelo–Fraenkel’s set theory with the axiom of Choice or ZFC. More specifically, this Turing Machine has the property of never halting, much like the Turing Machine
of [
35], but this property is impossible to prove within ZFC. Thus, this Turing Machine is an obfuscator for ZFC. In the Github repository [
37], there are construction details for a Turing Machine with fewer states, namely 1919, that essentially leads to the same results with the “lengthier” Turing Machine of the 7910 states.
10. Conclusions
This paper investigated the feasibility of a complete, in principle, classification of all programs as either malware or non-malware, under a plausible definition, based on the deductive power of formal systems. Enhancing Cohen’s seminal work, we demonstrated that there is no algorithm capable of categorizing all programs as malware or non-malware, in general. Thus, the malware identification problem is undecidable under the malware definition we proposed. Moreover, we demonstrated that an infinite, recursively enumerable set of Turing Machines corresponds to each given formal system which cannot be categorized as malware or non-malware within the formal system. The specifics of these Turing Machines depend on the details of the formal system, but the important fact that we have proven is that, given the formal system details, this infinity of Turing Machines can be effectively enumerated. Based on Theorem 8 and Corollary 2, it can be deduced that, in theory, it is impossible to formally classify this infinite number of programs as malware or not, regardless of the formal system employed for this purpose. This suggests that there is an infinite number of potentially malware programs that cannot be proven to be such using any formal system, regardless of its expressiveness and capability. Furthermore, an algorithm exists to systematically enumerate the members of this set. This could potentially enable malicious entities to create programs with an undetectable malware status through formal methods alone, i.e., algorithmically. What is only required is a basic understanding of the formal system utilized to categorize programs as malware or non-malware.
Furthermore, as demonstrated in Corollary 3, it can be concluded that all of these programs are actually non-malware, unless the formal system utilized for the classification task, denoted as
, is inconsistent. Therefore, despite the innocuous nature of these programs, they cannot be classified as such using any formal system
. It is plausible that the Turing Machines constructed through the use of Corollary 2 are malware programs, given also the conclusions from Corollaries 1 and 3, if
is, in fact, inconsistent. Therefore, the perplexing (and alarming) situation could be the following: Does the program being evaluated, based on a formal system
’s deductive power, truly qualify as non-malware, as Corollary 3 certifies, or is it the case that the formal system
is actually inconsistent, rendering Corollary 3 guarantees invalid? We should stress the point that establishing the consistency of a given formal system
is an extremely challenging problem. For instance, there have been cases of previously proposed formal systems (e.g., ML, as documented in [
38]) that were subsequently proven to be inconsistent (Rosser proved rather surprisingly that ML is actually an inconsistent formal system). Moreover, there are formal systems that are widely employed in mathematics today whose consistency status remains unknown (e.g., Zermelo–Fraenkel’s set theory with the axiom of Choice (ZFC)).
Assuming, now, that the formal system is known to be consistent, the situation changes drastically. Now, we are able to construct an infinity of malware programs which cannot be classified as such by the deductive power of a given formal system . In other words, we can construct an infinity of actually malware programs that evade capture by methods based on the formal system , i.e., no proof exists in that any of these programs is, indeed, malware and, thus, harmful. Based on Theorem 10 and Corollary 4 and the particular Turing Machine from Theorem 9, we have at our disposal a construction algorithm for infinitely many Turing Machines that are malware and for which no formal proof exists in the formal system that proves their malware stat shows that they are malware programs. Although Theorem 8 and Corollary 2, as discussed above, lead to a similar result, there is an important difference. The “prototype” Turing Machine of Theorem 8 does not exhibit malware behaviour, while the “prototype” Turing Machine of Theorem 10 does exhibit such behaviour and, thus, is indeed dangerous as a malware program. Nevertheless, no guarantee, i.e., no formal proof, exists in about their behaviour either as non-malware or malware programs, respectively.
11. Future Work
As for goals for our future research, the theoretical decidability problem of the malware Detection Problem can be investigated under other, more complex definitions of malware behaviour based, for instance, on the actions of the program (i.e., properties of sequences of specific computational steps) or the languages that malware Turing Machines accept (this latter approach is pursued, further, in [
34] in the case of Panopticon Turing Machines). Our view is that the study of the theoretical decidability of any computational entity (such as malware) can be benefited by formally defining the targeted entities’ behaviour using a computational formalism (e.g., Turing Machines). In this way, the deep findings of computability theory can help derive interesting results about difficulty, in principle, of detecting such entities. We hope that our work can contribute to pursuing further this line of research.
Last, we cite the abstract of Ken Thompson’s excellent Turing Award lecture (see [
39]) that summarizes in the best possible way our work, i.e., that no fully automated or algorithmic solution can provide a complete characterization of all possible programs as malware or non-malware and, thus, become the “perfect” anti-malware application: To what extent should one trust a statement that a program is free of Trojan horses? Perhaps it is more important to trust the people who wrote the software.