Coinductive Natural Semantics for Compiler Verification in Coq

(Coinductive) natural semantics is presented as a unifying framework for the verification of total correctness of compilers in Coq (with the feature that a verified compiler can be obtained). In this way, we have a simple, easy, and intuitive framework; to carry out the verification of a compiler, using a proof assistant in which both cases are considered: terminating and non-terminating computations (total correctness).


Introduction
This paper tackles the problem of compiler verification in proof assistants. At present, a number of long-term projects deals with several aspects of this issue, for instance, CompCert C [1][2][3][4][5], CertiCoq [6], and IRIS [7]. In this work, we address specifically the verification of total correctness of compilers of functional languages in Coq. Here, we refer to total correctness in the sense of Leroy [8] and Gregoire and Leroy [9], that is total correctness means: correctness of terminating and non-terminating computations.
In the literature, ad-hoc verifications are traditionally used; meaning, verifications that employ more than one distinct formalism.
This situation calls for a solution that abstracts away everything needed in a single unifying framework that simplifies (and perhaps, it could even make possible to automate) this task. By unifying framework we mean a single formalism able to define each of the components of a compiler, namely: source language semantics, intermediate language semantics, abstract machine semantics, and translations. In fact, in this work we offer (coinductive) natural semantics as a simple, easy and intuitive unifying framework to carry out (total correctness) compiler verification in Coq. Whenever we use 'framework', and we are referring to the (coinductive) natural semantics unifying framework as presented in this work (and only to it), we mean 'unifying framework'. In this manner, we remark that only one formalism, (coinductive) natural semantics, is sufficient to conduct this task as opposed to usual verifications in the literature where more than one distinct formalism are needed in order to accomplish the same goal (see Section 1.1 for a discussion of the related work). In our preliminary Traditionally, in the literature small-step semantics is used in the machine. That is why, as an alternative target machine, we offer the original small-step semantics Modern SECD machine (Sections 2.2 and 3.2) extended to support all Mini-ML features, in particular, with native recursion support, mainly to compare it with our solution, in which a big-step machine is used, i.e., our Modern SECD machine big-step version.

Related Work
Since CompCert C [1][2][3][4][5] project's inception led by Leroy, there has been great progress in the literature dedicated to compiler verification using proof assistants, Coq in particular. In this work, we address specifically functional programming languages verification. In principle, verification of functional programming languages to abstract machines.
An unusual technique exposed by Hardin et al. [16] to carry out the verification of a functional language to an abstract machine is to use small-step semantics, both in the source language and in the abstract machine, together with a decompilation function and a measure to establish correctness. The idea of this technique is to perform a bottom-up simulation in which every machine transition corresponds to zero or one source level reductions. The machine states are mapped back to source level expressions using a decompilation function. More precisely, if from a machine state s a state s ′ is reached via a machine transition s → s ′ , and e is the source language expression corresponding to the state s via decompilation, then there exists an expression e ′ corresponding to s ′ via decompilation, such that e = e ′ or e reduces to e ′ via source language small-step semantics e → e ′ . When a machine performs a transition from a state to another and the decompilation of both states corresponds to the same expression in the source language, the machine performs a silent transition. To guarantee that there are not infinitely many silent machine transitions, a measure defined on the machine states is used, i.e., if s → s ′ and the decompilation of s and s ′ corresponds to the same expression e, then s measure is greater than that of s ′ .
Gregoire and Leroy [9] and Gregoire [17] use this technique to verify a compiler from a strong reduction lambda calculus to an abstract machine in Coq; more precisely, to verify the correctness of a compiler from the Calculus of Inductive Constructions (CIC) to a variant of the ZAM machine [18] (adapted to support weak symbolic reduction), obtaining a compiler-based verified implementation to evaluate Coq terms. In addition, they show that this compiler-based implementation is more efficient than the original Coq interpreter as expected. More recently, Kunze et al. [19] employ a very similar technique to verify the correctness of a compiler from a call-by-value lambda calculus to an abstract machine in Coq.
However, Leroy [8,14] and Leroy and Grall [15] point out that a correctness proof using this technique is difficult, and also that the definition of a decompilation is complicated, hard to reason about, and hard to extend (especially for optimizing compilation phases). In consequence, they propose a solution based on big-step semantics. In fact, they state that proving semantic preservation for compilers both for terminating and diverging programs using big-step semantics is the original motivation of their work.
The Leroy [14] and Leroy and Grall [15] technique consists of using (coinductive) big-step semantics in the source language but small-step semantics in the machine. In this way, for the termination case, if a source language expression e evaluates to v via big-step semantics e ⇒ v, then reducing the machine code c via transitive closure of small-step semantics + → , takes the machine to a state with v m at the top of the stack, where c corresponds to e compilation and v m is the machine value corresponding to v. For the non-termination case, if e diverges using coinductive big-step semantics, then c also diverges in the machine. Leroy and Grall mention that their technique provides a simpler way to prove semantics preservation, in particular for the non-termination case. Currently, it is well known [14,15,20,21] that big-step semantics are easier and more convenient for compiler correctness proofs, and also for efficient interpreters [21]. Thus, we have on one hand that Leroy and Grall main motivation is to use big-step semantics for compiler correctness proofs, and on the other that big-step semantics has proved to be easier and more convenient for compiler correctness proofs. Our aim is to take big-step semantics to its deepest consequences exploiting it where it has proved to be useful. This is why we propose (coinductive) natural semantics as framework for compiler verification.
(Coinductive) natural semantics as framework for compiler verification in Coq as proposed in this paper is a technique very similar in spirit to that of Leroy and Grall, but going further since not only (coinductive) big-step semantics is used in the source language but also in the target machine (let us recall that Leroy and Grall employ small-step semantics in the machine). Furthermore, to the best of the author's knowledge, it is the first time that coinductive natural semantics is proposed and used to define computations that do not terminate in an abstract machine. In this way, we obtain a fully-based (coinductive) natural semantics technique for a functional language to an abstract machine compiler correctness verification in Coq.
Establishing correctness is even easier, intuitive and simple since natural semantics are also used in the machine. If a source language expression e is evaluated to a value v via source language natural semantics e ⇒ v, then c is evaluated to a final machine state with v m at the top of the stack via machine natural semantics s ⊢ c ⇒ v m ⋅ s; where c is the compilation of e via natural semantics e ⇓ c, and v m is the compilation of v via natural semantics v ⇓ v m , and s is any machine stack. If e diverges via coinductive natural semantics e ∞ ⇒ , then c also diverges via machine coinductive natural semantics s ⊢ c ∞ ⇒ .
We can note here, how only (coinductive) natural semantics is sufficient to establish correctness; we do not need to use any other distinct formalism. A potential use of this framework is to take it as basis to verify a conventional compiler to abstract machine implementation of (the core of) a realistic functional language such as OCaml. The official INRIA OCaml implementation comes with two compilers [22], the first one generates code of the ZAM machine, and the second one generates C--code. We speculate that this framework can also be used as basis to verify the compiler that generates C--since Dargaye [23] already uses big-step semantics (although not as unifying framework and only tackling terminating computations) to verify a compiler from Mini-ML to Cminor (an early intermediate language of the CompCert C compiler). The idea of generating Cminor (or some other Compcert C intermediate language) code instead of C-is immediate since, in this way, we can connect the compiler's back-end to CompCert C and obtain as final result verified assembly code. This use takes more relevance if we take into account that Coq itself is an OCaml program (even though some portions of Coq are verified in Coq [24][25][26], the extracted verified OCaml code will eventually run on an OCaml implementation).
Another line of work is dedicated to systematically derive an abstract machine from a lambda calculus [27][28][29][30][31][32][33]. The general idea in these works is from a lambda calculus to carry out a series of transformations until the desired abstract machine is obtained. One of the most exploited transformations in some of these works is refocusing [34], although a great variety of transformations are used. The compilation correctness is a direct consequence of the correctness of the transformations. Some of them, in addition, address Coq formalization [29][30][31][32][33]. The closest works to ours are those which starting from a natural semantics of a lambda calculus derive an abstract machine [27,30]. Specifically, the most similar work in nature to ours is [30]. In [30], the STG machine is derived from natural semantics of a lazy lambda calculus and the derivation is formalized in Coq. However, in [30] only the case for terminating computations is tackled.
In all these works, the emphasis is on the corresponding machine derivation. In contrast, in a functional language implementation the target abstract machine is usually designed by hand and only then (if any) proved correct w.r.t source lambda calculus semantics (see, for example, [18]). Hence, (coinductive) natural semantics as framework as presented in this paper is best suited to verify functional languages implementations (which targets abstract machines), since it assumes that the target machine (and intermediate languages) are given (not to be derived).
Moreover, if for some reason (for example, semantic justification of the target abstract machine) it is considered relevant to systematically derive the target abstract machine from the source calculus, we conjecture that the corresponding derivation can also be carried out in our (coinductive) natural semantics framework. This is because each transformation which leads to the derived machine could be seen as an (intermediate) translation and be defined in natural semantics. Also, the input and output language of each transformation could be seen as an (intermediate) language and its corresponding semantics be defined in natural semantics. Certainly, the derived abstract machine would be a big-step machine.
Other works tackle the verification of a small functional language in Coq, but to the authors' knowledge none of them use (coinductive) natural semantics as unifying framework; instead, they use ad-hoc verifications. For instance, Chlipala [35] offers a compiler from a small impure functional language to an idealized assembly language. He starts from de Bruijn notation and employs natural semantics for the source and target languages, but not to specify the compilation, his effort only cover terminating computations. Benton and Hur [36] deal with the compilation of a small typed functional language to the SECD machine, but they use denotational semantics for the source language and small-step semantics for the target machine. In addition, Benton and Hur employ a biorthogonality step-indexed logical relation to establish correctness. As mentioned before, Dargaye [23] develops a compiler from Mini-ML to C minor, but it is not designed to be a standalone general-purpose Mini-ML implementation. Instead, it was conceived to work only on the code generated by the Coq extraction mechanism. The Coq extraction mechanism generates code of a real-life functional language, by default OCaml, but it is also able to generate Scheme and Haskell code. This is why in Dargaye's work, it only makes sense to cover terminating computations since the Coq's calculus, the Calculus of Inductive Constructions, is strongly normalizing [37], meaning in Coq all computations must terminate. For this reason, all extracted code from Coq should be terminating, while in Coq this property is ensured by Coq's type checker [26]; the code generation translation performed by the Coq extraction mechanism is not verified, although some efforts are conducted in this direction [6,24,25,[38][39][40].
The CertiCoq project [6,40,41] aims to provide a verified extraction pipeline from the core language of Coq, Gallina, to machine language. Therefore, in CertiCoq it also only makes sense to cover terminating computations. This fact is explicitly stated in [6]: '... we can restrict our reasoning to terminating programs since Coq is strongly normalizing. This way we avoid backward simulations (forward simulations proofs are much simpler) and avoid proving preservation of divergence'. Similarly, Savary Bélanger [40] indicates: 'In CertiCoq, we are only concerned with terminating programs: Gallina is strongly normalizing, and our proof of correctness ensures that programs do not acquire non-terminating behaviors along the way'.
Instead of producing machine code directly, CertiCoq generates C light (a CompCert C intermediate language) code. Hence, it uses CompCert C as verified compiler back-end to produce machine language. This way, CertiCoq compiler performs a series of phases from Galllina to Cligth. In CertiCoq, (intermediate) languages semantics and proofs of correctness are based on big-step semantics (for terminating computations). However, big-step semantics is refined with other notions such as step-indexed logical relations and context-based semantics [40,42] to account for additional properties, for instance, compositionality. In addition, the idea of adapting this technique to be useful for general-purpose programming languages is barely mentioned in [40]. Albeit, for this purpose, Savary Bélanger [40] suggests to employ small-step semantics. For their part, Paraskevopoulou and Appel [42], in order to prove closure conversion correctness, they already extend this technique to cover non-termination computations under certain conditions. Closure conversion is a phase performed by CertiCoq.
Our (coinductive) natural semantics framework is best suited to verify usual functional language to abstract machine implementations since it accounts for both terminating and non-terminating computations (total correctness). In addition, it can express terminating and non-terminating computations in an abstract machine. In contrast, by design [6] CertiCoq only covers terminating evaluations on one hand, and on the other it targets C light which is why no abstract machine is used. This situation reflects the fact that our (coinductive) natural semantics framework and CertiCoq pursuit different goals. Although our (coinductive) natural semantics framework is a framework to conduct total correctness compiler verification in Coq, CertiCoq is a verified compiler (from Coq's core calculus to Clight). Hence, it is not an explicit CertiCoq main objective to offer an infrastructure to perform compiler verification [6], even although, the infrastructure and techniques developed to verify the CertiCoq compiler could be adapted to be used to verify other compilers as well.
Step-indexed logical relations as shown by Ahmed [43] serve to establish contextual equivalence between programs. We remark that step-indexed logical relations provide a way to deal with two compiler problems in particular; specifically, compositionality and secure compilation.
In [44], Ahmed and Blume show how to use step-indexed logical relations together with small-step semantics to deal with a notion of secure compilation. Ahmed and Blume demonstrate their method, applying it to a typed closure conversion transformation. Patrignani et al. [45] offer a recent survey of the formal approaches and techniques used in secure compilation. Certainly, this survey includes in particular works that employ step-indexed logical relations. Abate et al. [46] study generalizations of trace-based compiler correctness criteria including some which accounts for secure compilation.
To account for compositionality, Perconti and Ahmed [47] propose the use of a language in which all languages involved in a compilation pipeline can be embedded. Then, using a step-indexed logical relation and small-step semantics compositional compiler correctness is established in terms of the combined language. For their part, Neis et al. [48] introduce parametric inter-language simulations (PILS) as a technique particularly suited to compositional compiler verification for higher-order imperative languages. In particular, they demonstrate their technique with Pilsner, a verified compositional compiler from a ML-like language to an assembly-like language. Patterson and Ahmed [49] provide a framework for expressing different notions of compiler correctness, especially those which consider compiler compositionality.
Dreyer et al. [50], in order to avoid tedious, error-prone and obscuring step-indexed arithmetic, instead of using explicit indices, they propose to 'hide' indices, internalizing them into a logic. The idea is to replace indices with a modal operator, this way obtaining a modal logic which they name LSLR. In particular, this idea is reused in IRIS. IRIS [7,[51][52][53] is a concurrent separation logic framework implemented and verified in Coq. In this regard, Krebbers et al. [53] comment: 'We also show that the step-indexed "later" modality of Iris is an essential source of complexity, in that removing it leads to a logical inconsistency'. Recently, Linn Georges et al. [54] formalize a capability machine in IRIS. As Linn Georges et al. [54] point out, capability machines are promising targets for secure compilers. Hence, the idea to extend IRIS to be used as secure compiler framework is imminent; in particular, to verify secure compilers from high-level concurrent languages to capability machines. However, to the authors' knowledge, IRIS has never been used in this manner. A very similar goal is pursued by Cuellar et al. [55] and Cuellar [56] but extending CompCert C. To this end, they introduce the Concurrent Permission Machine (CPM). Certainly, C (with concurrency) is the source language in these works.
In retrospective, on one hand step-indexed logical relations have proved to be useful, in particular in secure compilation, compiler compositionality and concurrency; on the other hand, natural semantics has proved to be easier and more convenient than other formalisms (for instance, small-step semantics) for compiler correctness proofs. Hence, we speculate that both natural semantics and step-indexed logical relations can be combined in a single formalism that has the best properties of each one of them. In other words, we envisage the ambitious goal of reaching a single formalism that features secure compilation, compositional compilation, concurrency and be simple, easy and intuitive as possible.
Currently, our (coinductive) natural semantics framework does not account for secure compilation, compositionality nor concurrency. However, we conjecture that step-indexed logical relations can be adopted in it to address some or even all these features. The price paid for this effort would be to deal with the known complexity of step-indexed logical relations (although it could be ameliorated, for instance, by internalizing the indices in a natural semantics modal logic). At present, our (coinductive) natural semantics framework is simple, easy and intuitive.
The only one of these works that presents the verification of the correctness of a compiler is Leroy's (an ad-hoc verification). This means that (coinductive) natural semantics is not used in any of them as a unifying framework for the verification of compiler correctness. Specifically, it is not used in the definition of the semantics of the machine (nor in that of its interpreter), it is not used to define the translations, and it is not used (both in the source language and in the target language) to establish, nor to prove the correctness of the translations. What it does, in each of them, is to present a natural semantics with coinduction of a high-level language (which would usually correspond to the source language in a compiler) and it is this aspect that we review next.
Leroy [14] first expresses finite computations with natural semantics 'evaluation' and infinite computations 'divergence' with coinductive natural semantics, separately; this solution is clear and clean. After, he offers an alternative solution in which one finite and infinite computations are expressed in a single coinductive natural semantics 'coevaluation'; however, this semantics does not behave well in the sense that on one hand, there are infinite computations that it is not able to express, and on the other, there are infinite computations that are evaluated to any value v. Nakata and Uustalu [57] remark that this behavior appears accidental and undesired. Nakata and Uustalu [57] define a coinductive natural semantics of the While language that expresses finite and infinite computations; the careful and ad-hoc design of semantics follows, and within it that of small-step semantics. Additionally, Nakata and Uustalu define an interpreter using the trace monad and they show that it is correct regarding such semantics. Nakata and Uustalu's work [57] is the only one of the related works presented here in which an interpreter is presented.
Charguéraud [20] introduces pretty-big-step semantics, a semantics based on 'coevaluation' of Leroy. Unfortunately, pretty-big-step semantics inherits the not well behavior of 'coevaluation'. In turn, Bach Poulsen and Mosses [21] define flag-based big-step semantics based on pretty-big-step semantics. Unfortunately, flag-based big-step semantics, through pretty-big-step semantics, also inherits the not well behavior of 'coevaluation'.
In this work, we present (coinductive) natural semantics as a framework for the verification of total correctness of compilers in Coq. Once we have a simple, easy, clear, and intuitive solution for this task, we can seek to improve it in the future. In particular, we use a natural parameter in the interpreters to bound the recursion. Recently, Leroy [58] has defined an interpreter of While using the partiality monad in Coq; we plan to adopt the partiality monad in our Mini-ML compiler and in the framework in general to avoid the use of this parameter.
On the other hand, we can seek to reach a single coinductive natural semantics ' co ⇒ ' able to express terminating computations, as well as non-terminating computations. Charguéraud [20] mentions that in principle, this semantics can be used directly to prove total correctness of the translations; however, he points out that the conclusion in the correctness theorem is usually of the form ∃v ′ .
and that the current support of coinduction in Coq only allows using coinductive predicates in the conclusion. In particular, it does not allow using the existential quantifier '∃' or the connective '∧' when a proof is done by coinduction. Bach Poulsen and Mosses [21] run a similar criticism to the current coinduction support in Coq. Fortunately, our (coinductive) natural semantics is ideal here since, when using (inductive) natural semantics '⇒' to express finite computations and coinductive natural semantics ' ∞ ⇒ ' to express infinite computations (separately), the proof of the termination case where the conclusion requires an ∃ and an ∧ can be done by induction, whereas, in this way, in the case of non-termination, in the conclusion neither the ∃ nor the ∧ is required, only the coinductive predicate ∞ ⇒ is used, so it can be proved by coinduction (with the current support of Coq).
Even then, it would be possible to aim at having a single semantics in order to have a more concise definition. If so, the framework could automate the translation from it to the two separated semantics (⇒ and ∞ ⇒ ); also, the framework could establish and prove the equivalence between the first one and the union of the last two semantics. Having arrived at these two semantics, the current results of the framework can be used. The central problem is that (to the authors' knowledge) to date, there is not a single coinductive natural semantics in the literature that expresses finite and infinite computations, and that it does behave well. The first author, based on Leroy's 'coevaluation', has succeeded in defining a single coinductive natural semantics (of the pure lambda calculus extended with constants) that expresses terminating and non-terminating computations, and that it does behave well in Coq. Also, he has proved the equivalence of this semantics with the union of the two semantics (⇒ and ∞ ⇒ ) that express, respectively, finite and infinite computations separately. Apparently, this result is sound [59] and we plan to present it in future works.
To continue, it would be possible to deal with the problem of decreasing the number of rules necessary in a coinductive natural semantics definition. This is the main goal of pretty-big-step semantics and flag-based big-step semantics. To this end and going further, we envisage that the results in this work and those of pretty-big-step semantics and flag-based big-step semantics (future work and perhaps other works as well) can be integrated in a coinductive natural semantics framework having all the desired properties of each of them. In other words, it is our intention that the resulting coinductive natural semantics framework synthesizes all the major advances in natural semantics.

Contributions
Our main general and specific contributions are: • The (coinductive) natural semantics as a framework for the verification of total correctness of compilers in Coq. Such that a working standalone verified compiler, meaning, a compiler sound and complete regarding (coinductive) natural semantics specification and correct regarding semantic preservation of the specified translation can be obtained as a final product • A systematic method to obtain either a sound or complete interpreter (Sections 2.1 and 3.1) or compiler (Section 2.3.1), as applicable, from a (coinductive) natural semantics specification • The use of coinductive natural semantics to specify non-terminating computations in an abstract machine (Section 3. An algorithm to translate from (an abstract representation of) the (coinductive) natural semantics specification of a total correct compiler to its corresponding formalization in Coq (Section 4) The strategy for the presentation is, first, to tackle the termination case using natural semantics (Section 2), and then the non-termination case using coinductive natural semantics (Section 3). During this work, the method that we use is to present each of the compiler's components together with their corresponding Coq formalization; in this way, it is intended that when our compiler is finished, we will have the necessary intuition behind the algorithm to go from a total correct compiler in abstract to Coq (Section 4). Finally, (Section 5), we present our conclusions.

Natural Semantics
We will first tackle, in this section, the case in which the computations are finite (terminating) using natural semantics.

MiniML dB
To begin with, we introduce the source language, Mini-ML, in de Bruijn notation, which is essentially the pure lambda calculus extended with naturals, Booleans, arithmetic and comparison operators, local definitions, conditionals and native recursion by means of a fixed point operator. Its abstract syntax is the following: Before carrying out the coding of our definitions in Coq, it is important to highlight some of its features. Coq is based on the Calculus of Inductive Constructions, i.e., a lambda calculus with a sophisticated and expressive type system. Since it is a lambda calculus, it can be used as a logic, but also as a programming language, i.e., we can prove propositions, but also write programs. This distinction is made explicit by using the types Prop and Set respectively. Roughly, it can be said that when a term in Coq has type Prop, it is used as logic, and if a term has type Set, then it is used as programming language. In fact, this explicit syntactic distinction between Prop and Set was introduced by the Coq extraction mechanism [60] to distinguish between those terms with computational content and those without it (Paulin-Mohring [60] calls 'Spec' what would later be called 'Set'). In this way, if a term in Coq has a Set type, the extraction mechanism can generate a program written in a general-purpose programming language, such as OCaml, related to this term, in contrast to a term with a Prop type from which is not possible to extract any program at all.
The abstract syntax of MiniML dB can be coded in Coq as first order abstract syntax using an Inductive definition with type Set as follows: For conciseness, throughout this article, we will show only the essentials parts of the formalization. The full formalization can be consulted in [61].
Hence, the Coq extraction mechanism can be used with the following command: which gives as a result: This way, we can notice how an Inductive definition with type Set in Coq corresponds to an ADT in OCaml, in this case to the abstract syntax of MiniML dB written in OCaml.
To define the natural semantics of MiniML dB , we first need to define its values by means of environments and closures.
The environments serve as (implicit) associations from variables (represented by de Bruijn indices) to values. In this manner, as expressed by the predicate Ω ⊢ i → v, the value of a variable represented by the index i is at the i th position in the environment (a sequence of values).
The natural semantics of MiniML dB is inductively defined by the predicate: which can be read as follows: in the environment Ω, the expression d is evaluated to the value v.
The Ω environment is supposed to contain the value of the free variables in d. The natural semantics of MiniML dB is defined as follows: A natural semantics definition can be seen as an inductive logical proposition; hence, it can be encoded in Coq as an Inductive definition with type Prop. This way the MiniML dB semantics can be written as follows: A natural semantics definition, in general, is a relation; therefore, no determinism is the general case. Then, in case a definition of this kind is deterministic, a lemma that expresses this property must be formally established in the same way we do it for the MiniML dB semantics: If a relation is deterministic, then it can be stated as a function. Consequently, it can be encoded as function in a programming language. In particular, in Coq, we can write a recursive function employing Fixpoint. For instance, the function corresponding to MiniML dB natural semantics is written as follows: Please note that this function is actually an interpreter. To guarantee termination, we added the natural parameter depthR which indicates the recursion depth (depthR is also called 'fuel' by some authors, see, for example [24,25,62,63]). This is because the CIC is strongly normalizing [37], which means, from a programming-language perspective, that all the calculations must terminate. In Coq, this means that all functions must be total and terminating.
From a verification perspective, a logical proposition serves as a formal specification that a program must comply. In this case, the logical proposition is the Inductive definition in Prop whereas the program is the interpreter in Set. Then, to verify that our interpreter is sound with respect to the MiniML dB natural semantics, we must prove the following lemma: Conversely, i.e., to verify that our interpreter is complete with respect to MiniML dB natural semantics, we must prove this lemma: Now we write factorial of 5 in MiniML dB as a program example: then, we can evaluate it in our MiniML dB interpreter, inside Coq, as follows: obtaining just the expected result: = Some ( Num_dB 120) : option MML_dB_val Next, we can use the extraction mechanism as follows: Extraction MML_dB_NS_interpreter.
so, in this way, we obtain a verified interpreter, sound and complete with respect to MiniML dB natural semantics in OCaml, ready to be used in real life.
The reader may question the 'double' task of maintaining both definitions, Prop and Set. On one hand, if we stay in the logical part, in Prop, a verified compiler cannot be obtained to be used in real life, while on the other hand, definitions using the Set type forces us to work with total terminating functions.
Let us recall that natural semantics is, in general, inherently relational and, non-deterministic; therefore, to write a natural semantics definition as a function we must ensure that it is total and deterministic. Although for some particular cases this is true, we think that if a natural semantics definition is written directly as a function, the essence of natural semantics vanishes.
Also, Coq automatically generates inductive principles from inductive definitions, which is not the case for functions. These induction principles are useful as they can be used through the induction tactic while doing a proof. In some scenarios, this is an advantage, especially, of course, when a proof is done by induction.
Regarding the remarks above, we give definitions in Prop to retain the essence of natural semantics and to take advantage of the inductive principles generated by Coq. Also, we give the corresponding definitions in Set, mainly to obtain verified implementations.

Modern SECD Machine
Leroy [8,14] introduces the Modern SECD, a machine based on Landin's SECD [64] with two main differences: the first one is that it does not use a Dump; instead, it makes use of frames in the stack to support function calls; the second one is that it uses de Bruijn indices to access the environment.
The original Modern SECD only offers natural constants, local definitions, abstraction and application support. Due to this, we offer an extended version of the MSECD to support Booleans, arithmetic and comparison operators, conditionals, and native recursion by means of recursive closures. It is worth mentioning that we made the conditionals support based on Henderson's SECD presentation [65].
The instructions of the extended MSECD are the following: Notice the distinction between 'i' and 'i', the former denotes a machine instruction, whereas the latter denotes a de Bruijn index.
The code is defined as an instruction sequence: The values of the machine and its environment are defined as follows: Besides these values, the frames should be able to be stored in the stack. Therefore, the stack values and the stack of the machine are defined as follows: The MSECD small-step semantics is a transition relation from a configuration ( Next, we present the MSECD small-step semantics, shown in Table 1.
To codify this semantics in Coq, we write: Let m 1 , m 2 and m 3 be MSECD machine configurations, the transitive closure of the small-step semantics is defined inductively as follows: In Coq, this transitive closure is written as follows:

Compilation
Leroy [14] defines the compilation from the pure lambda calculus extended with constants to MSECD machine code as a function. Here, we extend his work to all the MiniML dB language constructs:

Correctness
The correctness of this compilation can be established by semantic preservation for which it is necessary to extend the compilation to values and environments of the machine, as shown below: In this way, if an expression d is evaluated to a value v in an Ω environment, it is expected that its compilation d is evaluated to v in the Ω environment. However, to prove this result, it is necessary to strengthen the hypothesis (we will see, in Section 2.3.2 that this is not necessary when natural semantics is used). Here, to strengthen the hypothesis is to concatenate any code c at the end of compilation d , so when the evaluation of d finishes, it is expected that v is at the top of the stack, and the code c remains to evaluate. This is formally expressed in the following theorem formulated by Leroy [14]: for all codes c and stacks s.
Proof outline. The proof is conducted by induction on the derivation of Ω ⊢ d ⇒ v. The base cases where d is a natural, a Boolean, a nameless variable, an abstraction, or a fixed point (recursive abstraction) are straightforward since the corresponding d compilation is a single machine instruction whose evaluation is performed by a single machine step → that mimics the MiniML dB d evaluation.
For the inductive cases where d is an arithmetic or comparison expression, a conditional, a local definition, or an application the proof follows the structure of the derivation Ω ⊢ d ⇒ v. The key idea is to use the + → transitivity together with the induction hypothesis, while evaluating the intermingled single instructions which appear in d by performing the corresponding → machine step.
The complete proof can be consulted in Appendix A. This theorem is written in Coq as follows: Theorem compile_eval: :: s)).

Big-Step MSECD Machine
This section introduces our big-step version of the Modern SECD machine. This machine is strongly based on our extended version of the original small-step semantics MSECD. Unlike the small-step MSECD, due to the high-level of abstraction of natural semantics, it is not necessary to use stack frames at all, and therefore the return instructions are also unnecessary (IRet which works for returning from a function call, nor IJoin that works for returning from a conditional). Having said this, we can affirm that the use of natural semantics directly impacts the machine design, specifically the machine's components.
The instructions of the big-step MSECD machine, as well as its code, are the following:  a pair (∆, s) where ∆ is a machine environment and s a stack. The machine natural semantics is defined by the following two mutually dependent predicates: the first one for machine code, which can be read as follows: if the machine is in a state (∆, s), and a code c is given, evaluating c takes it to the state (∆ f , s f ). The second one for instructions, which can be read as follows: if the machine is in a state (∆ 1 , s 1 ) and an instruction i is given, evaluating i takes it to the state (∆ 2 , s 2 ). However, the entry point for the semantics should be the predicate for code. The environment ∆ is supposed to contain the value of the free variables (represented by IAcc i instructions) in c, whereas the environment ∆ 1 are supposed to contain the value of the free variable in i (if i is an instruction IAcc i). The natural semantics of the machine is the following: The coding of the natural semantics of the machine is similar to that of the natural semantics of MiniML dB (only, it is necessary to use a mutually dependent definition in Coq, in correspondence with the mutually dependent predicates of the machine natural semantics). The same applies for the interpreter and its respective lemmas of soundness and completeness regarding the machine natural semantics. The formalization details can be consulted at [61].
We highlight that the machine natural semantics has the property enunciated in the following lemma: Lemma 1. Let ∆, ∆ 1 , ∆ 2 be machine environments; s, s 1 , s 2 stacks; c 1 , c 2 machine codes. If Proof outline. By induction on the derivation ∆, s ⊢ c 1 ⇒ (∆, s 1 ). The base case is when c 1 is the empty code c 1 = [] which follows simply by hypothesis. The inductive case is when c 1 is not empty c 1 ≠ [] which is proved by applying the induction hypothesis and by ⇒ definition.
This lemma is useful to prove compilation correctness (Section 2.3.2). A detailed proof can be seen in Appendix A.

Compilation
Using natural semantics, the compilation from MiniML dB to code of the big-step Modern SECD machine is defined by the following predicate: d ⇓ c meaning, the MiniML dB expression d is compiled into the machine code c.
Regarding the Coq encoding of this compilation, it is analogous to the MiniML dB semantics, meaning, it is done with an Inductive definition with Prop type: It is noteworthy how this time, instead of the function being an interpreter, it is a compiler, since it translates an expression instead of evaluating it. Also, it is not necessary to add a natural parameter to bound the recursion, given that the translation is decidable. This fact is guaranteed in Coq by using structural recursion (based on syntax) on the expression d.
The following lemma expresses that the compiler is sound regarding the natural semantics definition of the compilation: Lemma Compilation_NS_compiler_soundness: Conversely, the next lemma expresses that the compiler is complete regarding the natural semantics definition of the compilation:

Correctness
To establish the correctness, we extend the compilation to values and environments once more, so after that we can formulate semantic preservation.
To formulate correctness, we expect that if a nameless expression d is evaluated to a value v in an environment Ω; moreover, if c is the code resulting of the d compilation, and ∆ is the resulting compilation of Ω; then, it must exist a machine value v m that corresponds to the compilation of v and, when c is evaluated starting with the machine in a state (∆, s), for any stack s, the evaluation takes the machine to the state (∆, v m ⋅ s). Now, the correctness theorem is enunciated.

Theorem 2 (Correctness for termination).
Let Ω be a nameless environment, ∆ a machine environment, d a nameless expression, c a machine code, v a nameless value. If then there exists a machine value v m such that v v m and for all stack s, Proof outline. We proceed by induction on the derivation of Ω ⊢ d ⇒ v. The base cases where d is a natural, a Boolean, a nameless variable, an abstraction, or a fixed point (recursive abstraction) are straightforward. In these cases, we exhibit a v m such that v v m , since c is the result of d compilation, c is a single machine instruction hence ∆, s ⊢ c ⇒ (∆, v m ⋅ s) follows simply by definition. For the inductive cases where d is an arithmetic or comparison expression, a conditional, a local definition, or an application, the main idea is to use the induction hypothesis in tandem with Lemma 1. In such way the machine evaluation follows the structure of the Ω ⊢ d ⇒ v derivation and the proof is simple and intuitive.
The complete proof can be consulted in Appendix A. This theorem is written as follows in Coq: We can immediately notice that due to the unifying use of natural semantics to define each of the compiler's components: source language, compilation and machine; the source language is mapped down, in a transparent way, to the target language (in this case, machine code) by means of the compilation. In this manner, to establish the correctness turns out to be easier, clearer, simpler and more intuitive than using an ad-hoc solution. For instance, in this case, it was unnecessary to previously define a closure of a relation, and it was also unnecessary to strengthen the hypothesis to prove the correctness theorem compared to the use of a function to define the compilation and small-step semantics in the machine.

Coinductive Natural Semantics
In this section, we will address the case in which the computations do not terminate, for which we will use coinductive natural semantics.

MiniML dB
In general, coinduction allows us to reason on infinite structures. In this way, taking into account the natural semantics design, we can employ a coinductive definition to express infinite evaluations of a language, in this case of MiniML dB . Following Leroy [14], we define the coinductive natural semantics for divergence (infinite evaluations) by the following predicate: which can be read: in the Ω environment, the evaluation of the expression d diverges, is infinite, or, non-terminate. That is, the infinite evaluations of MiniML dB are defined by the coinductive interpretation of the following rules: Adopting the Leroy [14] convention, double horizontal lines denote coinductive interpretation, whereas single horizontal lines denote inductive interpretation.
The coinduction support in Coq is based on the work of Giménez [66]. In particular, Coq has native support of coinductive definitions. Likewise, a natural semantics definition can be encoded in Coq as an Inductive definition with type Prop, a coinductive natural semantics definition can be encoded in Coq as a CoInductive definition with type Prop. Hence, the MiniML dB semantics for divergence can be written in Coq as follows: | MML_dB_CNS_AppF: In the same manner as shown earlier in Section 2.1, where we verified that our interpreter MML_dB_NS_interpreter is sound regarding MiniML dB natural semantics ; here, we must verify that it is sound regarding MiniML dB coinductive natural semantics for non-termination, meaning, we must prove the next lemma: This lemma states that if the evaluation of d does not terminate, whatever the n value of the fuel is, the interpreter will necessarily, eventually, run out of fuel (None means that the interpreter runs out of fuel).
Conversely, to verify that our interpreter is complete regarding MiniML dB coinductive natural semantics, we must prove this lemma: This lemma says that if the interpreter runs out of fuel, then, there is not a finite evaluation of d or, in fact, it exists a finite evaluation of d, but more fuel is needed for the interpreter to be able to compute it. The proof of this lemma in Coq requires classic reasoning.

Modern SECD Machine
Let us now see how to express non-terminating computations in a machine. Leroy [14] uses small-step semantics to express infinitely many transitions in the MSECD. He defines the transition relation ' ∞ → ' coinductively as follows: This relation can be written in Coq in the following manner: However, by using this definition directly, it is not possible to prove the correctness of the compilation in the case of non-termination. The reason is that the Coq coinduction mechanism imposes the guard condition. The guard condition requires (at least) one rule (a constructor) of a coinductive definition to be used before the coinduction hypothesis is employed during a proof by coinduction. The solution offered by Leroy [14] is to define an auxiliary relation with which the proof can be carried out, and which is equivalent to the previous definition.
The most important property of the ∞ → n relation, for our purposes, is that it allows the machine to remain in the same configuration, at most, a finite number n times ( ∞ → n -sleep rule). This rule is crucial, as it is used for being able to comply with the guard condition when carrying out the correctness proof. At some point before n reaches 0, or necessarily when n arrives at 0, at least one transition ( ∞ → n -perform rule) must be performed, in exchange for making a transition, the value of n is reset to any natural n ′ , i.e., the possibility is given (again) to remain in the same configuration (this time, n ′ times at most). The relation ∞ → and the relation ∞ → n are equivalent as stated in the following lemma: A more detailed proof can be consulted in Appendix A.

Compilation Correctness
To carry out the correctness proof, it is necessary to define a measure that indicates how many times the machine can remain in the same configuration based on the constructs of a language. The measure offered by Leroy (extended to cover all MiniML dB ) is the following: This is because it is possible that an evaluation step of a MiniML dB expression d does not correspond, one to one in the same order, to a transition while evaluating d in the machine, causing the machine to stay at the same configuration, d times, before performing a transition.
In this way, we are ready to state the correctness for the non-termination case, using the auxiliary relation ∞ → n and strengthening the hypothesis, by the following lemma: , Ω , s) ∞ → d for all codes c and stacks s.
Proof outline. By coinduction. The main idea of the proof is to use Theorem 1 to evaluate the finite parts of d, to apply the coinduction hypothesis on the infinite parts of d, and to employ ∞ → n -sleep and ∞ → n -perform rules as convenient.
A complete proof of this lemma can be consulted in Appendix A. This gives the possibility for formulating the correctness theorem directly with the m ∞ → relation, as Leroy [14] does, as follows: , Ω , s) ∞ → for all codes c and stacks s.
Proof outline. The result is an immediate deduction from Lemma 3 followed by an application of Lemma 2.
A step-by-step proof can be seen in Appendix A. This theorem is written in Coq as follows: Theorem compile_evalinf:

Big-Step MSECD Machine
This section introduces coinductive natural semantics to express non-terminating computations (infinite evaluations) in a machine. We illustrate its use with our big-step Modern SECD machine.

Rules of Non-terminating Computations
As with natural semantics (for finite evaluations, Section 2.3), coinductive natural semantics for divergence (infinite evaluations) is defined by the following two mutually dependent predicates: The first one reads: in the state (∆, s) the instruction i diverges, whereas the second one reads: in the state (∆, s) the code c diverges.
Thus, the infinite evaluations of the machine are defined by the coinductive interpretation of the following rules: The coinductive natural semantics of the machine is encoded in Coq in an analogous manner to the coinductive natural semantics of MiniML dB (it is just necessary to use a coinductive mutually dependent definition in Coq that parallels the mutually dependent predicates of the machine coinductive natural semantics). Likewise, the respective lemmas that express that the machine interpreter is sound and complete regarding the machine coinductive natural semantics are analogous to those of the MiniML dB interpreter. These formalizations can be seen in detail in [61].
We will give a brief explanation of the rules. When evaluating a code, it must begin by evaluating the first instruction i. The evaluation of i can be finite (rule 2) or infinite (rule 1). In the rule 1 case, if the first instruction diverges, then the complete code diverges. How can an instruction diverge?
Let us recall (Section 2.3) that due to the high level of abstraction of the big-step MSECD, in the case of termination, the ISel and IApp instructions are fully evaluated in a single big step, which includes the evaluation of their sub-codes and is why for these instructions it is necessary to define rules that allow expressing the possibility that their corresponding sub-codes diverge (rules 3-6). In the case of rule 2 , if the evaluation of the first instruction terminates but the remaining code diverges, then, the complete code (including the first instruction) diverges.
We can note here that in principle only rule 2 is necessary to express divergence in the machine since, intuitively, an instruction performs only a single basic operation and this rule is the analogous of the small-step semantics transition relation ∞ → . However, as already mentioned, our big-step machine has two instructions, namely ISel and IApp, which are high-level (and therefore they evaluate different from their small-step semantics counterparts, performing not only single basic operations but a big-step sub-code evaluation). This is why these instructions require specific rules, while the remaining instructions, perform in fact only a single basic operation; for instance, IConst n push n on the stack. This is why these remaining instructions do not need specific rules.
In this way, the machine computations that do not terminate are completely defined. However, as in the case of the MSECD small-step semantics, we are facing, once more, the problem with Coq's guard condition. This means that similarly, we cannot prove correctness directly by using this relation. To solve this problem, we will present a variant of Leroy's solution adapted to coinductive natural semantics. This means that we must define an auxiliary relation equivalent to the previous relation, and which allows proving correctness. Below, we present the auxiliary relation: The rule ( ∞ ⇒ n -sleep) is the improved analogous version of Leroy's rule ( ∞ → n -sleep), since, in addition, it expresses that if in a code, the initial code c 1 diverges, then, no matter what the remaining code c 2 is, the code will diverge. The importance of this improvement is that it allows proving correctness without the need to strengthen the hypothesis. The rule ( ∞ ⇒ n -perform) is analogous to the rule ( ∞ → n -perform) of Leroy. We can note that rule 2 disappears (is no longer necessary) because it is subsumed by the ( ∞ ⇒ n -perform) rule. For its part, the remaining rules (1 and 3-6) remain unchanged, i.e., they are analogous, but instead of the relation for code ∞ ⇒ they use the relation ∞ ⇒ n with any natural n.
It is worth remarking that the ∞ ⇒ relation defines the machine computations that do not terminate in the expected way and it would be ideal working directly with it in Coq; however, because of the Coq guard condition it is not possible. This is why we have defined the ∞ ⇒ relation just to beat the Coq guard condition. Since we have defined the ∞ ⇒ relation (more precisely ∞ ⇒ n relation for code and ∞ ⇒ relation for instructions) in a very similar way to the ∞ ⇒ relation, we have used a very similar notation; nonetheless, we should be careful and notice the distinction between ∞ ⇒ and ∞ ⇒ .
The following lemma states that the two original mutually dependent relations are equivalent to the two auxiliary mutually dependent relations.

Lemma 4.
Let ∆ be a machine environment, s a stack, i a machine instruction, and, let c be a machine code, n any natural, i.e., of the following two cases: Assuming 2, since the instructions' definition of ∞ ⇒ and ∞ ⇒ are analogous, the proof proceeds simply by case analysis which are proved directly by ∞ ⇒ definition using 2 to obtain the required ∆, s ⊢ c ∞ ⇒ n code premises.

2.
If ∆, s ⊢ c ∞ ⇒ then ∆, s ⊢ c ∞ ⇒ n . By coinduction. The main idea is to use the coinduction hypothesis together with the ∞ ⇒ n -perform rule, and to apply 1 to obtain ∆, s ⊢ i ∞ ⇒ premises when necessary.
The only if part consists of: That is, of the following two cases: Assuming 4, the proof is analogue to that of 1 going in the opposite direction (and using 4 instead of 2).

If
The proof proceeds by coinduction.
The key idea is to use the coinduction hypothesis together with ∞ ⇒ n definition, in particular with the ∞ ⇒ n -perform rule, employing the assumption in the step which is required, and applying 3 to obtain ∆, s ⊢ i ∞ ⇒ premises when necessary.
The details of this proof can be consulted in Appendix A.

Compilation Correctness
To prove correctness, we must use the auxiliary relation ∞ ⇒ n along with a measure. Following an analogous argument to the case of the small-step MSECD, the measure d also works here. In this way, the correctness for the non-termination case can be formulated by the following lemma: Lemma 5 (Correctness for non-termination (auxiliary)). Let Ω be a nameless environment, ∆ a machine environment, d a nameless expression, c a machine code. If then, for all stack s, Proof outline. We proceed by coinduction. The main idea is to mimic the MiniML dB d evaluation in the machine while evaluating c. To carry out this idea, for the finite parts of d (if any), we employ Theorem 2, whereas for the infinite parts of d, we apply the coinduction hypothesis. In addition, we use the ∞ ⇒ n definition, including the ∞ ⇒ n -sleep and ∞ ⇒ n -perform rules, as necessary.
A complete proof of this lemma appears in Appendix A. Finally, we enunciate the correctness theorem for the non-termination case of the machine, using the ∞ ⇒ relation directly in the following manner: Theorem 4 (Correctness for non-termination).
Let Ω be a nameless environment, ∆ a machine environment, d a nameless expression, c a machine code. If then, for all stack s, The result is a direct consequence of Lemma 5 followed by Lemma 4.
A detailed step-by-step proof can be found in Appendix A. This theorem is written in Coq as follows: Theorem Compilation_CNS_correctness:

Abstract to Coq Translation Algorithm
At this point, by means of our Mini-ML compiler, we have shown how from the (coinductive) natural semantics definition of each compiler component the corresponding formalization in Coq can be realized. It is our intention here to generalize this method and write it formally as an algorithm.
Algorithm 1 expresses how to translate a (coinductive) natural semantics definition of a compiler to its corresponding formalization in Coq.
We observe that the steps of the algorithm possess a high-level of abstraction. This is favorable in the sense that it provides freedom on how to actually implement them. We can even take advantage of this freedom by applying previous work. For instance, applying the results in [67], step 14 could be performed by generating a function from the Inductive definition I N , corresponding to the natural semantics N.
Analyzing the algorithm, as we can note in step 35, the case in which the target language R of a translation T is a postfix representation requires a special treatment that merits explanation. Let V be the source language of T, d a construct of V, if c is the translation of d into R, then if R is a prefix representation when reasoning about the evaluation of c in Coq, necessarily a constructor s associated with the construct, must be used (at front) when starting the evaluation; in this way, the guard condition is fulfilled. Instead, if R is a postfix representation, then necessarily a constructor s associated with the construct, must be used but (behind), at the end of the evaluation; in this way, the Coq's guard condition is not fulfilled since it requires that it must be used at the start (at front). For instance, let Plus e1 e2 be a construct in MiniML and let Plus_dB d1 d2 be its translation in MiniML dB , then the evaluation of Plus_dB d1 d2 will start using a Plus_dB constructor associated with the addition, at the start, and the Coq's guard condition will be satisfied. Instead, let Plus_dB d1 d2 be a construct in MiniML dB and c1++c2++IAdd its translation into code of the big-step MSECD, then the evaluation of c1++c2++IAdd will not start using an IAdd constructor associated with the addition, even when potentially, eventually, it will be used at the end in fact. Therefore, in this latter case, we must find a way to express and convince Coq's guard condition that the constructor will be in fact used, but at the end of the evaluation of c. This is exactly what the solution presented in Section 3.3 does, if the auxiliary relation is used, then initially the sleep rule constructor can be used, which allows the starting of the evaluation of c without using a constructor s associated with the construct, ensuring that potentially, eventually, such a s constructor will be in fact used (what is expressed in the constructor corresponding to the perform rule); the measure function d indicates the number of constructors that will be used at the end of the evaluation of c. This is certainly a weakness of Coq's guard condition that turns out to be too inflexible in this case, and that is why we have had to resort to an indirect solution.
In fact, in our compiler, in the MiniML to MiniML dB translation it was not necessary to use this indirect solution at all because MiniML dB is a prefix representation. Instead, it was in fact necessary to use it in the MiniML dB to big-step MSECD machine code generation because machine code is a postfix representation. Emit an Inductive definition I N with type Prop; 10 foreach rule r ∈ N do 11 Add the constructor s, corresponding to the rule r, to I N ; 12 end 13 Emit a Lemma that states N determinism; 14 Emit a Fixpoint function (interpreter) i that mimics the N natural semantics; 15 Emit a Lemma enunciating that the interpreter i is sound regarding N natural semantics; 16 Emit a Lemma enunciating that the interpreter i is complete regarding N natural semantics;

17
Let CoN be the coinductive natural semantics of L;

18
Emit a CoInductive definition C CoN with type Prop;

19
foreach rule r ∈ CoN do 20 Add the constructor s, corresponding to the rule r, to C CoN ; 21 end 22 Emit a Lemma enunciating that the interpreter i is sound regarding CoN coinductive natural semantics;

23
Emit a Lemma enunciating that the interpreter i is complete regarding CoN coinductive natural semantics;

24
Emit an Extraction command with the interpreter i as argument; 25 end 26 ...

Algorithm 1: Translation of a Compiler Definition to Coq (second part)
26 foreach translation T do 27 Emit an Inductive definition I T with type Prop; 28 foreach translation rule r ∈ T do 29 Add the constructor s, corresponding to rule r, to I T ; 30 end 31 Emit a Lemma that states T determinism;

32
Emit a Fixpoint function (compiler) c that mimics the T translation; 33 Emit a Theorem enunciating the translation T correctness for termination; 34 Let R be the target language of the translation T; 35 if R is a postfix representation then 36 Emit an auxiliary CoInductive definition C ′ CoN analogous to C CoN , but including a natural n as additional term; 37 Add the constructor s, corresponding to the adapted improved sleep rule, to C ′ CoN ;

38
Add the constructor p, corresponding to the adapted perform rule, to C ′ CoN ; 39 foreach rule r ∈ CoN do 40 if r is subsumed by the improved sleep rule or the perform rule then 41 Remove the constructor s, corresponding to the rule r, from C ′ CoN ; 42 end 43 end 44 Emit a Lemma that states the equivalence between C CoN and C ′ CoN ;

45
Let V be the source language of T; Emit a Lemma enunciating the translation T correctness for non-termination (using C ′ CoN and h); 54 end 55 Emit a Theorem enunciating the translation T correctness for non-termination (using C CoN );

56
Emit an Extraction command with the c compiler as argument; 57 end

Conclusions
Natural semantics is a simple, easy, and intuitive formalism widely known and used in the literature to define the semantics of programming languages.
In this work, we extended (coinductive) natural semantics to present it as a unifying framework for the verification of total correctness of compilers in Coq. This way, we present a solution to the problem of having a simple, easy, clear, and intuitive framework to perform this task in this proof assistant. By means of this framework, one can obtain standalone executable verified compiler.
Although we have not illustrated it here, natural semantics can also be used to express and verify the static semantics of a language. For instance, in [68] the Mini-ML static semantics is verified (although it is not possible to obtain a verified semantic analyzer). In future work, we plan to extend this use of natural semantics to make it possible to obtain a verified semantic analyzer.
To have a full compiler framework, we must address lexing and parsing too. So, we envisioned that natural 'semantics' can also be used to perform these tasks. This inspiration comes from the observation that, as stated by Kahn [11], natural deduction is at the heart of natural semantics, so we are looking for a natural deduction-based parsing strategy, fortunately, it already exists a parsing technique with these features since long time ago. In the logic programming community, parsing as deduction [69,70] is a well-known and established natural deduction-based parsing framework. Therefore, since both: natural semantics and parsing as deduction are based on natural deduction, we believe that we can abstract both in a single formalism able to express both: syntax and semantics. In this way, it would achieve natural 'semantics' as full compiler verification framework in Coq.  Acknowledgments: The first author sincerely thanks Xavier Leroy for inspiring him to work with coinductive natural semantics in Coq. In particular, for revising his preliminary results regarding the equivalence of co ⇒ with the union of ⇒ and ∞ ⇒ . Finally, for encouraging him to write his results and extend his work. The first author is grateful to David de Frutos-Escrig for providing advice during this work. Special thanks to Veronica Dahl who offered guidance, and useful insight into Prolog parsing techniques. Our deepest gratitude goes to our anonymous reviewers whose accurate, appropriate and constructive comments have led to a valuable improvement of this work.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Proofs
for all codes c and stacks s.

Proof. By induction on
Base cases: (i) d = n. By hypothesis Ω ⊢ n ⇒ n. Since by definition n = IConst n and n = n we must prove (IConst n ⋅ c, Ω , s) + → (c, Ω , n ⋅ s), which follows simply by definition of the machine small-step semantics transition → corresponding to the IConst n instruction.
transitivity, and applying induction hypothesis.
and applying induction hypothesis.
and applying induction hypothesis.
transitivity, and applying induction hypothesis.
and applying induction hypothesis.
and applying induction hypothesis.
transitivity, and applying induction hypothesis.
→ transitivity and applying induction hypothesis.
Inductive case: (a) Applying induction hypothesis on ∆ 1 , s 1 ⊢ c ⇒ (∆ 2 , s 2 ) and ∆ 2 , Theorem A2 (Correctness for termination). Let Ω be a nameless environment, ∆ a machine environment, d a nameless expression, c a machine code, v a nameless value. If then, there exists a machine value v m such that v v m and for all stack s, Base cases: Hypothesis Ω ⊢ n ⇒ n, n ⇓ c, Ω ∆. We claim that there exists v m = n, the proof of n n follows by definition. Using the n ⇓ c hypothesis, by ⇓ definition necessarily c = IConst n. We are now in a position to prove the main goal ∆, s ⊢ IConst n ⇒ (∆, n ⋅ s) which follows by ⇒ definition of IConst.
(ii) d = b. Analogous to case i.
Hypothesis Ω ⊢ b ⇒ b, b ⇓ c, Ω ∆. We claim that there exists v m = b, the proof of b b follows by definition. Using the b ⇓ c hypothesis, by ⇓ definition necessarily c = IConstb b. We are now in a position to prove the main goal ∆, s ⊢ IConstb b ⇒ (∆, b ⋅ s) which follows by ⇒ definition of IConstb.
The proof relies on the fact that if Ω ⊢ i → v, Ω ∆, v v m then ∆ ⊢ i → v m which is proved by straightforward induction on Ω ⊢ i → v. (Also, it can by proved by induction on Ω.) (iv) d = λ.d. Analogous to case i.
We claim that there exists v m = n 1 ⋆ n 2 , the proof of n 1 ⋆ n 2 n 1 ⋆ n 2 follows by definition.
Hypothesis  Hypothesis Hypothesis We are now in position to prove the main goal ∆, s ⊢ c 1 ⋅ c 2 ⋅ IApp ⇒ (∆, m v ⋅ s).  Hypothesis We are now in position to prove the main goal ∆, s ⊢ c 1 ⋅ c 2 ⋅ IApp ⇒ (∆, m v ⋅ s). (ii) If m ∞ → n then m ∞ → . By coinduction.
Hypothesis m ∞ → n . We have to prove m ∞ → .