Data-Rate Constrained Observers of Nonlinear Systems

In this paper, the design of a data-rate constrained observer for a dynamical system is presented. This observer is designed to function both in discrete time and continuous time. The system is connected to a remote location via a communication channel which can transmit limited amounts of data per unit of time. The objective of the observer is to provide estimates of the state at the remote location through messages that are sent via the channel. The observer is designed such that it is robust toward losses in the communication channel. Upper bounds on the required communication rate to implement the observer are provided in terms of the upper box dimension of the state space and an upper bound on the largest singular value of the system’s Jacobian. Results that provide an analytical bound on the required minimum communication rate are then presented. These bounds are obtained by using the Lyapunov dimension of the dynamical system rather than the upper box dimension in the rate. The observer is tested through simulations for the Lozi map and the Lorenz system. For the Lozi map, the Lyapunov dimension is computed. For both systems, the theoretical bounds on the communication rate are compared to the simulated rates.

In Section 2, we define the types of systems to be observed, as well as the observer notations, and provide a definition for observability with data-rate constraints. Section 3 introduces the proposed observer. In Section 4, preliminary criteria for observability of the plant are offered, which are converted into a fully analytical form in Section 5. Section 6 illustrates the general theory via handling two examples: the Lozi map and the Lorenz system. For both systems, the necessary data-rates are computed and simulations that confirm the theoretical results are provided.

Problem Statement
In this section, we introduce the problem statement. The general setting is that of a dynamical system and two peers connected together via a communication channel. Both peers have full knowledge about the dynamics of the system. Meanwhile, only one of them has direct access to the system's current state and fully measures it. The task is to provide estimates of the state to the other peer by sending messages through the communication channel. This channel is discrete (i.e., the variety of transmittable messages is finite) and has delays, losses, and limited data-rate. The effects due to data-rate constraints and delays are explicitly modeled in this section, whereas the issue of message losses is discussed separately in Remark 1. Two types of delays will be considered in turn. A processing delay is also incurred, since the channel can transmit only a given and finite number of bits c per unit time, and so a B-bits message can wholly arrive at the receiving end of the channel not earlier than B/c time units after the transmission of this message is commenced. A transmission delay is caused by holding up the progress in the ideal routine of bits transfer, which may occur as a result of, e.g., resolving competition with third parties for shared resources of the communication medium or network.
In order to solve the stated problem, we will develop a particular type of observer. In this section, we will only introduce notations concerned with the observer. Operation of its components will be described in the next section.

Observed Dynamical System
We consider a dynamical system {ϕ t } t∈T on an open set S ⊂ R n , paying special attention to a certain subset S 0 ⊂ S. Here,

•
T is the set of time periods, which is either Z + or R + ; • ϕ t : S → S is the evolution function that gives the system state x(t) = ϕ t (x 0 ) at time t ∈ T, provided that the initial state is x 0 ; • S 0 is the focus of our interest in the system.
Specifically, we are interested only in trajectories that start in S 0 and remain there afterwards: Assumption 1. The dynamical system at hands is time-invariant: ϕ t • ϕ s = ϕ t+s ∀t, s ∈ T.
Assumption 2. The set S 0 is a bounded forward invariant ϕ t (S 0 ) ⊂ S 0 ∀t ∈ T, its closure S 0 lies in S.
Our main interest is in systems for which complex and possibly chaotic long-horizon dynamics arise from rather regular short-horizon behavior. The last feature is partly substantiated by the following. Assumption 3. For any t ∈ T, the evolution function ϕ t : S → S is continuously differentiable.
Depending on the "time-set" option, two types of dynamical systems will be considered.
Assumption 1 holds, and Assumption 3 is met if, and only if ψ is continuously differentiable.
(2) Continuous time systems: T = R + and the evolution of the system is described by an ordinary differential equation (ODE): where f : S → R n is a continuously differentiable vector field. So for any x 0 ∈ S, the solution x(t, x 0 ) of the Cauchy problem x(0) = x 0 for the ODE (2) exists, and is unique; it can be extended to the right on the maximal interval [0, T(x 0 )). However, Equation (2) not ineluctably defines a dynamical system on S, since, not necessarily, T(x 0 ) = ∞ for all x 0 ∈ S. Insofar as the right-hand side of Equation (2) is not defined outside S, this extendability T(x 0 ) = ∞ means, in particular, that the solution x(t, x 0 ), x 0 ∈ S never attempts to leave the set S; i.e., the set S is forward invariant. So when dealing with ODE, we always assume that all its solutions that start in S at t = 0 can be extended on [0, ∞) while remaining in S.
The following proposition can be proved by retracing the arguments from Section 2.2 in [37].

Proposition 1.
Whenever the vector field f : S → R n is smooth (i.e., continuously differentiable) and the ODE (2) has the just-stated extendability and invariance properties, this ODE gives rise to a dynamical system {ϕ t } t∈R + on S (ϕ t (x 0 ) := x(t, x 0 )), which satisfies Assumptions 1 and 3. Moreover, ϕ t (x) and its first derivatives, with respect to t and x, are continuous functions of t and x.

Architecture of the Observer, Notations, and General Traits of the Communication Channel
We assumed that the current state x(t) was observed in full at a certain measurement site but is needed at time t at a remote location, where data can be communicated only via a discrete channel. The channel is discrete in the sense that first, it is constrained to carry messages that are drawn from a finite set, and second, the messages can be communicated only one at a time and, while the channel is busy transmitting a previous message, it is closed for the next transmission.
The purpose of the observer is to arrange and manage transmissions across the channel and to finally build, at time t and at the remote location, an estimatex(t) of the current state x(t) with a pre-specified exactness. The formal definition of the last notion is as follows.

Definition 1.
A number > 0 is called an "exactness of observation", and if there existst 0 < ∞ such that As is illustrated in Figure 1, an observer O is defined as a composition consisting of a sampler S, quantizer Q, and decoder D; the sampler and quantizer together form a coder C: O is composed of S and Q C and D.

•
The sampler and quantizer are built at the measurement site and have access to the dynamics {ϕ t } of the system, the set S 0 , the current state x(t), and the desired exactness of observation .

•
The decoder is built at the remote site L and has access to the system dynamics {ϕ t }, the set S 0 , the desired exactness of observation , and the messages transmitted across the channel. The roles and structures of the observer components are as follows.
The Sampler generates the (sampling) instants s j ∈ T (where at every one of these instants t = s j , transmission of another message e(s j ) is initiated): Also, the sampler builds a finite alphabet A j from which the message e(s j ) should be drawn at time s j for subsequently communicating across the channel: The alphabet is thus permitted to depend on s j . The Quantizer forms the message e(s j ) ∈ A j to be dispatched The Decoder generates state estimates based on the previously received messages: wheres j is the time when the message e(s j ) arrives at the remote site L and J(t) := {j :s j ≤ t}. If no message has arrived yet, J(t) = ∅ and the meaningless {(e(s j ),s j )} j∈∅ is replaced by an arbitrarily pre-specified symbol, e.g., 0 ∈ Z + . The observer has to fit the constraints and capabilities of the channel, which are as follows.
(c.1) The channel correctly transfers any message e(s j ) ∈ A j to the receiving end provided that the message processing time τ pr j and the size of the message are in balance: Here, b(τ) is a channel-dependent function that gives the number of bits processable by the channel during any time period of length τ.
(c.2) As the processing time increases to infinity, the average number of bits transmittable per unit of time stabilizes and converges to a certain value c ∈ R + , called the (bit-rate) channel capacity: The channel is closed for the next message until all bits of the current message e(s j ) have been processed, but is open afterwards. (c. 4) On its way to the destination point L, any message e(s j ) incurs a transmission delay τ tr j : wheres j is the time when the whole of the message e(s j ) arrives at L, and the processing time τ pr j plays the role of a processing delay here. ( c. 5) The transmission delays are upper-bounded: τ tr j ≤ τ tr + < ∞.
To correctly transmit messages, the sampler should balance the chosen alphabet and the message processing time τ pr j in accordance with Equation (7), and respect the requirement: s j+1 ≥ s j + τ pr j .

Observability via Channels with Limited Bit-Rate Capacity
The objective of the observer is to guarantee observability, as defined in the following definition.

Definition 2.
A system {ϕ t } t∈T is said to be observable on the set S 0 via a communication channel if, for any > 0, there exists an observer Equations (3)-(6) that operates via this channel and ensures the requested exactness of observation for any trajectory satisfying Equation (1).
Observability is classically defined as a property of the system itself. However, in the current context, finite data rate makes observability critically dependable on the employed communication channel. So, by following [18,35,36,38], observability is introduced as a property of the pair "system + channel". In Definition 2, the reference to existence of an observer in fact conveys the idea of most effectively utilizing the properties of the system and the potentialities of the channel, where if their clever use may result in a reliable and exact state estimate at the receiving end of the channel, the pair is sealed with the stamp, "observable".

Design of the Proposed Observer
Since we are interested only in trajectories satisfying Equation (1), our discussion of the observer design is confined to the case where x(s j ) ∈ S 0 ∀j in Equations (3)-(5).
We will introduce an observer that is determined by the following four entities: (e.1) s + ∈ T-The period s j+1 = s j + s + between consecutive dispatches of messages via the channel; (e.2) P-A symmetric and positive definite n × n-matrix; (e.3) δ( , s + ) > 0-A function of > 0 and s + for which (e.4) {B P δ (q k )} K k=1 -A finite covering of the compact set S 0 with K = K( , s + ) balls (with respect to the norm · P ) centered in q i ∈ S 0 and with a radius of δ = δ( , s + ) each.
Here, the centers q i may also depend on , s + . Lemma 1. Let Assumptions 2 and 3 hold, and for any τ ∈ T, the derivatives of ϕ t are bounded over t ∈ T, t ≤ τ and x from some τ-dependent neighborhood of S 0 . Then, a function δ( , s + ) > 0 with the property (10) exists.
The proof of this lemma simply follows from the continuous differentiability of ϕ t and the boundedness of of the derivates.
The proposed observer operates as follows.
The sampler S (Equations (3) and (4)) carries out the following actions: i.e., the alphabet substantiates the numbering of the balls from (e.4).
The quantizer Q finds an element B P δ (q k ) of the covering from (e.4) that contains x(s j+1 ) = ϕ s + (x(s j )) ∈ S 0 and sends its index k over the channel: The decoder D performs the following operations at time t ∈ T: -Extracts the index k from the last message received at a time θ ≤ s i , where i := t/(s + ) (If no message has been received yet, k is assigned an arbitrarily pre-specified value, e.g., 1.).

-
By using the centers from (e.4), forms the current state estimatê Several comments on this observer are as follows: • In (o.2), we do not address the case x(s j+1 ) ∈ S 0 due to the reason stated at the onset of the section.

•
The proposed design assumes that both the coder and decoder have access to s + from (e.1) and the covering from (e.4).

•
The observer uses a fixed alphabet {1, . . . , K}, which is shared by the coder and the decoder.

•
The quantizer sends data about the estimate q k of not the current x(s j ), but the forward-time state x(s j+1 ), which is computed from the measured x(s j ) by using the known transition map ϕ s + (·).

•
The idea behind this relies on the expectation that these data will be received prior to s j+1 and put in use at due time, t = s j+1 . Then, the exactness of estimation will be δ at this time.

•
These data are also used to estimate the state on the subsequent time interval {t ∈ T : s j+1 ≤ t < s j+2 } via applying the matching transition map to the just-discussed estimate at time t = s j+1 . By Equation (10), this guarantees the exactness of estimation x(t) −x(t) P ≤ on this interval.
In order for the proposed observer to be able to operate correctly via a given communication channel, the message e(s j ) initiated at time s j should be fully processed and received prior to s j+1 . (This, in particular, implies that the messages arrive in order:s j+1 >s j .) Due to (c.1), (c.4), and (c.5), correct operation occurs whenever there exists a solution τ pr ∈ T to the following two inequalities: We recall that τ tr + is an upper bound on the transmission delay. In the typical case where K( , τ) is an increasing function of τ and modulo the possibility to choose s + , Equation (12) reduces to only one inequality: log 2 K( , τ pr + τ tr + ) ≤ b(τ pr ). (13) Anyhow, the inequalities depend on both the system (via K(·, ·)) and channel (via b(·), τ tr + ). This means that correct operation in fact requests a certain level of conformity between the system and the channel.
The conditions for correct operation will be fleshed out in the next section. We conclude the section with a comment on observability and a remark on how the observer proceeds when a loss occurs. Observation 1. The following statements are true: (i) Let the proposed observer correctly operate for a given > 0 and s + . Then, for any trajectory satisfying Equation (1), the desired exactness of observation is ensured with respect to the norm · P ; (ii) Let a communication channel be given. Also, let any > 0 small enough be coupled with some s + so that the proposed observer with these and s + operates correctly via the channel at hand. Then, the system {ϕ t } t∈T is observable on the set S 0 via this communication channel.
All observability conditions that will be established in this paper are nothing but implications of ii) in this observation. This means that these conditions ensure the correct operation of the proposed observer modulo's proper and feasible choice of its parameters. In other words, whenever these conditions are satisfied, a reliable state estimate can be obtained by means of this observer.

Remark 1.
Suppose that messages may be lost when transmitting over the communication channel. If a loss does occur, the message q k which was last received is used in Equation (11) not only during the intended time interval (from s i to s i+1 ), but also during the subsequent time intervals until the next successful transmission. Certainly, there is no guarantee that the estimation accuracy will be within the desired on these extra intervals. However, as soon as a new message arrives, this accuracy is restored due to the very design of the observer. This robustness against losses is achieved without any feedback in the communication channel (i.e., the coder is not notified when losses occur on the channel), unlike many competing schemes [31][32][33][34].
This remark extends on the situation where the message is not lost, but corrupted so that an incorrect q k is occasionally used in Equation (11).

Criteria for Observability of the System
A problem with the conditions (12) and (13) is that they use the function K(·, ·) from (e.4), for which there is a lack of constructive techniques to compute, or at least to assess it from its "parents": the dynamics {ϕ t }, and the set S 0 . In this section, we make a first step to overcome this deficit; whereas the function K(·, ·) is a by-product of the coalesce of the dynamics and set, we re-master the conditions into a form where separate characteristics of the dynamics and the set are employed.

The Size of Finite Covering
Inspired by (e.4), we start with the question: How many balls of a common radius δ are needed to cover a given bounded set? Though not articulated thus far, our interest in fact focuses on the high exactness of estimation δ ≈ 0. This, in turn, motivates asymptotical analysis as δ → 0. A response to these concerns is partly given by the concept of an upper box-counting dimensiond B , which is defined as follows.

Definition 3 ([21]
). The upper box-counting dimensiond B (F) of a bounded set F ⊂ R n is given bȳ Here, N δ (F) can be defined in any of the following ways, with all of them resulting in a common value (14): (i) The smallest number of closed balls of radius δ that cover F; (ii) The smallest number of closed balls of radius δ and centers in F that cover F; (iii) The smallest number of cubes of side δ that cover F; (iv) The number of δ-mesh cubes that intersect F; (v) The smallest number of sets of diameter at most δ that cover F; (vi) The largest number of disjoint balls of radius δ with centers in F.
Also, the quantity (14) does not depend on the choice of the norm in (i),(ii), (v), and (vi).
It follows that for arbitrarily small κ > 0, the number of δ-balls with centers in F that are needed to cover F does not exceed δ −(d B (F)+κ) for all sufficiently small δ > 0.
As is well-known [21], The box-counting dimension may assume non-integer values; for example,d B (F) = 1/ log 2 3 for the middle-thirds of the Cantor set F ⊂ R.
Our particular interest is in dynamical systems and their invariant sets S 0 withd B (S 0 ) < n; this case does hold for some chaotic systems and complex attractors S 0 .

Balance between the Initial and Forthcoming Estimation Exactness, Respectively
Now, we are going to study relations between the initial exactness δ of the state x estimatex and the implied forthcoming exactness during the time horizon of duration s + . This study is aimed at building the component (e.3) of which the proposed observer is composed, among others. We recall that this component is a function δ( , s + ) for which Equation (10) holds.
The growth rate of the system {ϕ t } on the set S 0 is defined to be: is the Jacobian matrix of the map ϕ θ (·) at point x and log 2 ∞ := ∞. It is well-defined for all sufficiently small δ, since x ∈ S in Equation (15), thanks to the following.
The proof of this lemma is trivial and thus omitted from this document. In Equation (15), the limit lim δ→0 exists since the subsequent quantity decays as δ decreases. Since all norms · in the space of n × n-matrices are equivalent, it is easy to see that g(S 0 ) does not depend on the choice of the norm.
Among other components, the proposed observer uses a function δ( , s + ) with a special property described in (e.3). Now, we show how such a function can be built from g(S 0 ). Lemma 3. Let g(S 0 ) < ∞. For anyĝ > g(S 0 ) and any positive definite n × n-matrix P, there exists a function δ( , s + ) with the property (e.3) that is given by for all sufficiently small > 0 and sufficiently large s + .
The proof of this lemma is provided in Appendix A.

Correct Operation of the Observer and a Criterion for Observability
By bringing the pieces together, we arrive at the following.

Proposition 2.
Suppose that Assumptions 1-3 hold and the system has a finite growth rate g(S 0 ) on the set S 0 . Consider a communication channel with capacity c. If the system {ϕ t } t∈T is observable on the set S 0 via this communication channel in the sense of Definition 2.
The proof of this proposition is provided in Appendix A. The previous inequality strongly resembles other inequalities in the context of entropy in dynamical systems that link dimensions, Lyapunov exponents, and entropy (see [23][24][25]).

Remark 2.
The bounded transmission delay τ tr j from Equation (9) and its upper bound from (c.5) do not affect the condition (17) for observability.

Constructive Estimates and Analytical Bounds
In this section, we make the next and final step for obtaining tractable conditions for observability. The road to this is via the development of techniques for assessing growth rate and the box-counting dimension. A technique will be employed in both these cases that is similar in spirit to the second Lyapunov method.

Lyapunov-Like Function
The characteristic trait of the classic Lyapunov function v(·) is its decay along the trajectories of the system. In the current context, we are not interested in such a decay. Instead, our interest is focused on the rate at which an infinitesimally small ball is expanded under the transition mapping ϕ t . The smallness implies that this mapping is well-approximated by the first two terms of its Taylor series, and so the rate in question is nothing but the expansion rate due to the Jacobian matrix A t (x) defined in Equation (15). The deformation of a ball under a linear mapping A is described by the singular values of A; in particular, the maximal of them is the norm of A and may be used in Equation (15). If P is a symmetric positive definite matrix and R n is endowed with the P-related norm · P , these values are the square roots of the solutions of the algebraic equation det[A PA − P] = 0 repeated in accordance with their algebraic multiplicities and ordered from large to small. With these in mind, we introduce a function v(·) : S → R with special properties whose description uses the t-step increment of this function: Assumption 4. There exist d ∈ [0, n], a bounded function v : S → R, constant Λ ≥ 0, and symmetric positive definite matrix P ∈ R n×n , such that where λ 1 (t, x) ≥ · · · ≥ λ n (t, x) are the roots of the algebraic equation repeated in accordance with their algebraic multiplicities and ordered from large to small, and log 2 0 := −∞.
In the discrete-time case, only t = 1 is concerned in Equation (19). In the continuous-time case, Equation (19) is imposed only within the finite time horizon of duration 1.
With P = I n , Equation (20) are the squares of the standard singular values of the Jacobian matrix A t (x). For a generic P, the roots λ i (x, t) can also be reduced to standard singular values. Indeed, let U be the symmetric and positive definite "square root" of the symmetric and positive definite matrix P = U 2 . The solutions of Equation (20) are evidently identical to those of det U −1 A t (x) UU A t (x)U −1 − λI n = 0, and so the λ i (x, t)'s are the squares of the ordinary singular values of the matrix U A t (x)U −1 . This matrix is similar to A t (x), and so these two matrices represent a common linear transformation in various bases. Thus, the role of P is, in fact, that of a linear coordinate transformation in pursuit of ease of building Λ and v(·).
Assumption 4 will be utilized for assessment of both quantities that we are interested in. Specifically, it will be used with d = 1 and arbitrary Λ to upper-estimate the growth rate (15) of the system; this estimate is given by Λ. With Λ = 0 and some d ∈ [0, n], it will be used to establish an upper bound on the upper box dimension of the invariant set S 0 ; this bound is given by d.
In the case of a continuous time system, the next proposition provides an alternative to computing the transition maps ϕ t , t ∈ (0, 1] and checking infinitely many inequalities (19), each for its own t ∈ (0, 1], when verifying Assumption 4. To state this proposition, we introduce the Jacobian matrix of the right-hand side in Equation (2):

Proposition 3.
Let there exist d ∈ [0, n], a continuously differentiable bounded function w : S → R, constant Γ ≥ 0, and a symmetric positive definite matrix P ∈ R n×n , such thaṫ whereẇ(x) = ∂w ∂x f (x) and γ i (x) are the solutions of the algebraic equation ordered from largest to smallest (γ 1 (x) ≥ · · · ≥ γ n (x)) and repeated in accordance with their algebraic multiplicity. Then, Assumption 4 holds with the particular P and d of Equation (22), v(x) = w(x) ln 2 and Λ = Γ ln 2 .
This result is proved in Appendix B.

Analytical Upper Bound on the System's Growth Rate and Related Conditions for Observability
Proposition 4. Let Assumptions 1-4 hold with d = 1 and Λ ≥ 0 in the last of them. Then, the growth rate (15) of the system on S 0 obeys the following bound: The proof of this proposition is provided in Appendix B. By combining Propositions 2 and 4, we arrive at the following.
the system {ϕ t } t∈T is observable on the set S 0 via this communication channel in the sense of Definition 2.
The observation schemes proposed in [39,40] can sometimes work under the channel rates smaller than that given in Theorem 1. This improved rate comes at a price: these schemes are not robust against losses in the communication channel.
In [7], the observer requires some feedback in the channel and a channel rate of the form n log 2 L, where L is the Lipschitz constant of the mapping ϕ 1 . The estimate (23) is less conservative, both because L ≥ Λ/2 (if L is related to a norm of the form · P ) and n ≥d B (S), with ≥ →> in some cases. Moreover, the scheme from [7] does not enjoy robustness against losses in the communication channel.
Finally, Corollary 6.2.1 of [29] provides an estimate for the topological entropy by using a result from [41], which is identical to our estimate of the rate c with identical assumptions.

Analytical Bounds on the Upper Box Dimension and Final Conditions for Observability
A drawback of Theorem 1 is that it uses the upper box dimension, whereas there are no general techniques to compute this dimension analytically. To compensate for this drawback, we will use results from [28,30] to replace the upper box dimension by its upper estimate in the form of another well-known kind of dimension, i.e., the so-called Lyapunov dimension. The benefit from this is that the latter can be estimated analytically.
We start by introducing the necessary definitions, including those of the Lyapunov dimension of a map in a point, of a map over a set, and of a dynamical system. Next, we will recall the required results from [28,30], and finally, we will provide the general results of this paper, which offers analytical conditions for observability under a finite communication bit-rate.

Definition 4.
For any t ∈ T, the singular value function of A t (x) of order d ∈ [0, n] at point x ∈ R n is defined as Here σ 1 (A) ≥ . . . ≥ σ n (A) are the singular values of the n × n-matrix A.

Definition 5 ([30]
). For any t ∈ T, the Lyapunov dimension of the map ϕ t (·) at the point x ∈ S is given by

Definition 6 ([30]
). For any t ∈ T, the Lyapunov dimension of the map ϕ t (·) with respect to the invariant set S 0 is given by

Definition 7 ([30]
). The Lyapunov dimension of the dynamical system {ϕ t } t≥0 with respect to the invariant set S 0 , is defined as For the sake of completeness, we provide the results that we borrowed from [28,30].

Corollary 1 ([28]
). Let the hypotheses of Theorem 2 be true. Then for all t ∈ T : t ≥ 1, The following proposition is essentially a reformulation of Theorem 2 from [30].

Proposition 5.
Let Assumptions 1-3 be true. Suppose also that Assumption 4 holds with some d ∈ [0, n] and Λ = 0. Then, for sufficiently large l > 0, the following inequality is valid: The proof of this proposition is provided in Appendix B. In some cases, the inequalities in Equation (24) take place as equalities. Specifically, the following proposition is valid, which is a reformulation of Proposition 3 and Corollary 3 from [30].
Then, for any compact invariant set S 0 x eq of {ϕ t } t∈T , the following equation holds Now we are in a position to state the main result of the paper, which is clear from Theorem 1 and Proposition 5.
the system {ϕ t } t∈T is observable on the set S 0 via this communication channel in the sense of Definition 2.

Examples
In this section, we apply the previous theory to two celebrated prototypical chaotic systems: the smoothened Lozi map and the Lorenz system. For the smoothened Lozi map, we will compute the Lyapunov dimension and provide a bound on the channel rate above where the associated dynamical system is observable via the channel at hand. We will then test this bound via computer simulations of the proposed observer to show that the established theoretical rates are close to the actual practical rates. For the Lorenz system, we borrow upper estimates of the Lyapunov dimension and the largest singular value of the Jacobian from [39,42], respectively, to provide a bound on the channel rate by using Theorem 3. Like in the previous example, we will also test this bound via computer simulations.

The Smoothened Lozi Map
The Lozi map [43,44] is a modification of the Henon map. The Lozi map is not continuously differentiable, and so does not meet Assumption 3. We examine its continuously differentiable analog introduced in [45] by smoothing the Lozi map at the fracture point. The respective smoothened map acts according to the following formula ϕ α : where a, b, and α 1 are positive parameters and If 1 + a − b > 0 and α < (a + 1 − b) −1 , the smoothened Lozi map has an equilibrium If 1 − a − b < 0 and α < (a − b − 1) −1 in addition to the previous inequalities, there exists a second equilibrium In this section, we adopt the following.

Assumption 5.
The following inequalities hold: This assumption implies that two equilibria exist, and that they are unstable. Moreover, b < 1 ensures d L ({ϕ t α } t≥0 , K) < 2 for the associated discrete-time dynamical system. We start by giving more insight into the Lyapunov dimension of the smoothened Lozi map.
and Assumption 4 holds with Λ = 0 and d = d. Moreover, if x + ∈ S 0 , inequality (29) holds as equality: The proof of this theorem is provided in Appendix C.
In order to use the observer from Section 3 for the smoothened Lozi map (27), we need to choose a compact and invariant set S 0 . For the original (i.e., non-smooth) Lozi map, such a set exists whenever the following conditions are met: [46] Moreover, when the previous inequalities hold, the set S 0 is the closure of the unstable manifold of the unstable equilibrium x + . It is still unknown whether they guarantee the same for the smoothened Lozi map (27). To the best of the authors' knowledge, no conditions that guarantee the existence of such a set for the map (27) are available in the literature. In the following, we will assume that Equations (30)  By combining Theorem 3 with Theorem 4 and an estimate of the largest singular value of the concerned Jacobian given in [40], we arrive at the following. Corollary 2. Let Assumption 5 hold, and let S 0 be a compact invariant set of the smoothened Lozi map (27). Then, the associated dynamical system is observable on the set S 0 via any communication channel whose capacity Proof. Theorem 13 from [18] yields that Assumption 4 holds with d = 1 and Λ = 2 log 2 a 2 + 4b + a − 2.
Theorems 3 and 4 complete the proof.
Corollary 2 implies that for these parameters, the associated dynamical system is observable for any channel rate above 1.2013 bits/s. To verify whether this lower bound on the channel rate can be improved, we employed the observer from Section 3 whose parameters were experimentally tuned to ensure a pre-specified exactness of observation during the first 1000 steps. The following values were considered: = 0.5, 0.2, 0.1, 0.05. An accompanying objective of experimentally tuning was to minimize the size of the alphabet K employed for data encoding, or, in other words, the channel capacity c * requested by the observer. The best values of the capacity can be found in Table 1. It shows that for high exactness, the system can be observed with a channel rate slightly below the theoretical estimate. For the lowest exactness, the experimental rate barely exceeds the theoretical bound. However, in any case, this bound seems to be pretty close to the experimental result.

The Lorenz System
In this section, we apply our previous theoretical results to the Lorenz system. The Lorenz system [47] is a well-known example of a continuous-time system where, for certain values of its parameters, it displays chaotic behavior. The system equations are: where σ, ρ, and β are positive parameters. If ρ < 1, the system has a single globally asymptotically stable equilibrium: the origin. For ρ > 1, this equilibrium becomes a hyperbolically unstable saddle-point. In addition, two equilibria appear. In this paper, we assume ρ > 1. We will apply our findings to the system with a chaotic attractor as an invariant set. As is well-known [48], the conditions on the parameters (σ > 0, β > 0, ρ > 1) suffice to ensure the presence of a compact invariant set. Moreover, to compute the Lyapunov dimension, we adopt the following assumption which is taken from [42].
Assumption 6. Let the following hold: and either or the following equation has two distinct solutions, ν and where ν 1 is the largest root of Equation (36).
Any solution of the Lorenz system that starts at t = 0 can be extended on [0, ∞) [29], and thus has the extendability property discussed just after Equation (2). Hence, the differential Equations (35) give rise to a dynamical system on S := R 3 in the sense of Section 2.1.

Proposition 7.
Let Assumption 6 hold, and let S 0 be a compact invariant set of the Lorenz system. Then, this system is observable on the set S 0 via any communication channel whose capacity is: Proof. Since the right-hand side of the equations in Equation (35) and the matrix P defined by (16) in [39]. So Proposition 3 guarantees that Assumption 4 holds with d = 1 and Λ = 1 As is shown in Section 4 from [42], Assumption 4 also holds with Λ = 0 and Theorem 3 completes the proof.
We have performed simulation studies similar to those carried out for the previous example. Starting from various initial conditions in S 0 , we have simulated the observer for various chosen , each with its own chosen δ and associated covering. In our simulations, we used σ = 10, ρ = 28, β = 8 3 , which verify Assumption 6. For these parameters, the theoretical rate bound is c > 40.975 bit/s. The results of the simulations can be seen in Table 2. Once again, it can be seen that the experimentally found rate is below or very close to the theoretical rate. This confirms that our theoretical results correctly predict the rate.

Conclusions
In this paper, we have presented an observer for both discrete and continuous-time nonlinear systems. We have provided bounds on the necessary data-rates to implement the observer. We have proven that this observer can be implemented on any channel with a finite delay parameter and a channel rate c > Λd B (S 0 )/2, where Λ/2 is an upper bound on the largest singular value of the Jacobian andd B (S 0 ), the upper box dimension of the compact invariant set of the system. By combining results from several other papers, we have provided an analytical bound to the channel rate that depends on the Lyapunov dimension, rather than the upper box dimension. These analytical bounds have been computed for the smoothened Lozi map and the Lorenz system. For the smoothened Lozi map, we computed the Lyapunov dimension. Simulations of the observer on both of these systems have proven that the theoretical rate is closely related to the actual rate required to implement the observer. Acknowledgments: We dedicate this paper to the memory of Gennady A. Leonov, who passed away on the April 23, 2018. The authors acknowledge the many fruitful discussions they held with Professor Leonov.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following notations are used in this manuscript: x Px, with P symmetric and positive definite; the ball in · 2 of radius δ centered in x; B P δ (x): the ball in · P of radius δ centered in x.

Appendix A. Proofs of Section 4
In this appendix, we present the proofs of of the results stated in Section 4.
Appendix A.2. Proof of Proposition 2 We first pickĝ > g(S 0 ) and κ > 0 so close to g(S 0 ) and 0, respectively, that We also pick a positive definite n × n-matrix P and borrow the function δ( , s + ) from Lemma 3. We also consider > 0 and s + ∈ T so small and large, respectively, that Equation (16) holds. As was remarked just after Definition 3, the set S 0 can be covered by no more than δ −(d B (S 0 )+κ) δ-balls centered in S 0 for all small enough δ > 0. By reducing > 0, if necessary, we make δ = δ( , s + ) small enough in this sense irrespective of s + ∈ T. Then, no more than δ-balls centered in S 0 are needed to cover S 0 . Hence, the integer floor of the right-hand side of this equation is the function K( , s + ) from e.4). So, the condition (13) for the correct operation of the observer from Section 3 takes the form By invoking Equations (8) and (A2), we see that the last inequality can be satisfied by picking τ pr which is large enough. Then, by picking s + ≥ τ pr + τ tr + in accordance with Equation (12), we ensure the correct operation of the observer. The statement ii) from Observation 1 completes the proof.

Appendix B. Proofs of Section 5
Appendix B.1. Proof of Proposition 3 For the dynamical system given by Equation (2), the matrices A t (x) defined in Equation (15) obey the equations Hence, where M x (∆) → 0 as ∆ → 0. Moreover, this convergence is uniform over x from any compact subset K of S. Indeed, for x ∈ K, Equation (A3) yields that It remains to note that for all x ∈ K, By invoking Equation (A4), we see that where uniformly over x ∈ K due to the foregoing since the continuous function J(x) is bounded on the compact set K. Let U = U > 0 be the positive definite square root P = U 2 of P. Equation (20) with t := ∆ can be rewritten by virtue of Equation (A5) as follows Thus we see that [λ i (∆, x] − 1)/∆ are the ordinary eigenvalues of the symmetric matrix Meanwhile the eigenvalues continuously depend of the symmetric matrix by Corollary 6.3.8 [50].
where ω i (x, ∆) → 0 as ∆ → 0 uniformly over x ∈ K and η i (x) are the solutions for the eigenvalue problem The last equation is identical to Equation (22), and so η i (x) = γ i (x). By using the previous equations together, we see that Now we invoke d from Proposition 3. For any n × n matrix B, we denote byλ i (B) the roots of the algebraic equation det[B PB − λP] = 0 (A7) enumerated from large to small, and put By combining the generalized Horn inequality [29] with Lemma 8.1 in [39] (which relates to the roots of Equation (A7) with the concept of the singular value of a matrix), we infer thatω d (BC) ≤ω d (B)ω d (C) for any n × n matrices B, C. It follows that for any sequence B 0 , . . . , B m of such matrices Now we pick an arbitrary t > 0 and denote ∆ r := t/r, t r (j) := j∆ r for any natural r and j ∈ Z + . Since the system is time-invariant, we have As a result, we see that in Equation (19), We proceed by using the elementary inequality log 2 (1 + x) ≤ x/ ln 2 and the quantity Ω i (K, ∆) defined in Equation (A6) for the compact set K := {ϕ θ (x) : 0 ≤ θ ≤ t}, which contains all points of the form ϕ t r (j) (x), j = 0, . . . , r, where r∆ r = t by the definition of ∆ r = t/r. Thus where

Now we estimate B by employing Equation (21)
Here, the sum is the Riemann sum of the continuous functionẇ([ϕ θ (x)] of θ ∈ [0, t], and A does not depend on r. So be letting r → ∞ and by invoking thatẇ( Thus we have arrived at Equation (19) modulo the definitions of Λ = Γ/ ln 2 and v(x) = w(x)/ ln 2 from Proposition 3. It remains to note that the function v(x) is bounded since w(x) has this property by the assumptions of Proposition 3.

Appendix B.2. Proof of Proposition 4
We first note that Equation (19) with d = 1 means that whenever x ∈ S, t ∈ T, 1 ≥ t > 0, we have We are going to show that for any natural r, Equation (A8) holds whenever t ∈ T, 0 < t ≤ r, arguing by induction on r. As a result, we will show that Equation (A8) is valid for all t ∈ T, t > 0.
For r = 1, this claim is initially given. Suppose that it is true for some r, and consider t ∈ T ∩ (0, r + 1]. If t ∈ T ∩ (0, r], Equation (A8) is true by the induction hypothesis. Let t > r. Then t = r + θ, where θ ∈ T, 0 < θ ≤ 1. By putting y := ϕ r (x) and invoking Assumption 1 and Equation (15), we see that where the start of the second line is due to Equation (18). By using these relations, we arrive at Equation (A8) via adding Equation (A8) with t := θ to the inequality which holds by the induction hypothesis. Thus, Equation (A8) is true whenever t ∈ T, t > 0. Now we introduce a finite upper bound v ≥ |v(x)| ∀x ∈ S on the bounded function |v(·)|. Let x ∈ S. Then ϕ t (x) ∈ S and so Equation (18) yields that |∆ t v(x)| ≤ 2v. By Equation (A8), As was remarked, g(S 0 ) does not depend on the matrix norm · in Equation (15). Meanwhile, Lemma 2 ensures that x ∈ B δ (y), y ∈ S 0 ⇒ x ∈ S for all small enough δ > 0. Hence Since P is positive definite and symmetric, we can decompose it as where U is nonsingular. Equation (A9) can thus be rewritten as If we premultiply with U − and postmultiply with U −1 , the solutions λ i are unchanged and the equation becomes det(U − A t (x) U U A t (x)U −1 − λI n ) = 0.
We thus know that λ i (t, x) are the eigenvalues of the matrix U − A t (x) U U A t (x)U −1 or the square of the singular values of the matrix U A t (x)U −1 . From Equation (4) with the same d and Λ = 0, we have that log 2 λ i (t, x) + (d − d ) log 2 λ d+1 (t, x) < 0, ∀x ∈ S, t ∈ T : 0 < t ≤ 1, which thus also implies that where e is Euler's number. This can be rewritten as ln λ i (t, x) + (d − d ) ln λ d+1 (t, x) < 0, ∀x ∈ S, t ∈ T : 0 < t ≤ 1.

Appendix C. Proofs of Section 6
Proof of Theorem 4 Proof. Proposition 5 yields that to prove the first sentence from the conclusion of Theorem 4, it suffices to justify Assumptions 1-3 and Assumption 4 with Λ = 0 and d :=d defined in Equation (29) for the discrete-time dynamical system associated with the smoothened Lozi map. Assumptions 1-3 do hold since the map (27) is smooth and S 0 is a compact invariant set by the assumptions of Theorem 4. It remains to check Assumption 4, where t = 1 in Equation (19) since we are in discrete time now.
We are going to justify Assumption 4 with P = diag{1, b}. Since we have that To simplify the notations, we introducef := f α (x). Equation (A10) admits two solutions Since by Equation (28), given Assumption 5, we know that max x∈S 0 λ 1 (x) is always larger than one (in particular, for all |x| ≥ α, f = 1). This implies that d L ({ϕ t α } t∈T , S 0 ) > 1 and that 1 < d < 2. Note that the latter inequality is strict as a result of Assumption 5. After some computations and due to Equation (A11), the left-hand side of Equation (19) becomes log 2 (λ 1 (x)) + (d − d ) log(λ 2 (x)) Which, using Equation (A12) can be upper bounded in the following way To satisfy Assumption 4 with some d and Λ = 0, we are looking for d − d such that the left-hand side of the previous equation is negative. We thus obtain the following sufficient condition which, using Assumption 5, can be rewritten as Note that from the conditions from Assumption 5, the following is always true 0 < (d − d ) < 1. Since Assumption 4 with d =d and Λ = 0 is verified, the proof is completed by applying Proposition 5 which yields To prove the second part of this theorem, we will use Proposition 6. We consider the following diagonalizing coordinate change matrix   which is nonsingular and well-defined for all parameters satisfying Assumption 5.