1. Introduction
This paper is designed to cover a couple of major objectives. Broadly, the first one is to solve one of the remaining issues in [
1]. Later this introduction, this first objective will be introduced in a detailed way. The second objective is to exemplify the interest, for the economic science, of the algorithm presented in [
1] and implemented in the R-package 
nlstac; see [
2]. We have developed this algorithm in order to fit data coming from an exponential decay. In this primitive utilization of the algorithm, we do not distinguish between fitting data and fitting coefficients, but we hasten to remark that there is no model proposed in the present paper: our goal is to provide a tool that allows the reader to fit the coefficients of a previously chosen model. Therefore, we will illustrate the interest of this algorithm by fitting the pattern in a couple of cases related with different economic problems.
The first economic problem deals with demand curves. Economic demand curves can be used to map the relationship between the consumption of a good and its price. When plotting price (actually the logarithm of the price) and consumption, we obtain a curve with a negative slope, meaning that when price increases, demand decreases. Hursh and Silberbeg proposed in [
3] an equation to model this situation; in this paper, we will fit some data using this model.
The second and third economic problems deal with nonlinear time series models. Many financial time series display typical nonlinear characteristics, so many authors (see [
4]) apply nonlinear models. Although the TAC (Spanish for CT scan) algorithm was not designed for these kinds of problems, we can obtain good results from using it. In these examples, we will focus on the model that, among all nonlinear time series models, seems to have more relevance in the literature, namely, the exponential autoregressive model. We show that the coefficients given by 
nlstac give a realistic approximation of such datasets. In any case, our purpose is not to assess the fitness of any model nor to provide an economic analysis.
Before we get into the first objective, let us outline the structure of this paper. 
Section 2 deals with approximations by means of exponential functions measuring the error with the max-norm when fitting a 
small set of data, i.e., three or four observations. 
Section 3 is devoted to some symmetric cases that could happen. 
Section 4 deals with approximation in generaldatasets. 
Section 5 gathers two examples about Newton Law of Cooling and directly applies what has been developed in previous sections. 
Section 6 shows examples related to economy and uses the R-package 
nlstac for the calculations. Although the section on economics represents one out of the six sections of this paper, it remains a very important one, and everything we have been working on before directly applies in it. What we have been able to do with nlstac gives an idea of how it can approximate the best coefficients for patterns that are usually regarded as unapproachable; see, e.g., [
5].
We will focus on the first objective in 
Section 2, 
Section 3, 
Section 4 and 
Section 5. In these sections, we will deal with approximations by means of exponential functions, and we will measure the error with the max-norm. Therefore, when we say that some function 
f is the best approximation for some data 
, we mean that we have 
, with 
 for every 
i, and that 
 is the center of the narrowest band that contains every point and has exponential shape; see 
Figure 1.
We will need the following definition:
Definition 1. A real function defined in a linear topological space X is quasiconvex whenever it fulfills  This definition can be consulted in, for example, [
1] or [
6].
The authors already proved in [
1] that for every 
, there exists the best approximation 
 amongst all the functions of the form 
, with 
. We also showed that, given any dataset 
, the function 
 that assigns to every 
 the error 
 is quasiconvex, so there are two options: either 
 attains its minimum at some 
k or it is monotonic. If 
 is monotonic, then either it is increasing and the minimum would be attained, so to say, at 
, or it is decreasing and attains its minimum at 
. We will also study what happens for positive 
k, so we will need to pay attention not only to the behaviour of the exponentials but also to their limits when 
 and 
; see Proposition 2 and 
Section 3.2.
Remark 1. Our main results show that every dataset  with at least four data fulfills one of the following conditions:
- There exists one triple  with  such that  is the best possible approximation, i.e., 
whenever  is a different exponential. - There are two indices  where  is attained and there is some , with , such that . In this case, the best approximation by means of exponentials does not exist and the constant  approximates  better than any strictly monotonic function—in particular any exponential. Therefore, for every , the best approximation with the form  has  and . Exactly the same happens when the maximum is attained at  and the minimum at  and , with . In both cases, the function  is constant. 
- ,  and T attains its second greatest value at . The best approximation by means of exponentials does not exist and every exponential approximates  worse than any function fulfiling  for . The pointwise limit of the best approximations when  takes these values. (The symmetric cases belong to these kind of limits, with  instead of .) If this happens,  increases in  
- There are some  such that the line  approximates  better than any exponential. In this case, each  is the limit when  of the values in  of the best approximations with k as exponent. This happens when there are four indices  or  such that 
This implies that  decreases in 
 Remark 2. What happens with this kind of functions is the following: consider two exponentials that agree at , say . Then, the following are equivalent:
-  for some . 
-  for every . 
-  for some . 
-  for every . 
A visual way to look at this is the following. Consider a wooden slat supported on two points, and imagine we put a load between the supports. When we increase the load, the slat lowers between the two supports but the other part of the slat raises. For these functions, the behaviour is similar—if two of them agree at α and β, then one function is greater than the other in  and lower outside .
Besides, if  and  then for each  that does not lie in the line defined by  and  and belongs tothere is exactly one exponential h such that ,  and . Of course, if  does not belong to the set given by (1), then there is no monotonic function that fulfils the latter. The existence of such an exponential is a straightforward consequence of [1], Lemma 2.10—we will develop this later, see Proposition 4.  Remark 3. A significant problem when dealing with the problem of approximating datasets with exponentials has been to find conditions determining whether some dataset is worth trying or not. The only way we have found to answer this problem has been to identify the most general conditions that ensure that some dataset has one best approximation by exponentials—needless to say, this has been a very sinewy problem. The different behaviors described in Remark 1 can give a hint about the several different details that we will need to deal with, but there is still some casuistry that we need to break down. Specifically, our main interest in these results comes from the fact that they can be applied to exponential decays, which appear in several real-life problems—the introduction in [1] presents quite a few examples. The typical data that we have worked with are easily recognizable, but we needed to determine when the data may be fitted with a decreasing, convex function—like the exponentials  with . The easiest way we have found is as follows: - If some data  are to be fitted with a decreasing function and we are measuring the error with the max-norm, then the maximum value in T must be attained before the minimum. There may exist more than one index where they are attained, but every appearance of the maximum must lie before every appearance of the minimum. In short, if  and , then . 
- Moreover, if we are going to approximate  with a convex function, the dataset must have some kind of convexity. The only way we have found to state this is as follows: - ♡ “Let  be the line that best approximates . Then  has two maxima and one minimum between them.” - Thanks to Chebyshev’s Alternation Theorem (the polynomial  is the best approximation of function f in  if and only if there exists  points  where  attains its maximum and ; see for example [7], Theorem 8, p. 29 or [8]) we know that the line that best approximates any dataset behaves this way, the opposite way or as described in Remark 1. Please observe that this Theorem would not apply so easily to approximations with general degree polynomials. 
 Before we go any further, let us comment something about the notation that will be used. For the remainder of the paper, we will always take n as the number of coordinates of  and T, i.e., . In addition,  will fulfil .
Moreover, for any  we will always denote as  the best approximation with the form  with .
To ease the notation, whenever we have some function  and , we will let  denote , and the same will apply to any fraktur character:  would represent the same for  and so on.
Given any vector ,  will denote its maximum and  will denote its minimum.
  2. Small Datasets
In this Section, we show some not too complicated, general results about the behavior of exponentials that will allow us to prove our main results in 
Section 4. We will focus only on the approximation of the most simple datasets, with 
 or 
.
We will begin with Proposition 1, just a slight modification of [
1], Proposition 2.3 that will be useful for the subsequent results. Later, in Lemma 1, we will find the expression of the best approximation for 
 for each fixed 
k (please note that with 
, for every 
k there is an exponential that interpolates the data). In Lemma 2, we study the case 
, determining a technical condition on 
 that ensures that the best approximation exists and it is unique, and moreover, we kind of determine analytically this best approximation.
Proposition 1 ([
1]). 
Let ,  such that  is the best approximation to T for this k, i.e.,Then, there exist indices  such that .
Reciprocally, if  and  fulfil this condition, then  is the best approximation to T for this k.
 Lemma 1. Let ,  and . Then, the best approximation to  by means of exponentials has these coefficients:  Proof.  It is clear that 
, and a simple computation shows that 
 also holds. Indeed,
        
By Proposition 1, this is enough to ensure that  and  are optimal. □
 Remark 4. Please observe that  does not depend on .
 Lemma 2. Let  and  such that . Then, there exists a unique exponential , with  and  such that Moreover, this exponential is the best approximation to .
 Proof.  Let 
 be as in the statement. For 
, there exist unique 
 such that 
 and 
. Specifically, 
a is as in (
3) and 
.
Indeed,  means that , so  and each  determines . The same way,  determines .
Therefore, the equalities (
4) hold if and only if, for some 
, we have 
. Equivalently,
        
Please observe that this equality holds trivially when  and that, as both  and  are positive, we are not trying to divide by 0.
If we put 
, the last equality can be written as
        
We will denote as  the left hand side of this equality.
As we are only interested in positive roots of p, we can divide by  and consider  with  for .
Taking into account that 
, that 
q obviously vanishes at 1 (but this root corresponds to the void case 
 and so 
 and 
 are not defined) and also that the limit of 
 is 
∞ as 
z goes to 
∞, there must exist another 
, maybe 
, such that 
. By Descartes’ rule of signs—see [
9], Theorem 2.2—both 
q and 
p have at most two positive roots, so there is exactly one other positive root of 
p. To determine whether this root is greater or smaller than 1, we can compute the derivative of 
p at 1.
        
        so 
. This is positive provided 
, so the other root of 
p lies between 0 and 1 whenever the condition in the statement is fulfiled.
Therefore, there exists just one 
 for which
        
Now, taking
        
        and 
, we have the function we were looking for.
Moreover, suppose that there exist 
 such that 
 approximates 
 at least as well as 
f. We may suppose that 
. Now, the conditions for 
 can be rewritten as
        
By [
1], Lemma 2.8, this means that 
. □
 Remark 5. This Lemma gives a kind of analytic solution to the best approximation problem, with the only obstruction of being able to determine the other root of p. In the next section, we do the same with the symmetric cases and give actual analytic solutions to the same problem when the data are not good to be approximated by exponentials, ironically.
   3. Symmetric Cases and Limits
In this section, we focus on those cases that do not match with the problem we have in mind but, nevertheless, have their own interest. First, we approach the symmetric cases such as, for example, exponential growths. Second, we approach the limit cases, that is, the ones whose best approximation is not an exponential but the limit as  or  of exponentials. They are not what one can expect to find while adjusting data that follow an exponential decay, but we have been able to identify when they occur and deal successfully with them.
  3.1. Symmetric Cases
If  and f are as in the statement of Lemma 2, then a moment’s reflection is enough to realize that:
- 1.
- The  t- - symmetric-  data  -  have
             - 
            as its best approximation. 
- 2.
- The  T- - symmetric-  data  -  have
             - 
            as its best approximation. 
- 2.
- The  bisymmetric-  data  -  have
             - 
            as its best approximation. 
These symmetries correspond to the following:
- 1.
- If , then there are still two changes of sign in the coefficients of p, so it has another positive root. The difference here is that both a and k are positive. Please observe that this means that , so f must be increasing and increases faster for greater t. 
- 2.
- If , then  and . 
- 3.
- If , then everything goes undisturbed, but we have , so the second root of p is greater than 1. This implies that  and . 
  3.2. Limit Cases
Even if the conditions are not fulfiled by any symmetric version of the dataset, the computations made in the proof of Lemma 2 give the answer to the approximation problem:
If , then  is a double root—this corresponds to —and the “exponential” we are looking for is a line with negative slope. Namely, its slope is  and this best approximation is the line given by Chebyshev’s Alternation Theorem.
If , then this “exponential” is a line with positive slope —this is symmetric to the previous case.
If  or , then we have, up to symmetries, three cases:
- i:
- If  and , then the best approximation is a constant, namely, . 
- ii:
- If  -  then there is no global best approximation, but every exponential approximates  -  worse than the limit, with  - , of the best approximations. This limit is
             - 
            and it turns out to be also a kind of best approximation for every  - . 
- iii:
- If  then the situation is as follows: - As  -  lies before  - , any good approximation must be non-increasing.  T-  attains its second greatest value after  - , so every decreasing function approximates  -  worse than the function  -  defined as in ( 6- ). Actually,  -  could be ignored whenever  - , as we are about to see in the last item. 
Finally, if  and  have different signs, then there is just one change of signs in the coefficients of p, so the only positive root of p is  and , and there is no function fulfilling the statement. More precisely, this situation has two paradigmatic examples with  and  or with  and .
In the first case, the third point 
 simply does not affect the approximation in the sense that, for every 
k, the exponential 
 that best approximates 
T fulfils
        
Namely, if  is decreasing then , so  is neither  nor . If  is increasing then  for , and this, along with Proposition 1, implies that  cannot be the best approximation.
The second case is similar. Though the point  is relevant for some approximations, it is skippable for every  for some .
  4. General Datasets
In this Section, we apply the previous results to datasets with arbitrary size in order to find out when a dataset has an exponential as its best approximation. Before we arrive to this first objective’s main result, Theorem 1, we will need several minor results. The path that we will follow is, in a nutshell, the following:
Lemma 3 is a technical Lemma that allows us to show that the maps , ,  are continuous; see Corollary 1.
Lemma 4 is just Chebyshev’s Alternation Theorem, and it suffices to determine which datasets are good to be approximated by decreasing, convex functions—like exponential decays. We will call these datasets admissible from Definition 2 on.
Then, we determine the vectors one obtains by taking limits of exponentials with exponents converging to  or 0; see Proposition 2.
With all these preparations, we are ready to translate Lemma 2 to a more general statement, keeping the  condition. We give a necessary and sufficient condition for any dataset  to be approximable by exponential decays in terms that are easily generalizable to . This is Proposition 3.
In Proposition 3 and Remark 7, we improve the results in Corollary 1 to get Remark 8, where we show that we can handle the best approximations at ease if the variations of k are small enough.
Finally, Proposition 5 reduces the general problem to the  case, thus getting Theorem 1.
Lemma 3. Let  and  be the best approximation for  and suppose that there are exactly three indices  such that the equalitieshold. Then, there exists  such that, for every  the equalities hold with the same indices. Moreover, if  is such that the indices where the norm is attained are not , then there exists  for which the norm is attained in at least four indices.  Proof.  Suppose , the case  is symmetric.
As 
—see Lemma 1—taking 
 for every 
 we have 
. If we take, further
        
        then 
, so defining 
 we get 
 and a straightforward computation shows that
        
        for every 
k. Given any 
, our hypotheses give
        
As the map 
 is continuous for every 
l, we obtain that
        
        holds for every 
k in a neighbourhood of 
, say 
. Since there are only finitely many indices, we may take 
 as the minimum of the 
 to see that 
 is the best approximation for 
, and finish the proof of the first part.
Moreover, it is quite obvious that the expression for 
 will be 
 with
        
        if and only if, for every 
, one has
        
Please observe that the symmetric inequalities could hold and it would make 
 have the same expression, but this would imply 
. In any case, if 
 is such that there is some 
l for which
        
        then it is clear that there is 
 such that
        
Maybe it is not , but taking  as the smallest real number in  for which there exists such an l, we are done. □
 Corollary 1. The maps  and  are continuous.
 Lemma 4. Let . There exists exactly one line  such thatfor some  and , and this line approximates T better than any other line.  Proof.  It is a particular case of the Chebyshev’s Alternation Theorem, applied to the polygonal defined by . □
 Remark 6. Thanks to Lemma 4, we can define which vectors T will be our “good vectors”: those for which  and the equalities (8) hold with  and not with . When data fulfil these conditions, we have some idea of decreasing monotonicity and also some kind of convexity, and this is the kind of dataset that we wanted, though we will need to add some further conditions. Anyway, when dealing with datasets that fulfil any couple of symmetric conditions, we just need to have in mind the symmetries. Specifically, they will behave as in Section 3.1.  Definition 2. Let  be the line that best approximates . We will say that  is admissible when  and there exist  such that
- 1.
- 2.
-  for every  and every . 
- 3.
-  for every . 
 Once we have stated the kind of data which we will focus on, say discretely decreasing and convex, now we have to determine when they will be approximable. Before that, we will study the behavior of the limits of best approximations.
Lemma 5. Let . For each , consider some exponential  and Then  depends on k but not on  or , and moreover:
- 1.
- When  
- 2.
- When ,  
- 3.
- When . 
 Proof.  We only need to make some elementary computations to show that
        
The computation of the limit at 0 only needs a L’Hôpital’s rule application, and the other ones are even easier once one substitutes 
. See [
1] (lemma 2.10). □
 Proposition 2. Let . Then, the following hold:
- 1.
- For , . 
- 2.
- For ,  is , the line that best approximates . 
- 3.
- For ,  takes at most two values, and fulfils 
- 4.
- For ,  takes at most two values, and fulfils 
 Proof.  Let . If the best approximation for  is a constant, then it is constant for every , and the constants are obviously the same. Therefore, we may suppose  is not a constant for any . In this case, Lemma 3 implies that  is continuous for every l. As we have just a finite amount of indices, this means that  so we are done.
The proof of the three last items is immediate from Lemma 5. □
 Proposition 3. Let  and . Then, the best exponential approximation to  has the form  with  if and only if  is admissible and the following does not happen:
 and the second greatest value of T is attained after .
 Proof.  As the second greatest value of T will appear frequently in this proof, we will denote it as . Analogously, 
If ♠ happens, then the following expression is the limit of best approximations when 
Indeed, as 
 is the pointwise limit of functions fulfiling (
2), it must fulfil (
2) as well. It is clear that this implies that 
 must be as in (
9). It is clear that every strictly decreasing function approximates 
 worse than 
, so we have finished the first part of the proof.
Conversely, if 
 is admissible, then there are exponentials with 
 that approximate 
 better than the line 
. Indeed, we only have to consider the three points of the Definition 2 and take into account Lemma 3. As the function error is quasiconvex, the only option for contradicting the statement is that every exponential is worse than the 
 limit of the approximations 
, and of course this limit is not better than 
 as in (
9) because no vector of the form 
 approximates 
T better than this. Therefore, we may suppose 
 is the best approximation—recall that we are supposing that 
 is admissible. We need to break down several possibilities:
- I:
- If , then we can change the first coordinate of  from  to  without increasing the error, so one best approximation is a constant, and this means that , so  is not admissible, a contradiction. 
- II:
- If , then  is not admissible. 
- III:
- If , then we still have some options: - i:
- If  then we obtain that ♠ holds, no matter the value of . 
- ii:
- If  and the rate of decreasing  is greater than , then  fulfils the hypotheses of Lemma 2. This implies that the best approximation to  has , so the best approximation to  has . 
- iii:
- If  -  and the rates of decreasing are equal, then  -  is not admissible because this implies
                 
- iv:
- If  and , then Lemma 2 ensures that the best approximation is , with  and . □ 
 
 Remark 7. Let  be the line that contains  and . In [1] (lemma 2.10), it is seen that, if , wherethen  and - As , . 
- As , . 
- As , . 
Essentially, the same proof suffices to show how  behaves:
- As , . 
- As , . 
- As , . 
 This implies that the map  is strictly increasing, while  is strictly decreasing. So,  increases as  decreases, and, moreover, the map  given by  for  and  is a (decreasing) homeomorphism from  to . Applying the same reasoning to  and to , we obtain this key result:
Proposition 4. Let  and  and consider for every  the only exponential  such that  and  the only line such that . Then, all the following maps are homeomorphisms,  and  are decreasing and  is increasing:
- 1.
-  defined as . 
- 2.
-  defined as . 
- 3.
-  defined as . 
 We can rewrite Proposition 4 as follows:
Remark 8. Let  and . Let, for every ,  be the only exponential that fulfils . Then, when k increases,  and  decrease and  increases and everything is continuous.
 Proposition 5. The best exponential approximation (including limits) to  is the best approximation for some quartet 
 Proof.  Let 
 be the best approximation, and suppose that the conclusion does not hold. Then, we may suppose that there are exactly three indices where the norm is attained, say 
 and
        
If  is the limit at  of the best approximations, then it is the best approximation for every quartet that contains  because this means that ♠ holds. Therefore, suppose that the best approximation is , for some  – maybe . Then, for some  the functions , with  approximate this triple better than . Reducing if necessary , Remark 8 implies that every  with  approximates  better than , thus getting a contradiction. □
 Theorem 1. Let  be admissible. Then, the best approximation is a exponential if and only if ♠ does not happen.
 Proof.  The proof of Proposition 3 is enough to see that ♠ avoids the option of  being approximable by a best exponential.
If  is admissible, then the best approximation cannot be the 0-limit of exponentials, so it is either an exponential or the -limit of exponentials. So, suppose it is the -limit and let us see that in this case ♠ holds. It is clear that  as in the proof of Proposition 3, so just need to show that  occurs later than . Let . A moment’s reflection suffices to realize that  for every , so the error for  is exactly . Let  be the last appearance of  and  the first appearance of , and suppose , i.e., that ♠ does not hold. Thanks to Proposition 4, for small , there is k close enough to  that we can find  such that ,  when  and  when . If we take  small enough,  approximates  better than . □
 The value of 
b in (
3) can be easily generalised, so we do not need to worry about it. If we are able to determine 
k and 
, then finding 
b is just a straightforward computation. Namely, the following Lemma solves it:
Lemma 6. For , the best approximation to T in  is attained when With this b, the error is  Proof.  First, we compute the error. Since, obviously,
        
        we just need to take into account that (
10) implies
        
On the one hand, this implies that the error is as in the statement. On the other hand, let 
. Then,
        
        so 
 approximates 
T worse than 
. The same happens with 
 if we take 
, so the best approximation is the one with 
b as in (
10). □
 With this section, we have covered the theoretical aspects about the first objective of this paper. Examples in 
Section 5 are about Newton Law of Cooling and directly apply what have been developed here, ending this way the first objective.