Quantifying the Predictability and Efﬁciency of the Cointegrated Ethanol and Agricultural Commodities Price Series

: Ethanol is an energy commodity and a biofuel that has contributed to mitigate the use of fossil fuels. Nonetheless, the environmental beneﬁts derived from the use of ethanol can occur at the expense of the agricultural commodities prices, affecting their volatilities and efﬁciency. This problem occurs because most of the raw materials currently used to produce biofuels, such as corn in the US, sugarcane in Brazil and oilseeds in Europe, are also important global commodities. This work adopts several mathematical tools, namely the Detrended Fluctuation Analysis, fractal dimension, and the Hurst and Lyapunov exponents. This set of tools measures the market efﬁciency and the prices’ predictability for the ethanol and some agriculture commodities that revealed price transmission (cointegration), in a previous work. The results show that, in general, the ethanol has a lower predictability horizon than the other commodities. Moreover, it is discussed a quantitative measure to assess the market performance, by means of the efﬁciency index. We observe that the ethanol efﬁciency is similar to the other agricultural commodities evaluated.


Introduction
The mitigation of the climate change is a concern not only of researchers and governments, but also of a significant part of the world's population. Nevertheless, fossil fuels are used worldwide and they play a major problem in the global warming and in climate change [1].
Since the 1970s, Brazil contributes to minimize these unwanted effects by increasing the usage of biofuels, notably ethanol [2,3]. However, this topic is still in debate, as a future cropland expansion for ethanol and other biofuels can indirectly increase deforestation, having the opposite effect from the expected greenhouse gas emissions mitigation [4][5][6]. The ProAlcool program, the flex-fuel (ethanol/gasoline) vehicles production and a large number of financial subsidies for sugarcane producers (i.e., the feedstock for the production of Brazilian ethanol) were important encouraging policies for the sugar and alcohol sector in Brazil [7,8].
Nowadays, Brazil is preparing to a new round of investments in this biofuel market. The new Renovabio 2030 program is part of this initiative that aims the mandatory increasing blend of ethanol in gasoline toward 30% by 2022 and 40% by 2030 [9], against the current 27%.
Indeed, in the last decade, the Brazilian ethanol production suffered a retention due to the government policies to avoid (artificially) domestic gasoline prices fluctuation relatively to the international market ones. For this reason, during this period, the gasoline prices in the domestic market were forced down. To guarantee the competitivity of the ethanol, and vis a vis the gasoline prices, its cost was also kept low, making the sector inefficient or even unfeasible [10].
Several works studied the ethanol price dynamics [9,[11][12][13][14][15][16][17][18]. Albarracine et al. [11] investigated the efficiency of the energy (ethanol) and agricultural (sugar) markets and their findings pointed out to a higher market efficiency for the energy than for the agriculture. Quintino et al. [16] analyzed the relationship between the spot and futures prices of the ethanol in Brazil and unraveled that the future market is more efficient in price discovery and information transmission. However, the cash market leads to a long-run price discovery process. Other studies also pointed that the energy market is influenced by the agricultural commodities prices [9,[19][20][21][22][23][24][25][26][27][28][29][30].
We must have in mind not only the efficiency of a market is related to how its information is transmitted to the prices, but also the significant magnitude of Brazil with regard to the ethanol and agricultural production and the price transmission between two or more markets. The influence in their efficiencies can lead to new strategies for traders, hedgers, and investors. For example, the more efficient a market, the smaller the likelihood that information of other markets reflect significantly on its prices. As ethanol is produced from sugarcane in Brazil, it is expected that it can also be affected by the market volatility that is presented in such agricultural commodities.
In fact, the study of the volatility has been widely applied in commodity prices and leads to the adoption of models that can effectively help predict those price imbalances [20,24,[31][32][33]. Furthermore, the prices volatility mitigation is a practice that is also seen in country policies [34][35][36]. For example, in Brazil's 2014 elections the government tried to control the volatility of ethanol and gasoline prices by adopting a disguised freeze of their values. Despite that, the efficiency of a market can also give information about their volatility, since efficiency is also an indicative of low volatility, i.e., of small variance of the prices [37].
Hereafter, we adopt several tools [38][39][40], namely the Detrended Fluctuation Analysis (DFA), fractal dimension (d A ), and Hurst (H) exponent to measure the market efficiency through the so-called "efficiency index" (EI) and the Lyapunov (λ) exponent, to quantify the prices' predictability for the ethanol and several agricultural commodities.
Such goods revealed price transmission (cointegration) in a previous study [19]. In fact, not only some agricultural commodities [9,28], but also relevant commodities for Brazil's economy, (e.g., the cotton, live cattle, corn, soybean, Arabica coffee, and Robusta coffee) have a direct linkage with the Brazilian ethanol.
The paper is structured as follows. In Section 2, the price time series (TS) are introduced, and the DFA, H, d A , EI, and λ indices are formulated. In Section 3, the results are presented and discussed. Finally, in Section 4, the main conclusions are outlined.

Data and Methodology
David et al. [19] investigated the cointegration and price transmission through a bivariate analysis between the ethanol (ETH) and seven important commodities in the Brazilian GDP. Those agricultural commodities consist of the sugar (SUG), cotton (COT), live cattle (LCA), Arabica coffee (ARA), Robusta coffee (ROB), corn (COR), and soybean (SOY). The presence of breakpoints in the ethanol price series was identified using the Bai-Perron algorithm [41], and the period and sub-periods were classified as follows; Full-period (Jan/2011 to Dec/2018), Sub-period 1 (P 1 : Jan/2011 to May/2012), Sub-period 2 (P 2 : May/2012 to Nov/2013), Sub-period 3 (P 3 : Nov/2013 to Sept/2015), Sub-period 4 (P 4 : Sept/2015 to Oct/2017) and Sub-period 5 (P 5 : Oct/2017 to Dec/2018). Table 1 summarizes the cointegrations found by David et al. [19]. Table 1. Johansen cointegration by means of the Max-Eigen and Trace tests from David et al. [19]. Significance levels of 10% (*), 5% (**) and 1% (***) for rejecting the null-hypothesis of no cointegration. The data was obtained from the Center for Advanced Studies on Applied Economics/University of Sao Paulo (CEPEA/USP) and the CEPEA methodology for the daily pricing of these products can be found on its website www.cepea.esalq.usp.br.

Pairs
We propose the EI, calculated using H and d A , introduced by Kristoufek and Vosvrda [42] and described later to assess the market efficiency, for the ethanol and agricultural commodities. The main idea is to observe the distance of the actual market state with respect to an ideal benchmark index for the full-period, and for the sub-periods P 1 , P 3 , and P 5 . Additionally, we employ the EI in two different perspectives, where we firstly consider EI as a unique value representing each of the cointegrated TS as described in Section 2.3. Secondly, we explore EI using a "Rolling Window Approach", that allows the visualization of the dynamics of the index along time. Moreover, we investigate in Section 2.4 the predictability of the price TS using the Lyapunov exponent.

Detrended Fluctuation Analysis and Hurst Exponent
The DFA scaling coefficients is one of the most employed methods to measure H, as it allows its calculation using the slope of a variability function F(m) plotted versus m on a log-log scale, where m are the time slots that divide the TS [43][44][45][46].
The H exponent was introduced by H. Hurst [39] and is related to the concepts of Brownian motion (Bm) and fractional Brownian motion (fBm) being often used to quantify the long-range, or long-memory dependence in the TS [47][48][49][50].
We find a variety of computational techniques for determining H, such as the classical rescaled range analysis originally developed in [51,52], Fourier analysis using the FFT algorithm [53,54], Detrended Moving Average (DMA) [55,56], and wavelet decomposition [57,58]. However, the DFA allows calculating H with a simple procedure, avoiding the spurious detection of correlation or self-similarity, and can be applied for processing nonstationary TS [59,60].
The H index [39,61] value can be interpreted considering the following properties, • 0 < H < 1, • H = 1/2, for a random walk (Bm). The TS has no long memory process, • H > 1/2, for a persistent (long memory or correlated) process that leads to the concept of the fBm, and • H < 1/2, for an antipersistent (short-term memory, anticorrelated) process.
Thus, the closer the H value is to 1, the higher the probability for the next change to be positive, if the last change was also positive and vice versa.
The calculation of the H index using the DFA involves several steps. The first step [59,62] consists on the following estimation, where N is the number of observations in the TS, P t represents the value (price) observed at the time instant t, and P denotes the arithmetic average of the price. The second step [59,62] calculates the quantity F(m), that is, the root-mean-square error (RMSE), by means of where P m (i) is the ordinary least squares value that is subtracted from P(i), for removing any trend. The process is repeated for distinct values of m and the slope of the line relating log(F(m)) with log(m) determines the scaling H exponent.
If k ∈ N, then the k th order auto-covariance is defined as and the k th order autocorrelation can be determined as Peters [63] explored an important relation between H and the autocorrelation function ρ, given by In this work, the value of H is used as a component to calculate the EI index for the ethanol and the cointegrated agricultural commodities. If we obtain H > 1/2, then it indicates the long-term memory in the TS [47]. However, the index can be affected by short-term memory bias or distributional properties [64]. Therefore, H values deviating from the theoretical value of H = 1/2 do not necessarily indicate the absence of random walk phenomenon. To mitigate this problem, other measures are proposed in addition to H in the follow up.

Fractal Dimension
The local memory of a TS can be measured by the fractal dimension d A , with 1 < d A ≤ 2. The index d A is intensively connected to H, so that d A + H = 2. This relationship allows the accurate reflection from a local behavior (fractal dimension) to a global behavior (long-term memory) [42] of a given TS.
Similarly to the H definition, the d A properties [42,64] can also be summarized as three different intervals: , for a random walk (Bm) such that the TS has no long memory process and no local anticorrelations; • d A < 3/2, corresponds to a persistence (long memory or correlated) process that leads to the concept of the fBm; • d A > 3/2, for an antipersistent process (short-term memory, anticorrelated).
In this work, the d A is obtained from two methods, namely, by means of the so-called Hall-Wood (HW) and the Robust Genton (RG) estimators [40]. Therefore, both d A HW and d A HW are combined and its mean value M D is obtained in order to calculate the EI index as described in Section 2.3.

Hall-Wood Estimator
The HW proposed in [65] is a box-counting estimator that admits small scales. The area of the boxes covers the curve instead of just their sum. Formally, let us have a scale l = l/n, where l = 1, 2, 3, ...n. The aforementioned area is where [n/l] is the integer part of n/l. The HW estimator is given by the expression where L ≥ 2, s l = log(l/n) and s = (1/L) ∑ L i=1 s i . Using L = 2, as suggested by Hall-Wood to avoid bias [65], one obtains Similarly to H, the fractal dimension d A HW is applied as a component to calculate the EI index for the ethanol and for the cointegrated agricultural commodities.

Robust Genton Estimator
Genton [66] introduced the RG estimator to calculate the d A based on a highly robust estimator of scale. It is well known classical variogram estimators are not robust against outliers in the data and, therefore, the estimator developed by Genton is adopted. The calculation is given by Similarly to the HW, one obtains the RG estimator as where L ≥ 2, s l = log(l/n) and s = (1/L) ∑ L i=1 s i . Using L = 2 to mitigate bias, one obtains The fractal dimension d A RG is also used as a component to calculate the EI index for the ethanol and the cointegrated agricultural commodities.

Market Efficiency Measure
The EI index is defined [42,64,67] as where M * H = 1/2 and M * D = 3/2 are the expected values for the efficient market and R H and R D are the ranges of the measures related H and d A , respectively. We note that M H is the measure of H and the value M D is obtained from M D = ( d A HW + d A RG )/2, with d A HW and d A RG calculated by Equations (8) and (11), respectively. We consider R H = 2 for the Hurst exponent and R D = 1 for the fractal dimension, so that the maximum deviation from the efficient market value is the same for all measures.
Values of EI near zero imply a more efficient market, meaning lower distance (deviation) between the measured and the market efficiency values.

Lyapunov Method
The Lyapunov concepts [68,69] are a useful mathematical tool for analyzing non-linear and chaotic systems. The Lyapunov (λ) exponent [70] determines how the presence of chaos addressees the predictability of the future [71].
As the H index is related to the fractal dimension (d A ) of a TS by the condition d A = 2 − H, a relation between the Hurst and Lyapunov exponents can be estimated from the global dimension (d G ). This is used to find the neighboring points in the TS and must be at least or greater than 2d A in order to avoid false neighbors in the calculation [70]. Then, one can write [70,72] d G ≥ 4 − 2H. (13) In this work, we obtain the Lyapunov exponent from a TS by means of the algorithm developed by Wolf et al. [71]: where M is the total number of replacement steps and t k − t k−1 = ∆ is the time step. The signs of the Lyapunov exponents provide information about the system dynamics. Indeed the value of λ is an important index to diagnose chaotic motion since a positive value indicates that the system is chaotic. Besides, the Lyapunov exponent can be an indicator of how far into the future forecasts can be made in a TS [73]. Bearing this fact in mind, we applied such technique to calculate the Lyapunov exponent for the price TS.

Results and Discussion
In this section, several numerical experiments are conducted to explore the Hurst exponent, the fractal dimension, the efficiency market index, and the Lyapunov exponent. We use H and d A in order to obtain the EI index. We highlight that the closer EI is to zero, the higher is the efficiency of the market. Also, the Lyapunov exponent is used to evaluate the maximum predictability of a price TS. The numerical value 1/λ can be calculated and interpreted as a quantitative measure of the predictability of the future of the TS based on its past.
Table 2 and Figure 1 show that the ROB is the most efficient (EI = 0.0672), whereas COR, SUG, ETH, and COT are less efficient when considering the full-period. Moreover, the COT has also the lowest value of λ implying the highest prediction horizon for this period. The similarity of the EI values for the ETH and SUG may indicate a market linkage effect between those commodities. On the other hand, the ROB is the most efficient and the ETH presents the shortest prediction horizon (e.g., 1/λ = 3.2520), which is inferior to four full days.
For P 1 the ROB shows the lowest value of EI, followed by the SUG, ETH, and COT. Also, SUG achieves the highest prediction horizon when compared to the other commodities.
It is possible to note that the ARA has the lowest values of EI = 0.0224 and also the lowest value of the prediction horizon (e.g., 1/λ = 2.5803) during P 3 . The values of EI for the other commodities analyzed in this period are 0.1477, 0.1027, and 0.0961 for COR, LCA, and ETH, respectively.
The ETH achieved its lowest value of EI during P 5 when compared to all periods investigated in this work, indicating a moving towards the efficiency market for the ETH. In an opposite direction, ROB has been increasing its EI value over the evaluated periods.
We also adopt a "Rolling Window Approach" to accurately explore the EI dynamics along time. This approach means that the EI is calculated in a rolling fixed window with a length of 100 observations. Therefore, we start at the first of the one-hundred observations and roll until the last one, for a sample of 135 observations. The window size is chosen having in mind that it must be long enough to reflect the index dynamics and provide statistical significance [74,75]. Figure 2 depicts the dynamics of EI for all periods and commodities with cointegrations confirmed in [19]. Figure 2a shows the results achieved for the full-period. One can note that the price series of the ROB commodity is the one that fluctuates closer to the values pointing to a more efficient market. In turn, the COT commodity presents the highest EI values for the full-period. Despite the Brazilian cotton market not presenting strong evidences of price transmission with international prices [76,77], it is widely known that major players (e.g., the USA) affect the price dynamics of this sector. In consequence, the Brazilian cotton sector still has difficulties to compete with international prices, resulting in importation as the most feasible option for domestic firms in some cases [78]. Figure 2b illustrates the dynamic behavior of EI when P 1 is considered. Also, the value for the ROB maintains its state achieving the lowest values for the EI index.  We verify from Figure 2c that the ARA presents lower values of EI during P 3 . Also, the COR and LCA have the highest values of EI and can be moving along time very closely or even together. As pointed out in [19], the cointegration between the ETH and COR prices possibly forced the LCA to be also cointegrated with the ETH. This evidence becomes clearer when EI is analyzed, as the two commodities are also moving together in terms of efficiency. Figure 2d illustrates the dynamic behavior of EI for ETH and ROB during P 5 , being the only cointegrated pair for this period [19]. Note that the values of EI are increasing along time in P 5 for the ETH indicating that it is losing efficiency. Thus, it can be an evidence that after the Brazilian presidential process, when the prices of gasoline began fluctuating back from international prices, the ETH prices increased inefficiency in a volatile manner because of the discontinuation of the effective policy regulations. Conversely, the ROB is decreasing its EI values and pointing to a gain of efficiency.
The pattern of high efficiency showed by the ROB during the full-period, P 1 and P 5 , as well as by the ARA in P 3 , is expected for these commodities since coffee is the most liquid and oldest contract traded in the Brazilian Exchange (B3). Therefore, traders and market agents tend to respond faster to changes in the market information [78,79].
In the following, we pay special attention to the Lyapunov exponent for the cointegrated TS and its periods. Figure 3 shows that λ is positive for all commodities and for all periods explored, meaning that the TS have a chaotic behavior and a low predictability. We highlight that the closer λ is to 0, the greater is the prediction horizon calculated by 1/λ. We note that the smallest values of λ occur for surfaces related to the full-period (Figure 3a), and also to the subperiod P 5 (Figure 3d), indicating a greater predictability for these periods when compared to the subperiods P 1 and P 3 in Figure 3b,c, respectively. Anyway, for all the commodities price series, an accurate predictability is limited to a few days.

Conclusions
The detrended fluctuation analysis, fractal dimension, and the exponents of Hurst and Lyapunov provide quantitative information about market efficiency and prices' predictability for the ethanol and some agriculture commodities, whose price transmission (cointegration) was reported in a previous work. All results suggest that the measure of market efficiency is similar for most commodities and that the ethanol has lower predictability than the others. However, the coffee commodities (ARA and ROB) showed higher efficiency for all periods. This behavior is expected since coffee is the most liquid and the oldest contract traded in the Brazilian Exchange, and thus market agents tend to respond faster to market information. The low predictability of ethanol can indicate an increase in its volatility in the posterior period (P 5 ) of the energy policies maintaining energy prices low for the domestic market. Also, when the ETH is cointegrated with the COR and LCA during P 3 , the similarity between the COR and LCA in what concerns their EI dynamics indicates a strong linkage between these markets even in terms of efficiency.
The results can provide assertive information about predictability and efficiency, contributing to the clarification of price relationships between the ethanol and other commodities. This can help agents in the markets to make decisions involving hedging, risk exposure, and investment incentives, as inefficiency and prediction horizon are features that can lead to better, more accurate forecasting strategies.
Author Contributions: These authors contributed equally to this work.

Conflicts of Interest:
The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript.