2. Literature Review
Since the early twentieth century, hyperinflation episodes have provided economists with a natural laboratory for studying the interaction between money, prices, expectations, and exchange rates. Among these episodes, the German hyperinflation of the Weimar Republic in the 1920s occupies a central place. While it was neither the first nor the most extreme case of hyperinflation, it was the most prominent episode to occur after economics had consolidated as a formal academic discipline. As a result, it generated a foundational body of work, including early contributions by
Laursen and Pedersen (
1964), that shaped subsequent thinking on monetary instability.
A useful organizing framework is provided by
Tullio (
1995), who classifies the hyperinflation literature into three broad strands. The first focuses on real money balances and the quantity theory of money, with seminal contributions by
Cagan (
1956), and later extensions by
Frenkel (
1977,
1979) and
Taylor (
1991). This line of research emphasizes money demand, inflation expectations, and the explosive dynamics that arise when confidence in the domestic currency collapses. The second strand examines exchange rate determination under hyperinflationary conditions, notably in the work of
Frenkel (
1976) and
Engsted (
1996), where monetary fundamentals and purchasing power parity play a central role. The third strand builds on rational expectations theory, with influential contributions by
Sargent (
1977) and
Sargent and Wallace (
1973), highlighting the role of expectations, policy credibility, and regime consistency in driving inflationary outcomes.
Tullio (
1995) bridges these strands by proposing a unified framework in which money supply, prices, and exchange rates evolve jointly.
Despite the extensive literature on hyperinflation, surprisingly little attention has been paid to its implications for exchange rate forecasting, particularly as a direct test of the Meese–Rogoff puzzle. Most studies of hyperinflation focus on theoretical consistency or long-run relationships, while the question of whether monetary models can outperform a random walk in real-time forecasting has remained largely in the background. An important exception is
Rapach and Wohar (
2002), who show that during extraordinary episodes—such as wars or hyperinflation—fundamentals-based models can outperform the random walk. Their results hint at a broader insight: the Meese–Rogoff puzzle may reflect the typical economic environment rather than an inherent failure of monetary theory.
Recent literature has significantly broadened the conceptual and empirical scope of these debates. The dominant currency paradigm, articulated by
Gopinath (
2020) and further developed in
Gopinath et al. (
2021), shifts attention from bilateral exchange rates alone to the currency in which trade is invoiced and financial contracts are denominated. This work shows that when a dominant currency such as the U.S. dollar governs pricing and invoicing, exchange rate pass-through, inflation dynamics, and monetary policy transmission change fundamentally. In the context of hyperinflation and currency substitution, this perspective reframes dollarization as part of a wider pricing and contracting system, helping to explain why foreign currency use can persist even after stabilization.
Alongside this line of inquiry, policy-oriented research has increasingly turned its attention to financial dollarization and currency substitution using richer cross-country evidence. IMF work, particularly
Vargas et al. (
2023), paints a clear picture of how dollarization tends to persist in economies where policy credibility has been repeatedly undermined, even after inflation is brought under control. These studies also show that the path chosen for stabilization—whether anchored in the exchange rate or in domestic monetary policy—has lasting consequences for currency use and financial behavior. Related contributions, such as
Fares (
2023), stress that reliance on foreign currency often reflects deeply embedded institutional practices and long-term contracts rather than short-lived reactions to inflation spikes.
This interpretation resonates strongly with recent developments in the exchange rate forecasting literature. Rather than viewing predictability as universally weak, recent studies emphasize that forecasting performance varies with economic conditions. Periods marked by high volatility, structural breaks, or regime shifts tend to be precisely those in which fundamentals become more informative (
Rossi, 2013;
de Souza Vasconcelos & Júnior, 2023). Empirical contributions by
Jackson (
2024) and
McCarthy (
2025) reinforce this view, showing that fundamentals-based models are more likely to outperform the random walk when volatility is elevated and standard stability assumptions no longer hold. Hyperinflation represents an extreme version of this environment. Monetary forces dominate, volatility is pervasive, and regimes change rapidly, making hyperinflation a natural setting in which the conditional predictability highlighted in the recent literature can be examined in its clearest form. At the same time, methodological innovations such as dynamic model averaging and time-varying parameter frameworks reveal that allowing relationships to evolve over time is crucial for capturing real-world exchange rate dynamics.
Seen in this light, modern research invites a rethinking of both the classical hyperinflation literature and the long-standing skepticism surrounding exchange rate models. Hyperinflation episodes—especially the German experience of the 1920s—are increasingly understood not merely as historical curiosities, but as moments when the links between money, prices, expectations, and exchange rates become unusually visible (
Engsted, 1996;
Rapach & Wohar, 2002). Earlier studies understandably focused on establishing theoretical consistency and long-run relationships through integration and cointegration analysis. More recent work, however, places greater emphasis on forecasting performance, regime dependence, and the role of structural instability (
Rossi, 2013;
Jackson, 2024). At the same time, the dominant currency paradigm articulated by
Gopinath (
2020) and extended by
Gopinath et al. (
2021) highlights how pricing conventions and financial structures shape exchange rate behavior during crises. Complementary evidence on currency substitution and financial dollarization (
Vargas et al., 2023;
Fares, 2023) further shows that institutional memory and contracting practices can prolong foreign-currency reliance well beyond the inflationary episode itself. Together, these strands of literature motivate a renewed examination of whether monetary models—rooted in the quantity theory of money, purchasing power parity, and rational expectations—can outperform naïve benchmarks in hyperinflationary environments, thereby shedding new light on one of the most enduring puzzles in international finance.
3. Method
This section focuses on two related challenges that sit at the core of the exchange rate forecasting debate, one theoretical and the other methodological. The theoretical issue concerns the well-known difficulty of consistently outperforming the random walk when forecast accuracy is judged by error magnitude. The methodological issue relates to how fragile tests of the Meese–Rogoff puzzle can become when forecasts are generated from dynamic models. We begin with the theoretical question and argue that the inability to beat the random walk should not be viewed as an anomaly. On the contrary, under realistic conditions, it is often the outcome one should expect.
To build intuition, we begin with a simple simulation exercise that focuses on one central feature of the forecasting problem: volatility. We generate six artificial time series that differ only in their degree of volatility, as illustrated in
Figure 1. Each series contains 51 observations and starts from the same base value of 100, with an identical vertical scale across panels. This intentionally stylized setup abstracts from economic interpretation and model structure, allowing us to isolate how volatility alone shapes forecasting performance.
The results convey a message that is intuitive once stated clearly. As volatility increases, the underlying process becomes harder to predict, regardless of how well specified a model may be. Forecast errors grow not because the model suddenly becomes misspecified, but because the data themselves become more erratic. In this sense, forecasting accuracy deteriorates naturally as volatility rises. For a given model, higher volatility translates mechanically into a higher root mean square error. The same logic applies to the random walk benchmark, whose forecast error simply reflects the magnitude of changes in the series.
What ultimately matters, however, is not whether forecast errors increase—they inevitably do—but how fast they increase relative to one another. As volatility intensifies, if a model’s the root mean squared error (RMSE) rises more rapidly than that of the random walk, the model will fail to outperform the benchmark, even if it captures meaningful economic structure. Under sufficiently volatile conditions, this outcome becomes not just possible, but likely. In such environments, the random walk appears unusually robust, not because it embodies superior economic insight but because it is mechanically well suited to absorbing large, unpredictable movements.
Seen from this perspective, the Meese–Rogoff puzzle becomes less mysterious. The dominance of the random walk does not necessarily signal the irrelevance of economic fundamentals. Instead, it reflects the harsh reality that volatility places severe limits on what forecasting models can achieve. In highly volatile settings—such as periods of crisis or hyperinflation—failing to beat the random walk should therefore be understood as the rule rather than the exception.
Figure 1 illustrates this idea by showing six simulated time series, labeled A through F, that differ only in how volatile they are. Series A moves smoothly and predictably, making it, at least in theory, the easiest case for any forecasting model. At the other extreme, series F is highly erratic, with large swings from one period to the next, and is therefore the hardest to forecast. Arranging the series this way allows the reader to immediately see how increasing volatility alone can change the forecasting environment.
For each simulated series, forecasts are generated using polynomial functions, beginning with low-order specifications. Forecast accuracy is then evaluated using RMSE for both the model-based forecasts and the random walk benchmark, with Theil’s U statistic providing a convenient summary measure. The goal here is not to propose a particular functional form, but to explore how much structure a model must impose in order to compete with the random walk as volatility increases.
The results, reported in
Table 1, tell a clear and consistent story. In every case, Theil’s U exceeds one, meaning that the model fails to outperform the random walk for all six series. As volatility increases, the forecast errors of both approaches naturally become larger, which is exactly what one would expect. What stands out, however, is how differently the two respond to rising volatility. As shown in
Figure 2, where the horizontal axis measures volatility (standard deviation) and the vertical axis reports forecast accuracy in terms of RMSE, the RMSE of the random walk increases steadily, while the model’s RMSE not only remains higher throughout but also grows more quickly as volatility rises. The gap between the two widens as we move from the calm behavior of series A to the turbulence of series F. This pattern highlights an important intuition: in increasingly volatile environments, structured models are penalized more heavily than simple benchmarks, making it remarkably difficult for them to gain an edge over the random walk.
Outperforming the random walk in terms of RMSE is achievable under certain conditions, but this task requires a model with an extremely good fit. Using the most volatile series, F, we have shown that a model represented by a polynomial of order 2 cannot outperform the random walk, producing
U = 1.691. The next step is to fit polynomials of orders 1 (linear) to 8 and calculate the goodness of fit (measured by
), RMSE, and
U for each polynomial. The results (
Table 2) show that as the goodness of fit improves, forecasting accuracy relative to the random walk also improves, as
U values decrease. When polynomials of orders 6, 7, or 8 are used, the model outperforms the random walk.
The results point to a clear and intuitive trade-off. In highly volatile environments, low-order specifications are simply not flexible enough to outperform the random walk, even when they capture broad movements in the data. Higher-order polynomials can eventually improve forecasting accuracy, but only by fitting the data almost perfectly. Any apparent success therefore comes at the cost of overfitting and offers little in terms of economic interpretation, underscoring the mechanical nature of such gains.
This insight naturally motivates the empirical strategy that follows. The simulation does not suggest that increasingly complex models are desirable. Instead, it highlights how difficult it is to outperform the random walk in volatile settings without relying on excessive flexibility. To avoid both mechanical failure under volatility and spurious improvements driven by overfitting, the empirical analysis therefore turns to simple, static bivariate models grounded in well-established monetary theory. In the context of hyperinflation—where monetary forces dominate and economic relationships simplify rather than become more complex—such parsimonious specifications provide a transparent and economically interpretable framework for assessing predictive content.
Empirical exchange rate models have a long and humbling track record. More often than not, they struggle to fit the data well, and this difficulty shows up clearly when their forecasts are judged against the random walk using RMSE. In many cases, the random walk simply remains out of reach. The results presented above naturally point to a deeper methodological question about how the Meese–Rogoff puzzle is being tested in the first place.
A common pattern in the literature is that models claimed to outperform the random walk typically rely on dynamic specifications. At first glance, this may seem like a technical improvement, but it raises an important comparability issue.
Meese and Rogoff (
1983) originally evaluated forecasting performance using static models, intentionally stripping away dynamic feedback. Once dynamics are introduced, the nature of the comparison changes. The exercise is no longer about whether economic fundamentals can beat a random walk, but whether one dynamic process can edge out another.
This distinction matters because dynamics almost inevitably introduce persistence. Lagged dependent variables or adjustment terms allow forecasts to closely track past movements, which naturally improves apparent accuracy under RMSE criteria. When this happens, outperforming the random walk becomes less informative, as part of the benchmark’s behavior is already built into the model.
This point should not be read as a rejection of dynamic models. Dynamic specifications can be valuable for understanding economic mechanisms and adjustment processes. The concern here is narrower and methodological: under RMSE-based evaluation, forecasts from dynamic models are not strictly comparable to a pure random walk. Apparent forecasting gains may therefore reflect inherited persistence rather than deeper structural insight, making it essential to distinguish clearly between predictive performance and economic interpretation.
Seen this way, claims of having solved the Meese–Rogoff puzzle should be treated with care. Dynamic models can be useful and even necessary for many purposes, but they blur the line between structure and benchmark. Without a truly comparable setup, apparent improvements in forecasting performance may say more about model mechanics than about the underlying economic relationships they are meant to capture.
We can investigate this proposal by considering two dynamic models, a straight first-difference model and an error correction model. The benchmark used for evaluating forecasting accuracy may be the random walk without drift,
, the random walk with drift,
, or an estimated AR(1) process,
. The first difference model is specified as
Equation (1) can be manipulated to obtain
or
in which case the equation contains a random walk component.
Alternatively, consider the simple error correction model
Equation (4) can be rearranged to get
in another term:
Once again, Equation (6) quietly carries a random walk at its core. This is not specific to that equation and does not reflect a modeling error. It is simply what happens when dynamics are introduced into a model. The moment lagged dependent variables or similar mechanisms are added, a random walk-type component naturally slips in. Persistence is no longer something the model uncovers from the data; it becomes something the model assumes, almost by construction.
Over time, this built-in persistence tends to take over. The random walk component begins to dominate the behavior of the model, shaping both its in-sample fit and its forecasts. As a result, forecasts generated from dynamic specifications often look impressively close to the actual series. This closeness can easily be mistaken for strong predictive power, but in reality, it reflects a defining feature of random walks: they follow the data closely by design.
Figure 3 makes this intuition concrete. It plots a simulated time series and the corresponding forecasts generated by a simple random walk with no drift, with the horizontal axis representing time and the vertical axis measuring the value of the series. The forecasts closely trail the realized values, creating the appearance that the model is accurately predicting future movements. But the direction of causality is revealing. The forecasts are not leading the series forward; they are being pulled along by what has already happened. The actual values are doing the leading, and the forecasts are merely following.
This same pattern emerges whenever forecasts are generated from a dynamic model. The more heavily the model leans on its own past values, the more it behaves like a random walk, regardless of how sophisticated the surrounding structure may appear. What looks like forecasting success is often just persistence in disguise. Recognizing this helps clarify why dynamic models so often seem to “beat” the random walk—and why such results should be interpreted with care rather than celebration.
The monetary model of exchange rates is built on a simple idea: a bilateral exchange rate reflects differences in economic activity, money supply, and interest rates across countries. When growth, monetary expansion, or returns on assets diverge, exchange rates adjust in response.
Meese and Rogoff (
1983) formalize this intuition using the flexible-price version of the monetary model, which assumes rapid price adjustment and allows fundamentals to influence exchange rates directly.
where
y is the log of industrial production,
m is the log of the money supply,
s is the log of the exchange rate,
i is interest rate, and an asterisk denotes the corresponding foreign factor.
In periods of hyperinflation, the usual economic signals that help explain exchange rate movements largely lose their relevance. Differences in growth rates or interest rates across countries matter very little when rapid and uncontrolled money creation dominates everything else. In such circumstances, the “relative” money supply effectively becomes the money supply of the hyperinflating country itself, because developments abroad are simply too small to offset what is happening at home. The monetary model, which is usually fairly rich, collapses into something much simpler—a direct link between money growth and the exchange rate.
Frenkel (
1976) makes this point vividly in his study of the German hyperinflation. He notes that domestic monetary forces were so powerful that they overwhelmed influences coming from the rest of the world, allowing the exchange rate to be analyzed almost in isolation from external factors. Building on this insight, Frenkel strips the model down to the elements that truly matter under hyperinflationary conditions. His approach centers on just two fundamental relationships: purchasing power parity and the demand for real money balances, which together capture the core mechanics driving exchange rate behavior in such extreme environments.
where
is the conditional expectations operator. On the assumption that domestic monetary factors completely dominate domestic real income and foreign variables under hyperinflation,
Frenkel (
1976) advocated for focusing on prices, money supply, and exchange rate, while ignoring other variables.
Following
Frenkel (
1976),
Engsted (
1996) derives an equation for exchange rate determination under hyperinflation by substituting Equation (9) into (8) and omitting
.
where
. Solving this first-order expectational difference equation recursively produces the following present value model:
Frenkel (
1976) adopts a very practical perspective in his analysis. Instead of building the model around a present-value framework, he focuses on the core relationships summarized in Equations (8) and (9), keeping the analysis closely tied to observable market behavior. A key step in his approach is to use the forward premium as a stand-in for expected inflation, based on the idea that traders’ expectations are already reflected in forward exchange rates.
To support this assumption, Frenkel examines the relationship between the spot exchange rate and the forward rate by regressing the former on the latter. The results point to a well-functioning foreign exchange market, where forward rates act as unbiased predictors of future spot rates. This finding lends empirical support to the Unbiased Expectations Hypothesis (UEH). On this basis, Frenkel argues that when markets are efficient, expectations need not be imposed by theory—they can be read directly from market prices.
Engsted (
1996) manipulates Equation (10) to produce the following equation:
While the equations used by Frenkel and Engsted are elegant, they cannot be used to test the Meese–Rogoff puzzle because they are dynamic, giving rise to the problems described earlier—comparability and the rise of the random walk component.
The simulation results and the preceding discussion point to a simple but important implication. In highly volatile environments, outperforming the random walk in terms of RMSE is mechanically difficult, and introducing dynamics into a model often means building a random walk into the specification itself. If the aim is to assess whether economic fundamentals have genuine explanatory and predictive content, this calls for a modeling strategy that keeps structure transparent and avoids persistence by construction.
Hyperinflation provides exactly such a setting. When inflation becomes extreme, monetary expansion overwhelms most other economic forces. Foreign variables, real activity, and gradual adjustment mechanisms fade into the background, while the link between money, prices, and the exchange rate becomes unusually direct. In this environment, the monetary approach does not need to be simplified by assumption; it simplifies itself as a consequence of the underlying economic conditions.
For this reason, the models used in this paper take the form of simple log-linear relationships. The starting point is a standard demand-for-money function,
where
is the quantity of money demanded.
P is the price level,
Y is real income, and
k is a positive constant. Assuming an exogenous money supply and money market equilibrium yields
where
M is the money supply. If purchasing power parity holds, substituting Equation (14) into the PPP condition produces
Using logarithmic transformations and expressing the relationships in testable form—while abstracting from factors that are dominated under hyperinflation—leads to the following specifications:
Lowercase letters denote the natural logarithms of the factors represented by their corresponding uppercase letters. Equation (16) represents the quantity theory of money, Equation (17) represents purchasing power parity (PPP), and Equation (18) represents the monetary model. In the quantity theory, there is no foreign variable. In the PPP and monetary model equations, foreign variables and are omitted because they are dwarfed by the dominant influence of variables under hyperinflationary conditions. Equations (16)–(18), which produce the forecasts, reflects the notion that hyperinflation creates conditions approximating a controlled experiment. As discussed later, these specifications are confirmed both by theoretical interpretation and stylized facts.
4. Results
The empirical evidence comes from weekly data originally reported by
Graham (
1930), covering the period from December 1921 to November 1923, when Germany was experiencing the most dramatic phase of its hyperinflation. Money supply is measured as the total volume of paper currency in circulation, expressed in millions of marks—a figure that grew at an extraordinary pace during this time. Wholesale prices and the mark–dollar exchange rate are both recorded as index numbers, normalized to equal one in 1913, which makes it easier to see how far the economy had moved away from its pre-inflation benchmark.
Given the extreme monetary conditions, the data should be interpreted with appropriate caution. Hyperinflation inevitably introduces measurement challenges, including reporting delays, rounding, and potential inconsistencies across sources. Nevertheless, Graham’s weekly series remain among the most detailed and systematically constructed records available for this episode and have been widely used in the literature. The normalization to a 1913 base year serves a purely scaling role: it anchors all variables to a common pre-inflation reference point, facilitates log-linear estimation, and does not affect relative movements, forecast evaluation, or RMSE-based comparisons.
Figure 4 plots the logarithms of these three series over the sample period, with each point on the horizontal axis corresponding to a weekly observation. The picture is striking: money supply, prices, and the exchange rate rise almost in lockstep. This close co-movement is not a statistical illusion. In a hyperinflationary environment, rapid monetary expansion feeds directly into currency depreciation through purchasing power parity, while prices rise proportionally in line with the quantity theory of money. Under such extreme conditions, these variables are tightly bound together by economic fundamentals, which is why their correlation is both strong and entirely expected.
The acceleration of inflation in 1923 leads the analysis to be conducted by dividing the sample period into two periods. Period 1 encompasses December 1921 through November 1922, and period 2 includes December 1922 through November 1923 data.
Figure 5 shows scatter plots representing the model of monetary exchange rate (
s on
m) the quantity theory of money (
p on
m), and PPP (
s on
p). As shown, the variables are related strongly by positive log-linear relations. This kind of pattern is not observable under normal conditions.
Given this prima facie evidence for simple log-linear relationships, the forecasts are generated from the models recursively. The equations are estimated over part of the sample period,
, then a one-period-ahead forecast is generated for the point in time
k + 1. Using a general equation of the form
, the point forecast for
k + 1 is calculated as
where
and
are the estimated values of
and
, respectively. The process is repeated by estimating the model over the period
to generate a forecast for point in time
k + 2,
, and so on, until we get to
, where
n is the total sample size. This process of generating point forecasts involves recursive regression, as recommended by
Marcellino (
2002) and
Marcellino et al. (
2003). They make it explicit that their forecasts are generated by using a “fully recursive methodology”. Preference for recursive over rolling estimation may be justified in terms of forecasting efficiency, which refers to the property that a forecast contains all information available at the time of the forecast (
Nordhaus, 1987). Information is lost in rolling estimation, because some observations are excluded from the sample to obtain a constant estimation window.
The forecasting results are presented in
Table 3 for the quantity theory, PPP, and the monetary model over the two periods 1 and 2, and the combined period (1 + 2). In period 1, when hyperinflation was less severe, none of the three equations outperforms the random walk, as
in all cases. The null hypothesis
is rejected for the monetary model, implying that the model is worse, in terms of producing a higher RMSE, than the random walk. For the quantity theory and PPP models, the null hypothesis was not rejected, indicating they perform comparably to the random walk. Notably, the random walk fails to surpass these models, a finding that typically unattainable under normal economic cases. These results align with those based on the AGS test of
Ashley et al. (
1980), which produces an F statistic for the null of equality between two mean square errors. Notably, the Meese–Rogoff puzzle reveals that models struggle not only to outperform the random walk but also to match its performance.
Forecast accuracy is evaluated primarily using RMSE to remain closely aligned with the original Meese–Rogoff framework. Other accuracy measures or alternative benchmarks, such as a random walk with drift, could certainly shed additional light on the results. However, the weekly nature of hyperinflation data leaves only a limited number of usable observations, which naturally constrains the scope for extensive robustness checks. For this reason, the results are best read as indicative of how forecasting performance changes across regimes, rather than as definitive evidence of broad and unconditional superiority.
In period 2, models are anticipated to surpass those in period 1 due to the accelerating inflation rate. The findings show that this holds, as and the null hypothesis is rejected for the three equations. Furthermore, the findings of the AGS test show that RMSEs are significantly different, with the model outperforming the random walk in all cases. Specifically, for the combined period (1 + 2), both purchasing power parity and the quantity theory models surpass the random walk’s performance. The null hypothesis is rejected in these models, and the AGS test results are consistent with this finding. Thus, even simple bivariate models can surpass the random walk in out-of-sample predictions, suggesting that the dominance of the Meese–Rogoff puzzle may be context-dependent and can be overturned in hyperinflationary regimes.