Empirical Laws and Foreseeing the Future of Technological Progress

António M. Lopes 1,*,†, José A. Tenreiro Machado 2,† and Alexandra M. Galhano 2,† 1 UISPA–LAETA/INEGI, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, Porto 4200-465, Portugal 2 Institute of Engineering, Polytechnic of Porto, Department of Electrical Engineering, Rua Dr. António Bernardino de Almeida, 431, Porto 4249-015, Portugal; jtenreiromachado@gmail.com (J.A.T.M.); amf@isep.ipp.pt (A.M.G.) * Correspondence: aml@fe.up.pt; Tel.: +351-22-041-3486 † These authors contributed equally to this work.


Introduction
In 1965, the physical chemist Gordon E. Moore, co-founder of both Intel and Fairchild Semiconductor, wrote an article for the 35th anniversary issue of Electronics magazine about the evolution of the semiconductor industry [1].Moore noted that the complexity of minimum cost semiconductor components had doubled per year since 1959, the date of production of the first chip [2].Such exponential increase in the number of components on an integrated circuit became later the so-called Moore's law (ML), predicting that the number of components that could be placed on a chip could be expected to double every year, and that such trend would continue for the foreseeable future [3][4][5].
Moore revised his first prediction in the year 1980 [6], stating that the exponential increase could approximately double every two years, rather than every year [6].Since 1980, ML had successive revisions and reinterpretations that, in fact, ensured its survival.At the beginning of the 1980's, ML meant "the number of transistors on a chip would double every 18 months", in 1990 it became interpreted as "doubling of microprocessor power every 18 months", and in the 1990's it was reformulated to enunciate that "computing power at fixed cost would double every 18 months" [2], just to cite a few formulations.This apparent weakness of the empirical ML became its main strength, and made the law seem accurate until now [3].
The ML disseminated beyond semiconductor and computer technologies [3,7].In a broader sense, the ML refers to the perceived increase in the rate of technological development throughout history, suggesting faster and more profound changes in the future, possibly accompanied by underlying economic, cultural and social changes [8][9][10][11][12][13].
Given the apparent ubiquity of the ML (here interpreted in the broad sense of "exponential growth") [13][14][15], simple questions can be raised: Does ML describe technological development accurately?Can it be used for reliable forecasting?
The so-called exponential growth should be understood as an approximate empirical model for real data.During more than two decades, several authors foresaw the end of ML, arguing that technological limits were close [5,16,17].Others defended that ML would survive for many years, as they envisaged the emergence of a new paradigm that could enlarge dramatically the existent technological bounds.Within such paradigm novel technologies would became available, such as quantum devices [18][19][20], biological [21], molecular [22], or heterotic computing [23].Those technologies would thereafter keep ML alive.Whatever one's opinion, either forward-looking, or conservative, we should note that forecasting technological evolution for many years ahead from now is difficult.Technological innovation means by definition something that is new and, therefore, may be inherently unpredictable [24].However, even a rough knowledge about technological evolution could be invaluable for helping decision makers to delineate adequate policies, seeking sustainability and improvement of individual and collective living [25,26].
In this paper we seek to contribute for the discussion of some the questions raised above.We illustrate our scheme with real data representative of technological progress in time.
In that perspective, we adopt 4 performance indices: (i) the world inflation-adjusted gross domestic product (GDP), measured in 2010 billions of U.S. dollars; (ii) the performance of the most powerful supercomputers (PPS), expressed in tera FLOPS (floating-point operations per second); (iii) the number of transistors per microprocessor (TPM), and (iv) the number of U.S. patents granted (USP).Obviously, other data-series may be candidate for assessing the technological evolution.Data-series from economy, or from finance, can be thought as possible candidates since there is some relationship between them and scientific and technological progress.However, country economies evolve very slowly [27], while financial series are extremely volatile [28].Since they reflect a plethora of phenomena, not directly related with our main objective, we do not consider them here.
We start by the usual algebraic, or "static", perspective.In a first step, we adopt nonlinear least-squares to determine different candidate models for the real data.In a second step, we interpret the data-series as random variables.We adopt a sliding window to slice the data into overlapping time intervals and we evaluate the corresponding entropy.We then develop a "dynamical" perspective and we analyze the data by means of the pseudo-state space (PSS) technique.We conjecture about the usefulness of the entropy information and the PSS paths as complementary criteria for assessing the ability of the approximated models in the perspective of forecasting.
In this line of thought, the paper is organized as follows.In Section 2 we analyze the data and in Section 3 we discuss the results draw the main conclusions.

Data Analysis and Results
The 4 performance indices studied here are commonly referred as having exponential growth.The corresponding data-series have different time-lengths, with time resolution of 1 year.In some series we may have several samples for the same year, or a variable number of years between consecutive data points.For these cases, we calculate the annual mean value, or interpolate linearly between adjacent points.The data-series corresponding to the indices {GDP, PPS, TPM, USP} will be denoted by y i (i = 1, • • • , 4).The information is available at distinct sources.The World Bank database [29] was used for retrieving the GDP data.The PPS file is obtained from the TOP500 website [30].The TPM data-series is available at Wikipedia [31].The USP information was gathered from the U.S. Patent and Trademark Office [32].Table 1 summarizes the main features of the original data.We study the data-series to determine whether there is any signature of determinism underlying the data.The existence (or absence) of determinism will point towards the adoption of deterministic models (or statistical approaches) for representing and analyzing the data [33].For testing determinism, we use the autocorrelation function (ACF), due to its simplicity, but other methods can be chosen [34,35].
We apply a logarithm transformation ỹi = ln(y i ) to the data-series [36] and then we calculate the ACF of ỹi .Figure 1 depicts the correlograms for {GDP, PPS, TPM, USP}.For each case the maximum lag corresponds to approximately 50% of the total time span, expressed in years.An identical pattern is observed for all data-series, where the ACF gradually drops toward zero as the time lag increases.Such pattern is compatible with the existence of determinism.We now test the data not only for the standard exponential growth (Exp), but also for 5 additional hypotheses, namely the logistic (Log), Morgan-Mercer-Flodin (MMF), rational (Rat), Richards (Ric) and Weibull (Wei) models, given by: where t represents time and {a, b, c, d} ∈ R are parameters.Exponential models describe well many natural and artificial phenomena for limited periods of time [37][38][39].However, since it is not possible for any physical variable to grow indefinitely, we often consider that any growth process has an upper limit, or saturation, level [40,41].The sigmoidal models reveal that kind of behavior and are considered here (i.e., models Log, MMF, Ric, Wei).The Rat equation is tested for their simplicity, yet good adaptability to different shapes.We can adopt other fitting functions, eventually with more parameters, that adjust better to some particular series.However, as we shall discuss in the sequel, the main idea is to test some possible "candidate laws" for predicting future events.Therefore, only simple analytical expressions, requiring a limited set of parameters, are considered.
In Subsections 2.1 and 2.2 we adopt nonlinear least-squares and entropy, respectively, as complementary tools for data analysis.
Figure 2 depicts the original values, ỹi , and their estimated values, ŷi , obtained from the Models (1)- (6).The plots were extended toward a 10 year period in the future, that is, until year 2025.We observe good fit of all models to each available data-series.Moreover, when the models are used for extrapolating beyond year 2015, it is difficult to argue in favor of a given one in detriment of others.Table 2 summarizes the results in terms of the normalized root-mean-square deviation (NRMSD) and the coefficient of determination (R 2 ) for ỹi versus ŷi , within the time span of the original data.Figure 3 depicts the residuals ( ỹi − ŷi ) 2 .As can be seen, those are reasonably symmetrical, tending to cluster towards the middle of the plot, and do not reveal clear patterns.For all fitting functions the GDP and the USP are the series that lead to the best and worst fitting, respectively.On the other hand, for all series, the Rat is the function leading to the best fit, closely followed by the MMF.

Entropy Analysis
We consider here that each data-series {GDP, PPS, TPM, USP} can be modeled as a random variable, Y = g(X), where g(•) represents a function given by Equations ( 1)-( 6) and X is a random variable with probability density function (pdf) f x (x).This means that Y results from the transformation of the variable X by means of g(•).The pdf of Y, f y (y), is determined by g(•) and f x (x), by means of the so-called transformation technique, as follows [44].
Suppose that X is a continuous random variable with pdf f x (•), and X = {x : f x (x) > 0} denotes the set of all possible values of f x (x).Assuming that (i) function y = g(x) defines a one-to-one transformation of X onto its domain D, and that (ii) the derivative of x = g −1 (y) is continuous and nonzero for y ∈ D, where g −1 (y) is the inverse function of g(x), i.e., g −1 (y) is that x for which g(x) = y, then Y = g(X) is a continuous random variable with pdf: Once determined f y (y) we can then calculate the corresponding entropy.This scheme avoids problems that occur if we determine entropy directly based on the original data-series, since the reduced number of points available yield inaccurate histograms for estimating the pdf of the indices {GDP, PPS, TPM, USP}.
We generate a total of 10,000 points for each data-series by means of Models (1)-( 6).We then adopt a sliding time window in order to slice {GDP, PPS, TPM, USP} into partially overlapping time intervals.For each window we obtain f y (y) by means of the histograms of relative frequencies with bin size L = 100 and for a uniform pdf f x (x), where x stands for time.We then calculate the resulting Shannon entropy, S, defined by [45][46][47][48]: We adopt a window length W = 5 years with 40% overlap (i.e., 2 years) making a total of {18, 10, 18, 26} windows for the data-series {GDP, PPS, TPM, USP}, respectively.Alternative values for W were tested, but that revealed as a good compromise instantaneous and average behavior of the entropy evolution.Figure 4 depicts the results obtained for the Exp, Log, MMF, Rat, Ric and Wei models.In general we verify a smooth variation of S(t) for most cases, with exception of the Rat for the {GDP, TPM, (

Pseudo-State Space
The PSS is a technique adopted in the context of non-linear dynamics, especially useful when there is a lack of knowledge about the system [49,50].Generally speaking, the PSS allows a dynamical system to be represented in a higher dimensional space by taking a small sample of signals representing measurements of the system time history.The PSS is justified by Takens' embedding theorem [51], which states that if a time series is one component of an attractor that can be represented by a smooth d-dimensional manifold.Then, the topological properties of the signal are equivalent to those of the embedding formed by the n-dimensional state space vectors: where n > 2d + 1, {d, n} ∈ N and τ ∈ R + .Parameters τ and n denote the time delay and embedding dimension, respectively.The vector u(t) is commonly plotted in a n-dimensional graph, forming a trajectory.Usually we choose n = 3, or n = 2, in order to facilitate the interpretation of the graphs.The PSS produced by u(t) is expected to allow conclusions about the system dynamics [50].
A key issue in the PSS technique is the choice of τ.Intuitively, choosing τ too small will result in time series s(t) and s(t − τ) close to each other, not providing two independent coordinates.On the other hand, choosing τ too large will lead to series s(t) and s(t − τ) almost independent of each other, providing totally unrelated directions.Most criteria for choosing τ are based on the behavior of the autocorrelation, or the mutual information (MI), functions.The MI has the advantage of dealing well with nonlinear relations.Therefore, one possible criterion is to consider the value of τ corresponding to the first local minimum of the MI function.
The MI is a measure of how much information can be predicted about one time series, giving full information about the other [47,52].Let X and Y represent two discrete random variables with alphabet X and Y, respectively, then their mutual information I(X, Y) is given by: where, p(x) and p(y) represent the marginal pdf of X and Y, respectively, and p(x, y) denotes the joint pdf.
Given the real data representative of the indices {GDP, PPS, TPM, USP}, we start by calculating the MI between each data-series and their delayed versions, obtaining the minima for τ = {4, 4, 6, 4} (years).In a second step, we use the Models (1)-( 6) to generate year-spaced data points within the time span of the original data-series, as well as points for 2τ years before and 2τ after that period.These extrapolated points are then used for regenerating the data that is lost when we construct the PSS vectors in the 3-dimensional space.Therefore, each year regenerated is given by the median of the corresponding points extrapolated by the models.
Figure 5 depicts the PSS paths in log scales for n = 3 and the indices {GDP, PPS, TPM, USP}.For the indices PPS and USP we do not include the Rat model, since it degenerates when extrapolating for certain years.

Discussion of the Results and Conclusions
Applied sciences are fertile in providing data that can be fitted by means of trendlines.In many cases we obtain heuristic "laws", being particularly popular the trendlines of the exponential and power-law types [53][54][55][56][57][58][59].The extrapolation of some "law" towards the future, as in the cases of R. Kurzweil [15] or E. Brynjolfsson [60], is often received with some skepticism by the scientific community, since future seems unpredictable a priori within present day paradigm of cause and effect.Nonetheless, the debate of predictability is out of the scope of the present paper that intends mainly to explore possible visions of the future complexity based on today's available data.
We observe that the proposed trendlines fit well the data within the time period of the data-series.In fact, other expressions could be included in the set under analysis, that is kept to 6 just for the sake of parsimony.Therefore, the purpose of our study is to explore the behavior of such "laws" in the future and to design a scheme based on entropy to access such evolution.
Bearing these ideas in mind, adopting a time varying window and an entropy measure to evaluate the evolution of the trendline g(•) can be interpreted in a probabilistic way.Time consists of an input random variable X that excites function g(X) by means of a uniform pdf.Finally, the entropy measures the output random variable Y. Therefore, the real-world data-series can be interpreted as "noisy" measurements of the unknown trendline "law".
In the present case with 4 data-series and 6 trendlines, we observe that the simple Exp function can be interpreted as a reasonable good, but not flexible, expression.In the TPM data-series the Rat function is clearly not adequate for future extrapolations, but seems to remain a valid option in the rest of the series.In what concerns the other trendlines, both the time and the entropy analysis do not lead to the emergence of one "best option".
The 3D PSS inspired in dynamical systems seems more interesting to assess the performance of a given trendline than the 2D entropy-based approach.It is straightforward to point the Rat, Ric and Wei models for the TPM series as not adequate.Yet, in most cases we do not have an assertive criteria to decide for the "best" trendline.May be as pointed by the Danish philosopher Søren Kierkegaard "Life can only be understood backwards; but it must be lived forwards".

Table 1 .
Features of the original data.

Table 2 .
Values of RMSD and R 2 for ỹi versus ŷi , obtained by means of nonlinear least-squares.