A Generalized Model of Complex Allometry I: Formal Setup, Identification Procedures and Applications to Non-Destructive Estimation of Plant Biomass Units

: (1) Background: We previously demonstrated that customary regression protocols for curvature in geometrical space all derive from a generalized model of complex allometry combining scaling parameters expressing as continuous functions of covariate. Results highlighted the relevance of addressing suitable complexity in enhancing the accuracy of allometric surrogates of plant biomass units. Nevertheless, examination was circumscribed to particular characterizations of the generalized model. Here we address the general identification problem. (2) Methods: We first suggest a log-scales protocol composing a mixture of linear models weighted by exponential powers. Alternatively, adopting an operating regime-based modeling slant we offer mixture regression or Takagi–Sugeno–Kang arrangements. This last approach allows polyphasic identification in direct scales. A derived index measures the extent on what complexity in arithmetic space drives curvature in arithmetical space. (3) Results: Fits on real and simulated data produced proxies of outstanding reproducibility strength indistinctly of data scales. (4) Conclusions: Presented analytical constructs are expected to grant efficient allometric projection of plant biomass units and also for the general settings of allometric examination. A traditional perspective deems log-transformation and allometry inseparable. Recent views assert that this leads to biased results. The present examination suggests this controversy can be resolved by addressing adequately the complexity of geometrical space protocols


Introduction
Carbon fixation by plant biomass units promotes reduction of concentration of greenhouse gases in the atmosphere, thereby lessening global warming [1][2][3][4][5][6]. Therefore, the assessment of the flux and storage of carbon in plant biomass is of great interest. Concomitantly important is the adaptation of methods aimed at non-destructive estimation. For instance, aboveground tree biomass in forest ecosystems has been estimated through remote sensing protocols [7][8][9][10]. Nevertheless, a number of factors such as sample size, weather, complexity of biophysical settings, study area scale, software, or spatial resolution can induce uncertainty of remote-sensed estimation [11][12][13][14][15]. Allometric methods allow implementation of parallel cost-effective non-destructive estimation of plant biomass units [16][17][18][19][20][21][22][23][24]. However, this approach is not also problem-free. Factors like analysis method sample size and data quality can bear significant influences on precision [25][26][27][28][29][30][31][32][33]. Understanding the way procedural factors drive precision of allometric projection of plant biomass units is crucial for assuring reliability. This paper centers on the influences of the analysis method and particularly aims at proposing reliable methods for handling complexity. For clarification it is pertinent reviewing the notions behind interpretation and identification of the allometric paradigm.
As was originally envisioned, the term allometry, also mentioned as biological scaling, refers to the relation between the size of a given organismal trait and overall body size. The notion developed from observations by Otto Snell in 1892 and D'arcy Thompson in 1917. Allometry as a study subject was outlined by Julian Huxley in 1932, in his theory of constant relative growth by two body parts [34], formulated though the scaling equation: where and are quantifiable traits, the parameter is nominated as the allometric exponent and is recognized as the normalization constant. This model also known as the equation of simple allometry has been widely used in research problems in many fields including, biology [25,35,36], biomedical sciences [37][38][39][40][41], economics [42][43][44][45][46], earth and planetary sciences [47][48][49][50][51], resource management and conservation [52][53][54][55][56]. The interest for this model lies essentially in its practical utility to produce surrogates of a response , that is difficult to measure in direct way, by using estimates of the parameters and and easily gotten measurements of a covariate .
Parallel to Equation (1) is the traditional analysis method of allometry (TAMA). This is a widespread protocol that relies in logtransformation in order to transfer Equation (1) into a linear model in geometrical space. Then, the fitted line is back-transformed to yield the original two parameter power function in arithmetical scale. A logtransformation embraces a notion of multiplicative growth. Moreover, in Huxley's rationale the intercept of TAMA's line was of no explicit biological relevance, but the slope was significant enough as to mean allometry itself. This interpretation maintains nowadays as the only valid theoretical perspective for many practitioners of allometry [57][58][59]. However, in spite of TAMA's prevalence, there are views asserting this scheme produces biased results [60][61][62][63][64]. Also allometrical relationships express as power functions and fit in to a non-linear form in the original scale of data. Thus, keeping the analysis in arithmetical scales is in some way more adequate. Concomitantly for this perspective, direct non-linear regression in arithmetical scales (DNLR) becomes a default standard [65][66][67][68][69][70][71][72]. From this slant, the failure of a TAMA fit manly obeys to unsuitable complexity endured by Huxley's formula of simple allometry. Amendment of this circumstance has encouraged routing further away from Huxley's perception on covariation among different traits, in order to conceive allometry as centered on covariation between size and shape [73,74]. Alongside this it is necessary to consider of multiple-parameter complex allometry (MCA). Related formulations can admit all sorts of nonlinear or discontinuous relationships intended to be fitted in arithmetical scales by means of DNLR protocols [75][76][77][78].
However, opposing MCA-DNLR slants, defenders of a TAMA approach state that as conceived in the original theoretical standpoint of allometry, a logarithmic transformation is deemed necessary in the analysis [18,59,[79][80][81][82][83][84][85][86][87]. Thus, embracing MCA-DNLR protocols feeds one of the most central discrepancies among schools of allometric examination. Furthermore, from a traditional stance MCA-DNLR schemes sacrifice appreciation of biological theory in order to privilege statistical correctness [32,59,87]. Besides, a DNLR approach could stand unreliable results, for instance, an inadequate consideration of intrinsic error structure can lead to substantial bias [88]. In addition, largest values of covariate can be influential of parameter estimates [18]. From a practical point of view depending on the complexity of MCA to be fitted in direct scales, there could be issues related to initial parameter estimates as well as convergence of non-linear regression algorithms. Therefore, there are also caveats in efficiency of allometric projection of plant biomass units derived from MCA-DNLR arrangements. Then, defenders of traditional allometry assert that overcoming MCA-DNLR inconveniences could be achieved by embracing suitable complexity in geometrical space.
Huxley marked a breakpoint in the log-log plot of chela mass vs. body mass of fiddler crabs (Uca pugnax). This was endorsed by an abrupt change in relative growth of the chela approximately when crabs reach sexual maturity [34,89,90]. Admittance of a break point for transition to succeeding growth phases readily adapts a notion of curvature in geometrical space. This paradigm is also referred as non-loglinear allometry in Huxley's original interpretation [70,91,92]. Extension of Huxley's break point idea allows consideration of polyphasic loglinear allometry (PLA) [91,[93][94][95]. This approach characterizes heterogeneity of the response in geometrical space by composing the range of covariate into sectors separated by break points. Each tract associates to a linear sub model. Therefore, as opposed to examination through MCA-DNLR schemes, endorsing PLA break-point borne curvature offers a way to add complexity while maintaining the essence of traditional allometry. However, curvature as comprehended in the traditional allometric perspective is also controversial. For instance, its manifestation has been associated to distortion produced by the use of a logarithmic transformation itself [64,96]. However, Mascaro et al. [88] emphasizes that the occurrence of curvature has nothing to do with the use of logarithmic transformations since deviations from perfect log-linear allometry can be explained on grounds of methodological factors related to data. In conceiving the aims of the present research we abided by this perspective and offer methods that in our view allow efficient identification of MCA patterns through logtransformation methods. Nevertheless, as we shall explain ahead, present methods could also embrace efficient identification of MCA forms in direct scales.
Following Bervian et al. [77] and Echavarria-Heras et al. [92], we can conceive MCA as a generalization of Huxley's formula of simple allometry, namely, = ( ) (2) with and belonging to domains and one to one, both contained in , and where ( ) and ( ) are real functions, with ( ) having a range in . Moreover, as we explain in the methods section, logtransformation of Equation (2) leads to a generalization of a TAMA arrangement that hosts curvature in geometrical space in a direct and intuitive way. Moreover, Mascaro et al. [88] recommends three ways of addressing this sort of curvilinearity. One adopts a PLA approach by indorsing separation of data in order to contemplate local linear models with the aim of taking into account heterogeneity of effects of the covariate [97][98][99]. A second one is by fitting a polynomial model in geometrical space [100][101][102]. A final one is by fitting a heteroscedastic non-linear regression model in arithmetical scales [43]. Echavarria Heras et al. [92] demonstrated that each one of the curvature models suggested by Mascaro et al. [88] can be derived as logical consequents of suitably chosen forms of the scaling functions ( ) and ( ) in Equation (2). Moreover, allometric proxies of plant biomass units produced by agreeing protocols fitted in geometrical space exhibited high consistency with observed values. This suggests that logarithmic transformation methods could be dependable provided sustaining fitting schemes bear suitable complexity. In this sense, the model of Equation (2) could provide the required approach. Nevertheless, procedures addressed by Echavarria-Heras et al. [92] only amount to particular characterizations of ( ) and ( ). Moreover, the identification of the problem of these functions in a general set up has not been yet undertaken. The present research is an attempt to address this subject. We advance two general identification procedures for the model of Equation (2). One characterizes ( ) and ( ) one to one by means of independent polynomial forms to be fitted in geometrical space. An alternative approach takes on an operating regime-based modeling slant (ORBM) [103,104]. This also allows independent characterizations of ( ) and ( ) from weighted mixtures of linear sub models. It turns out that a PLA perspective can be also hosted by the paradigm of Equation (2) by choosing involved scaling functions in proper forms. Moreover, the ORBM approach undertaken brings about an interpolation device aimed at identifying whatever MCA form renders necessary in direct arithmetical scales. This regardless of complexity of inherent allometric response-covariate linkage in named scales. This construct provides a criterion to test performance of geometrical space methods resulting from the MCA model of Equation (2). Moreover, the present approaches permit adaptation of an index denoted here through the symbol ( ) and aimed at detecting to what extent the complexity of the MCA form in arithmetic space drives curvature in geometrical space. Performance metrics of fits achieved on both real and simulated data suggest that the presently offered geometrical space protocols could entail highly consistent projection of plant biomass units. Beyond providing empirical convenience, the present examination demonstrates that adoption of the MCA paradigm of Equation (2) and the offered identification approaches can overcome controversial views pertaining to suitability of analysis method in allometry.
This paper is organized as follows. In the materials and methods, we specify datasets. Then, we explain formulae and notation conventions behind the considered MCA identification approaches. This specifies both geometrical space as well as direct scales regression protocols. We explain derivation of curvature index formulae and simulation procedures aimed to verify dependability of the MCA approach. The results section analyses the capabilities of the proposed MCA arrangements to produce dependable fits based on real and simulated data. A discussion section highlights the advantages of the present constructs for efficient allometric projection of plant biomass units several Apendices provide details of derived formula and important complementary explanation. A supplementary files section provides data and computational codes backing the results.

Data
For the aims of the present research, we relied on an eelgrass data set collected in San Quintin Bay, México, and reported in Echavarria Heras et al. [92]. Data comprises measurements of length (mm), width (mm) and dry weight (g) of a total of 10,412 individual eelgrass leaves taken from 20 randomly thrown 400 cm 2 quadrats. The length times width proxy provided estimations of leaf area. A second data set composing 47 measurements of aboveground tree biomass (ABG) and height (H) was taken from Ramirez-Ramirez et al. [6]. by electronic scanning methods. Sampling protocols acquiring data pairs are explained therein. Additionally, in order to test consistence of proposed identification methods we produced simulated data. This compose replicates of reference values resulting from Equation (2) for particular characterizations of the scaling functions ( ) and ( ) in the MCA. Simulated replicates resulted from multiplying reference values by a factor expressed as an exponential function of a random variable . This was taken as normally distributed or according to another distribution type. (cf. Equation (42) through Equation (44)).

Multiple-Parameter Complex Allometry (MCA) Identification Protocols
The MCA of Equation (2) reduces to Huxley's formula of simple allometry, when for all values of , ( ) and ( ) take on constant values one to one. Echavarria-Heras et al. [92] demonstrated that for suitable characterizations of the functions ( ) and ( ) the curvature protocols suggested by Mascaro et al. [88] can all be derived from the MCA formula of Equation (2). Additionally, since the response has been assumed to belong to then embracing a perspective of a multiplicative error in allometry, we can consider the general MCA regression model in direct arithmetical scales, being a residual error term conceived as − distributed random variable having mean and variance set by a function ( ) of the covariate , that is, ~( , ( )).

Identification in Log Scales
In order to place analysis of MCA in geometrical scales we consider a log transformation = , and = . This sets domains and for and respectively. Thus, Equation (3) leads to the regression model: Then, Equation (4) provides the geometrical space MCA regression protocol. The resulting mean response function denoted through the symbol ( | ) becomes ( | ) = ( ) . A retransformation = exp( ( ( )) + ) of Equation (4) to direct scales yields a mean response function ( | ) namely, where ( ) = ( ) is the required correction factor (CF) for bias of retransformation of the regression error. Notice that a subscript in Equation (6) is intended to signpost association to the mean response function ( ) fitted in geometrical space. This notation convention will be maintained throughout. Generally, the efficiency of analytical methods in geometrical space centers on suitability of the retransformation step entailed by Equation (6). For instance, Mascaro et al. [88] asserted that biased results that Packard [66] blamed on a TAMA fit can be explained by a missing CF. However, in spite of such a factor being taken into account the suitability of its form becomes crucial in determining reliable reproducibility of back transformed forms. What is more, there are restraints to be considered about CF appropriateness. Properly, if represents error in the regression, then CF expresses as the mean of the exponential random variable, that is CF = ( ). Furthermore, assuming that is normally distributed CF takes on a lognormal-mean form [105,106]. But, if fails to be normally distributed, two possibilities arise. If has known distribution then ( ) can be obtained and CF can be represented in a closed form. Otherwise, if distribution of is not identified a priori, one commonly taken approach is setting CF as given by Duan's smearing estimator of bias [61,[106][107][108]. Yet, there are provisions for this since the chosen form can fail to appropriately compensate for downward bias intrinsic to retransformation of logged data [61,109,110]. Thus, whenever turns unspecified, picking of a suitable CF seems subtle. In order to offer an appropriate form for ( ), Echavarria-Heras et al. [92] suggested an arrangement that generalizes CF as introduced by Zeng and Tang [52]. Adaptation of ( ) first considers an approximation ( ) given by the −term partial sum of series representation of expected value of retransformed error , that is, then, Lin's concordance correlation coefficient (CCC) [111] between observed values and those projected by ( | ) as determined by ( ) is obtained. If a value for entails maximum reproducibility of projections we take ( ) = ( ). We will keep this criterion to choose ( ) in retransformation tasks through. The explicit CCC formulae is provided by Equation (D2).

Identification of ( ) Involving Polynomial Forms
The Weierstrass approximation theorem [112] offers an approach that lodges ( ) involving terms ( ) and ( ) that express through polynomials. Certainly, as we explain in Appendix A, for a proper integer we can consider polynomials ( ) for ( ( )) and ( ) for ( ) such that the MCA of Equation (2) can be written in the form: where the function exp ( ( )) stands for involved approximation error. We further assume that the ℎ-degree polynomials ( ) and ( ) define through, where and for = 0, 1, … , are coefficients. According to the Weierstrass approximation theorem for large enough ( ) will display negligible values so we can consider that Equation (8) associates to the regression model: where Then, ( ) can be interpreted as a mixture of linear models ( ) = + weighted by exponential powers . Through, we will use the symbol to symbolize the vector of parameters characterizing ( ). It turns out that by setting = 0 Equation (11) returns the linear regression model of the TAMA protocol. The resulting mean response function is denoted by means of the symbol ( | ) and turns out to be ( | ) = ( ). Moreover, by defining auxiliary variables ( ) = and ( ) = Equation (11) becomes a multiple linear regression model. According to our notation convention, the corresponding mean response function in arithmetical space is denoted by means of ( | ). It becomes: As given by Equation (12) the ( ) function depends on both and ( ), but, as it is explained in Appendix A, it is possible to offer an equivalent form expressed as a (2 + 1) ℎ-degree polynomial of plus a remainder, namely where where Equation (A15) estates how the coefficients relate to those conforming ( ) and ( ).
Moreover, by the Weierstrass approximation theorem for large enough we could expect R ( ) becoming negligible, so the regression Equation (11) can be also written in the form: with, Corresponding mean response function ( | ) in arithmetical space becomes: Identification of ( ) by means of an operating-regime-based modeling approach: In order to place MCA into an ORBM scheme, we set = | ≤ ≤ . Then, we contemplate a collection of disjoint intervals , with = 1, 2, … , , given by This way, the s compose a partition ⋃ for . Now, for = 1, 2, … , , we take weight functions ( ) defined through, e l s e w h e r e .
Then, chosen ( ) functions satisfy the normalizing condition, Now, for = 1, 2, … , we contemplate parameters and , and also assume that ( ) and ( ) setting ( ) in Equation (5) are defined through: This way regression Equation (4) hosts a characterization: where Ω ( ) is the arranged ORBM form of ( ) that becomes, with In a PLA arrangement, variability of the log-transformed response , is interpreted through a collection of linear sub models defined on domains conceived as disjoint subintervals composing covariate range . The local linear models switch on thresholds also called break points. This way a PLA assemblage provides interpolation features for the identification of curvature in geometrical space. But, beyond empirical gains, the parameters composing the local linear models admit an interpretation as allometric exponents according to Huxley's original formulation. Then, Equation (19) through Equation (26), readily bring about a PLA arrangement. Certainly, for = 1, 2, … , − 1 a threshold interprets as a break point for transition from a th allometric phase associated to a linear sub model ( ) to one ( + 1)th for local model ( ). Moreover, Ω ( ) as given by Equation (25) entails a mixture of linear models ( ) weighted by factors ( ). Along with this, Equation (24) becomes a mixture regression model [104,113,114]. For the piecewise linear setting of Equation (25) identification can be achieved by means of the segmented package in R [115]. Alternatively, we can conceive the regression model: where but, this time, we let ( ) vary continuously over the whole domain , with range 0 ≤ ( ) ≤ 1 and also satisfying the normalization condition of Equation (21). Then, Equation (28) entails a mixture of weighted linear models [114,[116][117][118][119]. For instance, for a biphasic characterization ( = 2) of Ω ( ) we may take ( ) as a normal survival function. Consequently, Equation Moreover, a Takagi-Sugeno-Kang (TSK) fuzzy model [120,121] offers a versatile PLA identification procedure that allows consideration of multiple interpolation sub models ( > 2). Associated regression model expresses here by: with Ω ( ) taking the ORBM form: with weights ( ) being continuous functions acquired through fuzzy clustering techniques, and ( ) linear sub models having the form set by Equation (26) and to be identified through recursive least squares techniques. Appendix C explains details on putting forward Ω ( ) as given by Equation (30) This is achieved by adapting Equation (C15) for the present allometric settings. A first step involves acquiring a fuzzy partition of covariate domain . This characterizes the set specified by Equation (C2), composing linguistic terms Φ ( ), … , Φ ( ) that create a fuzzy partition of . Alongside this, we need to specify the set of membership functions Φ ( ), … , μΦ ( ) described by Equation (C4). A membership function μΦ ( ): → [0,1] assigns the degree of pertinence of a covariate value to the fuzzy set associated to the linguistic term Φ ( ). For the present analysis, membership functions are assumed to have a Gaussian form i.e., being and for = 1, 2, … , , parameters to be identified from available data applying subtractive clustering (SC) techniques [122,123]. Contemplation of SC techniques also establishes the number setting the cardinality of , as well as, the number of inference rules specified by Equation (C10). Furthermore, for = 1, 2, … , , the consequent functions ( ) specified by Equation (C11) and here assumed to have a form given by Equation (26) where the parameters and are to be identified from data pairs ( , ) through a recursive least squares (RLS) routine [124,125].
According to Equations (C12) and (C13), we take: It follows that the normalization condition of Equation (21) holds. Corresponding to Equation (29), we have the mean response function ( | ) = Ω ( ) . Then, performing a back transformation → yields the mean response function ( | ) in arithmetical space, namely:

MCA Identification in Direct Arithmetical Scales
Alternatively, Equation (C15) can adapt a Ω ( ) interpolation device for MCA as given by Equation (2) in direct arithmetical space. Resulting regression equation becomes: where, The th membership function Φ ( ) is given by the formula with and for = 1, 2, … . , parameters. Correspondingly, we consider consequents ( ) to be the linear functions:

Derivation of Curvature Index ( )
Let's define functions ( ) and ( ) and constants and such that ( ) and ( ) in Equation (2) where Since logtransforrming both sides of Equation (38) leads to the expression: then, the closer κ( ) gets to the line = 1, the more dominant the linear term + becomes. Therefore, κ( ) interprets as a measure of the curvature implied by the CMA form in geometrical space. Moreover, once the function ( ) exp ( ( )) in Equation (2) has been estimated, given candidate values for the parameters and , the curvature factor κ( ) can be estimated from data through the relationship: The TSK form fitted in geometrical space allows a direct characterization of the for the parameters and . A back transformation → in Equation (29) and then simplifying establishes, and introducing the auxiliary function ℎ ( ) = ( ) − 1, we get the equivalent formulation Then, we may set κ( ) = θ ( ) being θ ( ) as estimated by the ratio ( | ) ⁄

Simulation Studies
In order to asses performance of the MCA by simulation assays, we first arranged covariate values for = 1, 2 … and such that ≤ ≤ . Then, we acquired characterizations of the scaling functions ( ) and ( ), so that Equation (2) determined projected reference values namely, Next, we use the Matlab function: (′ , , ) in order draw a random numbers , = 1, 2, … , , from a normal distribution having mean and variance that is ~( , ). This produced a number , of lognormally distributed replicates ( ) ( ), associating to each reference response value , that is, In order to consider non-lognormally distributed replicates we considered random numbers so Equation (43) produced replicates ( ) ( ) according to: with expressing as a product of exponentially ( = 0.1) and logistically ( = 0, = 0.1) distributed random numbers, produced by corresponding Matlab functions that is,

Assessment of Reproducibility Strength
The assessment of the reproducibility strengths of proxies considered here will be primarily carried out by analyzing values of Lin's concordance correlation coefficient, symbolized by means of [111]. Agreement will be defined as poor whenever < 0.90, moderate for 0.90 ≤ < 0.95, good for 0.95 ≤ < 0.99, or excellent for ≥ 0.99 [126]. Moreover, CCC reproducibility will be estimated by means of model performance metrics, such as the coefficient of determination (CD), standard error of estimate (SEE), mean prediction error (MPE) [127][128][129][130]. Related formulae for these statistics are provided by Equations (D1) through (D7) in Appendix D. The Matlab and R codes involved in both fuzzy inference and convention statistical task are provided in the supplemental files section.

Results
In this examination, we considered MCA protocols in the form set by ( ), given by Equation (12). We also address MCA-PLA forms Ω ( ), Ω ( ) and Ω ( ) given by Equations (25), (28) and (30) one to one. For comparison aims, we also acquired the Η ( ) proxy given by Equation (35) fitted in direct arithmetical scales. As already pointed out the case ( ) identifies a TAMA protocol. It is worth recalling the notation convention. For MCA surrogates as listed above corresponding retransformed forms are and ( | ) associating to directly fitted Η ( ). These symbols will be used through in tables and figures.

MCA Identified on Simulated Data
For the aim of performing MCA simulation studies, we adapted a response range as determined by leaf area values in the eelgrass data set reported by Echavarria-Heras et al. [92]. This way, we considered covariate values for = 1, 2 … 500 such that 0 < ≤ where = 10,000.
Then, reference values are produced by Equation (42), with ( ) = p ( ) and ( ) = ( ) being ( ) and ( ) as given by Equations (9) and (10)  Appendix E. This assay included the aims of (1) demonstrating that curvature in geometrical space could not be considered as a consequence of a logtransformation itself (2) explaining the skills of the Ω ( ) and Η(x) formulations to adapt complexity as necessary in interpreting inherent allometric pattern and (3) verifying the dependability of the curvature index κ( ) criterion of Equation (38). Indeed, as portrayed in Figure E5c,d for this simulation assay deviations of ( ) about the line = 1 are practically vanishing for all values of .
In order to explore full MCA, without loss of generality, we circumscribed our study to the case = 2 of Equation (12). In particular, we set reference parameter vector .
Acquired reference data pairs ( , ) are provided in the Com2ref.txt supplementary file. We used = 0 and = 0.1 in Equation (43) in order to generate normally distributed random numbers , = 1, 2, … . , 5 . Then, Equation (43) (1)) to , ( ) pairs produced estimated parameter values = 1.113e − 12 and = 2.765 ( = 0.9729). Figure 1a presents the distribution of ( ) values around the reference curve, and also compared to the fitted Huxley curve. This seems to provide a suitable approximation to the reference one. Nevertheless, we can be aware of deviations attributed to curvature induced by the conceived MCA form. Indeed, deviations from the line = 1 by curvature index ( ), in Figure 1b  We can ascertain that deviations from linearity in geometrical space can be expected for this data.
According to Equation (11), choosing the case = 0 in Equation (12) brings about a TAMA protocol.
This for simulated pairs ( , ( ) ) produced an = 0.9963 fit with resultant estimated parameter vector Figure 2a displays spread of ( ) replicates about fitted mean response ( ) compared with TAMA's counterpart ( ) . We can ascertain remarkable account of curvature by ( ) and noticeable bias by the TAMA protocol on this data. Figure 2b compares retransformed mean functions ( | ) and corresponding TAMA for one ( | ) and we can ascertain biased projections by the latter. y y Figure 2. Fit of MCA on simulated data ( , ( ) ( )) . Regression Equation (46) was fitted to simulated ( , ( ) ) data pairs. Panel (a) shows dispersion of ( ) about the mean response function ( ). This plot also shows the traditional analysis method of allometry's (TAMA) fitted mean response ( ) . We can ascertain notorious bias by this last approach. Panel (b) shows corresponding retransformation results. Compared to significant reproducibility of mean response ( | ) TAMA's counterpart ( | ) entails significant bias. Figure 3a displays the spread of ( ) replicates about mean response ( ) compared with its Ω ( ) counterpart. We can ascertain a dependable account of curvature by both surrogates Figure   3b compares retransformed mean functions ( | ) and ( | ). Figure 3c presents mean response acquired by fitting the protocol Η ( ) in direct arithmetical scales. Panel (d) is the comparison of ( | ) to ( | ), the directly acquired projection function. This is gotten by replacing fitted parameter vector into addressed MCA form in arithmetical space. We can ascertain remarkable reproducibility features by the Ω ( ), with the approach fitted in geometrical scales. Table 1 presents performance metrics for considered proxies.  ( | ) to ( | ) , the directly acquired projection function. This can be obtained by replacing the fitted parameter vector into addressed MCA form in arithmetical space. We can be aware of reliable projections even without a CF.By acquiring non lognormally distributed replicates ( ) ( ) = exp( + ) according to procedure of Equation (43), we examined error structure effects on performance of addressed proxies. Figure 4 displays plots corresponding to retransformed forms. Table 2 presents related model performance statistics. Results reveal that reliability of the MCA and TSK approaches does not depend in error structure for the present assay.  The failure of a TAMA approach to display suitable complexity explains poor performance in all included fitting statistics. Although, the Huxley by DNLR fit suggest fair reproducibility by and indices it is outperformed by MCA alternates in the Akaike information criterion (AIC), SEE and MPE statistics. Moreover, deviations of a ( ) index from the line = 1 shown in Figure 1b adds criterion that sustain selection of the MCA model. This confirms our judgement that consistency of the Huxley-DNLR approach as suggested by visual inspection of plots is only apparent. This could explain an assertion that clinging to DNLR could impair detecting inherent complexity. Then, suitable examination for this data must rely on a MCA paradigm. This example also shows how a TSK slant could offer reliability on both scales of allometric examination.

MCA Identified on Real Data
We now explore the performance of the model of Equation (2) in analyzing real data sets. We considered MCA protocols set by ( ) and ( )given by Equations (12) and (17) respectively, and the PLA forms Ω ( ), Ω ( ) and Ω ( ) given by Equations (25), (28) and (30) one to one. For comparison aims, we also acquired the Η ( ) proxy fitted in direct arithmetical scales As already pointed out, the case ( ) identifies a TAMA protocol.

MCA Proxies Identified on the Aboveground Tree Biomass (ABG) and H Data
The Ramirez and Ramirez et al. [6] tree aboveground biomass and height (ABG-H) data set is available from: Ramiez-Ramirez.txt in the supplementary files section. Figure 5 presents ( , ) =     Table 3 presents model comparison metrics for all MCA surrogates fitted in geometrical space. We can learn that among considered methods an interpolation mode of the Ω ( ) approach brings about better performance.   panel (d)). We can learn of the deviation of index ( ) from the line = 1 that explain curvature detected in geometrical space for this data is inherent to complexity of the allometric relationship response-covariate in direct scales. Table 4 presents the performance metrics of retransformed proxies compared to Huxley's formula of simple allometry and characterizations of directly fitted ( ) ∶ ( = 2, = 7). Again, the retransformed ( ) and directly fitted ( ) displayed the highest reproducibility strengths.  Vertical segments in plots of panels (b), (c) and (d) associate to identified break points. As explained above, the break point for ( ) was also estimated by the empirical interpolation criterion described in Appendix A. Resulting estimations of this threshold trough are consistent with reported by Echavarria-Heras et al. [92]. In all cases scattering of values about MCA fitted mean response functions suggest steady agreement.       This illustrates that manifestation of curvature is explained by complexity inherent to the allometric response-covariate association in direct scales. Table 6 presents performance metrics of retransformed proxies compared to Huxley's formula of simple allometry and characterizations of directly fitted ( ) ∶ ( = 2, = 7). Again the retransformed ( ) and directly fitted ( ) displayed the highest reproducibility strengths.

Discussion
The flux and storage of carbon in overall plant biomass influences global carbon cycle. Traditional approaches to quantification of related carbon fluxes and stocks have essentially relied on allometric methods [1,19,22,100]. Usually, in this approach, directly obtained measurements of a trait and the scaling model of Equation (1) allow non-destructive estimation of values of a plant biomass unit . However, the acquired projections of the response values are markedly sensitive to numerical changes in composing parameters and . Moreover, local factors can induce a certain extent of variability on estimates. The influences of analysis method, error structure, sample size, and overall data quality in the study of biological scaling have been widely acknowledged [25][26][27]131]. Therefore, error propagation of estimates and its influence on overall precision pose important queries, on dependability of allometric surrogates of plant biomass units.
Particularly relevant is the influence of the analysis method. This nourishes a vivid controversy that divides allometric practice into two schools. Views emphasize traditional identification of parameters and via logtransformation set biased results, thereby recommending direct nonlinear regression analysis. However, defenders of a traditional perspective conceive logtransformation and allometry as inseparable. We have here embraced a view that controversy can be remedied by addressing adequate complexity of an inherent allometric relationship on the direct scales. Alongside this, it becomes necessary choosing a proper form of correction factor for bias of retransformation. In order to validate the present view, we adopted the MCA model of Equation (2), that assimilates necessary complexity through scaling parameters expressed as functions ( ) and ( ) depending on covariate. A particular form of this paradigm was addressed by Bervian et al. [77] and its generalization into the form offered here was suggested by Echavarria Heras et al. [132]. A formal exploration established this construct as a source model from which conventional models aimed to address curvature in geometrical space [88] could be derived. Nevertheless, consideration of MCA there removed the circumscribed general identification problem. This can be conceived as using regression protocols and data pairs ( , ) in order to unfold forms of ( ) and ( ) as independent functions granting highest reproducibility. We aimed here to contribute on the subject. This explains present advancement of geometrical space identification methods resulting from Weierstrass approximation theory, essentially epitomized by the ( ) approach, as well as, ORBM alternates represented here by ( ), Ω ( ) and ( ). An ORBM approach also nourished adaptation of the ( ) arrangement aimed to analyse MCA forms in direct scales. The results in Appendix E explain that a ( ) scheme can automatically adapt complexity as required to identify a pattern conforming to the formula of simple allometry. The extraordinary approximation capabilities of the TSK fuzzy model [133,134] endow ( ) interpolation of allometric patterns in direct scales irrespective of intrinsic complexity. Moreover, the ( ) arrangement makes it possible to detect break points for transition among allometric phases which is unattainable by MCA-DNLR methods [132]. The results of Appendix E also reveal that a logarithmic transformation itself cannot be blamed for curvature detected in geometrical space. Simulation assays exhibit that geometrical space methods provide reliable proxies on condition that CF is chosen in a suitable form. However, the ( ) scheme could entail steady CF free retransformation to direct scales (Figures 3d and 4d). This suggests that a suitable integration of complexity becomes a key factor in driving dependability of allometric projection based on protocols fitted on log-scales. Adoption of an ORBM approach also allowed adaptation of index κ(x) given by Equation (33) Figure 12d). This explains manifestation of curvature that was corroborated by dependability of the biphasic forms of the ( ), ( ) and ( ) protocols. Therefore, clinging to Huxley's model, and a non-linear regression protocol in direct scales can at most offer an apparent empirical convenience tied to simplicity, but it may leave behind relevant biological information. This concerns the existence of break points for transition between successive growth phases that are undetected by Huxley's model of simple allometry and are well recognized by the PLA paradigm [91,[93][94][95]. Likewise, identified break point in Zostera marina may perhaps be interpreted as a threshold beyond which the plant promotes generation of a relatively greater amount of tissue in leaves to enhance resistance to drag force effects that induce damage and separation from shoots. As a result, allometric scaling parameters among small and large leaves could be different [30]. Similarly, a break point in the Ramirez-Ramirez [6] data suggest differential allometric rules depending on tree size. Indeed, resource allocation to different tree attributes like diameter or height could differ during growth as a response to changed environmental-biotic conditions, as well as, to variations in resource availability.
In this perception, a risk of suppression by competitors may induce small trees to dispense more resources to increase height. Then, beyond a threshold height at which suppression risk is at a minimum, resources may be assigned to horizontal growth parameters like diameter, crown and root cover [6,135]. Thus, since the aim of allometric examination is comprehending the biological processes that bring about covariance among traits, analytical methods entailing break point identification must be preferred. This explains adaptation of the empirical procedure of Appendix A aiming to extending the ( ) protocol for break point detection. In particular, the contemplated ( ) approach could be a highly biologically meaningful model of allometry, because it can model the break points while keeping the meanings of allometric exponents as Huxley's original formulation. Moreover, as exemplified by the addressed study cases, the intersection of firing strength factors provides a reliable criterion to break point identification which could be expected to deliver reliable results in the general settings of allometric examination.

Conclusions
We aimed here to address the general identification problem of the MCA model of Equation (2). This conceived using regression protocols and data pairs ( , ) in order to unfold ( ) and ( ) as independent functions granting highest reproducibility. This task was addressed here in two ways. A first one was through Weierstrass approximation theory that anticipated ( ) and ( ) as determined by Equation (8) through Equation (10) and that epitomize by the ( ) form in geometrical space. A second way to address the general identification problem relied on an ORBM slant. This conceived ( ) and ( ) in the forms given by Equation (22) and Equation (23) respectively, and that engendered the ( ), ( ) and ( ) protocols also intended to be fitted in geometrical space. This ORBM approach is also behind the adaptation of the form ( ) that takes advantage of the oustanding approximation capabilities of the general output of a first order TSK fuzzy model grants polyphasic interpolation of inherent MCA pattern in arithmetical scales. A consistent fit of the ( ) arrangement could set up reliable reproducibility by direct retransformation to arithmetical scales. Moreover, this approach can also entail estimation of break point inherent to biphasic patterns. Concomitant broken line ( ) or weighted linear segment mixture ( ) regression could also offer reasonable MCA identification in many instances of allometric examination. Nevertheless, it is opportune to emphasize gains derived from adoption of a ( ) approach. This construct bears a flexible computational assembly that allows intuitive and interactive integration of previous knowledge into the analysis [132]. This gives ( ) a relative advantage over the conventional protocols addressed here. Moreover, analysis of model performance metrics show that the mean response function ( |x) brings about similar reproducibility strength as its ( | ) counterpart. Therefore, ( ) could grant efficient identification of allometric proxies indistinctly of inherent complexity. This device also drives direct and intuitive identification of multiple break points for transition among successive growth phases comprising a PLA pattern. It is also worth highlighting that the extraordinary approximation capabilities of a first-order KSK fuzzy model that engenders ( ) grant reliable identification of the scaling functions ( ) and ( ) by retransforming fitted form of ( ) to arithmetical scales. We can then consider that the essential aim of this contribution was fulfilled. The TSK arrangement also allows efficient implementation of the ( ) criterion for curvature assessment. The identification schemes offered can be conveniently implemented on available data by using the codes provided. In summary, the presented MCA paradigm, along with identification schemes plus a suitable CF form could grant efficient projection of plant biomass units. This may well be also extended to the general settings of allometric examination. Nevertheless, we acknowledge that further substantiation is deemed necessary before proposed MCA could be adopted as a comprehensive tool for the analysis of allometric data. Meanwhile, present results endorse the relevance of the excerpt of Kerkhoff and Enquist [81] on the hopelessness of a divergence between traditional logarithmic transformation and direct non-linear analysis slants in allometry. Surely, the efficiency of the proposed MCA arrangement can elucidate this glowing controversy. Funding: This research received no external funding.
Acknowledgments: Two anonymous reviewers provided important comments that greatly improved our final presentation.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Identification of the Model of Complex Allometry by Approximation of ( ) and ( ) through Polynomial Forms
The proposed MCA formula takes on the form of the generalized power function of Equation (2) namely: varying in an interval [ , ] ⊂ , and also assuming that both ( ) and ( ) are continuous functions on that domain. Additionally, we undertake that ( ) maps [ , ] into . Complexity of the general settings of the model of Equation (41), (1) poses significant difficulties when using direct nonlinear regression as an identification device. However, we can provide an alternative fitting protocol in geometrical space that overcomes these technical complications. In order to advance this, we set ( ) = exp ( ) with ( ) continuous in [ , ] so we can express Equation (2) in the equivalent form: Now, since ( ) is continuous on a real interval, then, the Weierstrass approximation theorem assures that for any > 0, there exists a th-order polynomial ( ) with depending on and set by: where for 0, 1, … , are coefficients, and such that, | ( ) − ( )| < , for all [ , ]. Moreover for that value of we can also choose a ℎ-order polynomial ( ) given by: such that we also have, | ( ) − ( )| < . Let = max( , ). Then, if = we set = 0, for + 1 ≤ ≤ . Conversely, if = , we set = 0 for + 1 ≤ ≤ . This way we can consider an homogeneous degree for polynomials ( ) and ( ) in Equations (A2) and (A3) that will ensure tolerances | ( ) − ( )| < , and | ( ) − ( )| < . Therefore, Equation (A1) can be written in the form: where ( ) is a remainder that, according to the Weierstrass approximation theorem, can become negligible by a suitable choosing of . Assuming that this is satisfied we can take ( ) as a random variable which allows to consider the regression model: Then, can be interpreted as a multiplicative random error term. The transformation = and = carries Equation (A5) into: with ϵ interpreted as an additive error term. Using Equations (A2) and (A3) to set forms ( ) and ( ), Equation (A6) yields, Therefore, we can express the regression model of Equation (A6) in the form: that allows to obtain estimators of the parameters and defining the ( ) and ( ) that approximate ( ) and ( ) in Equation (A5). Equation (A8) provides a non-linear regression protocol involving and powers of . A characterization of the log transformed allometric response as a polynomial of the linked covariate corresponds to a complex allometry model with a varying exponent in arithmetical space. Here, we demonstrate that such a polynomial representation allows an identification of the full complex allometry described by Equation (2) with ( ) = exp( ( )) and ( ) = ( ). In order to resolve these matters, we first consider a power series expansion of , so we can write,

This result implies
where R ( ) is a remainder. Expanding the sum over the index then multiplying by ( + ) and collecting similar terms leads to But, the following results hold and Then Equations (A11) through (A13) yield: or equivalently, Moreover, by the Weierstrass approximation theorem for large enough we could expect R ( ) becoming negligible so the regression Equation (A8) can be also written in the form,

Appendix B. Identification through an ( ) Scheme
In this appendix we address the maximum likelihood approach for the identification of the mixture mean function ( ). We exemplify procedures for the biphasic characterization. The being ( ) the weight function given by Equation (A21). Due to the complexity of the log likelihood function ( ), we perform its maximization numerically, using the function of R.

Appendix C. The Takagi-Sugeno-Kang (TSK) Fuzzy Model
Ying [133] demonstrated that the general output of a TSK fuzzy can uniformly approximate any continuous function to arbitrarily high precision. In the present allometric examination settings, a TSK a fuzzy model is intended to provide a device for the identification of the mean response ( ) associating to the regression model of Equation (4) involving the logtransformmed response expressed as a nonlinear function of a descriptor , namely: where and take values on domains and U one to one. In order to elaborate on the construction of a TSK fuzzy model, we firstly need to bring up an abstract structure known as a general fuzzy inference system. For that aim, we introduce a set containing a number of linguistic terms Φ ( ) characterizing the input variable , formally: Each linguistic term Φ ( ) is associated to a membership function Φ ( ) that setting a mapping Φ ( ): → [0,1] characterizes Φ ( ) as the fuzzy set: where , , … , stand for the values that takes on. We use the symbol to denote the collection of membership functions describing the variable , that is, Likewise, the pair ( , ) will stand for what is known as a fuzzy partition of the input domain .
Respectively, for = 1, 2, … , we designate linguistic terms Ψ ( ) for the output variable so we can consider a set of linguistic terms, i.e., Similarly, for each linguistic term Ψ ( ), we associate a membership function Ψ ( ), such that the mapping Ψ ( ) ∶ → [0,1] establishes the fuzzy set : where , … , denote the values that acquires. Concurrently, we have the collection of membership functions tied to , and concomitantly, we may also say that the pair ( , ) sets a fuzzy partition for the output domain . Additionally, for = 1, 2, … , , in and we advance a correspondence → Φ ( ) ( ), and → Ψ ( ) ( ), so we can contemplate antecedent conjunctions of the form, and a consequent backing inferential rules : : We may then envisage a general fuzzy inference system F as a construct that includes an application F: → characterized by the sets of fuzzy partitions ( , ) and ( , ), a set of inference rules = ⋃ and a defuzzification operator that associates to the fuzzy set ( ) in Equation (A36) a crisp value in .
In a TSK fuzzy inference system representation, we consider decision rules having an antecedent ( ) of the form given by Equation (C8) but with the consequent ( ) in expression (C9) taking a crisp functional form = ( ). That is, in the TSK fuzzy inference system we may envision rules having the form : : An important concept tied to a TSK fuzzy model is the firing strength ( ) of the antecedent ( ) of a rule , for = 1, 2 , … , . For a first order TSK fuzzy model we take: A normalized firing strength ( ) takes a form: It follows that, The final output of the TSK inference system is the weighted average of all rule outputs computed as: and intended to provide a proxy for the MCA response of Equation (C1). The identification of the structure and the estimation of parameters of the TSK fuzzy model are interrelated processes [136]. A first stage relies on Subtractive Clustering [122,123]. In order to determine regions in the input space with high point densities, initially a point with the highest number of neighbors is selected as the center for a group. Points in this group placed within a prespecified fuzzy radius are removed. Then, the algorithm finds again the point with the largest number of neighbors and so on until all points in are examined. This determines the number of decision rules since each cluster associates to one of them. This stage also produces parameter estimates for the ( ) membership functions characterizing the fuzzy sets of the antecedents of the rules.
This determines estimated forms of the normalized firing strength factors ( ). A second stage of the identification task is achieved by placing these weight factors in Equation (C15) in order to obtain parameter estimates for the consequents of the rules ( ). This is usually achieved using recursive least squares techniques [124,125].

Appendix D. Model Performance Metrics
In addition to the AIC and indices, model assessment here is mainly based on the SEE, MPE and mean percent standard error (MPSE) indices that rely in on statistics of squared and absolute deviations of observed to predicted values. According to Parresol [137] the use of SEE, MPE and MPSE statistics as model performance metrics were first recommended by Schlaegel [138] and have subsequently been used by Zeng [130]. We provide ahead related formulae and explanation.
Akaike information criterion (AIC): Lin's concordance correlation coefficient: with standing for Pearson's correlation coefficient. The index estimates through, where, Standard error of estimation (SEE): Mean prediction error (MPE): Mean percent standard error (MPSE): The AIC index allows comparing performance of different candidate models that fit a set of data. The model with the lowest AIC value is considered the best among competitors. The AIC index establishes a compromise between the goodness of fit of a model and its complexity, which is expressed through linked log-likelihood and number of parameters as a way to penalize inclusion of unnecessary ones. As it is based on information entropy, an AIC index is often interpreted, as an estimate of the information lost when a model is used to represent the process that generates the data. Lin's concordance correlation coefficient ( ) measures how well one variable (Y) reproduces another (X), that is, it represents a measure of the similarity (or agreement) between the two variables. This index can be estimated, with sample sizes of at least ten pairs (x, y). The R square index ( ) also called determination coefficient interprets through the ratio (SS due to regression/Total SS corrected for the mean) and is mainly intended as a measure of closeness between response values and adjusted linear regression models. This index also measures the proportion of the total variation of the response, around the average, explained by the model. The coefficient of determination takes values between zero and one. When attains its maximum value, one, the response variable is fully explained by the predictive variables of the fitted linear regression model. According to Parresol [137] using the coefficient of determination as a fit index aimed to compare performance of biomass models was first suggested by Schlaegel [138]. Nevertheless, for non-linear models a high value does not necessarily associates with high reproducibility strength. The standard error of estimation SEE is of widespread use in statistical texts and also widely reported in statistical software. This index bears a global assessment of goodness of fit of a model to observed data, as it measures the accuracy of ( ) predictions produced by a fitted regression model. This index takes non-negative values. When SEE attains its minimum value, of zero, the observed values of the response coincide with the fitted mean response function, meaning that the model displays exact reproducibility of observed values.
The MPE, which is now used to determine the goodness of fit of a model, is a standardized version of the coefficient of variation = ( / ) × 100 expressed as a percentage, as proposed by Schlaegel [138]. The MPSE bears a measure of the average absolute relative error, expressed as a percentage. This model assessment index recommended by Schlaegel [138] was previously suggested by Meyer [139] as a measure of the absolute deviation of the expected and predicted responses, relative to the size of the prediction (| − | ) expressed as a percentage average.

Appendix E. Simulation Assay on ( )
In this assay, we considered the case = 0 of the MCA form ( ) as represented by Equation (12). This is associated to constant scaling functions, ( ) = and ( ) = . The resulting model is set as Huxley's formula of simple allometry given by Equation (1) replicates. Resulting data can be found in the supplementary files Hrep.txt. Figure 1a shows the spread of simulated ( ) ( ) replicates around the reference curve = exp( ). A fit of the model = exp( ) by direct non-linear regression to ( , ( ) ( )) data resulted in estimates of = 1.231 × 10 and = 2.753 ( = 0.9802). Figure E1b shows spread of ( ) ( ) replicates around the fitted curve ( , ).  Figure E2a displays the spread of ( ) ( ) replicates about estimated mean response ( ) = + . Then, this was retransformed to arithmetical space to produce the estimated mean function ( | )exp ( + ( )) ( ) (cf. Equation (6)). For this try ( ) = exp( 2 ⁄ ) as corresponds to ~(0, ) . This assumption was corroborated by applying an Anderson-Darling test (Function adtest.m in Matlab). Figure E2b allows comparison of retransformed ( , ( | )) and directly fitted ( , ) curves. Shown spread suggest reliable agreement between observed values and retransforming mean response ( | ) . No biased results can be attributed to using a logtransformation procedure in this study case. Table E1 presents model performance metrics. We can be aware of remarkable agreement of these surrogates. As a result, it can be inferred that the log transformation step in the TAMA approach dos not induce a curvature deformation in geometrical space. In order to assess error structure effects we considered the case of non-lognormally distributed replicates, and we maintained the reference curve = exp( ) as above, but according to the procedure around Equation (44) we acquired values = (′ ′, ) * (′ ′, , ), that is, expresses as a product of exponentially ( = 0.1) and logistically ( = 0, = 0.1) distributed random numbers. Then, we formed replicates ( ) ( ) = exp( + ). In this assay, we maintained × = 2500 replicates. Supplemental file Hnonrep.txt includes acquired data. Figure E3a shows the spread of replicates ( ) ( ) about the reference curve.
Fitting Huxley's model to ( , ( ) ( )) pairs by DNLR produced = 1.403 × 10 and = 2.739 ( = 0.977) . Figure E3b compares reference and fitted curves. Again we can visualize remarkable agreement.  Figure E4a shows spread about the fitted line ( ) = + . Retransforming by Equation (6) produced ( | ) = exp( + ) ( ), with ( ) produced by the criterion around Equation (7) since is not normally distributed. Figure E4b compares ( | ) and reference curves. We can assert that in spite of non-lognormally distributed multiplicative error, a log scales identification procedure bears reliable results in this case. Table 2 presents model performance metrics. We can be aware that these statistics corroborates remarkable agreement shown in Figure  E4b. We can then assert that on spite of regression error failing to be normally distributed, the TAMA device produces a dependable fit. Then, whenever Huxley's model of simple allometry is consistent in direct scales assuming linearity in geometrical space, this will entail consistent analysis provided a proper CF form is chosen. As a result, it can be inferred that regardless of an error structure a log transformation step in the TAMA approach dos not induce curvature in geometrical space. ( | ) and power function = fitted in direct arithmetical space by means of nonlinear regression to simulated data ( , ( ) ( )). It can be inferred that regardless of an error structure a log transformation step in the TAMA approach dos not induce curvature in geometrical space.
We analysed performance of a Η ( | ) proxy fitted directly in arithmetical space by means of the dtsk_model_fit.m routine provided in the supplementary files section. For this assay the fuzzy clustering parameter was set at a value radius = 2. This in both and simulations returned an heterogeneity index of = 2. Figure E5a,b display resulting. Figure E5c,d display agreeing dynamics of the ( ) index. This corroborates that no curvature effects could be expected in geometrical space for this data. Table A1 presents performance metrics for proxies fitted on data simulated through a normal error structure. Table E2 pertains to results on data simulated based on a non-normal error structure. We can learn that ( | ) seemingly adapts complexity as required by data.