Köln-Timişoara Molecular activity combined models toward interspecies toxicity assessment.

Aiming to provide a unified picture of computed activity - quantitative structure activity relationships, the so called Köln (ESIP-ElementSpecificInfluenceParameter) model for activity and Timisoara (Spectral-SAR) formulation of QSAR were pooled in order to assess the toxicity modeling and inter-toxicity correlation maps for aquatic organisms against paradigmatic organic compounds. The Köln ESIP model for estimation of a compound toxicity is based on the experimental measurement expressing the direct action of chemicals on the organism Hydractinia echinata so that the structural influence parameters are reflected by the metamorphosis degree itself. As such, the calculation of the structural parameters is absolutely necessary for correct evaluation and interpretation of the evolution of M(easured) and the C(computed) values. On the other hand, the Timişoara Spectral-SAR analysis offers correlation models and paths for H.e. species as well as for four other different organisms with which the toxicity may be inter-changed by means of the same mechanism of action induced by certain common chemicals.


Introduction
Directly and without delay inclusion of chemically artifacts in the biological cycle are due in the first line to solubility; from these, all less soluble, i.e., those set down as sediments, suffer with the time various transformations with formation of new derivatives and with other possibilities of implication in the same natural biological cycle [1]. However, in the all of the cases the principal area of the accumulation is mainly the shallow marine water where the effects can be detected immediately to intimation or pursued in the time with different investigation methods.
Hydractinia echinata, as an organism living in the European and North-American coastal waters, could be directly affected by the presence of chemical derivatives through interruption of the evolution cycle at the level of the larva to polyp metamorphosis [2].
The testing of many anticonvulsants through which it was established that the order of influence is identical to that obtained through treatment of the embryo in vitro [3] was first achieved by use of the Hydractinia echinata metamorphosis stage for monitoring toxicity problems. The research continued by establishing various relationships between structure and reactivity of oil and oil products, alkanes, cycloalkanes, aromatic compounds [4], as well as for a series of hydrocarbon derivatives, aliphatic alcohols, aliphatic amines, aminoalcohols [5], or phenols [6].
The Hydractinia echinata test system was already demonstrated to be applicable for very different series of derivatives or products, including pharmaceutical products for dentistry, natural extracts, detergents, dyes, etc. Their interactions on living cells from measured (M) values for simple organic molecules by means of the introduced ESIP-parameters (ElementSpecificInfluenceParameter) models the molecular substructures for their computed (C) toxicity of containing substances represent the essence of the so called "Köln model" [5]. On the other side the recently developed so called Spectral-SAR as the "Timisoara QSAR model" allows for mechanistic description of the molecular specific actions throughout combined reactivity-activity paths of interactions [7][8][9][10].
In this context, the present endeavor combines ESIP and S-SAR models for advancing a sort of "absolute" analysis of ecotoxicity employing the computed activities of their spectral correlation, respectively, for an inter-species analysis for a common set of compounds. As such, having at hand a complex method providing both the organisms' toxicity activity (ESIP), without the need to undertake extensive experiments for measuring them, as well as the mechanistically revealed path of molecular action (Spectral-SAR) may constitute an advancement in ecotoxicological assessments through computational design and reasoning. This way, the in silico methods will eventually reveal the mechanisms of toxicity for a given set of toxicants and environmental hazards, while lowering the experimental costs.

Köln ESIP Model for Biological Activity
We determinate ESIP-parameters based on the measured values basis Mlog(1/MRC 50 ) [Mol/L] in order to calculate toxicity values Clog(1/MRC 50 ) [Mol/L] for untested derivatives [5]. The molecular structures have always saturated hydrocarbon or aromatic substructures, so the first ESIP-parameter corresponds to saturated-carbon ESIPc-sat, followed by the aromatic-carbon ESIPc-ar, and ESIPorganic function (alcohol, amino, etc.). In the case of saturated hydrocarbons the ESIPc-sat have an average value of 0.50 log units, calculated on the basis of measured values M and saturated carbon numbers C.
In this way, the toxicity of not tested compounds can be calculated with the following assumptions: (i) the toxicity of a compound can be subdivided into that of components (ESIP's) in such a way that the sum of these components results in the total toxicity value; (ii) these components (ESIP's) are identical in different substances; (iii) the ESIP's components have a dynamical value (they depend on the determined number or are derived from newly available data) for one organism and a test-system, while varying for different test-systems. However, if a deviation between the measured M and the calculated C values is observed, there is an indication of an overlooked interaction between different parts of the molecule, or may indicate an activity of a substance specific for a certain biochemical pathway. Note that somewhat similar studies were examined at the inter-species toxicity level by the aid of data bases centered on a given species [11], although this limits the possibility to dynamically extend the molecular group toxicity from one organism to other [12,13], as the ESIP method is able to do.

Timişoara Spectral-SAR Model
Since QSAR models aim at correlations between concerned (congener) molecular structures and measured (or otherwise evaluated) activities, it appears naturally that the structure part of the problem be accommodated within the quantum theory and of its formalisms. In fact, there are few quantum characters that we are using within the present approach: o Any molecular structural state (dynamical, since undergoes interactions with organisms) may be represented by a ket state vector, in the abstract Hilbert space, following the ket bra Dirac formalism [14]; such states are to be represented by any reliable molecular index, or, in particular in our study by hydrophobicity LogP , polarizability POL , and total optimized energy tot E , just to be restrained only the so called Hansch parameters, usually employed for accounting the diffusion, electrostatic and steric effects for molecules acting on organisms' cells, respectively. o The (quantum) superposition principle assuring that the various linear combinations of molecular states map onto the resulting state, here interpreted as the bio-, eco-or toxicological activity, e.g., ...
, with 0 Y meaning the free or unperturbed activity (when all other influences are absent). o The orthogonalization feature of quantum states, a crucial condition providing that the superimposed molecular states generates new molecular state (here quantified as the organism activity); analytically, the orthogonalization condition is represented by the ket bra scalar product of two envisaged states (molecular indices); if it is evaluated to zero value, i.e., 0 = ket bra , then the convoluted states are said to be orthogonal (zero-overlapping) and the associate molecular descriptors are considered as independent, therefore suitable to be assumed as eigen-states (of a spectral decomposition) in the resulted activity state, while quantified by the degree their molecular indices enter the activity correlation. Further details on scalar product and related properties are given in Appendix A1, whereas in what follows the Spectralbased SAR correlation method (thereby called as Spectral-SAR) is resumed. Note that since molecular states are usually represented by ket vectors which are a generalization of custom (classical) vectors, all formalisms are consistently developed accordingly. In this regard, the bra-ket formalism is more than a simple notation -it is indeed a reliable formalism since, for instance, it differentiates between the dual and direct spaces the bra-and ket-vectors are attributed to, respectively, with insightful consequences for the space-time evolution of a system -a matter not conveyed by classical simple vectorial notation. However, it is not a complication of reality but a close representation of it: the molecular descriptors belong to a given molecular state that has to be included as a component of the quantum (ket) vectors carrying the specific structural information -a feature not fulfilled by simple classical vectors. Therefore, the adopted vectorial formalism goes beyond the simple notation -each time when we write a ket vector represented by a structural index we see in fact a generalized electronic (for a hyper-molecular) state, defined as the global state collecting one descriptor' values for all concerned congener molecules. Now, a set of N molecules studied against observed/recorded/measured biological activity is represented by means of their M -structural indicators (the states); all the M N × input information may be expressed by the vectors-columns of the Table 1 and correlated upon the generic scheme of Equations (1a)-(1d): .. ... was added to account for the free activity term.
In order for equation (1b) to represent a reliable model of the given activities, the hyper-molecular states (indices) assumed should constitute an orthogonal set, having this constraint a consistent quantum mechanical basis, as above described. However, unlike other important studies addressing this problem [15][16][17], the present Spectral-SAR [7] assumes the prediction error vector as being orthogonal to all others: since it is not known a priori any correlation is made. Moreover, Equations (1a), (1b), and (1c) imply that the prediction error vector has to be orthogonal on all known descriptors (states) of predicted activity: assuring therefore the reliability of the present ket states approach. In other terms, conditions (1c) and (1d) agree with Equation (1a) in the sense that the prediction vector and the prediction activity ) belong to disjoint (thus orthogonal) Hilbert (sub)spaces; or, even more, one can say that the Hilbert space of the observed activity OBS Y may be decomposed into a predicted and error independent Hilbert sub-spaces of states. Therefore, within Timişoara Spectral-SAR procedure the very first step consists in orthogonalization of prediction error on the predicted activity and on its predictor states, while the remaining algorithm does not seek to optimize the minimization of errors, but for producing the ideal correlation between PRED Y and the given descriptors Next, the Gram-Schmidt orthogonalization scheme is applied through construction of the appropriate set of descriptors by means of the consecrated iteration [16,18,19]: providing the orthogonal correlation: Remarkably, while available studies dedicated to the orthogonality problem usually stop at this stage, the Spectral-SAR uses it to provide the solution for the original sought correlation of Equation (1b) -having the prediction error vector orthogonal to the predicted activity and all its predictor states of Table 1. This can be wisely achieved through grouping Equations (2) and (3) so that the system of all descriptors of Table 1 is now written in terms of orthogonal descriptors: Now, when the determinant of Equation (5) is expanded on its first column, and the result is rearranged so that to have PRED Y on left side and the rest of states/indicators on the right side the sought QSAR solution for the initial observed-predicted correlation problem of Equation (1a) is obtained under the Spectral-SAR vectorial expansion (from where the "spectral" name is justified) without the need to minimize the predicted error vector anymore, being this stage absorbed in its orthogonal behavior with respect to the predicted activity.
In fact, the Spectral-SAR procedure uses the double conversion idea: one forward, from the given problem of Equations (1a)-(1d) to the orthogonal one of Equation (3) in which the error vector has no manifestation; and a backwards one, from the orthogonal to the real descriptors by employing the system (4) determinant (5) expansion as the QSAR solution.
It is worth stressing that the present QSAR/Spectral-SAR equations are totally delivered from the (analytical) determinant (5) and not computationally restricted to the inverse matrix product as prescribed by the fashioned statistical Pearson approach [20]. Moreover, the Spectral-SAR algorithm is invariant also upon the order of descriptors chosen in orthogonalization procedure, providing equivalent determinants no matter how its lines are re-derived, an improvement that was not previously achieved by other available orthogonalization techniques [15,17].
However, besides the effectiveness of the S-SAR methodology in reproducing the old-fashioned multi-linear QSAR analysis [7,21], one of its advantages concerns on the possibility of introducing the so called (vectorial) norms (see Appendix A1) associated with either experimental (measured or observed) or predicted (computed) activities: They provide a unique assignment of a number to a specific type of correlation, i.e., by performing a sort of final quantification of the models. Nevertheless, the activity norm given in Equation (6) opens the possibility of replacing the classical statistical correlation factor [21]: with a new index of correlation, introduced as the so called algebraic S-SAR correlation factor (or R-algebraic, shorthanded as RA) through the ratio of the predicted to observed norms [22,23]: It has the meaning of realization probability with which a certain predicted model approaches the observed activity throughout all of the employed molecules (in the hyper-molecular states of activities), see Appendix A2.
With this interpretation the algebraic correlation conceptually departs from the statistical one in that the later accounts on the degree with which each computed individual molecular activity approaches the mean activity of the N-molecules, while the first evaluates the (hyper-molecule) degree of overlap of predicted to observed activities' norms (viewed as the "amplitudes" of molecular-organism interaction's intensity). In this respect there seems that the algebraic analysis is more suited to environmental studies in which the global rather than local effect of a series of toxicants is evaluated on specific species and organisms.
In fact, this new correlation factor definition compares the vectorial lengths of the predicted activity against the measured one, thus being an indicator of the extent with which certain computed property or activity approaches the "length" of the observed quantity.
However, it was already shown that the algebraic correlation factor of Equation (8) furnishes higher and more insightful values than its statistical counterpart in a systematical manner [21,24], thus advancing it as the ideal tool for correlation analysis on a shrink interval of data analysis where the statistical meaning is naturally lost.
Even more, in the terms of the "quantum spectral" formalism, one can say that algebraic investigation provides the "excited" states of an activity modeling, while the statistical approach deals with "ground state" or lower states of correlation. Consequently, for completeness, a proper quest of structure-activity models should include both of these stages of molecular SAR modeling.
Going further towards extracting the mechanistic information from the Spectral-SAR norms and correlation factors we can further advance the so called least path principle: applied upon successively connected models with different correlation dimensions: it starts from 1-dimension with a single structural indicator correlation, say 1 A , until the models with maximum factors of correlation, say M Ai.e., containing M number of indicators, see Table 1) [7][8][9][10]. Since each of these models is now characterized by its predicted activity norm PRED Y along the algebraic (RA) and/or statistical (R) correlation factors, the elementary paths of Equation (9) are constructed as the Euclidian measure between two consecutive models (endpoints) [7][8][9][10][22][23][24]: It is noteworthy that the formal equation (9) has to be read as searching for paths' combination on the left side providing minimum value in the right side; it is practiced as the tool for deciding the hierarchy along all (ergodic) possible end-point linked paths with the important consequence of picturing the mechanistic and causal evolution of structural influences that trigger the observed effects.
This methodology was successfully applied in ecotoxicology [7,8,24] and for designing the behavior of the species interactions within a test battery [23], promising to furnish adequate framework also for the present (and future) interspecies analysis. Table 2 are modeled as QSARs for each species in both Mlog and Clog modes, with the help of Spectral-SAR determinant (5), wile reporting the algebraic norms and correlation computed upon Equations (6) and (8), respectively, side-by-side with the statistical correlation coefficients of Equation (7). The results are listed in Tables 3 and 4 for employed Mlog and Clog-ESIP data of  Table 2, respectively. However, in order to assure the reliability for the computed models the so called Topliss-Costello rule was considered, i.e., building models with about five times ratio of activity points with respect to the number of correlating/structural variables [25].

Data of
Aiming to provide the mechanistic maps of actions for the targeted species, the minimization principle of Spectral paths given by Equations (9) and (10) is considered among all possible ways of connecting endpoints from each category of models (i.e., with one, two or three dependency factors). The Tables 5 and 6 present all these endpoints' paths for Mlog and Clog activities, computed upon Equations (6)-(8) and (10) through processing the data of Tables 3 and 4, respectively.  [5], for compounds nos. 13-21 from Ref. [6], new data for the rest; the Hansch molecular parameters as hydrophobicity (LogP), polarizability (POL) and the steric optimized total energy (E tot ) were computed by HyperChem environment [26].   (6) and (8), and by Pearson statistical correlation (R) of Equation (7), for all possible mono-, bi-, and all-end-points, respectively. The referential algebraic norms of the considered species were estimated with the aid of Equation (6) from the Mlog input toxicity data of Table 2  However, in order to identify the shortest paths in each category of endpoint connections, according with prescription given by Equation (9), the following rules are applied: a) the first choice is the overall minimum path, in a certain column of Tables 5 and 6 (either for statistical or algebraically correlation); b) if the overall minimum is reached by many equivalent paths (as is the case of Mlogalgebraic column for H.e. in Table 5, for instance) the minimum path will be considered that one connecting the starting endpoint with the closest endpoint in the sense of norms (as is for H.e./ Mlog the norm of |2> state the closest to the norm of |2,3> state, as compared with |1,2> and |1,3>, see Spectral-SAR norm column of Table 3, for example); c) the overall minimum path will set the dominant hierarchical path in assessing the mechanistically mode of action towards the given/measured activity; it is called as the alpha path (α); d) once the alpha path has been set the next minimum path will be looked for in such a way that the new starting endpoint is different from that one already involved in the alpha path (that is, if in the established alpha path for H.e./ Mlog the starting model correspond to the |2> state, the next path to be identified will originate either on models/states |1> or |3>); e) the remaining minimum paths are identified on the same rules as before and will be called like beta and gamma paths, β and γ, respectively; f) at the end of this procedure each mode of action is to be "touched" only one, excepting the final endpoint state {|1,2,3>} that can present degeneracy, i.e., may be found with the same influence at the end of various paths, herein called as degenerate paths (e.g., the states |1,2,3>, |2,1,3>, and |3,1,2> in the case of Hydractinia echinata and Tetrahymena pyriformis at their ending toxicity paths of Table 5); Yet, such behavior may leave with the important idea the degenerate paths, although different in the start and intermediate states, while ending with the same ordering influences, e.g., the state |2,1,3> of Table 5( with "1" for LogP, "2" for POL, and "3" for E tot , see Tables 3 and 4), provides weaker contribution to the recorder activity since two paths have to produce the same (final) effect in order it to be activated; this is nevertheless one remarkable mechanistic consequence of the present combined (algebraic or statistical) correlations with minimization (optimization) principle applied for the spectral path lengths through Equations (6)-(10); g) the alpha, beta and gamma paths can be easily identified for algebraic and statistical treatments in Tables 5 and 6 and there are accordingly marked; the degeneracy behavior is readily verified in Table 5 where the alpha path is found as the only (non-degenerate) path out of all possible ones. Of course, the same rationalization applies also for alpha path of Table 6, however displaying the trivial situation in which the absence of any degeneracy is recorded due to the restrained structural parameters considered for activity modeling since less available data for Pimephales promelas (P.p.) and Vibrio fisheri (V.f.) species in Table 2, according with the above specified Topliss-Costello rule.

Species Toxicities |Y>
Now, the interspecies analysis may be unfolded employing the paths of Tables 5 and 6; for achieving that, a preliminary search for minimum paths at the inter-species levels for each Mlog/Clog and algebraic/statistic computational frames should be done first.
Note that for Daphnia magna (D.m.) species, although specific paths would be superfluous with the uni-parameter models considered in Tables 3 and 4 due the Topliss-Costello rule (since the limited  data of Table 2), its presence on the inter-species grids of Figures 1-4 may be as well considered by means of the pseudo-path construction based on reconsidering the above a) & b) minimum searching rules for models with single parameter dependency: h) models with higher correlation/probability (either within statistic or algebraic approaches) will firstly enter molecular mechanism of toxicity through their considered structural parameter, i.e., LogP, POL and E tot for the |1>, |2> and |3> end-points, respectively. Such a quest is performed in two steps: the computational scheme is primarily fixed, e.g., the Mlog-algebraic one; then, among all Mlog-algebraic alpha paths for all species of Tables 5 and 6 the minimum is selected, i.e., α P.p. for the actual case.   Then, the same procedure is unfolded for the remaining beta and gamma paths within the fixed computational frame, i.e., it will be repeated for each possible Mlog/Clog-algebraic/statistic combination. The results are summarized in Table 7 leading with the interspecies ordering of models to be considered for a mechanistic Spectral-SAR analysis. As such, all possible inter-and intra-species influences are presented in Figures 1-4 emphasizing on primary (alpha), secondary (beta) and tertiary (gamma) paths of Tables 5 and 6 projected on the Mlog/Clog models for algebraic/statistic correlations of Tables 3 and 4, respectively.
The inter-species diagrams reveal interesting features respecting both the correlation analysis and the inter-toxicity; as such, when is about either of algebraic or statistical treatment either Mlog-and Clog-interspecies ecotoxicity diagrams display the same endpoint ordering, as revealed by Table 7 and Figures 1 & 3 and 2 & 4, respectively. Beyond this, the algebraic approaches provide better systematic maps of inter-toxicity judged upon the minimum distribution of crossing individual species' paths (alpha, beta or gamma), being this another realization of the least path principle -here at interspecies paths' level; for instance, the H.e. paths within algebraic framework RA Clog of Figure 2 are clearly individuated as having no crossing toxicity with other species eventually submersed in the same ecological area, while the carried toxicity may be transmitted to V.f. species according with the statistical approach R Clog of Figure 4.
On the other way, when comparing the measured (observed) results it is apparent that the species H.e. and V.f. are eco-toxically interconnected and somehow independent from the T.p. and P.p. environmental response in RA Mlog picture of Figure 1. Yet, a different situation is noted for the statistical R Mlog analysis of Figure 3, according which H.e. species is highly mixed from a toxicological point of view with the species T.p. and P.p., but not with the V.f. one, either by means of first (alpha), second (beta) or third (gamma) toxicity paths.
Finally, the species D.m. is predicted to strongly interact (crosses at the alpha paths' level) with the species T.p. on both algebraic RA Clog and statistical R Clog frameworks of Figures 2 and 4 due to POL and LogP parameters specific influence -identified on the grid region of their path crossings, respectively. Such a situation is no longer valid when Mlog values are modeled, since the algebraic RA Mlog approach predicts moderate inter-toxicity influence (through alpha-beta crossing paths due E tot or steric influence) (see Figure 1), in contrast with no recorded interaction within the statistical R Mlog analysis (see Figure 3). Therefore, the molecular mechanistic models of toxicity may be proposed in four variants: based on algebraic (RA) or statistic (R) correlation of either measured (Mlog) or by ESIP computed (Clog) toxicities.
The difference between the algebraic and statistical approaches relays on their inner definition: while, for a data sample, the statistical framework quantifies the dispersion respecting the data average (the data mean), the algebraic picture accounts for the dispersion of the extremes (the N-dimensional Euclidian lengths of the data rows); from this conceptual difference, although both assess the same confined realm between 0 and 1 in probability realization, the algebraic correlation records closer values near to the certainty for models classified as with high or even moderate statistical correlation values [7], being thus more suited for least path principle applications, as also proven by the current study.  Tables 5 and  6 connecting the algebraic correlations of Table 3 across the ordered models of Table 7; the difference between species is made by the assignments of distinct icons, while alpha, beta and gamma paths are differentiated by thickness decreasing of lines joining the same icons; the D.m. pseudo-path (interrupted line on map) is considered from the highest correlation model towards the lowest one in Table 3.   In other words, if one is interested in the sample data behavior merely from its "length" (the norm) than from its "average dispersion" side, the algebraic way should be chosen as the main correlation framework, while keeping the statistical counterpart available for comparison purpose. This seems to be the case of ecotoxicological studies when the intensity (the "length or the amplitude") of action for each sample's endpoint may be important [8].
On the other hand the difference between the measured and computed values for ecotoxicological activities relays on the way the ESIP model is build from the available database, i.e., by collecting the measured molecular-upon-species values and then appropriately redistributing them among various molecular fragments and groups of actual interest. Nevertheless, a practical discussion on how fine the actual ESIP data accommodates with the correlated structural data is addressed next. Table 2 presents the 28 tested combinations of H.e. with other organisms. In the case of derivatives nos. 1-25, the estimations of the calculated (C) values have been possible through use of the ESIP's parameters distinctive for molecular substructures and specifically for every test-system. The file (structure + ESIP algorithm) of the mentioned derivatives offers the possibility of the analysis of different structure-reactivity relations through mentioned organisms we follow.

Discussion of ESIP
In the H.e. case, a specifically marine environment organism, pure water has the least toxicity and has the least values for structural parameters. The appearance of the hydrocarbonated chain leads to increasing molecular toxicity simultaneously with the increased values of the structural parameters (logP and POL) in the case of alcohols (nos. 2-4, and 6) and in the case of the phenols (nos. 14, 15, and 21) too. Compound no. 21 has the highest toxicity, probably due to geometry of the hydrocarbon radical situated in opposite para-position for phenolic hydroxyl and this one proximity on aromatic nucleus [27]; It is worth observing that the steric impediments limit produces such increase in the case of 2,6-diisopropylphenol. Instead, 1,2,3-propanetriol possess three OH groups, leading with persistent hydrophilic character, while the molecule has a diminished toxicity according to the logP value.
In the aromatic series (molecules nos. 16, 18, and 20) the structural parameters have closer values but the toxicities are more elevated as a result of the possibilities of extended electronic conjugation. The existence of two identical hydroxyl groups (see molecule no. 18), a highly symmetrical and flat molecule, as well as the absence of sterical hindrances, are considered to be the premises of an extended p-π conjugation (the possibility of conjugation between the non-bonded p electrons of Oxygen and the π electrons of aromatic centre) according to a push-pull electronic mechanism: an OH group is electron donating and becomes positively-charged, and the second one, an electron accepting group becomes negatively-charged. This phenomenon, which is probably alternant and permanent even in the absence of a reaction partner, induces a strong hydrogen bond donor character. Unexpected seems the toxicity of 1,2,3-trihydroxibenzene (molecule no. 20), essentially identical with that of 1,2-dihydroxybenzene, although its logP value is diminished; this maybe happens due the push-pull mechanism of polyhidroxylic phenols [6].
In the case of 1,4-dihydroxybenzene, the conjugation is diminished through the inclusion of a t-butyl radical (no. 19) and increased steric impediment at the phenolic hydroxyl level. However, the toxicity significantly decreases by three orders by replacing one -OH group with a methyl (no. 14), methoxy (no. 17), chloro (no. 22) or amino (no. 24) moiety though the logP parameter changes significantly. The situation according to which the toxicity values of the mentioned derivatives span a narrow domain of about 0.5 logarithm units, relays on the existence of certain stereo-electronic balance similarity in the case of aromatic derivatives inferior substituted, as 4-toluidine (no. 11) and 1,2-dichlorobenzene (no. 12), in agreement with other (unpublished) series of derivatives.
An example of the influence of functional groups is illustrated by the 1,10-diaminodecane (no. 7) molecule with a smaller toxicity than expected according to its hydrocarbonated chain, though the POL value is great. According with the experimental results [5], the chain with 8-10 C probably represents the hydrocarbon interface where the lipophilicity is manifested. Yet, the diffusion of the molecules with high number of the C atoms through the cell membrane is however "hindered" and the percentage of the crossing molecules is diminished [28].
The presence of the 4-methoxyazobenzene (no. 28) in Table 2 illustrates that as profound structural changes resulted new reactions mechanisms appear, emphasizing the availability offered by the test-system with H.e. to analyze very different derivatives. This is the case of the azo-function which, by means of enzymatic reduction, leads to the stoichiometric appearance of amines, though the toxicity of the mixed combination (the most frequent case in the environment) represents a fruitful and significant investigation direction.
The ESIP-Köln model provides, although not in all cases, the possibility to appreciate to a great extent the efficiency with which the real or measured M value agrees with the theoretically calculated (C) counterparts. In other words, if the calculated value (C) stands above those measured (M), for instance, the difference can be assigned to the lipophilicity character represented here by logP, along some electronic POL or steric Etot influences.
However, the ESIP values have been determined to bend down on real/measured values through inclusion of the effects relating specific parameters of molecular structures. This is confirmed since the measured Mlog/MRC 50 and calculated Clog/MRC 50 values are in concordance with the numerical structural parameter counterparts in the Table 2. This also indicates that the individual structural parameters or their combinations are specific and the organism H.e. can be successful employed as a suitable test-system for further toxicity determinations.

Conclusions and Outlook
There is already wider recognition of the problem posed by the ever growing number of available chemicals with no tested toxicity in junction with the increased costs and limited time available for testing before entering mainstream production or they are dispersed into the environment. Therefore, the demand for developing in silico tools for providing the associated computed activities from benchmark measurements and individuated molecular fragment toxicities naturally appears; such studies should provide correlation paths regarding how the given toxicants may act on various cells or species. In moving towards such complex computational techniques for species and inter-species toxicity assignment the present work combines the Köln-ESIP and Timişoara-Spectral-SAR models in a unified computational activity-correlation framework.
The Köln model for estimation of a compound toxicity is based on the experimental measurement expressing the direct action of chemicals on the H.e. organism so that the structural influence parameters are reflected by the degree of metamorphosis itself. As such, the calculation of the structural parameters is absolute necessary for correctly evaluation and interpretation of the evolution of M(easured) and C(omputed) values.
The present work evidences relatively simple rules in respect to relationships with structure and reactivity: the efficacy of aliphatic alcohols increases with the number of the C atoms (a phenomenon characterized through the structural parameter logP) and diminishes with the appearance of new alcoholic groups (a phenomenon widely reflected through POL and E tot ); the influence of the amino-group for aliphatic amines is comparatively predominant to the relative extent of the hydrocarbon chain [5]; the influence of the first methyl in the phenol case is negligible; the steric influence of isopropyl-radicals on the phenolic active centre for 2,6-derivative is stronger (by increased POL value) as in the case of the t-butyl substituent (while logP and POL values are diminished), etc.
In principle, the toxicity represents the synergetic effect of the three structural influences: hydrophobicity, electrostatic and steric molecular control on receptor binding. The efficiency difference through derivatives with closer parameters as those with 1,2-, 1,4-and 1,2,3-hydroxy groups can be interpreted through a particular electronic mechanism [6]; on the other hand the toxicity efficiency difference through 1,4-dihydroxybenzene (electronic) and 4-(3',5'-dimethyl-3'heptyl)phenol (steric) is well reflected through computed parameter's numeric values.
Finally, the Timişoara Spectral-SAR analysis offers the correlation models and the end-points' paths either for H.e. species as well for other four different organisms, with which the toxicity may be inter-changed by means of molecular structural mechanisms of action induced by certain common (or under testing) chemicals.
Besides the fact the Spectral-SAR algorithm was previously proven as being superior to the fashioned statistical approach in solving the paradoxical dichotomies various statistical indices produce when considered together [24], it advances a reliable method of identifying which structural molecular parameter is more influential across multiple possible paths of activation of a bio-or ecotoxico-logical response, thus furnishing a useful computational mechanistic molecular method in QSAR studies [29]. Then, when combined with ESIP algorithm a complex inter-molecular/interspecies toxicological transfer picture is provided.
However, the correlation maps depend on the algebraic or statistical way of modeling the action, i.e., by assuming the chemical-biological interaction driven by the intensity norm (relating algebraic vectorial picture) or average (relating statistical dispersive picture) in ligand-receptor specific binding. At this point the algebraic vs. statistic issue remains open for further investigations by comparative single-and inter-species activities.
Consequently, the distance between two vectors may be written in terms of their difference norm as: From Equation (A6), but also from the fact that the self-scalar product is positively defined, see Equation (A4), the distributivity and commutativity properties of scalar product may be employed for any real parameter, ℜ ∈ t , towards equivalent expressions: The last inequality says that the left sided second order equation has no solution or has two equal solutions; such condition is fulfilled when its discriminator is less or equal with zero, respectively, leading with the famous Cauchy-Schwartz inequality: The Cauchy-Schwartz inequality is successfully used in probability theory, variance theory and correlation factors. An actual application is in following given as well.

A2. Algebraic Correlation Factor
One starts with the simple connection between the observed, predicted and error vectors of Equation (1a), however specialized on their individual elements: where " pe " here stays as the abbreviation for "prediction error".
Then, while squaring relation (A11): and summing for all working N-molecules (of Table 1