Multivariate Analysis Applications in X-ray Diffraction

Multivariate analysis (MA) is becoming a fundamental tool for processing in an efficient way the large amount of data collected in X-ray diffraction experiments. Multi-wedge data collections can increase the data quality in case of tiny protein crystals; in situ or operando setups allow investigating changes on powder samples occurring during repeated fast measurements; pump and probe experiments at X-ray free-electron laser (XFEL) sources supply structural characterization of fast photo-excitation processes. In all these cases, MA can facilitate the extraction of relevant information hidden in data, disclosing the possibility of automatic data processing even in absence of a priori structural knowledge. MA methods recently used in the field of X-ray diffraction are here reviewed and described, giving hints about theoretical background and possible applications. The use of MA in the framework of the modulated enhanced diffraction technique is described in detail.


Introduction
Multivariate analysis (MA) consists of the application of a set of mathematical tools to problems involving more than one variable in large datasets, often combining data from different sources to find hidden structures. MA provides decomposition in simpler components and makes predictions based on models or recovers signals buried in data noise.
Analysis of dependency of data in more dimensions (or variables) can be derived from the former work of Gauss on linear regression (LR) and successively generalized to more than one predictor by Yule and Pearson, who reformulated the linear relation between explanatory and response variables in a joint context [1,2].
The need to solve problems of data decomposition into simpler components and simplify the multivariate regression using few, representative explanatory variables brought towards the advent of the principal component analysis (PCA) by Pearson and, later, by Hotelling [3,4] in the first years of the 20th century. Since then, PCA has been considered a great step towards data analysis exploration because the idea of data decomposition into its principal axes (analogies with mechanics and physics were noted by the authors) allows the data to be explained into a new multidimensional space, where directions are orthogonal to each other (i.e., the new variables are uncorrelated) and each successive direction is decreasing in importance and, therefore, in explained variance. This paved the way to the concept of dimensionality reduction [5] that is crucially important in many research areas such as chemometrics.
Chemometrics involves primarily the use of statistical tools in analytical chemistry [6] and, for this reason, it is as aged as MA. It is also for this reason that the two disciplines started to talk since the late 1960s, initially concerning the use of factor analysis (FA) in

High Dimension and Overfitting
MA provides answers for prediction, data structure, parameter estimations. In facing these problems, we can think at data collected from experiments as objects of an unknown manifold in a hyperspace of many dimensions [21]. In the context of X-ray diffraction, experimental data are constituted by diffraction patterns in case of single crystals or diffraction profiles in case of powder samples. Diffraction patterns, after indexing, are constituted by a set of reflections, each identified by three integers (the Miller indices) and an intensity value, while diffraction profiles are formed by 2ϑ values as independent variable and intensity values as dependent variable. The data dimensionality depends on the number of reflections or on the number of 2ϑ values included in the dataset. Thus, high-resolution data contain more information about the crystal system under investigation, but have also higher dimensionality.
In this powerful and suggestive representation, prediction, parameter estimation, finding latent variables or structure in data (such as classify or clustering) are all different aspects of the same problem: model the data, i.e., retrieve the characteristics of such a complex hypersurface plunged within a hyperspace, by using just a sampling of it. Sometimes, few other properties can be added to aid the construction of the model, such as the smoothness of the manifold (the hypersurface) that represents the physical model underlying the data. Mathematically speaking, this means that the hypersurface is locally homeomorphic to a Euclidean space and can be useful to make derivatives, finding local minimum in optimization and so on.
In MA, it is common to consider balanced a problem in which the number of variables involved is significantly fewer than the number of samples. In such situations, the sampling of the manifold is adequate and statistical methods to infer its model are robust enough to make predictions, structuring and so on. Unfortunately, in chemometrics it is common to face problems in which the number of dimensions is much higher than the number of samples. Some notable examples involve diffraction/scattering profiles, but other cases can be found in bioinformatics, in study for gene expression in DNA microarray [22,23]. Getting high resolution data is considered a good result in Crystallography. However, this implies a larger number of variables describing a dataset: (more reflections in case of single-crystal data or higher 2ϑ values in case of powder data). Consequently, it makes more complicated the application of MA.
When the dimensionality increases, the volume of the space increases so fast that the available data become sparse on the hypersurface, making it difficult to infer any trend in

High Dimension and Overfitting
MA provides answers for prediction, data structure, parameter estimations. In facing these problems, we can think at data collected from experiments as objects of an unknown manifold in a hyperspace of many dimensions [21]. In the context of X-ray diffraction, experimental data are constituted by diffraction patterns in case of single crystals or diffraction profiles in case of powder samples. Diffraction patterns, after indexing, are constituted by a set of reflections, each identified by three integers (the Miller indices) and an intensity value, while diffraction profiles are formed by 2ϑ values as independent variable and intensity values as dependent variable. The data dimensionality depends on the number of reflections or on the number of 2ϑ values included in the dataset. Thus, high-resolution data contain more information about the crystal system under investigation, but have also higher dimensionality.
In this powerful and suggestive representation, prediction, parameter estimation, finding latent variables or structure in data (such as classify or clustering) are all different aspects of the same problem: model the data, i.e., retrieve the characteristics of such a complex hypersurface plunged within a hyperspace, by using just a sampling of it. Sometimes, few other properties can be added to aid the construction of the model, such as the smoothness of the manifold (the hypersurface) that represents the physical model underlying the data. Mathematically speaking, this means that the hypersurface is locally homeomorphic to a Euclidean space and can be useful to make derivatives, finding local minimum in optimization and so on.
In MA, it is common to consider balanced a problem in which the number of variables involved is significantly fewer than the number of samples. In such situations, the sampling of the manifold is adequate and statistical methods to infer its model are robust enough to make predictions, structuring and so on. Unfortunately, in chemometrics it is common to face problems in which the number of dimensions is much higher than the number of samples. Some notable examples involve diffraction/scattering profiles, but other cases can be found in bioinformatics, in study for gene expression in DNA microarray [22,23]. Getting high resolution data is considered a good result in Crystallography. However, this implies a larger number of variables describing a dataset: (more reflections in case of single-crystal data or higher 2ϑ values in case of powder data). Consequently, it makes more complicated the application of MA.
When the dimensionality increases, the volume of the space increases so fast that the available data become sparse on the hypersurface, making it difficult to infer any trend in data. In other words, the sparsity becomes rapidly problematic for any method that requires statistical significance. In principle, to keep the same amount of information (i.e., to support the results) the number of samples should grow exponentially with the dimension of the problem.
A drawback of having a so much high number of dimensions is the risk of overfitting. Overfitting is an excessive adjustment of the model to data. When a model is built, it must account for an adequate number of parameters in order to explain the data. However, this should not be done too precisely, so to keep the right amount of generality in explaining another set of data that would be extracted from the same experiment or population. This feature is commonly known as model's robustness. Adapting a model to the data too tightly introduces spurious parameters that explain the residuals and natural oscillations of data commonly imputed to noise. This known problem was named by Bellman the "curse of dimensionality" [5], but it has other names (e.g., the Hughes phenomenon in classification). A way to partially mitigate such impairment is to reduce the number of dimensions of the problem, giving up to some characteristic of the data structure.
The dimensionality reduction can be performed by following two different strategies: selection of variables or transformation of variables. In the literature, the methods are known as feature selection and feature extraction [24].
Formally, if we model each variate of the problem as a random variable X i , the selection of variables is the process of selecting a subset of relevant variables to be used for model construction. This selection requires some optimality criterion, i.e., it is performed according to an agreed method of judgement. Problems to which feature selection methods can be applied are the ones where one of the variables, Y, has the role of 'response', i.e., it has some degree of dependency from the remaining variables. The optimality criterion is then based on the maximization of a performance figure, achieved by combining a subset of variables and the response. In this way, each variable is judged by its level of relevance or redundancy compared to the others, to explain Y. An example of such a figure of performance is the information gain (IG) [25], which resorts from the concept of entropy, developed in the information theory. IG is a measure of the gain in information achieved by the response variable (Y) when a new variable (X i ) is introduced into the measure. Formally: being H() the entropy of the random variable. High value of IG means that the second term H(Y|X i ) is little compared to the first one, H(Y), i.e., that when the new variable X i is introduced, it explains well the response Y and the corresponding entropy becomes low.
The highest values of IG are used to decide which variables are relevant for the response prediction. If the response variable is discrete (i.e., used to classify), another successful method is Relief [26], which is based on the idea that the ranking of features can be decided on the basis of weights coming from the measured distance of each sample (of a given class) from nearby samples of different classes and the distance of the same sample measured from nearby samples of the same class. The highest weights provide the most relevant features, able to make the best prediction of the response. Such methods have found application in the analysis of DNA microarray data, where the number of variables (up to tenths of thousands) is much higher than the number of samples (few hundreds), to select the genes responsible for the expression of some relevant characteristic, as in the presence of a genetic disease [27]. Another relevant application can be found in proteomics [28], where the number of different proteins under study or retrieved in a particular experimental environment is not comparable with the bigger number of protein features, so that reduction of data through the identification of the relevant feature becomes essential to discern the most important ones [29].
Feature extraction methods, instead, are based on the idea of transforming the variables set into another set of reduced size. Using a simple and general mathematical formulation, we have: where the variables are transformed losing their original meaning to get new characteristics that may reveal some hidden structure in data. Among these methods, PCA is based on the transformation in a space where variables are all uncorrelated each other and sorted by decreasing variance. Independent Component Analysis (ICA), instead, transforms variables in a space where they are all independent each other and maximally not-Gaussian, apart for one, which represents the unexplained part of the model, typically noise. Other methods, such as MCR, solve the problem by applying more complicated conditions such as the positivity of the values in the new set of variables or similar, so that the physical meaning of the data is still preserved. A critical review of dimensionality reduction feature extraction methods is in Section 2.2.
PCA and MCR have an important application in solving mixing problems or in decomposing powder diffraction profiles of mixtures in pure-phase components. In such decomposition, the first principal components (or contributions, in MCR terminology) usually include the pure-phase profiles or have a high correlation with these.

Dimensionality Reduction Methods
In Section 2.1, the importance of applying dimensionality reduction to simplify the problem view and reveal underlying model within data was underlined. One of the most common method to make dimensionality reduction is principal component analysis.
PCA is a method to decompose a data matrix finding new variables orthogonal each other (i.e., uncorrelated), while preserving the maximum variance (see Figure 2). These new uncorrelated variables are named principal components (PCs).
where the variables are transformed losing their original meaning to get new characteristics that may reveal some hidden structure in data. Among these methods, PCA is based on the transformation in a space where variables are all uncorrelated each other and sorted by decreasing variance. Independent Component Analysis (ICA), instead, transforms variables in a space where they are all independent each other and maximally not-Gaussian, apart for one, which represents the unexplained part of the model, typically noise. Other methods, such as MCR, solve the problem by applying more complicated conditions such as the positivity of the values in the new set of variables or similar, so that the physical meaning of the data is still preserved. A critical review of dimensionality reduction feature extraction methods is in Section 2.2.
PCA and MCR have an important application in solving mixing problems or in decomposing powder diffraction profiles of mixtures in pure-phase components. In such decomposition, the first principal components (or contributions, in MCR terminology) usually include the pure-phase profiles or have a high correlation with these.

Dimensionality Reduction Methods
In Section 2.1, the importance of applying dimensionality reduction to simplify the problem view and reveal underlying model within data was underlined. One of the most common method to make dimensionality reduction is principal component analysis.
PCA is a method to decompose a data matrix finding new variables orthogonal each other (i.e., uncorrelated), while preserving the maximum variance (see Figure 2). These new uncorrelated variables are named principal components (PCs). PCA is then an orthogonal linear transformation that transforms data from the current space of variables to a new space of the same dimension (in this sense, no reduction of dimension is applied), but so that the greatest variance lies on the first coordinate, the second greatest variance on the second coordinate and so on. From a mathematical viewpoint, said X the dataset, of size N × P (N being the number of samples, P that of the variates), PCA decomposes it so that PCA is then an orthogonal linear transformation that transforms data from the current space of variables to a new space of the same dimension (in this sense, no reduction of dimension is applied), but so that the greatest variance lies on the first coordinate, the second greatest variance on the second coordinate and so on. From a mathematical viewpoint, said X the dataset, of size N × P (N being the number of samples, P that of the variates), PCA decomposes it so that with T (of size N × P) the matrix of the principal components (called also scores), which are the transformed variable values corresponding to each sample and with W (of size P × P) the matrix of the loadings, corresponding to the weights by which each original variable must be multiplied to get the component scores. The matrix W is composed by orthogonal columns that are the eigenvectors of the diagonalization [30] of the sample covariance matrix of X: In Equation (5), Λ is a diagonal matrix containing the eigenvalues of the sample covariance matrix of X, i.e., X X. Since a covariance matrix is always semi-definite positive, the eigenvalues are all real and positive or null and correspond to the explained variance of each principal component. The main idea behind PCA is that in making such decomposition, often occurs that not all the directions are equally important. Rather, the number of directions preserving most of the explained variance (i.e., energy) of the data are few, often the first 1-3 principal components (PC). Dimensionality reduction is then a lossy process, in which data are reconstructed by an acceptable approximation that uses just the first few principal components, while the remaining are neglected: With s the retained components (i.e., the first s columns of both the matrices) and s P. Diagonalization of the covariance matrix of data is the heart of PCA and it is achieved by resorting to singular value decomposition (SVD), a basic methodology in linear algebra. SVD should not be confused with PCA, the main difference being the meaning given to the results. In SVD the input matrix is decomposed into the product a left matrix of eigenvectors U, a diagonal matrix of eigenvalues Λ and a right matrix of eigenvectors V, reading the decomposition from left to right: SVD may provide decomposition also of rectangular matrices. PCA, instead, uses SVD for diagonalization of the data covariance matrix X X, which is square and semidefinite positive. Therefore, left and right eigenvector matrices are the same, and the diagonal matrix is square and with real and positive value included. The choice of the most important eigenvalues allows the choice of the components to retain, a step that is missing from SVD meaning.
Factor analysis is a method based on the same concept of PCA: a dataset is explained by a linear combination of hidden factors, which are uncorrelated each other, apart for a residual error: FA is a more elaborated version of PCA in which factors are supposed (usually) to be known in number and, although orthogonal each other (as in PCA), they can be achieved adopting external conditions to the problem. A common way to extract factors in FA is by using a depletion method in which the dataset X is subjected to an iterative extraction of factors that can be analyzed time by time: X (k) = X (k−1) − l k F k . In FA, there is the clear intent to find physical causes of the model in the linear combination. For this reason, their number is fixed and independency of the factors with the residual ε is imposed too. It can be considered, then, a supervised deconvolution of original dataset in which independent and fixed number of factors must be found. PCA, instead, explores uncorrelated directions without any intent of fixing the number of the most important ones.
FA has been applied as an alternative to PCA in reducing the number of parameters and various structural descriptors for different molecules in chromatographic datasets [31]. Moreover, developments of the original concepts have been achieved by introducing a certain degree of complexity such as the shifting of factors [32] (factors can have a certain degree of misalignment in the time direction, such as in the time profile for X-ray powder diffraction data) or the shifting and warping (i.e., a time stretching) [33].
Exploring more complete linear decomposition methods, MCR [8] has found some success as a family of methods that solve the mixture analysis problem, i.e., the problem of finding the pure-phase contribution and the amount of mixing into a data matrix including only the mixed measurements. A typical paradigm for MCR, as well as for PCA, is represented by spectroscopic or X-ray diffraction data. In this context, each row of the data matrix represents a different profile, where the columns are the spectral channels or diffraction/scattering angles, and the different rows are the different spectra or profiles recorded during the change of an external condition during time.
In MCR analysis, the dataset is described as the contribution coming from reference components (or profiles), weighted by coefficients that vary their action through time: With c i the vector of weights (the profile of change of the i-th profile through time) and S i the pure-phase i-th reference profile. The approximation sign is since MCR leaves some degree of uncertainty in the model. In a compact form we have: MCR shares the same mathematical model of PCA, apart for the inclusion of a residual contribution E that represents the part of the model we give up explaining. The algorithm that solves the mixture problem in the MCR approach, however, is quite different from the one used in PCA. While PCA is mainly based on the singular value decomposition (i.e., basically the diagonalization of its sample covariance matrix), MCR is based on the alternating least square (ALS) algorithm, an iterative method that tries to solve conditioned minimum square problems of the form: The previous problem is a least square problem with a regularization term properly weighted by a Lagrange parameter λ. The regularization term, quite common in optimization problems, is used to drive the solution so that it owns some characteristics. The L 2 -norm of the columns of S or C, as reported in the Equation (11), is used to minimize the energy of the residual and it is the most common way to solve ALS. However, other regularizations exist, such as L 1 -norm to get more sparse solutions [34] or imposing positivity of the elements of S and C. Usually, the solution to Equation (11) is provided by iterative methods, where initial guesses of the decomposition matrices S or C are substituted iteratively by alternating the solution of the least-square problem and the application of the constraints. In MCR, the condition of positivity of elements in both S or C is fundamental to give physical meaning to matrices that represent profile intensity and mixing amounts, respectively.
The MCR solution does not provide the direction of maximum variability as PCA. PCA makes no assumption on data; the principal components and particularly the first one, try to catch the main trend of variability through time (or samples). MCR imposes external conditions (such as the positivity), it is more powerful, but also more computationally intensive. Moreover, for quite complicated data structures, such as the ones modeling the evolution of crystalline phases through time [35,36], it could be quite difficult to impose constraints into the regularization term, making the iterative search unstable if not properly set [37,38]. Another important difference between MCR and PCA is in the model selection: in MCR the number of latent variables (i.e., the number of profiles in which decomposition must be done) must be known and set in advance; in PCA, instead, while there are some criterion of selection of the number of significative PC such as the Malinowski indicator function (MIF) [39,40] or the average eigenvalue criterion (AEC) [41,42], it can also be inferred by simply looking at the trend of the eigenvalues, which is typical of an unsupervised approach. The most informative principal components are the ones with highest values.
For simple mathematical models, it will be shown in practical cases taken from X-ray diffraction that PCA is able to optimally identify the components without external constraints. In other cases, when the model is more complicated, we successfully experimented a variation of PCA called orthogonal constrained component rotation (OCCR) [43]. In OCCR, a post-processing is applied after PCA aimed at revealing the directions of the first few principal components that can satisfy external constraints, given by the model. The components, this way, are no longer required to keep the orthogonality. OCCR is then an optimization method in which the selected principal components of the model are let free to explore their subspace until a condition imposed by the data model is optimized. OCCR has been shown to give results that are better than traditional PCA, even when PCA produces already satisfactory results. A practical example (see Section 3) is the decomposition of the MED dataset in pure-phase profiles, where PCA scores, proportional to profiles, may be related each other with specific equations.
Smoothed principal component analysis or (SPCA) is a modification of common PCA suited for complex data matrices coming from single or multi-technique approaches where the time is a variable, in which sampling is a continuous, such as in situ experiment or kinetic studies. The ratio behind the algorithm proposed by Silvermann [44] is that data without noise should be smooth. For this reason, in continuous data, such as profiles or time-resolved data, the eigenvector that describes the variance of data should be also smooth [45]. Within these assumptions, a function called "roughness function" is inserted within the PCA algorithm for searching the eigenvectors along the directions of maximum variance. The aim of the procedure is reducing the noise in the data by promoting the smoothness between the eigenvectors and discouraging more discrete and less continuous data. SPCA had been successfully applied to crystallographic data in solution-mediated kinetic studies of polymorphs, such as L-glutamic acid from α to β form by Dharmayat et al. [46] or p-Aminobenzoic acid from α to β form by Turner and colleagues [47].

Modulated Enhanced Diffraction
The Modulated Enhanced Diffraction (MED) technique has been conceived to achieve chemical selectivity in X-ray diffraction. Series of measurements from in situ or operando X-ray diffraction experiments, where the sample is subjected to a varying external stimulus, can be processed by MA to extract information about active atoms, i.e., atoms of the sample responding to the applied stimulus. This approach has the great advantage of not requiring preliminary structural information, so it can be applied to complex systems, such as composite materials, quasi-amorphous samples, multi-component solid-state processes, with the aim to characterize the (simpler) sub-structure comprising active atoms. The main steps involved in a MED experiment are sketched in Figure 3. During data collection of X-ray diffraction data, the crystal sample is perturbed by applying an external stimulus that can be controlled in its shape and duration. The stimulus determines a variation of structural parameters of the crystallized molecule and/or of the crystal lattice itself. These changes represent the response of the crystal, which is usually unknown. Repeated X-ray diffraction measurements allow to collect diffraction patterns at different times, thus sampling the effect of the external perturbation. In principle, the set of collected diffraction patterns can be used to detect the diffraction response to the stimulus. Offline data analysis based on MA methods must be applied to implement the deconvolution step, i.e., to process time-dependent diffraction intensities I h measured for each reflection h to recover the time evolution of the parameters for each atom j of the crystal structure, such as occupancy (n j ), atomic scattering factor (f j ) and atomic position (r j ). It is worth noting that in X-ray diffraction the measured intensities (I h ), representing the detected variable, depend on the square of the structure factors F h , embedding the actual response of the system.  Applications of MED are twofold: on one hand, the active atoms sub-structure can be recovered, so disclosing information at atomic scale about the part of the system changing with the stimulus, on the other hand the kinetic of changes occurring in the sample can be immediately captured, even ignoring the nature of such changes.
To achieve both these goals, MA tools have been developed and successfully applied to crystallographic case studies. These methods, which will be surveyed in the next paragraph, implement the so-called data deconvolution, i.e., they allow extracting a single MED profile out of the data matrix comprising the set of measurements.

Deconvolution Methods
The first deconvolution method applied to MED technique was sensitive phase detection (SPD). SPD is a MA approach widely applied since decades to spectroscopic data (modulation enhanced spectroscopy-MES), which can be applied to systems linearly responding to periodic stimuli. SPD projects the system response from time domain to phase domain by using a reference periodic -typically sinusoidal-signal, through the following Equation: ( , ) = ( ) + where φ is the SPD phase angle, yi(x) are the n measurements collected at the time i, and k is the order of the demodulation. The variable x can be the angular variable 2ϑ in case of X-ray powder diffraction measurements or the reflection h in case of single-crystal diffraction data. Equation (12) represents a demodulation in the frequency space of the system response, collected in the time space. Demodulation at k = 1, i.e., at the lower frequency of the reference signal, allows the extraction of kinetic features, involving both active and silent atoms; demodulation at k = 2, i.e., at double the lower frequency of the reference signal, allow to single out contribution from only active atoms. This was demonstrated in [35] and comes from the unique property of X-ray diffraction that the measured intensities Ih depends on the square of the structure factors Fh. A pictorial way to deduce this property is given in Figure 4: silent and active atoms respond in a different way to the external stimulus applied to the crystal. This can be parameterized by assigning different time dependences to their structure factors, respectively FS and FA. For sake of simplicity, we can Applications of MED are twofold: on one hand, the active atoms sub-structure can be recovered, so disclosing information at atomic scale about the part of the system changing with the stimulus, on the other hand the kinetic of changes occurring in the sample can be immediately captured, even ignoring the nature of such changes.
To achieve both these goals, MA tools have been developed and successfully applied to crystallographic case studies. These methods, which will be surveyed in the next paragraph, implement the so-called data deconvolution, i.e., they allow extracting a single MED profile out of the data matrix comprising the set of measurements.

Deconvolution Methods
The first deconvolution method applied to MED technique was sensitive phase detection (SPD). SPD is a MA approach widely applied since decades to spectroscopic data (modulation enhanced spectroscopy-MES), which can be applied to systems linearly responding to periodic stimuli. SPD projects the system response from time domain to phase domain by using a reference periodic-typically sinusoidal-signal, through the following equation: where ϕ is the SPD phase angle, y i (x) are the n measurements collected at the time i, and k is the order of the demodulation. The variable x can be the angular variable 2ϑ in case of X-ray powder diffraction measurements or the reflection h in case of single-crystal diffraction data. Equation (12) represents a demodulation in the frequency space of the system response, collected in the time space. Demodulation at k = 1, i.e., at the lower frequency of the reference signal, allows the extraction of kinetic features, involving both active and silent atoms; demodulation at k = 2, i.e., at double the lower frequency of the reference signal, allow to single out contribution from only active atoms. This was demonstrated in [35] and comes from the unique property of X-ray diffraction that the measured intensities I h depends on the square of the structure factors F h . A pictorial way to deduce this property is given in Figure 4: silent and active atoms respond in a different way to the external stimulus applied to the crystal. This can be parameterized by assigning different time dependences to their structure factors, respectively F S and F A . For sake of simplicity, we can assume that silent atoms do not respond at all to the stimulus, thus F S remain constant during the experiment, while active atoms respond elastically, thus F A has the same time dependence of that of the stimulus applied (supposed sinusoidal in Figure 4). In these hypotheses, the diffraction response can be divided in three terms, the first being constant, the second having the same time dependence of the stimulus, the third having a doubled frequency with respect to that of the stimulus and representing the contribution of active atoms only. PSD performed incredibly well for the first case study considered, where the adsorption of Xe atoms in an MFI zeolite was monitored in situ by X-ray diffraction, by using pressure or temperature variations as external stimulus [48,49]. However, it was immediately clear that this approach had very limited applications, since very few systems, only in particular conditions, have a linear response -mandatory to apply PSD for the demodulation-to periodic stimuli, and it is easier to apply non-periodic stimuli.
Crystals 2021, 11, x FOR PEER REVIEW 10 of 21 assume that silent atoms do not respond at all to the stimulus, thus FS remain constant during the experiment, while active atoms respond elastically, thus FA has the same time dependence of that of the stimulus applied (supposed sinusoidal in Figure 4). In these hypotheses, the diffraction response can be divided in three terms, the first being constant, the second having the same time dependence of the stimulus, the third having a doubled frequency with respect to that of the stimulus and representing the contribution of active atoms only. PSD performed incredibly well for the first case study considered, where the adsorption of Xe atoms in an MFI zeolite was monitored in situ by X-ray diffraction, by using pressure or temperature variations as external stimulus [48,49]. However, it was immediately clear that this approach had very limited applications, since very few systems, only in particular conditions, have a linear response -mandatory to apply PSD for the demodulation-to periodic stimuli, and it is easier to apply non-periodic stimuli. A more general approach would allow coping with non-periodic stimuli and nonlinear system responses. To this aim, PCA was applied to MED data, with the underlying idea that in simple cases, the first principal component (PC1) should capture changes due to active and silent atoms, the second principal component (PC2) should instead capture changes due to active atoms only. In fact, if the time-dependence of the contribution from active atoms can be separated, i.e., ( ) = * ( ), then the MED demodulation shown in Figure 4 can be written in a matrix form: By comparing Equation (13) with Equation (12) it can be inferred that PCA scores should capture the time-dependence of the stimulus, while PCA loadings should capture the dependence from the 2ϑ or h variable of the diffraction pattern. In particular, we expect two significant principal component, the first having = ( ) and = ( − ), the second having = ( ) and = . Notably, constant   The diffraction response measured on the X-ray detector can be divided in two contributions from S and A atoms. This give rise to three terms, the first being constant in time, the second having the same time-dependence of the stimulus, the third varying with double the frequency of the stimulus and representing the contribution of active atoms only.

F S + F A F S cos(φ A -φ S ) + F
A more general approach would allow coping with non-periodic stimuli and nonlinear system responses. To this aim, PCA was applied to MED data, with the underlying idea that in simple cases, the first principal component (PC1) should capture changes due to active and silent atoms, the second principal component (PC2) should instead capture changes due to active atoms only. In fact, if the time-dependence of the contribution from active atoms can be separated, i.e., F A (t) = F A * g(t), then the MED demodulation shown in Figure 4 can be written in a matrix form: By comparing Equation (13) with Equation (12) it can be inferred that PCA scores should capture the time-dependence of the stimulus, while PCA loadings should capture the dependence from the 2ϑ or h variable of the diffraction pattern. In particular, we expect two significant principal component, the first having T 1 = g(t) and W 1 = 2F A F S cos(ϕ A − ϕ S ), the second having T 2 = g(t) 2 and W 2 = F A 2 . Notably, constant terms are excluded by PCA processing, since in PCA it is assumed zero-mean of the data matrix columns. An example of deconvolution carried out by PCA, which is more properly referred to as decomposition of the system response, is shown in Figure 5, considering in situ X-ray powder diffraction data. The PC1 scores reproduce the time dependence of the applied stimulus, whereas PC2 scores have a doubled frequency. PC1 loadings have positive and negative peaks, depending on the values of the phase angles ϕ A and ϕ S , while PC2 loadings have only positive peaks, representing the diffraction pattern from active atoms only.
1, x FOR PEER REVIEW 11 of 21 terms are excluded by PCA processing, since in PCA it is assumed zero-mean of the data matrix columns. An example of deconvolution carried out by PCA, which is more properly referred to as decomposition of the system response, is shown in Figure 5, considering in situ X-ray powder diffraction data. The PC1 scores reproduce the time dependence of the applied stimulus, whereas PC2 scores have a doubled frequency. PC1 loadings have positive and negative peaks, depending on the values of the phase angles and , while PC2 loadings have only positive peaks, representing the diffraction pattern from active atoms only. In this framework, a rationale can be figured out, where the PC1 term is like the k = 1 PSD term and the PC2 term is like the k = 2 PSD term. This correspondence is strict for systems responding linearly to periodic stimuli, while for more general systems PCA is the only way to perform decomposition. This new approach was successfully applied to different case studies [50,51] outperforming the PSD method.
Another important advancement in MED development was to introduce adapted variants of existing MA methods. In fact, PCA decomposition cannot be accomplished as outlined above for complex systems, where several parts (sub-structures) vary with different time trends, each of them captured by specific components. A signature for failure of the standard PCA approach is a high number of principal components describing nonnegligible data variability, and failure to satisfy two conditions for PC1 and PC2, descending from the above-mentioned squared dependence of measured intensity from structure factors. In the standard PCA case, PC2 loadings, which describe the data variability across reflections (or 2ϑ axis in case of powder samples), are expected to be positive, given the proportionality with squared structure factors of only active atoms. Moreover, PC1 and PC2 scores, which describe the data variability across measurements, are expected to follow the relation: as changes for active atoms vary with a frequency doubled with respect to that of the stimulus. A MA approach adapted to MED analysis has been developed, where constraints on PC2 loadings and on PC1/PC2 scores according to Equation (14) are included in the PCA decomposition. This gave rise to a new MA method, called orthogonal constrained component rotation (OCCR), which aims at finding the best rotations of the principal components determined by PCA, driven by figure of merits based on the two MED constraints. The OCCR approach allows overcoming the main limitation of PCA, i.e., to blind search- In this framework, a rationale can be figured out, where the PC1 term is like the k = 1 PSD term and the PC2 term is like the k = 2 PSD term. This correspondence is strict for systems responding linearly to periodic stimuli, while for more general systems PCA is the only way to perform decomposition. This new approach was successfully applied to different case studies [50,51] outperforming the PSD method.
Another important advancement in MED development was to introduce adapted variants of existing MA methods. In fact, PCA decomposition cannot be accomplished as outlined above for complex systems, where several parts (sub-structures) vary with different time trends, each of them captured by specific components. A signature for failure of the standard PCA approach is a high number of principal components describing nonnegligible data variability, and failure to satisfy two conditions for PC1 and PC2, descending from the above-mentioned squared dependence of measured intensity from structure factors. In the standard PCA case, PC2 loadings, which describe the data variability across reflections (or 2ϑ axis in case of powder samples), are expected to be positive, given the proportionality with squared structure factors of only active atoms. Moreover, PC1 and PC2 scores, which describe the data variability across measurements, are expected to follow the relation: as changes for active atoms vary with a frequency doubled with respect to that of the stimulus.
A MA approach adapted to MED analysis has been developed, where constraints on PC2 loadings and on PC1/PC2 scores according to Equation (14) are included in the PCA decomposition. This gave rise to a new MA method, called orthogonal constrained component rotation (OCCR), which aims at finding the best rotations of the principal components determined by PCA, driven by figure of merits based on the two MED constraints. The OCCR approach allows overcoming the main limitation of PCA, i.e., to blind searching components that are mutually orthogonal. It has been successfully applied to characterize water-splitting processes catalyzed by spinel compounds [43] or to locate Xe atoms in the MFI zeolite to an unprecedented precision [50].
A further advancement in the route of routinely applying PCA to MED was done in [51,52], where the output of PCA was analytically derived, and expected contribution to scores and loadings to specific changes in crystallographic parameters have been listed. This would make easier the interpretation of the results of PCA decomposition.

Applications in Powder X-ray Diffraction
X-ray powder diffraction (XRPD) offers the advantage of very fast measurements, as many crystals are in diffracting orientation under the X-ray beam at the same time. A complete diffraction pattern at reasonable statistics can be acquired in seconds, even with lab equipment and such high measurement rates enable the possibility to perform in situ experiments, where repeated measurements are made on the same sample, while varying some external variable, the most simple one being time [53]. On the other hand, the contemporary diffraction from many crystals makes it difficult to retrieve structural information about the average crystal structure by using phasing methods. In this context, MA has great potential, since it allows extracting relevant information from raw data, which could be used to carry out the crystal structure determination process on subset of atoms, or to characterize the system under study without any prior structural knowledge. Some specific applications of MA on powder X-ray diffraction are reviewed in the following sections. The two main fields of applications are the extraction of dynamic data (Section 4.1) in a temporal gradient and qualitative and quantitative analysis (Section 4.2) in presence of concentration gradients.

Kinetic Studies by Single or Multi-Probe Experiments
The advent of MA has also revolutionized kinetic studies based on in situ powder X-ray diffraction measurements. Several works have used MA to extract efficiently the reaction coordinate from raw measurements [54][55][56]. Moreover, in a recent study of solid-state reactions of organic compounds, the classical approach of fitting the reaction coordinate by several kinetic models has been replaced by a new approach, which embeds kinetic models within the PCA decomposition [57,58]. The discrimination of the best kinetic model is enhanced by the fact that it produces both the best estimate of the reaction coordinate and the best agreement with the theoretical model. MCR-ALS is another decomposition method that had been employed in the last few years in several kinetic studies on chemical reactions [59][60][61].
MA is even more powerful when applied to datasets acquired in multi-technique experiments, where X-ray diffraction is complemented by one or more other techniques. The MA approach allows, in these cases, to extract at best the information by the different instruments, probing complementary features. In fact, the complete characterization of an evolving system is nowadays achieved by complex in situ experiments that adopt multi-probe measurements, where X-ray measurements are combined with spectroscopic probes, such as Raman FT-IR or UV-vis spectroscopy. In this context, covariance maps have been used to identify correlated features present in diffraction and spectroscopic profiles. Examples include investigations of temperature-induced phase changes in spin crossover materials by XRPD/Raman [36] perovskites by combining XRPD and pair distribution function (PDF) measurements [62], characterization of hybrid composite material known as Maya Blue performed by XRPD/UV-Vis [63] and metabolite-crystalline phase correlation in wine leaves by XRPD coupled with mass spectrometry and nuclear magnetic resonance [64].
To highlight the potentialities of the application of PCA to unravel dynamics from in situ X-ray diffraction studies, a recently published case study [52] is briefly described. In this paper, the sedimentation of barium sulfate additive inside an oligomeric mixture was studied during polymerization by repeated XRPD measurements with a scan rate of 1 • /min. The kinetic behavior of the sedimentation process was investigated by processing the high number (150) of XRPD patterns with PCA, without using any prior structural knowledge. The trend in data can hardly be observed by visual inspection of the data matrix (Figure 6a), as the peak intensity does not change with a clear trend and their positions drift to lower 2ϑ values during the data collection. PCA processes the whole data matrix in seconds, obtaining a PC that explains the 89% of the system's variance (Figure 6b). The loadings associated to this PC resemble the first derivative of an XRPD profile (Figure 6c), suggesting that this PC captured the changes in data due to barite peak shift toward lower 2ϑ values. In fact, PCA, looking for the maximum variance in the data, basically extract from the data and highlight the variations of the signal through time. PCA scores, which characterize in the time domain the variations in data highlighted by loadings, indicate a decreasing trend (Figure 6d). In fact, the peak shift is strongly evident at the beginning of the reaction, with an asymptotic trend at the end of the experiment. Scores in Figure 6d can thus be interpreted as the reaction coordinate of the sedimentation process. In ref. [52], the comparison of PCA scores with the zero-error obtained by Rietveld refinement confirmed that the dynamic trend extracted by PCA is consistent with that derived from the traditional approach. However, the time for data processing (few seconds with MA, few hours with Rietveld Refinement) makes PCA a very useful complementary tool to extract dynamic and kinetic data while executing the experiment, to check data quality and monitor the results without a priori information.
Crystals 2021, 11, x FOR PEER REVIEW 13 of 21 studied during polymerization by repeated XRPD measurements with a scan rate of 1°/min. The kinetic behavior of the sedimentation process was investigated by processing the high number (150) of XRPD patterns with PCA, without using any prior structural knowledge. The trend in data can hardly be observed by visual inspection of the data matrix (Figure 6a), as the peak intensity does not change with a clear trend and their positions drift to lower 2ϑ values during the data collection. PCA processes the whole data matrix in seconds, obtaining a PC that explains the 89% of the system's variance ( Figure  6b). The loadings associated to this PC resemble the first derivative of an XRPD profile (Figure 6c), suggesting that this PC captured the changes in data due to barite peak shift toward lower 2ϑ values. In fact, PCA, looking for the maximum variance in the data, basically extract from the data and highlight the variations of the signal through time. PCA scores, which characterize in the time domain the variations in data highlighted by loadings, indicate a decreasing trend (Figure 6d). In fact, the peak shift is strongly evident at the beginning of the reaction, with an asymptotic trend at the end of the experiment. Scores in Figure 6d can thus be interpreted as the reaction coordinate of the sedimentation process. In ref. [52], the comparison of PCA scores with the zero-error obtained by Rietveld refinement confirmed that the dynamic trend extracted by PCA is consistent with that derived from the traditional approach. However, the time for data processing (few seconds with MA, few hours with Rietveld Refinement) makes PCA a very useful complementary tool to extract dynamic and kinetic data while executing the experiment, to check data quality and monitor the results without a priori information.

Qualitative and Quantitative Studies
Having prompt hints about main trends in data is beneficial in many applications and can facilitate subsequent structural analysis. Qualitative and/or quantitative analysis on sets from X-ray powder diffraction profiles can be achieved by both FA [65,66] and PCA, particularly from the score values of the main components. This approach is common in the -omics sciences in the analytical chemistry field. Euclidean distances in PC score plots can be calculated to obtain also a semi-quantitative estimation between groups or clusters [67,68]. In fact, the arrangement of representative points in the scores plot can be used to guess the composition of the mixture. An example of application of this approach is given in Figure 7. Figure 7a reports the original X-ray diffraction data collected on pure phases (A, B, C), the binary mixtures (AB, BC, AC) 50:50 weight and ternary mixture ABC prepared by properly mixing calcium carbonate (A), acetylsalicylic acid (B) and sodium citrate (C). PCA analysis was carried out and the corresponding PC scores (Figure 7b) and loadings (Figure 7c) are reported. It is evident that the PCA scores are sensible to %weight of the samples since they form a triangle with the monophasic datasets (A, B, C) at the vertices, the binary mixtures (AB, BC, AC) in the middle of the edges and the ternary mixture ABC in its center. All the possible mixtures are within the triangle, in the typical representation of a ternary mixture experimental domain. PCA recognizes the single-phase contributions by the PC loadings, containing the XRPD pattern features, highlighted in Figure 7c. PC1 loadings have positive peaks corresponding to phases B and C, and negative ones to phase A. PC2 loadings show positive peaks corresponding to phase B and negative ones to phase C, while phase A has a very moderate negative contribution, being located close to 0 along the PC2 axis in Figure 7b.
PCA scores in Figure 7b can be exploited to perform a semi-quantitative analysis, since the above described dependence of the topology of data points in the scores plot is related to the composition. The procedure is easy for binary and ternary mixtures where a 2D plot (Figure 7b) can be exploited to calculate Euclidean distances ( Figure 7d) and thus compositions (Table 1). For instance, the A phase amount within the ABC mixture, identified by the F score in Figure 7d, is given by the ratio AJ/AC. Given any score in the triangle, the corresponding phase amounts can be calculated accordingly. The comparison between compositions (Table 1) from PC scores (not requiring a priori information) and multilinear regression, exploiting the knowledge of profiles of the pure phases, is impressive and very promising for wide applications in quality control where solid mixtures are involved. Quantification errors can occur if data are not perfectly positioned in the triangle, such as the AC mixture in Figure 7b that is slightly out of the polygon. In such case, the sum of the estimated phase amounts can exceed or in general be different from 1 as a result of the unrealistic negative quantification of the third phase B. Deviations from unity of the sum of estimated phase amounts can be used as indicator of systematic errors, due for example to the presence of amorphous content in the samples, phases with different X-ray absorption properties or missing phases in the model [69]. mixture ABC in its center. All the possible mixtures are within the triangle, in the typical representation of a ternary mixture experimental domain. PCA recognizes the singlephase contributions by the PC loadings, containing the XRPD pattern features, highlighted in Figure 7c. PC1 loadings have positive peaks corresponding to phases B and C, and negative ones to phase A. PC2 loadings show positive peaks corresponding to phase B and negative ones to phase C, while phase A has a very moderate negative contribution, being located close to 0 along the PC2 axis in Figure 7b. Figure 7. Original data (a), PC scores (b), PC loading (c) and calculation of Euclidean distances from PC scores (d) from X-ray powder diffraction data by ternary mixtures. Data points A, B, and C correspond to powder diffraction profiles from monophasic mixtures, data points AB, AC and BC to binary mixtures and ABC to a mixture containing an equal amount of the three crystal phases. Table 1. Results of the quantification of the phases in polycrystalline mixtures represented in Figure 6 by PCA scores and multilinear regression.

Sample
Geometric Estimation from PC Scores Regression The approach can be extended to more than 3 phases. With four phases, the analysis can still be carried out graphically, even if a 3D plot is necessary, with increased difficulties in its representation and analysis. The graphical approach is of course impossible to represent without simplification or projections with 5 or more phases. In these cases, a non-graphic and general matrix-based approach can be used, as proposed by Cornell [70].
Quantitative analysis from raw data (Figure 6a) can be carried out exploiting a different approach, i.e., using least-squares calculations. Each individual experimental pattern is fitted by using a linear regression model: where i = 1, . . . , n runs for all the crystal phases present in the sample, y i (x) is the experimental profiles of the i-th pure crystal phases and b(x) is the background estimated from the experimental pattern. The weight fraction can be derived from the coefficients a i estimated from the fitting procedures. This approach is followed by the RootProf program, where it can be also used in combination with a preliminary PCA step aiming at filtering the experimental patterns to highlight their common properties [69]. Results obtained by the PCA + least squares approach have been compared with those obtained by supervised analysis based on PLS for a case study of quaternary carbamazepine-saccharin mixtures monitored by X-ray diffraction and infrared spectroscopy [71].
As a final note concerning recent developments in this field, the deep-learning technique based on Convolutional Neural Network (CNN) models has been applied for phase identification in multiphase inorganic compounds [72]. The network has been trained using synthetic XRPD patterns, and the approach has been validated on real experimental XRPD data, showing an accuracy of nearly 100% for phase identification and 86% for phase-fraction quantification. The drawback of this approach is the large computational effort necessary for preliminary operations. In fact, a training set of more than 1.5 × 10 6 synthetic XRPD profiles was necessary, even for a limited chemical space of 170 inorganic compounds belonging to the Sr-Li-Al-O quaternary compositional pool used in ref [72].

Applications in Single-Crystal X-ray Diffraction
Single-crystal diffraction data are much more informative than X-ray powder diffraction data as highlighted by Conterosito et al. [52], as they contain information about the intensity of each diffraction direction (reflection), which is usually enough to obtain the average crystal structure by phasing methods. By contrast, the measurement is longer than for powder X-ray diffraction, since the crystal must be rotated during the measurement to obtain diffraction conditions from all reflections. For this reason, radiation damage issues can be very important for single-crystal X-ray diffraction, and in fact this prevents carrying out in situ studies on radiation-sensitive molecules, as often observed in organic and bio-macromolecular systems. MA can still be useful for single crystal diffraction data, since it helps in solving specific aspects that will be highlighted in the following paragraphs. Regarding MED applications, it is worth mentioning that single-crystal X-ray diffraction has an intrinsic advantage over powder X-ray diffraction, related to the effect of lattice distortions during measurements: in a powder diffraction pattern a change of crystal cell parameters manifest itself in a shift in peak positions, which is usually combined with changes in peak shape and height due to variations of the average crystal structure. The combined effect of lattice distortions and structural changes makes difficult to interpret MA results. In a single-crystal pattern, the intensity of individual reflections is not affected by crystal lattice distortions, and crystal cell parameters are determined by indexing individual patterns measured at different times, so that they are not convoluted with structural variations.

Multivariate Approaches to Solve the Phase Problem
As noticed by Pannu et al. [73], multivariate distributions emerged in crystallography with the milestone work by Hauptman and Karle [74], dealing with the phase problem solution. The phase problem is "multivariate" by definition because the phases of the reflections, needed to solve a crystal structure, are in general complex numbers depending on the coordinate of all atoms. In other words, the atomic positions are the random variables related to the multivariate normal distribution of structure factors. After these considerations, statistical methods like joint probability distribution functions were of paramount importance in crystallography in general [75] developing direct methods, used for decades, until today, for crystal structure solution [76]. Despite these premises, the methods more diffused in analytical chemistry such as PCA were rarely exploited to solve the phase problems. The more closely related approach is that of maximum likelihood, exploited to carry out the structure solution and heavy-atom refinement. In the framework of the isomorphous replacement approach [77] the native and derivative structure factors are all highly correlated: to eliminate this correlation, covariance is minimized [73,78,79]. PCA in a stricter sense was used to monitor protein dynamics in theoretical molecular dynamics [80] and in experimental in situ SAXS studies [81] and recently in in situ single crystal diffraction data [51]. Even if much less applications of PCA to in situ X-ray single crystal with respect to powder diffraction data can be found, a huge potential of application also in in situ single crystal diffraction is envisaged in the next decades.

Merging of Single-Crystal Datasets
A common problem in protein crystallography is to combine X-ray diffraction datasets taken from different crystals grown in the same conditions. This operation is required since only partial datasets can be taken from single crystals before they get damaged by X-ray irradiation. Thus, a complete dataset (sampling the whole crystal lattice in the reciprocal space) can be recovered by merging partial datasets from different crystals. However, this can be only accomplished if merged crystals have similar properties, i.e., comparable crystal cell dimensions and average crystal structures. To select such crystals, clustering protocols have been implemented to identify isomorphous clusters that may be scaled and merged to form a more complete multi-crystal dataset [82,83]. MA applications in this field have a great potential, since protein crystallography experiments at X-ray free-electron laser (XFEL) sources are even more demanding in terms of dataset merging. Here thousands to millions of partial datasets from very tiny crystals are acquired in few seconds, which needs to be efficiently merged before further processing may get structural information [84].

Conclusions and Perspectives
Multivariate methods in general and PCA approach in particular are widely used in analytical chemistry, but less diffused in materials science and rarely used in X-ray diffraction, even if their possible applications have been envisaged already in the 50s of the last century [74], in relation to the phase problem solution. More systematic MA applications in Crystallography started appearing in the literature since about 2000 on and in the last two decades some groups started working with a more systematic approach to explore potentialities and limitations of multivariate methods applications in crystallography, focusing the efforts mainly in powder diffraction data analysis. Among the various approaches, multilinear regression and PCA are the main methods showing huge potentialities in this context.
Two main fields of application are envisaged: e.g., fast on-site kinetic analysis and qualitative/semi-quantitative analysis from in situ X-ray powder diffraction data. In details, MA has the ability to extract the dynamics in time series (i.e., in measurements monitoring the time evolution of a crystal system) or qualitative and quantitative information in concentration gradients (i.e., in samples showing different composition). Typically, the presented methods have the advantages, with respect to traditional methods, of needing no or much less a priori information about the crystal structure and being so efficient and fast to be applied on site during experiment execution. The main drawbacks of MA approaches lie on the fact that they exploit data-driven methods and do not claim to provide physical meaning to the obtained results. Outputs from PCA, MCR or FA, if not properly driven or interpreted may lead to wrong or unreasonable results.
Applications in single crystal diffraction started appearing in the last years and huge potentialities are foreseen, especially in serial crystallography experiments where the data amount is very large and unfeasible for the traditional approaches. Crystal monitoring and dataset merging are the more promising approaches in single crystal diffraction, where huge potentialities are envisaged. Kinetic analysis in singles crystal diffraction is also envisaged as a possible breakthrough field of applications, since the evolution of source and detectors suggest for the next decade a huge growth, as seen for in situ powder diffraction in the last decades of the 20th century. MA methods can for sure play a key role, because the huge amount of data make the traditional approaches unfeasible.
This review is centered on PCA, MCR, FA and PSD since they are the approaches employed in X-ray diffraction data analysis in the last decade. However, also other MA approaches have many potentialities in the field and wait to be tested on X-ray diffraction data. Statistical methods have revolutionized the field in the past, producing disrupting advances in the crystal structure determination of small molecules and biological macromolecules. We expect to be close to a similar turning point regarding applications of artificial intelligence to solve complex problems related to basic crystallography, such as merging thousands of datasets from different crystals or finding phase values to diffraction amplitudes starting from random values. Neural networks have a great potential to tackle these problems and accomplish the non-linear deconvolution of multivariate data, provided a proper training is given.
The spread of use of MA methods is intimately connected with availability of software with friendly graphic interfaces available to scientific community. Programs like Root-Prof [90] or MCR-ALS 2.0 toolbox [91] have contributed to this goal, making MA available to scientists carrying out X-ray diffraction studies.
The possibility of exploiting automated processes-highly demanding in terms of computational resources, but very fast and with minimal user intervention-is highly attractive both for industrial applications, where X-ray diffraction is used for quality control and monitoring of production processes, and in fundamental research applications, where extremely fast data analysis tools are required to cope with very bright X-ray sources and detectors with fast readouts.