The Rietveld Refinement in the EXPO Software : A Powerful Tool at the End of the Elaborate Crystal Structure Solution Pathway

The Rietveld method is the most reliable and powerful tool for refining crystal structure when powder diffraction data are available. It requires that the structure model to be adjusted is as close as possible to the true structure. The Rietveld method usually represents the final step of the powder solution process, in particular when a new structure is going to be determined and published. EXPO is a software able to execute all the steps of the solution process in a mostly automatic way, by starting from the chemical formula and the experimental diffraction pattern, passing through computational methods for locating the structure model and optimizing it, and ending to the Rietveld refinement. In this contribution, we present the most recent solution strategies in EXPO, both in reciprocal and direct space, aiming at obtaining models suitable to be refined by the Rietveld method. Examples of Rietveld refinements are described, whose results are related to different solution procedures and types of compounds (organic and inorganic).


Introduction
The Rietveld method [1], born from the simple and brilliant idea of refining crystal structure together with parameters describing the diffraction profile employing directly the profile intensities, has been one of the most innovative and still now widely applied methods for studying materials from powder diffraction data.It has given a great impulse to the process of crystal structure solution by powder diffraction data, expanding the fields of application of powder diffraction which, up to the end of the 1970's, was primarily used for qualitative and semiquantitative analysis.Powder diffraction without the Rietveld method would be much less popular.
The method was proposed in 1969 as best suited for neutron powder techniques, and, in particular, refining nuclear and magnetic structures, but later, it was extended to X-ray powder diagrams and used for different kinds of analysis: structural, including lattice parameters, atomic positions and occupancies, temperature vibrations (isotropic and anisotropic); quantitative phase; grain size and micro-strain (isotropic and anisotropic); stacking and twin faults.The present paper focuses on the Rietveld method for structure refinement.
The two main steps in structural analysis are solving the structure and refining the model [2][3][4].The single-crystal-like procedures for structure solution and refinement require that the estimates of the structure factor moduli derived from the diffraction experiment and associated to reflections are Crystals 2018, 8, 203 2 of 18 reliable.Indeed, in the case of a single crystal, this condition is wellaccomplished and the two steps are usually carried out in a straightforward manner.In particular, the experimental moduli are efficiently used as observations in the nonlinear least-squares calculation for refining the structural parameters.
In the case of powder diffraction peak overlap, background not always simple and easily described, preferred orientation and limited experimental resolution are concurrent problems that compromise, sometimes considerably, the process of extracting the structure factor moduli from the experimental powder diffraction pattern [5], and lower the reliability of the moduli estimates.In this scenario, using the experimental structure factor moduli as observations in the least-squares minimization for refining structural parameters can be unsuccessful, or at least it needs a powerful weighting scheme of reflections for compensating for the low accuracy of the extracted moduli [6].
The Rietveld method, which uses as observations the profile intensities of the full experimental pattern, is based on the minimization of the following quantity: where the summation is over the number of experimental diffraction profile intensities, w i is a suitable weight associated to each Y oi observed intensity (usually equal to 1/Y oi ) and Y ci is the corresponding calculated intensity.Y ci is described by analytical functions, which depend on structural and profile parameters, among which we mention: atomic coordinates, occupancies, displacement factors, lattice, background, peak shape, peak width, preferred orientation, sample displacement, sample transparency, and zero-shift error.These parameters are all variables to be refined by least squares.Due to the dependence on numerous variables, the minimization process can fall into false minima or diverge.Such a risk can be reduced and possibly avoided if the quality of the experimental diffraction pattern is good, peak and background are described by suitable functions and, especially, the structure model makes physical and chemical sense.For a good outcome of the Rietveld method, it is required that the structure model to be adjusted is as close as possible to the true one.
According to (1) formula, the progress of the Rietveld minimization can be monitored by figures of merit giving the agreement between the observed and calculated intensities, among which we mention: A small R wp value is an indicator of successful minimization and, if the trap of false minimum has been skipped, successful refinement and reliable final structure model.
Several computing programs have been developed for the Rietveld refinement [7].Some of them are of general use, some are devoted to specific classes of structures (zeolites, polymers, etc.).Some packages are able to execute not only the Rietveld refinement.The software EXPO [8] belongs to this last category, being dedicated to the full pathway of structure solution by powder diffraction data.It needs only the experimental powder pattern (collected by conventional or synchrotron X-ray or neutron diffraction) and the knowledge of the chemical formula of the sample to be investigated.Supported by a high level of automatism, EXPO is able to: • determine the cell parameters; • identify the space group; • solve the structure ab initio in the reciprocal space via direct methods (extract the integrated intensities from the experimental profile for estimating the observed structure factor modulus associated to each reflection, probabilistically evaluate the phases of reflections, calculate the electron density map for deriving the structure model, optimize the model); • solve the structure in the direct space (search the best model, compatible with the expected molecular geometry, by global optimization methods; • refine the structure model by the Rietveld method.
The EXPO graphic interface is operational to follow the evolution of the solution process, to check the results obtained in each step, and to select, if necessary, alternative nonautomatic procedures.
Recovering a structure model that makes physical and chemical sense and is close to the true one is often an arduous task whether we adopt the reciprocal space solution or the direct space option.In the first case, the main difficulty lies in the low accuracy of the extracted structure factor moduli which are actively used in the phasing process (see Section 2.1); in the second case, the difficulty is in the choice of the geometrically compatible model and in the efficiency of the optimization method (see Section 2.2).
EXPO is supported by several powerful methodological and computational procedures all aiming at providing a structure model worthy to be submitted to the last step of the solution pathway: the Rietveld refinement.In the present paper, we describe briefly the EXPO strategies and show examples of Rietveld refinement by EXPO.

The Crystal Structure Solution Process in EXPO
The role of the solution process which aims to obtain a complete structure model not far from the true one is relevant for the good outcome of the Rietveld refinement.After the cell parameters and space group have been recognized, the most widely used approaches for solving structure by powder diffraction data are [9]: (a) Reciprocal Space (RS) methods, in particular direct methods (they are single-crystal-like).This is the standard solution option in EXPO.The RS methods are fast in the execution time and need minimal information, but they strongly depend on the quality of the structure factor moduli extracted from the experimental pattern and on the experimental resolution.They are usually effectively combined with structure model optimization procedures which move from reciprocal to direct space and vice versa and are based on Fourier map and least-squares calculations; (b) direct space (DS) methods, also called global optimization methods.They are trial-and-error approaches which use the profile intensities, do not depend on the structure factor moduli and do not require high experimental resolution, but they need a priori knowledge of the connectivity or coordination of any of the atoms in the structure under study and, usually, are time consuming.
In recent years, the complementary features of the RS and DS methods, well supported by the availability of synchrotron light sources and fast computers, the advances in experimental devices, and the development of innovative methods and computing algorithms, have contributed to increase the number of structures successfully solved by powder data.The software package EXPO [8] is able to execute the solution process by ab initio methods and/or global optimization methods.The model generated by the solution process is then submitted to the Rietveld refinement.
If the model is resulted from the ab initio methods, even if completed and optimized by one or more of the structure model optimization strategies included in EXPO, it is usually approximate (e.g., atom positions are not very close to the true ones, rings are distorted, etc.) and, often, not fully compatible with the conditions required by the Rietveld refinement.Sometimes, information on the expected molecular geometry can be useful for restraining bond distances, angles and planes during the refinement process in order to reduce the possibility of falling into false minima.
The model derived from the global optimization methods benefits from the geometrical information on which it is based and is more suited to the Rietveld refinement, under the hypothesis that the assumed molecular geometry is not far from the true one.
A brief description of the two kinds of methods and the main features of the Rietveld refinement available in the EXPO software follow.

Solution in Reciprocal Space
Structure solution approaches working in reciprocal space, known also as traditional or ab initio or phasing methods, like direct methods (DM) and Patterson methods [10], have been firstly developed and widely applied to solve crystal structures by single-crystal data, then effectively exported to powder data.
The reciprocal space (RS) methods require minimal a priori information (i.e., the experimental powder diffraction pattern and chemical formula in addition to cell parameters and space group) and less computing time.They need the integrated intensities I of the individual reflections to be extracted from the powder diffraction pattern.Two reference approaches are devoted to this scope: the method proposed by Pawley [11] and the other from Le Bail and coworkers [12] (the latter was adopted by EXPO).The Pawley method treats the integrated intensities as variables to be refined (in addition to cell, profile and background parameters) in the least-squares that minimize the residual (1).The limit of the method is to consider I as independent observations, an assumption that can lead to refinement instability and negative intensity values.The Le Bail method is inspired by a Rietveld formula that portions the total observed intensity of a cluster of overlapping reflections proportionally to the calculated values of the single intensities.This approach reveals extremely fast and convergent but tends to the equipartition of the overall observed intensity in case of strongly overlapping reflections.
The dependence of the RS methods on the integrated intensities is a drawback, due to the unavoidable errors on the I values caused by the powder typical problems (i.e., overlap of reflections, wrong background description, preferred orientation effects, etc. see Section 1).Additional critical points of the RS approaches are related to: (i) Data resolution.Experimental diffraction data far away from the atomic resolution can prevent the success of the structure solution process or lead to a poor electron density map, and can be difficult to interpret; (ii) Structure complexity.RS methods are usually able to solve successfully crystal structures with number of atoms in the asymmetric unit (N au ) up to about 40 [13].N au >> 40 is a challenging task for RS methods.
Because of the mentioned limits, the RS methods often do not provide a structure model complete and with well-located atom positions: some atoms are missed, and some others are far or very far from the true ones.Since the success of the Rietveld method is critically dependent on the completeness and accuracy of the structure model, several advanced procedures have been developed and introduced in EXPO for enhancing the power of DM and the quality of the obtained structure model.We briefly describe the basic principles of direct methods used by EXPO for obtaining the structure model and its main strategies for optimizing the model up to Rietveld method expectations.

• Direct methods basic principles
In the solution process by RS methods, the crystal structure is determined by calculating the electron density given by the inverse Fourier Transform of the structure factors F h : The diffraction experiment provides, for each h reflection, the integrated intensity I h that is directly proportional to |F h | 2 .The structure factor F h is a complex quantity whose phase ϕ h is not obtained by the diffraction experiment [10].
Direct methods are phasing methods being able to estimate ϕ h .The basic idea of DM is that the information on phases is contained on the experimental structure factor moduli, and can be derived directly (as their name suggests) from them via mathematical relationships.
These two assumptions lead to algebraic relationships involving structure factors [14,15], from which information on phases can be recovered.Wilson [16,17] opened the door to the probabilistic approach based on an additional third assumption: the atomic positions are random variables uniformly distributed on the unit cell.Since the experimental diffraction intensities are not on absolute scale and depend on the scattering angle θ, Wilson proposed a method, known as Wilson plot method, that, starting from the experimental structure factor moduli, is able to determine a scale factor K (so that the experimental |F h | can be put on absolute scale) and the average isotropic displacement parameter B. Once K and B have been estimated, it is possible to calculate from |F h | the normalized structure factors moduli |E h | that are key magnitudes for DM.They have the great advantage to be independent of θ and correspond to an idealized point atom structure.
In the case of powder data, non-negligible errors on I h can result in scarce accuracy on |F h | and, consequently on |E h |, as well as in an unreliable phasing process and approximate result from (3).
EXPO is able to automatically carry out all the following main steps of DM: (1) Normalization: the integrated intensities are normalized by the Wilson method and the |E h | values are calculated.Statistical analysis on |E h | is performed in order to detect: (a) the presence or absence of an inversion center; (b) the possible presence and type of pseudotranslational symmetry [18]; (c) the preferred orientation effects [19].The largest reflections, the so called strong reflections, whose number is N larg , are considered because they strongly contribute to the DM phasing process (see next point 2).The default number of strong reflections to be phased (N phas ) is automatically calculated by EXPO by taking into account the number of atoms in the asymmetric unit and the type of symmetry.N phas should be at most N larg ; if N phas > N larg , EXPO sets N phas = N larg ; (2) structure invariants (s.i.) calculation: s.i. are magnitudes that are independent of change of the origin and depend only on the structure.They are fundamental in the phasing process.For example, a s.i. of order n is the product of n normalized structure factors E h 1 E h 2 . . . . . .E h n with h 1 + h 2 + . . .+ h n = 0.In the phasing process, a special role is occupied by triplet invariants is large for strong reflections) and, in the case of a crystal structure with N non-H equal atoms in the unit cell, is inversely proportional to N: with the increasing of N, G h,k becomes negligible and, consequently, the probability of DM failure increases.G h,k can be positive or negative; if positive, P 10 (Φ h,k ) attains its maximum at Φ h,k ∼ = 0 (i.e., positive triplets), if negative, P 10 (Φ h,k ) is maximum at Φ h,k ∼ = π (i.e., negative triplets).The positive triplets with G h,k ≥ 0.6 are stored by EXPO and are strongly involved in the phasing process; (3) Phases estimate: a milestone for DM is the tangent formula, proposed by Karle & Hautpman [21], that derives the phases of the N phas reflections starting from a subset of selected reflections (the so called starting set) and actively uses the triplets in which the reflections to be phased are embedded.The phases of the starting set can be set via a multisolution method based on magic integers [22] (this is the default choice of EXPO), or, alternatively, by a random approach starting with random phase values.DM provide several possible sets of phases that are ranked according to a suitable mathematical tool, the combined figure of merit (CFOM) [23,24], mainly based on the |E| values.The largest CFOM value phase set should correspond to the correct solution.When it does not provide a plausible and chemically interpretable structure solution, EXPO offers the graphic option to conveniently explore all the generated and stored phasing sets (their number is usually 20); (4) Electron density map calculation: the calculation of ( 3) is carried out on the largest CFOM value phase set or on some or all the DM generated and stored sets.The interpretation of ρ(r) in terms of positions and intensities of its peaks supplies the fractional coordinates and the chemical labels of the atoms in the structure.Because of uncertainties on the experimental structure factors moduli extracted from the powder profile, the entire DM process can be unsatisfactory and the final structure model approximate.Consequently, the completion and/or optimization of the DM structure model are a fundamental request for a successful and meaningful application of the Rietveld method.

• Model optimization
EXPO proposes procedures, some applied in a default way and others on request when the standard approaches fail, all aiming to improve the structure model obtained by DM in such a way that the optimized structure is close to the true solution and appropriate for the Rietveld refinement.The optimization strategies are: (1) WLSQFR (weighted least-squares Fourier recycling) [6] consists of a two-step approach alternating suitably weighted least-squares refinement (aiming to minimize the weighted squared difference between the observed and calculated intensities) and Fourier map calculations, which add missing atom positions to the refined model.The weights take into account the low accuracy of the integrated intensities estimates of the overlapping reflections and tend to prevent the domination of the refinement process by the largest intensity reflections.This optimization tool is automatically applied in a default run of EXPO in case of inorganic compounds solution.( 2) RBM (resolution bias modification).The procedure can work in direct or reciprocal space or in both spaces [25][26][27][28][29] and is a powerful approach able to reduce the errors on the electron density map mainly due to the limited experimental resolution.RBM is able to discard false peaks and recover the missing ones.It has revealed itself to be particularly effective for organic and metal-organic compounds (for them it is the default choice of EXPO).(3) COVMAP [30], a procedure aiming to correct the electron density map by exploiting the principle of covariance between two points of the map, nevertheless its quality.It can locate the missing atoms by modifying the electron density map taking into account some basic crystal chemistry rules, in particular, the expected bond distances between couples of atoms.(4) Shift_and_Fix [31], the last developed approach and effectively introduced in EXPO based on the optimization of the models derived from all the stored DM phasing sets that are automatically and sequentially processed and analyzed.Shift_and_Fix consists of two main steps, cyclically combined: The Fix step, which carries out a weighted least-squares refinement of the shifted model, followed by Fourier map calculations whose coefficients are functions of the chemical content of the compound under study.
COVMAP and Shift_and_Fix are not default procedures of EXPO and can be fruitfully applied to organic, inorganic and metal-organic crystal structures.
If hydrogen atoms are present in the structure under investigation (the solution process by RS methods is not able to detect them), the EXPO graphical interface provides computational tools for positioning them geometrically (if possible) and/or detecting them by nonstandard Fourier map calculation (i.e., difference Fourier map).

Solution in Direct Space
Because of the unavoidable problems of the powder diffraction pattern (peak overlapping, background, preferred orientation), responsible of ambiguities in the integrated intensities of individual reflections, the reciprocal space methods are not always able to reach a correct solution or at least a solution satisfactory for the Rietveld refinement.Structure solution procedures alternative to the RS approaches result in the direct space (DS) methods where the fit of calculated versus experimental powder pattern is performed by moving trial molecule models inside the unit cell.
By avoiding the pattern decomposition into single integrated intensities, the DS approaches overcome the limit of the strong dependence of the RS methods on the quality of the structure factor moduli extracted from the profile and on the experimental resolution.On the other hand, the principal reason for their success is the incorporation of the prior knowledge on the expected molecular geometry of the compound under study.However, this request also represents the main limitation of the technique: the full molecular structure must be managed if the correct solution is to be obtained.
In the DS approaches, a starting model is supposed with well-characterized bond lengths and bond angles typically assumed as known and considered standard values during the solution process (this assumption is usually well placed for the organic compounds).The structural variables describing the starting model are so restricted to the external (position and orientation) plus the internal (torsion angles) degrees of freedom (DOFs) whose values cannot be determined a priori.Several random trial structures are generated from the starting model by varying DOFs describing the model location and orientation in the unit cell and its internal conformation.
The quality of each trial model is evaluated via a cost function (CF) that compares the powder diffraction pattern calculated from the current structure with the experimental one.The direct space strategy is to find the trial model corresponding to the lowest CF value, which is equivalent to exploring the hypersurface of CF(DOFs) to find the set of DOFs that define the structure and correspond to the global minimum in CF(DOFs).
Several global optimization algorithms have been adopted to attain this objective [32][33][34] and successfully applied for solving structures of organic, inorganic and organometallic materials.Their common limit is the computational time consumption, particularly when the number of DOFs is large.Among the DS approaches, the most widely used is the Simulated Annealing (SA) [35][36][37][38][39].
Two alternative DS structure solution algorithms have been implemented in EXPO: (1) a classical Simulated Annealing (default choice); (2) the most innovative GHBB-BC global optimization method [40].They both require a starting trial structure model compatible with the expected molecular connectivity information which can be built up by using a molecular editor program or retrieved by literature in the case that a similar molecule has been already published.The hydrogen atoms (their contribution to X-ray scattering is weak) present in the starting trial model can be omitted in the DS process in order to reduce the number of DOFs and the computational time, which increases with the number of atoms.EXPO is able to read the starting model in different formats: MDL Molfile, MOPAC file, Tripos Sybyl file, Crystallographic Information File (CIF), Protein Data Bank (PDB) file, XYZ format, etc. (see the EXPO manual for more details).EXPO itself offers graphical tools for adding building blocks (tetrahedron, octahedron, square plane, cube, trigonal plane, antiprism tetragonal, prism trigonal, icosahedron, isolated atom).
The default solution process by DS methods consists of ten runs, which are automatically performed.During each run, a visual matching among observed, calculated and difference profile, together with the current trial structure model, is plotted.A plot of the CF value depending on the evolution of the process can be also monitored.
During each run, EXPO changes the torsion angles (automatically identified by the program), the orientation, and the position of the trial model in the unit cell, while bond lengths, bond angles and ring conformation are not considered as variables.For this reason, the assumed lengths and angles should match as closely as possible to those true of the studied compound if incorrect results are to be avoided.At the end of the automatic procedure, the ten solutions are ranked according to the CF value, saved in CIF files and graphically shown.
Two cost functions can be alternatively selected for driving the process towards the attainment of the best structure solution: (1) The R wp weighted profile reliability parameter (see Equation ( 2)) which represents the default choice; (2) The R I agreement factor, which compares the experimental integrated intensities I h (obs) and the intensities I h (calc) calculated by the model: Even if Equation ( 4) requires the determination of the integrated intensities, the advantage in its use (instead of R wp ) is the consistent reduced computing time while, if the reflection overlapping is severe, the I h (obs) values can be unreliable, and the use of R wp is preferred.
The most relevant aspects of SA and GHBB-BC in EXPO are hereinafter reported.

• Simulated Annealing
Simulated Annealing is an iterative metaheuristic algorithm widely used to address discrete and continuous optimization problems.Its main advantage is that during the explorative walk on the CF hypersurface via the Monte Carlo method [33,41,42], uphill moves are allowed, providing the trial model to escape from local minima in search of the global minimum.On the contrary, one shortage of SA is the high computational cost, strongly dependent above all on the chosen annealing schedule that regulates the temperature (T) parameter control.In effect, if T is reduced too rapidly, a premature convergence to a local minimum may occur; in contrast, if it is reduced too gradually, the algorithm is very slow to converge.
At the beginning of the SA process implemented in EXPO, a dialog window reporting the general settings of the procedure (i.e., cost function, experimental resolution, random seed, temperature, etc.) is shown.All the control parameters are automatically set by the program to execute an annealing schedule which represents a compromise between the requirements of maximizing the efficiency of the algorithm and minimizing the total execution time.If necessary, all the parameters can be modified by the user to better deal with the problem under study.
The SA user-friendly graphic interface of EXPO is supported by useful tools for visualizing and checking the various steps of the solution process and eventually modifying the default choices.
The dynamical occupancy correction [39] represents another important available option, which is very useful when DS methods are applied to nonmolecular crystals for which some atoms are expected to occupy special positions and different building blocks can share one or more atoms.This option can be activated graphically.
The problem of computational time required for complex structures, in general for compounds with more than 15-20 DOFs, represents a limit for the crystal structure solution by DS: a large number of SA moves per run, and a large number of runs are required to guarantee to find the global minimum and increase the frequency of correct solutions.Fortunately, this type of calculation can be easily distributed among more CPUs by a parallel version of EXPO which has been developed and is going to be improved by using the Message Passing Interface (MPI) parallelization paradigm and available for parallel machines (ordinary laptops and desktop PCs, supercomputers) with Linux operating systems.

• GHBB-BC Method
The recently proposed GHBB-BC algorithm has been developed for improving the features of the DS search methods as well as saving computational time.Its capabilities have been optimized for the application to compounds with a number of torsion angles lower than six and not more than two fragments in the asymmetric unit.It results from a proper combination of three DS approaches: (1) The Big Bang-Big Crunch global optimization method (BB-BC) [43].It is inspired by one of the cosmological theories of the universe and involves two phases: (i) the Big Bang, corresponding to the disorder caused by the energy dissipation in which a completely random population is generated; (ii) the Big Crunch, corresponding to the order due to gravitational attraction where the population shrinks to a single good quality element represented by the centre of mass, for converging to a global optimum point.(2) The metaheuristic Greedy Randomized Adaptive Search Procedure (GRASP) [44].It is an iterative approach, particularly effective in finding empirically good quality solutions in a reasonable computational time for most of the real-world combinatorial optimization problems that are computationally difficult and have enormous solution spaces.Each GRASP iteration is made up of two phases: construction and local search.The construction phase progressively builds a set of feasible solutions from scratch; the local search phase investigates their neighbourhood until a local minimum is found.The best overall solution is kept as the final result.(3) The traditional Simulated Annealing.
The GHBB-BC computational procedure starts from an expected external structure model and performs a number of possible iterations defined according to the number of structure fragments and torsions angles present in the model.The general iterative scheme can be summarized (see [40] for details): the Big Bang phase creates the initial random population whose elements are evaluated by their corresponding R wp values; cycles of GRASP come after for improving the population according to an effective sample; the Big Crunch phase selects three representative population elements which are then considered as centres of mass from which a new Big Bang phase can restart; at most, three population elements are conveniently chosen and submitted to SA optimization; the global optimization attainment is achieved by picking up the best population element corresponding to the minimum R wp value after it has been carefully checked; the best model is accepted and possibly used for starting with a new iteration.
To complete the framework of the mainly used powder structural solution approaches, the hybrid methods [45-48] should be considered.They result from the combination of RS and DS approaches to take advantages from the best features of the two methods.Of particular interest are two hybrid nonstandard procedures implemented in EXPO and obtained by combining Direct Methods with SA [49,50] and Monte Carlo methods [51,52], respectively.
As a final step of the solution process, the model obtained by the application of one of the DS algorithms in EXPO is then submitted to the Rietveld method.The request that it is a reasonably good model in order to succeed with the Rietveld refinement and converge to the correct solution is satisfied only if the prior information about the expected molecular geometry is in agreement with the true one.This is the reason for which the DS methods are very popular for solving organic compounds: for them, it is more difficult to fail in the building of the prior model.A well formulated global optimization approach is equivalent to a 'global Rietveld refinement' [53].
Compared with the RS approaches, which usually provide models so poor and far from chemically reasonable that their optimization is mandatory, the DS methods can be more profitable for both structure solution and for the Rietveld refinement (provided that the input molecular model is accurate).Critical points are: compounds with more than one chiral center and cis/trans isomerism, a nonplanar ring system, or an unusual combination of elements in functional groups.Experience and chemical intuition are certainly required to build the starting model.

The Rietveld Refinement
The paper by McCusker and coworkers [54] is very informative on the practical aspects of the Rietveld refinement, in particular on the usually involved parameters and strategy for their variation.
EXPO is able to carry out Rietveld refinements from X-ray or neutron powder diffraction data.Its main features are briefly described.The following parameters can be adjusted during the EXPO Rietveld execution: The correction for the peak asymmetry is applied by using the semiempirical function given in [56].
Kα 1 and Kα 2 peak doublet, if present, can be modelled.(5) Crystal structure parameters: lattice parameters, atomic fractional coordinates, occupation factors, and isotropic displacement parameters.Atomic displacement parameters can be refined individually or in a group of atoms with the same atomic type or the same environment.
The nonlinear least squares are carried out by employing the damped Gauss-Newton method.A backtracking line-search procedure based on cubic interpolation is used to automatically calculate the damping factor applied to the shift on parameters in order to ensure the descendant direction in each cycle of refinement and to prevent divergence [57].The refinement convergence condition is reached when the increments on parameters become smaller with respect to their standard deviations or when the relative gradient of the χ 2 minimized function given by ( 1) is less than a tolerance value.Tolerance value and the maximum number of cycles can be suitably modified by the user.
The Le Bail technique [12] can be adopted to perform a full pattern decomposition prior to the Rietveld refinement in order to determine the starting values of parameters (background, peak shape, line-shift corrections and unit cell dimensions), and then subjected to refinement.This strategy is suggested especially if the available structure model is not completed [58] or when the starting model is too different from the target model.
The refinement process can be executed by following two alternative approaches: (1) the user can decide the refinement strategy via graphic interface; (2) an automatic refinement schedule can be applied.In the automatic mode, groups of parameters are refined according to a fixed sequence as established in the Rietveld guidelines [54,58].In the last step of refinement, all parameters are refined simultaneously.
To reduce problems due to loss of experimental information and to increase the ratio 'number of observations/number of parameters to be refined', the knowledge of molecular geometry can be introduced and exploited in the refinement in the form of restraints on bond distances, angles and planes.To simplify the setting of restraints, the program is able to extract from the connectivity of the initial model the possible restraints providing a list of current and target values.The user can select the restraints to be included in the refinement procedure and eventually modify the target values.
Constraints, defined as exact mathematical relationships between least-squares parameters, can be used to reduce the number of parameters.Symmetry constraints, required to conserve the space-group symmetry rules, are mandatory and automatically deduced and imposed by EXPO.Constraints on site occupancies and on isotropic displacement parameters may be defined by the user.Hydrogen atoms can be geometrically generated in an automatic way and constrained according to the riding model approximation [59] (this strategy is usually adopted in the single-crystal refinement): H atoms are moved synchronously with the atoms to which they are bonded, thereby preserving the bond length and direction; the isotropic displacement parameters of the hydrogens are constrained to be 1.2 times that of the heavy atom to which they are attached.The factor 1.2 can be changed by graphic interface.
Several criteria of fit are available in order to monitor the progress and evaluate the quality of Rietveld refinement [54] in EXPO: the weighted profile R-factor and the unweighted profile R-factor calculated for the full pattern (R wp , R p ); the corresponding background-subtracted figures (R wp ', R p '); the statistically expected R value (R exp ); the goodness-of-fit (R wp /R exp ); the Durbin-Watson d-statistic.R values similar to those used in the case of single-crystal data are also available: R F and the Bragg-intensity R (R Bragg ) which use the structure factor moduli extracted from the experimental profile.In addition, graphical tools are available for checking: (1) the match between the observed and calculated data by visualization of the observed, calculated, and difference pattern and the cumulative χ 2 value; (2) the chemical sense of bonding and nonbonding distances, angles and displacement parameters by direct display of the crystal structure.

Application
An example of structure solution and Rietveld refinement reported here regards the organic compound [60] (see Figure 1 and Table 1).In this case, the X-ray powder diffraction analysis is fundamental because it gives a full structural elucidation of the products and starting materials of a new synthetic process where other approaches (NMR) could be ambiguous and error-prone.The solution was attained by the DS method, in particular by Simulated Annealing of EXPO.The trial model for starting SA was created by using the sketching facilities of ACD/ChemSketch [61] and applying the MOPAC program [62] for the geometry optimization.For the structure solution, the angular range 6 • < 2θ < 50 • of the experimental powder diffraction profile (X-ray standard laboratory data) was used.The number of parameters, varied during the minimization process, was equal to 10: three coordinates to describe the position of the center of mass, three describing the orientation, and four torsion angles to describe the conformation.The SA algorithm, applied in a nonstandard-way, was run 20 times and the structure model corresponding to the smallest value of the cost function R wp = 7.11 was selected.The criterion to accept the solution was based also on the soundness of the crystal packing.Then, the solution derived from the DS procedure was used as an input model for the Rietveld refinement after that the H atoms were placed in calculated positions.The peak shape was modelled by the Pearson VII function.The background was fitted by a 20 coefficients polynomial.The number of structural and profile refined variables was 112:4 cell parameters, 60 nonhydrogen atomic fractional coordinates, 20 isotropic displacement parameters, 7 profile parameters, 20 background coefficients, and 1 zero-shift parameter.Hydrogen atoms have been constrained according to the riding model approximation.The automatic refinement executed by EXPO was robust and convergence was quickly achieved yielding R wp = 2.787 and χ 2 = 1.846 (Table 1).
inates, 20 isotropic displacement parameters, 7 profile parameters, 20 background coeffici zero-shift parameter.Hydrogen atoms have been constrained according to the riding m ximation.The automatic refinement executed by EXPO was robust and convergence ly achieved yielding Rwp = 2.787 and χ 2 = 1.846 (Table 1).Another example of Rietveld refinement concerns inorganic structures, in particular the study of a set of new rare-earth tricalcium phosphates (TCP) Ca 9 RE(PO 4 ) 7 (RE = La, Pr, Nd, Eu, Gd, Dy, Tm, Yb [64], and Lu [65]).TCP doped with rare earth (RE) elements are widely investigated because of their applications in biological imaging, owing to their strong luminescence properties [66,67].Despite this, the available structural investigations and refinements of all the mentioned RE β-TCP inorganic structures from powder methods are quite lacking in literature [68,69], thus making a challenging task faced by EXPO, which worked on X-ray conventional laboratory diffraction data.
All the steps of the ab initio crystal structure solution process, from indexation to the Rietveld refinement, were performed by EXPO.The structure solution process was carried out by a default run of Direct Methods, confirming the model suggested by literature of a single-crystal study [69,70].Detailed crystallographic information is summarized in Table 2 only for the Ca 9 Dy(PO 4 ) 7 compound taken henceforth as a representative example of the series.In the successive Rietveld refinement, the default Pearson VII function was used for describing the peak shape.A nonstandard refinement strategy was adopted, requested by the graphic interface of EXPO: the RE positions were constrained to those of Ca in shared sites; the displacement parameters of P were set equal to those of O; and the sum of occupancies of Ca and RE elements were equal to 1.0 in shared sites with no limitation on the final charge at the site.In total, 86 parameters were refined, including the profile ones.The refinement procedure converged to low R exp , R p , and R wp discrepancy indices whose values are indicative of reliable results: those for Ca 9 Dy(PO 4 ) 7 are reported in Table 2 in addition to the crystal structure refinement data.The profile agreement between the observed (blue line) and the calculated pattern (red line) is shown in Figure 2. The difference pattern (violet curve) is also provided.
challenging task faced by EXPO, which worked on X-ray conventional laboratory diffraction data.
All the steps of the ab initio crystal structure solution process, from indexation to the Rietveld refinement, were performed by EXPO.The structure solution process was carried out by a default run of Direct Methods, confirming the model suggested by literature of a single-crystal study [69,70].Detailed crystallographic information is summarized in Table 2 only for the Ca9Dy(PO4)7 compound taken henceforth as a representative example of the series.In the successive Rietveld refinement, the default Pearson VII function was used for describing the peak shape.A nonstandard refinement strategy was adopted, requested by the graphic interface of EXPO: the RE positions were constrained to those of Ca in shared sites; the displacement parameters of P were set equal to those of O; and the sum of occupancies of Ca and RE elements were equal to 1.0 in shared sites with no limitation on the final charge at the site.In total, 86 parameters were refined, including the profile ones.The refinement procedure converged to low Rexp, Rp, and Rwp discrepancy indices whose values are indicative of reliable results: those for Ca9Dy(PO4)7 are reported in Table 2 in addition to the crystal structure refinement data.The profile agreement between the observed (blue line) and the calculated pattern (red line) is shown in Figure 2. The difference pattern (violet curve) is also provided.Four cationic sites are present in the structures and occupied by Ca and RE: three in general positions displaying eightfold coordination (named M1 and M3) and sevenfold coordination (named M2), and one on a special position displaying octahedral coordination, named M5 (Figure 3).Two of the three phosphorous and nine of the ten oxygen atoms are located on general positions, while the other atoms are located on special positions.Atom labelling was fixed according to [71].
M2), and one on a special position displaying octahedral coordination, named M5 (Figure 3).Two of the three phosphorous and nine of the ten oxygen atoms are located on general positions, while the other atoms are located on special positions.Atom labelling was fixed according to [71].The aim of the Rietveld refinement is to carefully investigate the RE distribution within the RE β-TCP samples.Two different possible dopant localizations were considered: (1) RE in M1, M2 and M3 sites, but not in M5 according to [69]; (2) RE in M1, M2 and M5 sites, but not in M3.For all the compounds, the refinement of both possible configurations was tested.The monitoring of the Rwp values of the Rietveld refinements confirmed the cation distributions proposed in [69] for RE = La, Pr, Nd, Eu, Gd, and Dy behavior due to steric reasons because of the large RE ionic radius.Three exceptions to the described trend are represented by RE = Tm, Yb [64] and Lu [65], entering in site M5 and not in M3 due to their lower ionic radius for 6-coordinations, allowing them to enter in the M5O6 compact octahedron.The refined site-occupancies provided a total charge very close to the ideal value of 21 valence units with minor anomalies (La and Nd the largest) within the experimental error provided by the EXPO software (see [64,65] for more details).
Considering the charge difference between RE 3+ and Ca 2+ and the lack of restraints on charge for mixed cationic sites, the obtained results can be considered a very reliable estimate of the RE/Ca ratio at each site.Rietveld refinement for Ca9Dy(PO4)7 as well as for all Ca9RE(PO4)7 analogue phases, was crucial for determining the exact amount of rare earth within every single calcium site, and to better understand the luminescence properties of such compounds.

Conclusions
The Rietveld refinement is the last necessary step in the structure solution pathway from powder diffraction data.It requires that the structure model to be adjusted is physically and chemically reliable and close to the true one.If these conditions are not fulfilled, the risk of obtaining an incorrect refined structure model is present, as well as if the minimization process has been successfully carried out and the figures of merits indicate a good fit between the observed and calculated profile.EXPO is a software package that is continuously improving the quality of the structure model obtained at the end of the structure determination, as well as the refinement process.Several strategies working both in the reciprocal and direct space can be selected with the aim of attaining a structure model suitable to the Rietveld refinement.In this way, the full solution process, from indexation to structure refinement, can be executed by EXPO, making use of default strategies or nonstandard ones.

Availability of EXPO
The EXPO program (last version EXPO2013) runs on any PC or workstation with the operating systems Windows, Linux or Mac OS X.For Windows, some DLLs are necessary, which are included in the distribution kit.For Linux, the binary packages for the most widely used distributions are available.The source code is also supplied and can be compiled by a Fortran95 and C++ compiler together with the GTK+2.0 and OpenGL libraries.
EXPO is available at http://wwwba.ic.cnr.it/content/expo-downloads, and the software is free for academic and nonprofit research institutions after registration.The installation instructions and the user manual are accessible via the web; documentation about the program itself and on the usage of the graphical interface is provided in HTML and PDF formats.The aim of the Rietveld refinement is to carefully investigate the RE distribution within the RE β-TCP samples.Two different possible dopant localizations were considered: (1) RE in M1, M2 and M3 sites, but not in M5 according to [69]; (2) RE in M1, M2 and M5 sites, but not in M3.For all the compounds, the refinement of both possible configurations was tested.The monitoring of the R wp values of the Rietveld refinements confirmed the cation distributions proposed in [69] for RE = La, Pr, Nd, Eu, Gd, and Dy behavior due to steric reasons because of the large RE ionic radius.Three exceptions to the described trend are represented by RE = Tm, Yb [64] and Lu [65], entering in site M5 and not in M3 due to their lower ionic radius for 6-coordinations, allowing them to enter in the M5O 6 compact octahedron.The refined site-occupancies provided a total charge very close to the ideal value of 21 valence units with minor anomalies (La and Nd the largest) within the experimental error provided by the EXPO software (see [64,65] for more details).
Considering the charge difference between RE 3+ and Ca 2+ and the lack of restraints on charge for mixed cationic sites, the obtained results can be considered a very reliable estimate of the RE/Ca ratio at each site.Rietveld refinement for Ca 9 Dy(PO 4 ) 7 as well as for all Ca 9 RE(PO 4 ) 7 analogue phases, was crucial for determining the exact amount of rare earth within every single calcium site, and to better understand the luminescence properties of such compounds.

Conclusions
The Rietveld refinement is the last necessary step in the structure solution pathway from powder diffraction data.It requires that the structure model to be adjusted is physically and chemically reliable and close to the true one.If these conditions are not fulfilled, the risk of obtaining an incorrect refined structure model is present, as well as if the minimization process has been successfully carried out and the figures of merits indicate a good fit between the observed and calculated profile.EXPO is a software package that is continuously improving the quality of the structure model obtained at the end of the structure determination, as well as the refinement process.Several strategies working both in the reciprocal and direct space can be selected with the aim of attaining a structure model suitable to the Rietveld refinement.In this way, the full solution process, from indexation to structure refinement, can be executed by EXPO, making use of default strategies or nonstandard ones.

Availability of EXPO
The EXPO program (last version EXPO2013) runs on any PC or workstation with the operating systems Windows, Linux or Mac OS X.For Windows, some DLLs are necessary, which are included in the distribution kit.For Linux, the binary packages for the most widely used distributions are available.The source code is also supplied and can be compiled by a Fortran95 and C++ compiler together with the GTK+2.0 and OpenGL libraries.
EXPO is available at http://wwwba.ic.cnr.it/content/expo-downloads, and the software is free for academic and nonprofit research institutions after registration.The installation instructions and the user manual are accessible via the web; documentation about the program itself and on the usage of the graphical interface is provided in HTML and PDF formats.
(a)The Shift step, which randomly shifts a suitably chosen part of the DM structure model; (b)

18 ( 1 )
Parameters for correcting the systematic line-shift errors due to sample displacement, sample transparency, and zero-shift.(2) Background parameters.The background is automatically described by empirical functions: the classical polynomial function, the Chebyshev polynomial function, and the cosine Fourier series.It can be also modeled by a mouse-click selection of points interpolated by the best fitting background curve.(3) Parameters related to integrated intensities: scale factor and preferred orientation.Correction for the preferred orientation can be achieved by the March-Dollase function [55].(4) Profile parameters: full width at half maximum of the peak shape and peak asymmetry.The available peak shape functions are: pseudo-Voigt, Pearson-VII, and modified Thompson-Cox-Hastings pseudo-Voigt.