Crystal Structure of 17 α-Dihydroequilin , C 18 H 22 O 2 , from Synchrotron Powder Diffraction Data and Density Functional Theory

The crystal structure of 17α-dihydroequilin has been solved and refined using synchrotron X-ray powder diffraction data, and optimized using density functional techniques. 17α-dihydroequilin crystallizes in space group P212121 (#19) with a = 6.76849(1) Å, b = 8.96849(1) Å, c = 23.39031(5) Å, V = 1419.915(3) Å3, and Z = 4. Both hydroxyl groups form hydrogen bonds to each other, resulting in zig-zag chains along the b-axis. The powder diffraction pattern has been submitted to ICDD for inclusion in the Powder Diffraction FileTM as the entry 00-066-1608.


Introduction
17α-dihydroequilin is a naturally-occurring steroidal estrogen found in the urine of female horses.Its 3-sulfate ester sodium salt is one of the components of Premarin ® , a mixture of conjugated estrogens which is used to treat symptoms of menopause and for the prevention of osteoporosis.Premarin is administered in oral, intravenous, and cream forms.The systematic name (CAS Registry number 651-55-8) is estra-1,3,5 (10),7-tetraene-3,17α-diol.A two-dimensional molecular diagram is shown in Figure 1.

Introduction
17α-dihydroequilin is a naturally-occurring steroidal estrogen found in the urine of female horses.Its 3-sulfate ester sodium salt is one of the components of Premarin ® , a mixture of conjugated estrogens which is used to treat symptoms of menopause and for the prevention of osteoporosis.Premarin is administered in oral, intravenous, and cream forms.The systematic name (CAS Registry number 651-55-8) is estra-1,3,5 (10),7-tetraene-3,17α-diol.A two-dimensional molecular diagram is shown in Figure 1.The presence of high-quality reference powder patterns in the Powder Diffraction File™ (PDF ® ) [1] is important for phase identification-particularly by pharmaceutical, forensic, and law enforcement scientists.The crystal structures of a significant fraction of the largest dollar volume pharmaceuticals have not been published, and thus calculated powder patterns are not present in the PDF-4 databases.Sometimes experimental patterns are reported, but they are generally of low quality.This crystal structure is a result of a collaboration among the International Centre for Diffraction Data (ICDD ® ), The presence of high-quality reference powder patterns in the Powder Diffraction File™ (PDF ® ) [1] is important for phase identification-particularly by pharmaceutical, forensic, and law enforcement scientists.The crystal structures of a significant fraction of the largest dollar volume pharmaceuticals have not been published, and thus calculated powder patterns are not present in the PDF-4 databases.Sometimes experimental patterns are reported, but they are generally of low quality.This crystal structure is a result of a collaboration among the International Centre for Diffraction Data (ICDD ® ), Illinois Institute of Technology (IIT), Poly Crystallography Inc., and Argonne National Laboratory to measure high-quality synchrotron powder patterns of commercial pharmaceutical ingredients, include these reference patterns in the PDF, and determine the crystal structures of these active pharmaceutical ingredients (APIs).
Even when the crystal structure of an API is reported, the single crystal structure was often determined at low temperature.Most powder measurements are performed at ambient conditions.Thermal expansion (generally anisotropic) means that the peak positions calculated from a low-temperature single crystal structure often differ significantly from those measured at ambient conditions.These peak shifts can result in the failure of default search/match algorithms to identify a phase, even when it is present in the sample.High-quality reference patterns measured at ambient conditions are thus critical for straightforward identification of APIs using standard powder diffraction practices.

Results and Discussion
The process of solving the crystal structure of 17α-dihydroequilin provides a cautionary tale.Even though good Rietveld residuals (R wp = 0.0987, χ 2 = 2.851, see Figure 2) were obtained for C 18 H 20 O 2 , the agreement of the refined and DFT-optimized structures was poor, indicating that something was wrong.Once we realized that the true formula of 17α-dihydroequilin was C 18 H 22 O 2 , the source of the problem became clear.Although the two molecular structures look similar when viewed perpendicular to the ring system (Figure 3), a side view (Figure 4) shows that the extra hydrogen atoms buckle the ring system.Using the correct molecular formula C 18 H 22 O 2 resulted in lower residuals (R wp = 0.0902, χ 2 = 2.221, see Figure 5), but support that this was the correct molecule was provided by the excellent agreement between the Rietveld-refined and DFT-optimized structures (Figure 6).The root-mean-square Cartesian displacement of the non-hydrogen atoms in the two structures is only 0.109 Å, indicating that the Rietveld refined structure is probably correct [2].This discussion uses the density functional theory (DFT)-optimized structure.The asymmetric unit (with atom numbering) is illustrated in Figure 7, and the crystal structure is presented in Figure 8.
Illinois Institute of Technology (IIT), Poly Crystallography Inc., and Argonne National Laboratory to measure high-quality synchrotron powder patterns of commercial pharmaceutical ingredients, include these reference patterns in the PDF, and determine the crystal structures of these active pharmaceutical ingredients (APIs).
Even when the crystal structure of an API is reported, the single crystal structure was often determined at low temperature.Most powder measurements are performed at ambient conditions.Thermal expansion (generally anisotropic) means that the peak positions calculated from a low-temperature single crystal structure often differ significantly from those measured at ambient conditions.These peak shifts can result in the failure of default search/match algorithms to identify a phase, even when it is present in the sample.High-quality reference patterns measured at ambient conditions are thus critical for straightforward identification of APIs using standard powder diffraction practices.

Results and Discussion
The process of solving the crystal structure of 17α-dihydroequilin provides a cautionary tale.
Even though good Rietveld residuals (Rwp = 0.0987, χ 2 = 2.851, see Figure 2) were obtained for C18H20O2, the agreement of the refined and DFT-optimized structures was poor, indicating that something was wrong.Once we realized that the true formula of 17α-dihydroequilin was C18H22O2, the source of the problem became clear.Although the two molecular structures look similar when viewed perpendicular to the ring system (Figure 3), a side view (Figure 4) shows that the extra hydrogen atoms buckle the ring system.Using the correct molecular formula C18H22O2 resulted in lower residuals (Rwp = 0.0902, χ 2 = 2.221, see Figure 5), but support that this was the correct molecule was provided by the excellent agreement between the Rietveld-refined and DFT-optimized structures (Figure 6).The root-mean-square Cartesian displacement of the non-hydrogen atoms in the two structures is only 0.109 Å, indicating that the Rietveld refined structure is probably correct [2].This discussion uses the density functional theory (DFT)-optimized structure.The asymmetric unit (with atom numbering) is illustrated in Figure 7, and the crystal structure is presented in Figure 8.All of the bond distances, bond angles, and torsion angles (bond distances and angles were restrained during the Rietveld refinement) fall within the normal ranges indicated by a Mercury Mogul Geometry check [3].Quantum chemical geometry optimization (Hartree-Fock/6-31G*/water) using Spartan '14 (Wavefunction, Inc., Irvine, CA, USA) [4] indicated that the observed conformation of 17α-dihydroequilin in the solid state is 9.2 kcal/mol higher in energy than a local minimum energy    All of the bond distances, bond angles, and torsion angles (bond distances and angles were restrained during the Rietveld refinement) fall within the normal ranges indicated by a Mercury Mogul Geometry check [3].Quantum chemical geometry optimization (Hartree-Fock/6-31G*/water) using Spartan '14 (Wavefunction, Inc., Irvine, CA, USA) [4] indicated that the observed conformation of 17α-dihydroequilin in the solid state is 9.2 kcal/mol higher in energy than a local minimum energy    All of the bond distances, bond angles, and torsion angles (bond distances and angles were restrained during the Rietveld refinement) fall within the normal ranges indicated by a Mercury Mogul Geometry check [3].Quantum chemical geometry optimization (Hartree-Fock/6-31G*/water) using Spartan '14 (Wavefunction, Inc., Irvine, CA, USA) [4] indicated that the observed conformation of 17α-dihydroequilin in the solid state is 9.2 kcal/mol higher in energy than a local minimum energy All of the bond distances, bond angles, and torsion angles distances and angles were restrained during the Rietveld refinement) fall within the normal ranges indicated by a Mercury Mogul Geometry check [3].Quantum chemical geometry optimization (Hartree-Fock/6-31G*/water) using Spartan '14 (Wavefunction, Inc., Irvine, CA, USA) [4] indicated that the observed conformation of 17α-dihydroequilin in the solid state is 9.2 kcal/mol higher in energy than a local minimum energy conformation of an isolated molecule.A molecular mechanics conformational analysis also using Spartan '14 indicated that the global minimum energy conformation (7.4 kcal/mol lower in energy than that observed in the solid state) is more compact, and thus intermolecular interactions are important in determining the solid-state conformation.
Analysis of the contributions to the total crystal energy using the Forcite module of Materials Studio [5] suggests that bond angle, bond distance, and torsion distortion terms are significant in the intramolecular deformation energy, as might be expected for a fused ring system.The intermolecular energy contains significant contributions from van der Waals and electrostatic attractions, which in this force-field-based analysis include hydrogen bonds.The hydrogen bonds are better analyzed using the results of the DFT calculation.
Both hydroxyl groups form fairly strong hydrogen bonds to each other (Table 1).These hydrogen bonds result in zig-zag chains along the b-axis (Figure 8).The volume enclosed by the Hirshfeld surface (Figure 9) [6][7][8][9] is 348.55 Å 3 , 98.2% of 1/4 the unit cell volume.The molecules are thus not tightly packed.The only significant close contacts (shown in red in Figure 9) involve the hydrogen bonds.Other close contacts are consistent with some small positive Mulliken overlap populations in H•••A distances, particularly involving the ring hydrogen atom H25, and the methyl hydrogens H35, H36, and H37.

H-Bond D-H, Å H•••A, Å D•••A, Å D-H•••A, •
Overlap, e E, kcal/mol O18-H39 conformation of an isolated molecule.A molecular mechanics conformational analysis also using Spartan '14 indicated that the global minimum energy conformation (7.4 kcal/mol lower in energy than that observed in the solid state) is more compact, and thus intermolecular interactions are important in determining the solid-state conformation.Analysis of the contributions to the total crystal energy using the Forcite module of Materials Studio [5] suggests that bond angle, bond distance, and torsion distortion terms are significant in the intramolecular deformation energy, as might be expected for a fused ring system.The intermolecular energy contains significant contributions from van der Waals and electrostatic attractions, which in this force-field-based analysis include hydrogen bonds.The hydrogen bonds are better analyzed using the results of the DFT calculation.
Both hydroxyl groups form fairly strong hydrogen bonds to each other (Table 1).These hydrogen bonds result in zig-zag chains along the b-axis (Figure 8).The volume enclosed by the Hirshfeld surface (Figure 9) [6][7][8][9] is 348.55 Å 3 , 98.2% of 1/4 the unit cell volume.The molecules are thus not tightly packed.The only significant close contacts (shown in red in Figure 9) involve the hydrogen bonds.Other close contacts are consistent with some small positive Mulliken overlap populations in H⋅⋅⋅A distances, particularly involving the ring hydrogen atom H25, and the methyl hydrogens H35, H36, and H37.The Bravais-Friedel-Donnay-Harker [10][11][12] morphology suggests that we might expect a platy morphology for 17α-dihydroequilin, with {002} as the principal faces.A 6th-order spherical harmonic preferred orientation model was included in the Rietveld refinement; however, the texture index was only 1.011, indicating that preferred orientation was not significant in this rotated capillary specimen.The powder pattern of 17α-dihydroequilin is included in the Powder Diffraction File as the entry 00-066-1608.The Bravais-Friedel-Donnay-Harker [10][11][12] morphology suggests that we might expect a platy morphology for 17α-dihydroequilin, with {002} as the principal faces.A 6th-order spherical harmonic preferred orientation model was included in the Rietveld refinement; however, the texture index was only 1.011, indicating that preferred orientation was not significant in this rotated capillary specimen.The powder pattern of 17α-dihydroequilin is included in the Powder Diffraction File as the entry 00-066-1608.

Materials and Methods
17α-Dihydroequilin was a commercial reagent purchased from US Pharmacopeia (Lot JOL148), and was used as-received.The white powder was packed into a 1.5 mm diameter Kapton capillary, and rotated during the measurement at ~50 cycles•s −1 .The powder pattern was measured at 295 K at beam line 11-BM [13,14] of the Advanced Photon Source at Argonne National Laboratory using a wavelength of 0.413342 Å, from 0.5-50 • 2θ with a step size of 0.001 • and a counting time of 0.1 sec/step.The pattern was indexed on a primitive orthorhombic unit cell with a = 6.765Å, b = 8.965 Å, c = 23.384Å, V = 1418.2Å 3 , and Z = 4, using Jade [15].The suggested space group was P2 1 2 1 2 1 , which was confirmed by successful solution and refinement of the structure.A reduced cell search in the Cambridge Structural Database [16] (increasing the tolerance on the longest dimension to 2.0%) combined with the chemistry "C H O only" yielded 15 hits, but no crystal structure for dihydroequilin.
The formula for 17α-dihydroequilin is reported in some sources (such as Chemical Book) as C 18 H 20 O 2 , and in others as C 18 H 22 O 2 .The molecular structure C 18 H 22 O 2 contains additional hydrogen atoms on C2 and C5 (see Figure 7).For the initial structure solution, a C 18 H 20 O 2 molecule was built and optimized in Spartan '14 [4].The resulting "mol2" file was converted into a Fenske-Hall Z-matrix file using OpenBabel [17].The structure was solved with FOX [18], using a maximum sinθ/λ = 0.25 Å −1 .A Rietveld refinement using GSAS [19] yielded good residuals (R wp = 0.097 and χ 2 = 2.851 for 87 variables) and an excellent fit (Figure 2).A density functional geometry optimization (fixed experimental cell) was carried out using CRYSTAL09 [20].The root-mean-square Cartesian displacement of the non-hydrogen atoms was 0.415 Å, indicating that the experimental structure was likely incorrect [2].
The molecular structure of C 18 H 22 O 2 contains additional hydrogen atoms on C2 and C5 (see Figure 7).The hydrogen atoms H41 and H42 were added using Materials Studio [5] (after appropriate modifications to the bond types).This model was then subjected to geometry optimization (fixing the unit cell parameters to the experimentally-determined values) using CRYSTAL14 [21].The 6-31G** basis sets for the H, C, and O atoms were those of Gatti et al. [22].The calculation was run on eight 2.1 GHz Xeon cores (each with 6 Gb RAM) of a 304-core Dell Linux cluster at IIT, using 8 k-points and the B3LYP functional, and took ~47 h.The optimized structure was the basis of the final Rietveld refinement.
Rietveld refinement was carried out using GSAS/EXPGUI [19,23].Only the 1.8-25.0• portion of the pattern was included in the refinement (d min = 0.955 Å).All non-H bond distances and angles were subjected to restraints, based on a Mercury/Mogul Geometry Check [24,25].The Mogul average and standard deviation for each quantity were used as the restraint parameters.The restraints contributed 4.2% to the final χ 2 .A common U iso was refined for the atoms of the steroid ring system, and a second U iso was refined for the substituent atoms.The hydrogen atoms were included in calculated positions, which were recalculated during the refinement using Materials Studio.The U iso of each hydrogen atom was constrained to be 1.3× that of the non-hydrogen atom to which it was attached.The peak profiles were described using the profile function #4 [26,27], which includes the Stephens [28] anisotropic strain broadening model.The background was modeled using a three-term shifted Chebyshev polynomial, with a six-term diffuse scattering function to model the X-ray scattering of the Kapton capillary and any amorphous component.The final refinement of 87 variables using 23,264 observations (23,201 data points and 63 restraints) yielded the residuals Rwp = 0.0902, Rp = 0.0732, and χ 2 = 2.221.The largest peak (1.25 Å from C11) and hole (0.80 Å from C19) in the difference Fourier map were 0.59 and −0.55 e.Å −3 , respectively.The Rietveld plot is included as Figure 5.The largest errors in the fit are in the shapes of some of the strong low-angle peaks.

Conclusion
The crystal structure of 17α-dihydroequilin has been solved and refined using synchrotron X-ray powder diffraction data, and optimized using density functional techniques.The agreement of the

Figure 2 .
Figure2.The Rietveld plot for the refinement of 17α-dihydroequilin, using the incorrect molecular formula C18H20O2.The red crosses represent the observed data points, and the green line is the calculated pattern.The magenta curve is the difference pattern, plotted at the same vertical scale as the other patterns.The vertical scale has been multiplied by a factor of 5 for 2θ > 7.0°, and by a factor of 40 for 2θ > 13.0°.X-ray wavelength of 0.413342 Å.

Figure 2 .
Figure 2. The Rietveld plot for the refinement of 17α-dihydroequilin, using the incorrect molecular formula C 18 H 20 O 2 .The red crosses represent the observed data points, and the green line is the calculated pattern.The magenta curve is the difference pattern, plotted at the same vertical scale as the other patterns.The vertical scale has been multiplied by a factor of 5 for 2θ > 7.0 • , and by a factor of 40 for 2θ > 13.0 • .X-ray wavelength of 0.413342 Å.

Figure 3 .
Figure 3.Comparison of the molecular structures of C18H20O2 and C18H22O2, viewed approximately perpendicular to the ring system.

Figure 4 .
Figure 4. Comparison of the molecular structures of C18H20O2 (red) and C18H22O2 (green), viewed approximately in the plane of the ring system.

Figure 5 .
Figure5.The Rietveld plot for the refinement of 17α-dihydroequilin, using the correct molecular formula C18H22O2.The red crosses represent the observed data points, and the green line is the calculated pattern.The magenta curve is the difference pattern, plotted at the same vertical scale as the other patterns.The vertical scale has been multiplied by a factor of 5 for 2θ > 6.80°, and by a factor of 40 for 2θ > 11.5°.X-ray wavelength of 0.413342 Å.

Figure 3 .
Figure 3.Comparison of the molecular structures of C 18 H 20 O 2 and C 18 H 22 O 2 , viewed approximately perpendicular to the ring system.

Figure 3 .
Figure 3.Comparison of the molecular structures of C18H20O2 and C18H22O2, viewed approximately perpendicular to the ring system.

Figure 4 .
Figure 4. Comparison of the molecular structures of C18H20O2 (red) and C18H22O2 (green), viewed approximately in the plane of the ring system.

Figure 5 .
Figure 5.The Rietveld plot for the refinement of 17α-dihydroequilin, using the correct molecular formula C18H22O2.The red crosses represent the observed data points, and the green line is the calculated pattern.The magenta curve is the difference pattern, plotted at the same vertical scale as the other patterns.The vertical scale has been multiplied by a factor of 5 for 2θ > 6.80°, and by a factor of 40 for 2θ > 11.5°.X-ray wavelength of 0.413342 Å.

Figure 4 . 8 Figure 3 .
Figure 4. Comparison of the molecular structures of C 18 H 20 O 2 (red) and C 18 H 22 O 2 (green), viewed approximately in the plane of the ring system.

Figure 4 .
Figure 4. Comparison of the molecular structures of C18H20O2 (red) and C18H22O2 (green), viewed approximately in the plane of the ring system.

Figure 5 .
Figure 5.The Rietveld plot for the refinement of 17α-dihydroequilin, using the correct molecular formula C18H22O2.The red crosses represent the observed data points, and the green line is the calculated pattern.The magenta curve is the difference pattern, plotted at the same vertical scale as the other patterns.The vertical scale has been multiplied by a factor of 5 for 2θ > 6.80°, and by a factor of 40 for 2θ > 11.5°.X-ray wavelength of 0.413342 Å.

Figure 5 .
Figure 5.The Rietveld plot for the refinement of 17α-dihydroequilin, using the correct molecular formula C 18 H 22 O 2 .The red crosses represent the observed data points, and the green line is the calculated pattern.The magenta curve is the difference pattern, plotted at the same vertical scale as the other patterns.The vertical scale has been multiplied by a factor of 5 for 2θ > 6.80 • , and by a factor of 40 for 2θ > 11.5 • .X-ray wavelength of 0.413342 Å.

Figure 7 .
Figure 7.The asymmetric unit of 17α-dihydroequilin, showing the atom numbering.The atoms are represented by 50% probability spheroids.C is shown in dark green, O in red, and H in light grey.

Figure 8 .
Figure 8.The crystal structure of 17α-dihydroequilin, viewed down the a-axis.The hydrogen bonds are indicated by dashed lines.

Figure 7 .
Figure 7.The asymmetric unit of 17α-dihydroequilin, showing the atom numbering.The atoms are represented by 50% probability spheroids.C is shown in dark green, O in red, and H in light grey.

Figure 8 .
Figure 8.The crystal structure of 17α-dihydroequilin, viewed down the a-axis.The hydrogen bonds are indicated by dashed lines.

Figure 7 .
Figure 7.The asymmetric unit of 17α-dihydroequilin, showing the atom numbering.The atoms are represented by 50% probability spheroids.C is shown in dark green, O in red, and H in light grey.

Figure 7 .
Figure 7.The asymmetric unit of 17α-dihydroequilin, showing the atom numbering.The atoms are represented by 50% probability spheroids.C is shown in dark green, O in red, and H in light grey.

Figure 8 .
Figure 8.The crystal structure of 17α-dihydroequilin, viewed down the a-axis.The hydrogen bonds are indicated by dashed lines.

Figure 8 .
Figure 8.The crystal structure of 17α-dihydroequilin, viewed down the a-axis.The hydrogen bonds are indicated by dashed lines.

Figure 9 .
Figure 9. Hirshfeld surface of 17α-dihydroequilin.Intermolecular contacts longer than the sums of the van der Waals radii are colored blue, and contacts shorter than the sums of the radii are colored red.Contacts equal to the sums of radii are white.

Figure 9 .
Figure 9. Hirshfeld surface of 17α-dihydroequilin.Intermolecular contacts longer than the sums of the van der Waals radii are colored blue, and contacts shorter than the sums of the radii are colored red.Contacts equal to the sums of radii are white.