1. Introduction
Non-ideal chromatographic separations received special attention even before the availability of spectral detectors (based on polychromators and photodiode array detectors) with real-time acquisition features. An analysis of unresolved chromatographic peaks using the sequential chromatogram ratio technique was reported by Bahowick et al. in 1994 [
1]. A second approach was using peak deconvolution in single-dimension chromatograms obtained from different runs and varying concentration ratios of the overlapping compounds for a two-way data approach [
2]. Correlation-based algorithms for peak area data resulting from sequential chromatograms monitored at different wavelengths were successfully used for the assessment of peak identity and homogeneity [
3]. Experimental design and heuristic evolving latent projection were also used to assess peak purity in liquid chromatography [
4].
Once photodiode array detection became a norm in liquid chromatography, multiple other approaches were proposed for peak homogeneity assessments [
5,
6,
7,
8,
9,
10,
11,
12,
13], with a first review on this topic in 1993 [
14], and the subject still remaining a topic of interest [
15]. Singular value ratio was used for resolving peaks in HPLC-DAD [
16], band broadening in a chromatographic column was studied by means of peak deconvolution [
17], and peak homogeneity was used for the identification of the best conditions to separate compounds in complex mixtures when extremely low chromatographic resolution occurs [
18], also making possible an automated statistical-based peak detection [
19]. The same principles were successfully applied in standalone UV spectrometric applications [
20]. All these approaches regarding peak homogeneity in LC-DAD were naturally extended to mass spectrometric detection [
21,
22,
23,
24].
The basic concepts relating to peak homogeneity in liquid chromatography coupled to photodiode array detection have been comprehensively discussed in the work of S.C. Rutan et al. [
25]. The similarity between two spectra acquired during peak elution can be measured by the cosine of the angle
θ between the n-dimensional vectors describing these spectra. Here, n is the number of wavelengths considered over the spectral acquisition interval, and absorbance values represent the length of the vector in the nth dimension.
The numerator is the dot product between the two vectors and the denominator is the vector length (norm) product. Because the dot product is divided by the lengths of the vectors, it means that the similarity is not dependent on the signals’ amplitude. Consequently, this means the similarity between spectral shapes does not consider their intensity, and thus spectral normalization is not required.
As long as the two vectors are mean-centered before computing Formula (1), the correlation coefficient between the two spectra according to Formula (2) may be successfully used as a measure of their similarity:
Here,
a and
b are absorbance values recorded at the
ith wavelength. The
b series are usually considered the reference spectrum, for instance, the spectrum recorded at the apex of the peak investigated for spectral homogeneity or the mean spectrum computed over the peak’s elution, with or without considering background subtraction. The change in the similarity factor computed as 1000 ×
r2 across the peak may be graphically represented. Additionally, one can use a limit threshold value according to which the spectral purity of the peak can be determined. An alternative for threshold calculation is given in (3).
Here,
Var is the variance of the noise, spectrum
j, and reference spectrum. Such features for evaluating peak spectral homogeneity have been naturally integrated in the software controlling data acquisition and processing by different manufacturers, with some variations [
26,
27,
28].
Peak homogeneity assessment is considered an important selectivity-evaluation tool in LC method development and validation. However, the results of peak purity evaluations produced by the instrument’s software should sometimes be viewed with caution if the following limitations occur:
Perfect co-elution, when the spectral contribution of any interfering compound is equally distributed along the main analyte’s peak profile;
Major concentration differences between the target and interfering compounds;
The target analyte and/or the interfering compounds do not exhibit characteristic absorption bands over the investigated spectral interval;
The main analyte and any interfering compounds exhibit increased spectral profile similarities;
False negative peak purity assessments for peaks produced by larger amounts of the analyte loaded onto the column, where the detector response is not saturated, but peaks are “cut”.
We investigated an alternative approach for evaluating peak spectral homogeneity by comparing the acquired spectra through linear regression, considering the resulting slope, intercept, and correlation coefficient values. The data were compared with the “classic” approach based on the similarity factor 1000 × r2, r being computed according to Formula (2). This alternative approach was evaluated with respect to the influence of the analyte’s amount loaded onto the column, spectral similarity between overlapping peaks, perfect co-elution situations, spectral acquisition parameters and the use of spectral processing features, such as the first and second spectral derivatives.
2. Materials and Methods
2.1. Chemicals, Reagents, and Standards
HPLC gradient grade (purity min. 99.9%) acetonitrile was acquired from Sigma Aldrich (Sigma Aldrich, Saint Louis, MO, USA). HPLC grade water (conductivity max 0.055 µS/cm, TOC max 10 ppb and bacterial count max 10 CFU/mL) was produced by means of a Milli-Q Integral S system. Carbamazepine (USP reference standard, cat. no. 109 3001), diazepam (USP reference standard, cat. no. 118 5008), acetylcysteine (cat. no. 100 9005), and enalapril maleate (cat. no. 123 5300) were obtained from Merck (Merck KGaA, Darmstadt, Germany). Nitrazepam (cat. no. N3889) was obtained from Sigma Aldrich.
2.2. Instrumentation
The chromatographic system was an Agilent 1260 series instrument (Agilent Technologies, Santa Clara, CA, USA) with the following modules: G1311B quaternary pump including a four channels degasser, G1329 automated liquid sampler with a 100 µL injection loop, G1316C column thermostat, and G1365 photo-diode array detector. Both hardware control and data acquisition/processing used Agilent’s Chemstation software LC3D, version 04.03(16).
The chromatographic column was a Kinetex EVO C18, 100 mm × 2.1 mm, 2.6 µm particle size (Phenomenex, Torrence, CA, USA). Chromatographic isocratic runs used HPLC grade water and gradient grade acetonitrile without any modifiers. To simulate perfect co-elution conditions, runs were performed through an SS tubing, 2 m × 0.12 mm. The column or tubing were kept at 25 °C. Injection volumes were 1 µL.
Flow rates of 0.3 mL/min for the column or 1 mL/min for the tubing were used. The mobile phase composition was modified to control elution of the analytes in the way imposed by the experimental protocol.
2.3. Data Processing
The proposed algorithm for evaluating the peak spectral homogeneity consists of successive steps: (a) spectral acquisition and digitization; (b) spectra normalization; (c) each pair of normalized spectra were compared through linear regression, computing the corresponding slope, intercept, and correlation coefficient values, with two different spectra being compared only once; (d) the mean values and standard deviations were then computed for the resulting populations of slopes, intercepts, and correlation coefficients; (e) the volume of the ellipsoid was calculated in 3D Cartesian space, with central coordinates as the mean values of the slopes, intercepts, and correlation coefficients and axes represented by the equivalent of
2 ×
standard deviation values for each population; (f) the resulting volume value was then transformed through
PEV = −log10(EV), where
EV is the ellipsoid volume. The final value
(PEV) correlates to the spectral homogeneity of the peak, with higher
PEV values meaning higher spectral homogeneity. All these steps are illustrated in
Figure 1, which also includes the calculation steps. Spectra acquired during peak elution were exported from the Chemstation software in CSV (comma-separated values) format and loaded in Microsoft
® Excel, where all calculations were performed.
All spectra used for ellipsoid computations were also processed using Formula (2) and the averaged r value represented as 1000 × r2 was obtained to allow continuous comparisons between the computed values and the match factor from the Agilent software.
4. Discussion
4.1. Analyte Amount Influence on Spectral Homogeneity Descriptors
If the threshold of 990 is generally accepted for the match factor approach, it becomes necessary to determine a threshold value for PEV. As the previous threshold value for the match factor meant a standard deviation of 0.005 for the correlation coefficients characterizing the linear regressions between each pair of spectra, it made sense to keep the same value for the standard deviation of the slope, having a theoretical mean of 1 in the case of a perfect fit between spectra, identical to the correlation coefficient value. The theoretical mean value for the intercept was 0. For a null value, it may be prudent to double the standard deviation value with respect to the other two parameters. Consequently, the proposed standard deviation values for the correlation coefficients and the slopes are s
r and s
S = 0.005, while the standard deviation for the intercept should be s
I = 0.01. The PEV value resulting from this set of standard deviations is thus 5.08, which can be rounded to 5.0, as illustrated in
Figure 2. This shows that PEV is more discriminative compared to the match factor, which is perfectly logical as long as the ellipsoid value depends upon three independent parameters instead of a single parameter as is the case for the match factor.
The PEV algorithm is even more sensitive compared to the match factor with respect to the amount of the analyte producing the spectral data over the chromatographic peak elution. The intrinsic efficiency associated with the chromatographic peak may also induce differences (high efficiencies are related to higher concentrations of the analytes during the peak elution profile, while a reduced efficiency results in lower concentration values). As a general rule for both algorithms, this amount should be placed below the ULOQ value resulting from the calibration of the targeted analyte.
4.2. Spectral Resolution and Number of Compared Spectra Influence on Spectral Homogeneity Descriptors
In
Figure 3, one can observe that the “cut-off” absolute amount where PEV is at the 5.0 threshold remained unchanged (around 0.75 µg), without any effects from the resolution used during spectral acquisitions. At the lowest amount loaded onto the column (0.01 µg), a higher resolution resulted in lower PEV, although still high enough above the threshold. This can be explained by the influence added by noise. For the match factor approach, the absolute amount reaching the threshold is as previously determined (around 1.5 µg). The noise does not seem to have any influence at the lowest limit of the absolute amount of the analyte loaded onto the column.
Considering the number of spectra acquired during the peak elution interval (
Figure 4), as expected, a higher number of compared spectra resulted in a higher variability for the three independent parameters. The amount of the loaded analyte producing PEV values reaching the threshold is placed in the same interval as previously determined, with a small decrease for the maximum of 18 compared spectra. The effect of the noise is also observable at low amounts of analyte, in direct relation to the number of the compared spectra. For 18 compared spectra and 0.01 mg of analyte loaded onto the column, the PEV value is slightly below the fixed threshold. Observing the variations in
Figure 4, it may be inferred that for more than 30 spectra considered for comparison, a new threshold should be established. Unlike the PEV observations, the match factor is less sensitive to the number of spectra used for comparison. The limit of the maximum amount of the analyte loaded onto the column reaching the threshold was the same as the value previously determined, in fair agreement with the ULOQ found for the test compound during calibration.
Increased spectral resolution increases the number of items contained in the series submitted to the linear regression analysis. The variability in the acquired data is reflected in the resulting characteristics of the linear regressions. The number of spectra acquired during peak elution and compared afterwards increases the population of values described by the mean values of the slopes, intercepts, and correlation coefficients. Consequently, the standard deviations associated with these data populations may increase, leading to the decrease in the PEV values.
4.3. Discrimination Ability of the Spectral Homogeneity Descriptors in the Case of Perfect Chromatographic Co-Elution
As illustrated in the
Supplementary Materials,
Figure S9, the variations in PEV and match factor values were all below the accepted threshold over the whole interval of loaded amounts onto the chromatographic column. It appears that PEV values were far below the limit, while the match factors were in the 10% variation interval below the limit. Again, PEV seems to act in a more discriminative way compared to the match factor. One can observe in both cases that low amounts of nitrazepam in diazepam generate higher shifts below the threshold, while low amounts of diazepam in nitrazepam placed the determined values below but closer to the accepted limits. Simulating perfect co-elution conditions shows that the amount loaded onto the SS tube resulting in crossing over the threshold limit fits relatively well to the ULOQ values determined during the calibration process. As already observed, in the case of PEV values, crossing over the threshold limit is placed at lower concentrations compared to the match factor. One can conclude that for compounds perfectly co-eluting with another one, the proportional contribution to the spectral behavior of the latter makes the lack of homogeneity undetectable. Both algorithms (match factor and PEV) failed to detect spectral inhomogeneities in cases of perfect co-elution due to the fact that the reciprocal spectral influences of the perfectly co-eluting compounds are proportional during peak elution. Consequently, the resulting overlapping spectra are identical after normalization (in case of PEV) or produce very good correlation coefficients (in the case of match factors).
4.4. Spectral Processing
As seen in
Figure 6, one can observe that the 1:1 mixture between analytes injected through the SS tubing (perfect co-elution situation) generates match factors above the 990 threshold, while the PEV values were all situated below the threshold limit. Thus, it was necessary to verify if the discrimination power of the PEV algorithm is true, or if it is an artefact resulting from data processing. This was the reason why the individual analyte solutions, as well as the 1:10 and 10:1 column loads of the target compounds were run through the SS tubing. Individual solutions loaded through the tube produced PEV and match factor values above the threshold. Discrimination according to the amount loaded onto the tube remains valid and in fair accordance with the ULOQ values resulting from calibrations. The 1:1 mixture of the analytes loaded onto the SS tubing produced PEV values below the threshold limit for the whole amount interval, but match factors remained above the threshold, thus validating the amount effect discrimination observation. When loaded onto the column, mixtures of 10:1 and 1:10 between analytes reduced peak homogeneity descriptors below their thresholds, and this reduction seemed more drastic in the case of PEV values.
When evaluating the second derivative approach, one can observe that PEV and match factor values are lower when the mixtures of analytes are passed through the column. Values computed on loading individual solutions and the 1:1 mixture through tubing do not differ substantially over some intervals of the loaded amounts. However, all data resulting from individual solutions and mixtures loaded onto the tubing or column are placed below the corresponding acceptance limits (990 for match factor and 5.0 for PEV). Variations in the computed values with the loaded amount also showed an unusual pattern, where in most cases the maximum values of the considered descriptors were computed for amounts loaded between 0.7 and 1.2 µg. Further studies would be necessary to check if the set threshold limits are still appropriate when using second order derivatives of the UV-VIS spectra as input data.
4.5. Similar Spectra and Lacking Characteristic Absorption Bands
Considering the acetyl cysteine and enalapril maleate peak purity evaluations, the match factors are situated above the 990 threshold, indicating spectral homogeneity even though the peak is made of the overlap of both compounds. The inability to discriminate between the test compounds is produced by the similarity between the spectra and the lack of characteristic absorption bands.
In the case of PEV, values are generally situated below and close to the threshold. For loaded amounts in the interval 1.5 and 3 µg, the resulting PEV values are slightly above the threshold (5.09 to 5.14). This effect may be associated with the test compounds being in the ULOQ region. The absorbance increase likely masks the subtle differences between the individual spectra. As with the other evaluations of PEV against match factors, and as expected, the PEV algorithm is more selective than the match factor algorithm.
5. Conclusions
A new algorithm designed to characterize spectral homogeneity across eluting peaks in LC/DAD is proposed. The algorithm is based upon the computation of the volume of an imaginary ellipsoid having as axes doubled standard deviations for slopes, intercepts, and correlation coefficient populations resulting from the linear regressions applied to each pair of spectra and expressed as a negative log10. The new spectral homogeneity descriptor was compared to the match factor computed by the Agilent Chemstation software Rev. B 04.03 [
16] to determine reciprocal advantages and disadvantages.
Both descriptors discriminate with respect to the amount of analyte loaded onto the column and distributed over the peak elution interval. Roughly, the ULOQ obtained during calibration for the considered compound will also represent the upper limit able to produce a correct positive identification of the spectral peak homogeneity. Above this limit, negative and false results occur. The PEV method is even more restrictive compared to the match factor. The PEV algorithm retains the information brought by three variables (slope, intercept, and correlation coefficient) after spectral normalization, while the match factor only considers the correlation coefficient of un-normalized spectral data. PEV may act as a more restrictive algorithm compared to the match factor.
This is not necessarily an advantage, considering that the spectral homogeneity of a peak is often the selectivity characteristic used during the validation of chromatographic methods. HPLC methods designed for the determination of related impurities in active ingredients habitually use test solutions with an increased concentration of the active ingredient (1000 µg/mL or higher). In such a case, analysts should be aware of the risk of having a negative spectral homogeneity result due to the increased concentration of the targeted compound.
It was observed that the resolution used during spectral acquisition does not affect the discrimination power of either match factor or PEV descriptors. In the case of PEV, if more than 30 peaks are compared during peak elution, a re-evaluation of the threshold would be necessary. Both descriptors are ineffective in evaluating the spectral homogeneity when perfect co-elution occurs. The PEV method is a better discriminant than the match factor when the first order derivatives of the UV-VIS spectra are used. When the second order derivatives of the spectra are considered, neither method is successful in assessing spectral homogeneity.
Additionally, the PEV method is better than the match factor when assessing spectral homogeneity of peaks resulting from co-elution of compounds having no characteristic absorption bands in and exhibiting increased spectral similarity. Thus, the PEV algorithm may be considered as a highly useful alternative for determining spectral homogeneity of peaks in HPLC-DAD.
We consider that the PEV algorithm can be included in data-acquisition and -processing software, as a useful tool that may bring some advantages to existing match factor algorithms. In this study, we used self-made template data sheets (example in
Figure S15) which worked with csv-format data exported from the Chemstation software. However, those with macro experience or training could potentially incorporate this algorithm within their software.