Estimation of Ascorbic Acid in Intact Acerola ( Malpighia emarginata DC ) Fruit by NIRS and Chemometric Analysis

Acerola fruit is one of the richest natural sources of ascorbic acid ever known. As a consequence, acerola fruit and its products are demanded worldwide for the production of health supplements and the development of functional products. However, the analytical determination of ascorbic acid is time-consuming and costly. In this study, we show a non-destructive, reliable, and fast method to measure the ascorbic acid content in intact acerola, using near-infrared spectroscopy (NIRS) associated with multivariate calibration methods. Models using variable selection by means of interval partial least squares (iPLS) and a genetic algorithm (GA) were tested. The best model for ascorbic acid content, based on the prediction performance, was the GA-PLS method with second derivative spectral pretreatment, with a root mean square error of cross-validation equal to 22.9 mg/100 g, root mean square error of prediction equal to 46.3 mg/100 g, ratio of prediction to deviation equal to 8.0, determination coefficient for calibration equal to 0.98 and determination coefficient for prediction equal to 0.96. The current methodology, using NIR spectroscopy and chemometrics, is a promising and rapid tool to determine the ascorbic acid content of intact acerola fruit.


Introduction
Acerola (Malpighia emarginata DC.), also called the Barbados cherry or cherry-of-Antilles, has its origin traced to the Caribbean area, northern South America and Central America.This tropical fruit is known for its high concentration of health-relevant phytochemicals, mainly ascorbic acid [1].The uniquely high concentration of ascorbic acid makes this fruit valuable for further use in other food products and creates a great potential market for acerola.In fact, significant market growth is expected for acerola extract.The global acerola extract market is expected to be worth 17.5 billion US dollars by 2026, due to rising global demand for natural bioactive-rich fruits and derivatives, such as acerola [2,3].
Horticulturae 2019, 5, 12; doi:10.3390/horticulturae5010012www.mdpi.com/journal/horticulturae The determination of ascorbic acid in acerola fruit is usually performed by high-performance liquid chromatography (HPLC), HPLC tandem photodiode array and mass spectrometry (MS) detectors (HPLC-PDA-MS/MS), and electrochemical or titrimetric methods [4,5].Although these analytical methods present high selectivity and sensitivity, there are some difficulties, such as the destruction of the sample, the need for specific reagents and bulky instrumentation that impairs in-field monitoring, as well as the generation of waste.In addition, some of the techniques, such as chromatography, require specialized personnel, which limits their application beyond the laboratory.Therefore, the development of non-destructive, reliable, accurate, fast and robust methods to measure the ascorbic acid content in intact acerola is highly desirable to ensure higher quality fruits to the market.
Near-infrared spectroscopy (NIRS) is an attractive analytical technique for measuring ascorbic acid [6,7] and other important quality parameters in horticultural products [8][9][10].Other applications and some developments have been reviewed by Moreda et al. [11].Briefly, the success of NIRS in food science and technology is due to minimum or no sample preparation, no waste generation, and low-cost when compared to conventional methods, while providing simultaneous analytical parameters or multi-components.
The use of appropriate chemometric tools for multivariate calibration is largely responsible for successful applications of the NIRS technique in intact fruits.The NIR spectrum is essentially composed of a large set of overtones and combination bands (O−H, C−H and N−H bonds), causing it to be highly convoluted.Multivariate calibration-such as partial least squares (PLS) [9], principal component regression (PCR) [12] artificial neural networks (ANN) [13], and least squares-support vector machines (LS-SVM)-and variable selection algorithms-such as interval partial least square (iPLS), and genetic algorithm (GA) [14]-are required to extract the information about quality attributes that remain unveiled in the NIR spectrum.Moreover, spectral pre-processing techniques applied before multivariate calibration are required to improve the signal-to-noise ratio of the spectral data and eliminate noise and undesired variations that can mask the signal of interest.Transformation techniques and their sequence depend on the NIR application itself, including a wide range of techniques, such as smoothing [9] and normalization [15,16].Moreover, the proposal and development of any new analytical validation of a multivariate method is very important because it is the first step for the recognition of methods for official analysis, estimating some figures of merit (FOM) such as analytical selectivity, sensitivity, precision, accuracy, limit of detection and limit of quantification [17].
Herein, this paper presents the use of NIRS as a rapid and non-destructive method to predict ascorbic acid content in intact acerola (Malpighia emarginata DC.) fruit.PLS regression and variable selection algorithms (iPLS and GA) were used for the statistical analysis of the data and development of prediction models.Several methods, including smoothing, multiplicative scatter correction and the second derivative, were conducted and compared to obtain the appropriate model.Finally, the best performing model was submitted to a statistical elliptical joint confidence region (EJCR) test and validated by the calculation of FOMs, including selectivity, sensitivity, analytical sensitivity, precision and accuracy, limit of detection and limit of quantification.

Fruit Material
Acerola fruits were purchased in the local market of Natal-RN-Brazil between December 2014 and January 2015.Acerola fruits with homogenous size and no evident injuries or illnesses were selected for this study.For the ascorbic acid and spectral measurements, 3 fruits were used per sample (306 acerola fruits, N = 102 samples).The selected acerola fruits were kept under controlled conditions (24-28 • C, humidity 60-80%) for at least 1 h prior to NIR spectral measurements.Acerola fruits in three different maturation stages were analyzed in this study.Here, the fruits were grouped into three different color groups (orange, red and purple).In doing this, our objective was to cover all the possible fruit variations in this study.

Reference Method for Ascorbic Acid and Morphological Properties
The method for the determination of ascorbic acid followed the modified titrimetric procedure described by Oliveira et al. [18] based on the Association of Official Agricultural Chemists (AOAC) standard methodology (AOAC 967.21), with the indicator 2,6-di-chlorophenol-indophenol (DCFI).A total of 0.5 g of sample was mixed with 50 mL of metaphosphoric acid and stirred for 3 min.The solution was used to titrate a mixture of 2 mL of DCFI with 18 mL of distilled water.The endpoint of titration was determined when the standard solution and the titrant reached the same color.The results were expressed in milligram per 100 g of fresh weight (mg/100 g FW).All analyses were performed in triplicate.
The fruit morphological characteristics (weight, diameter, average values) were also evaluated.The fruit weight was measured by an analytical balance, while the average diameter was determined using a digital caliper (INSIZE, 0-200 mm/0-8", model n • 1112-200, Ohio, EUA).

Near-Infrared Diffuse Reflectance Measurements
The spectra were collected for all samples in reflectance mode using a multi-purpose analyzer Antaris MX Fourier Transform near-infrared (FT-NIR) spectrometer (Thermo Fisher Scientific Inc., Waltham, MA, USA).The instrument was equipped with a NIR fiber optic probe, interferometer, cooled InGaAs detector, and a wide-band quartz halogen light source (50 W).The NIR spectra were obtained across the ranges 10,000-4166 cm −1 or 1000-2400 nm, and were recorded with a spectral resolution of 8 cm −1 (1.25 × 10 6 nm) and an average of 32 scans.The time required to achieve a spectral measurement was 30 s.For the spectral measurements, 3 fruits were used per sample (306 acerola fruits, N = 102 samples).The average value from three different spectral measurement locations (two diameter readings and one reading taken bottom to top of the fruit) on each fruit was stored, and the mean spectrum was calculated for each sample.

Chemometric Analysis
The data pre-treatment and construction of chemometric models were implemented in MATLAB version R2014a (Math-Works, Natick, MA, USA).Comparisons were made between the PLS regression analysis, varying the pre-processing: The first and second derivatives with different windows width (3-71 points), multiplicative scattering correction (MSC) and Savitzky-Golay (SG) smoothing (3-71 points).PLS is a technique of multivariate calibration based on maximal covariance between the spectral matrix and the response vector, and considers the number of latent variables (LV) defined as the number of factors that explain the model [19,20].In order to generate the prediction models for the ascorbic acid content, samples were grouped into two sets: 70% (N = 72) of samples for calibration and 30% (N = 30) for external validation, using the Kennard-Stone algorithm [21].
Variable selection algorithms (iPLS and GA) were used for optimizing PLS models using a spectrum of variables with more relevant information.The predicted results for the calibration models developed by PLS, using the spectral regions selected by iPLS and GA, were compared to those found by PLS using the whole region.The calibration model accuracy was described by the coefficient of determination for the calibration data set (R 2 c ) and validation data set (R 2 p ), root mean square error of cross-validation (RMSECV), and root mean square error of prediction (RMSEP).In addition, the ratio of prediction to deviation value (RPD) was calculated [9,22] by dividing the standard deviation (SD) to the RMSECV or the RMSEP.RPD results below 1.5 indicated that the calibration was not useful.RPD values higher than 2 indicated that quantitative predictions were possible.An outlier test was applied based on the model data, NIR matrices and the experimental ascorbic acid values used for calibration and validation, in consideration of the number of latent variables.Outlier detection was conducted to improve model accuracy, removing samples with extreme values and high influence on the model, and eliminating un-modeled residues in the spectra and vector data responses.An EJCR was calculated to evaluate the slope and intercept for the reference regression at a 95% confidence interval.Finally, validating an analytical method demonstrated whether it fulfilled its intended purpose.Therefore, some figures of merit, such as analytical selectivity, sensitivity, precision, accuracy, limit of detection, limit of quantification were determined.

Experimental Data
The number of samples (N), minimum (Min), maximum (Max), median and standard deviation (SD) for ascorbic acid content, and fruit morphological characteristics (diameter and weight) are presented in Table 1. Figure 1 shows the measured acerola absorbance spectra of all samples in the region between 1000 and 2400 nm.
Horticulturae 2019, 5, x FOR PEER REVIEW 4 of 10 analytical selectivity, sensitivity, precision, accuracy, limit of detection, limit of quantification were determined.

Experimental Data
The number of samples (N), minimum (Min), maximum (Max), median and standard deviation (SD) for ascorbic acid content, and fruit morphological characteristics (diameter and weight) are presented in Table 1. Figure 1 shows the measured acerola absorbance spectra of all samples in the region between 1000 and 2400 nm.

Ascorbic Acid Model Fitting
Initially, PLS and variable selection algorithms (iPLS and GA) were performed on the original spectra to develop the NIR model and thereby non-destructively predict ascorbic acid in intact acerola fruits.Table 2 shows the results for the PLS, iPLS-PLS and GA-PLS models, varying the processing method, after correcting the systematic behaviors presented by the original spectrum.

Ascorbic Acid Model Fitting
Initially, PLS and variable selection algorithms (iPLS and GA) were performed on the original spectra to develop the NIR model and thereby non-destructively predict ascorbic acid in intact acerola fruits.Table 2 shows the results for the PLS, iPLS-PLS and GA-PLS models, varying the processing method, after correcting the systematic behaviors presented by the original spectrum.Table 3 presents the results for outlier elimination and the RMSECV, RMSEP, determination coefficient for calibration (R 2 c ), predicted (R 2 p ), and RPD values for the GA-PLS models.In order to find the best model, some desirable criteria were considered, such as low RMSEP, low RMSECV, high determination coefficient for calibration (R 2 c ) and prediction (R 2 p ) data set, a small difference between RMSECV, and RMSEP.A relatively low number of LV were considered desirable in order to avoid the inclusion of noise in the model.Table 3. Summary of statistics for the calibration and prediction sets for ascorbic acid (mg/100 g FW) in acerola fruit using FT-NIR z spectra for different GA-PLS (second derivative) models after outlier detection.Figure 2 shows the correlation between the measured ascorbic acid concentration in intact acerola and those predicted by the best model based on NIR spectroscopy.As Table 3 shows, the higher value of RPD achieved for GA-PLS model after outlier detection showed that accurate quantification of acerola based on their content could be obtained.

Model
Based on the results for the best GA-PLS model presented in Table 3, the estimated intercept and slope ( â and b, respectively) parameters were compared with values of 0 and 1 using the EJCR test, in this case by using an ordinary least-squares (OLS) fitting.Figure 3 shows the ellipse corresponding to the validation samples (Table 3) that contain the theoretical (a = 0, b = 1) point, indicating that the bias was absent for the optimized GA-PLS model.The presence of relevant bias was tested with the prediction results for the validation samples by the t-test.The results showed that the bias included in the model was not significant, since the t value obtained (0.0831) was lower than the critical value of 2.14 within a 95% confidence level.Based on the results for the best GA-PLS model presented in Table 3, the estimated intercept and slope ( a ˆand b ˆ, respectively) parameters were compared with values of 0 and 1 using the EJCR test, in this case by using an ordinary least-squares (OLS) fitting .Figure 3 shows the ellipse corresponding to the validation samples (Table 3) that contain the theoretical (a = 0, b = 1) point, indicating that the bias was absent for the optimized GA-PLS model.The presence of relevant bias was tested with the prediction results for the validation samples by the t-test.The results showed that the bias included in the model was not significant, since the t value obtained (0.0831) was lower than the critical value of 2.14 within a 95% confidence level.Based on the results for the best GA-PLS model presented in Table 3, the estimated intercept and slope ( a ˆand b ˆ, respectively) parameters were compared with values of 0 and 1 using the EJCR test, in this case by using an ordinary least-squares (OLS) fitting .Figure 3 shows the ellipse corresponding to the validation samples (Table 3) that contain the theoretical (a = 0, b = 1) point, indicating that the bias was absent for the optimized GA-PLS model.The presence of relevant bias was tested with the prediction results for the validation samples by the t-test.The results showed that the bias included in the model was not significant, since the t value obtained (0.0831) was lower than the critical value of 2.14 within a 95% confidence level.Validation of the optimized multivariate calibration model for ascorbic acid determination was confirmed by the FOM, presented in Table 4.A reasonably low signal-to-noise ratio of 4.83 indicated that no straight relationship between the reference method and the fitted model was detected.

Discussion
Ascorbic acid is a labile molecule affected by several deleterious conditions such as oxygen, light, temperature and pressure [23].Acerola is known for its high content of ascorbic acid [24] which is associated with several biological activities such as antioxidant capacity, antiscorbutic activity and iron metabolism.The recent growing interest of the industry for such bioactive molecules has brought attention to the need to develop rapid and precise detection methods [25].
The ascorbic acid content of the fruits used in this study varied between 1190.65 to 2187.06 mg/ 100 g FW (Table 1).The observed variation of ascorbic acid was attributed to the differences between fruits, according to maturation stage [1].In this regard, it has been already shown that ascorbic acid is greatly reduced (about 50%) from the green to the ripe stage of acerola fruit due to biochemical oxidation reactions related to maturation [24].
Consistent baseline offsets and bias were observed in Figure 1.These are quite common features in NIR spectra acquired by diffuse reflectance techniques.As can be seen in Figure 1, the spectra were dominated by the water spectrum, with three overtone bands of the OH-bonds at 1070 and 1450 nm, respectively, and a combination band at 1940 nm.The peak at 1190 nm corresponded to the second and third C−H overtone regions, associated with sugar.Moreover, the sugar probably influenced the peak at 1700 nm.The absorption at 1170-1180 nm corresponded to the second overtone methylene group, and the 1410 nm peak represented the methylene group combination bands of ascorbic acid, while acid absorption occurred at 1890-1950 nm.Since ascorbic acid is one of the main chemical constituents of acerola fruit, these absorptions could be detected in the NIR spectra.
Due to some undesirable systematic behavior (baseline offsets) observed in the spectra (Figure 1), the original data were transformed by SG smoothing, multiplicative scattering correction (MSC) and SG first-and second-order derivatives.
As can be seen in Table 2, PLS and iPLS models demonstrated low predictive capability, but similar calibration and validation errors were observed.RPD values were low for all models.GA-PLS models presented better predictive capability compared to PLS and iPLS models.Reasonable models for ascorbic acid (RMSEP = 46.3mg/100 g and determination coefficient for calibration (R 2 c ) and prediction (R 2 p ) equal to 0.98 and 0.54, respectively) were obtained using a GA-PLS model with 436 wavelengths and 12 latent variables.However, the RPD = 1.6 is still considered poor.Although the RPD does not include other statistical parameters of the regression model, such as latent variables or calibration error, a more complex index would be desired for the evaluation of the goodness or robustness of regression models.
The GA-PLS model with second derivative pre-processing was optimized by eliminating samples with extreme leverages, extreme un-modeled residuals for the ascorbic acid concentration or spectral data (FT-NIR).For validation samples, the outlier tests were based on extreme leverages and spectral residuals.The best model was achieved using 14 latent variables, in which 3 and 15 samples were excluded due to extreme un-modeled residuals in the calibration and prediction set, respectively.For this model, the obtained errors were RMSECV = 22.9 mg/100 g FW, RMSEP = 46.3mg/100 g FW, RPD = 8.0 and R 2 p = 0.96.The FOM (Table 4) indicated a suitable validation of the fitted model.The RMSECV and RMSEP values revealed good accordance between the estimated multivariate model and the reference method, which highlighted a satisfactory accuracy for the model.Multiple measurements were conducted on five samples (ten replicates per sample) with spectra data recorded on the same day, to assess precision, at a level of repeatability.Sensitivity, LD and LQ results were considered adequate, considering the analytical range of each model.
Previous reports in the literature have attempted to develop an effective NIRS-based model for ascorbic acid content in fruit and vegetables.Kramchote et al. [26] applied the NIR reflectance spectra in the 310 to 1100 nm range (spectrophotometer Handy Lamda II; Spectra Co., Ltd., Tokyo, Japan) to evaluate the ascorbic acid content on cabbage and obtained a low predictive capacity model (RPD = 1.26,R 2 p = 0.35).In a more recent study, Pérez-Marín et al. [23] used the NIR spectra for assessing spinach pre-harvesting parameters.The fitted models revealed a limited use for routine testing, due to its restricted capacity of prediction (RPD = 1.21 and R 2 p = 0.33).The use of such poor-predicting models was recommended for simpler classification analysis, such as high/low sorting.Satisfactory accuracy for prediction was reported by Alamar et al. [27] using NIRS for modeling the ascorbic acid content on guava pulp.Lower variability on chemical data allowed the development of good-predicting models (RMSEP = 6.137 mg/100 g and R 2 p = 0.85) with a reduced number of samples (N = 45).

Conclusions
A method based on NIRS and multivariate calibration was developed to estimate the ascorbic acid content of intact acerola fruit.Spectra collected in the wavelength range of 1000-2400 nm were analyzed by PLS, iPLS and GA-PLS methods, combined with different spectral pretreatments.The best combination, based on the prediction performance, was the GA-PLS method with second derivative spectral pretreatment, with an RMSECV = 22.9 mg/100 g, RMSEP = 46.3mg/100 g, RPD = 8.0, R 2 c = 0.98 and R 2 p = 0.96.Our best model was compared to the conventional (destructive) method and no significant difference at a 95% confidence level was found.The values for accuracy, precision, and other figures of merit indicated that the NIRS model for ascorbic acid determination in intact acerola is a fast and reliable analytical alternative to titrimetric measurements.These findings can be useful for the rapid determination of ascorbic acid, a key acerola phytochemical that has been highly valued for the production of dietary supplements and the development of new products worldwide.

Figure 2 .
Figure 2. Measured versus predicted concentration of ascorbic acid (mg/100 g FW) in intact acerola using the genetic algorithm partial least squares (GA-PLS) model after second derivative pretreatment, where (•) calibration set; and (•) prediction set.

Figure 3 .
Figure 3. Elliptical joint confidence regions (EJCR) for predicted acid ascorbic content in acerola samples.

Figure 2 .
Figure 2. Measured versus predicted concentration of ascorbic acid (mg/100 g FW) in intact acerola using the genetic algorithm partial least squares (GA-PLS) model after second derivative pretreatment, where (•) calibration set; and (•) prediction set.

Figure 2 .
Figure 2. Measured versus predicted concentration of ascorbic acid (mg/100 g FW) in intact acerola using the genetic algorithm partial least squares (GA-PLS) model after second derivative pretreatment, where (•) calibration set; and (•) prediction set.

Figure 3 .
Figure 3. Elliptical joint confidence regions (EJCR) for predicted acid ascorbic content in acerola samples.

Figure 3 .
Figure 3. Elliptical joint confidence regions (EJCR) for predicted acid ascorbic content in acerola samples.

Table 1 .
Summary of statistics for ascorbic acid content and morphological characteristics (diameter and weight) of acerola fruit.

Table 1 .
Summary of statistics for ascorbic acid content and morphological characteristics (diameter and weight) of acerola fruit.

Table 2 .
Summary of statistics for the calibration and prediction sets for ascorbic acid (mg/100 g FW) in acerola fruit using FT-NIR z spectra and PLS, PLS-iPLS and GA-PLS.

Table 2 .
Summary of statistics for the calibration and prediction sets for ascorbic acid (mg/100 g FW) in acerola fruit using FT-NIR z spectra and PLS, PLS-iPLS and GA-PLS.RMSECV, root mean square error of calibration; RMSEP, root mean square error of prediction; R 2 p , determination coefficient for prediction; RPD, ratio of prediction to deviation; LV, latent variables.

Table 4 .
Figures of merit for the best performing GA-PLS z model of ascorbic acid in intact acerola.