1. Introduction
Colombia has established itself as a leading global exporter of Hass avocados (
Persea americana Mill.), with production concentrated in the Andean departments of Antioquia, Caldas, and Cundinamarca [
1]. To meet the quality requirements of the European and US markets, fruit must be harvested at a defined physiological maturity level, typically indicated by a Dry Matter (DM) content exceeding 23%, which ensures proper post-harvest ripening and flavor development [
2]. Harvesting too early produces fruit with a rubbery texture and bitter taste, whereas late harvesting shortens shelf life and increases susceptibility to fungal pathogens such as
Colletotrichum spp. [
1,
3].
The supply chain from Andean orchards to destination markets in Europe or Asia involves transit times of 20–35 days under cold storage. Currently, producers rely on destructive oven-drying analysis following AOAC protocols [
4] to determine DM content. This process is slow, is labor-intensive, and precludes high-frequency, individual-fruit monitoring. Consequently, there is a strong demand for non-destructive technologies capable of assessing maturity directly on the tree [
5].
1.1. Related Work and Research Gaps
Several non-destructive sensing modalities have been investigated for fruit maturity assessment. Near-Infrared Spectroscopy (NIR) has been widely applied for DM estimation in avocados with accuracy exceeding 95%; however, instrumentation costs remain prohibitive for small-scale farmers [
6]. Acoustic resonance methods offer lower hardware costs but are sensitive to fruit geometry rather than internal chemistry [
7]. Hyperspectral and multispectral imaging has been used to map surface and subsurface properties of horticultural produce, though they require controlled illumination environments. Wi-Fi Channel State Information (CSI) has been explored for non-contact ripeness estimation, achieving approximately 91% accuracy in avocados [
8], but relies on wireless infrastructure unavailable in remote orchards.
Electrical Bioimpedance Spectroscopy (EIS) offers a complementary approach by measuring the frequency-dependent opposition of biological tissue to alternating current [
9]. As the avocado matures, the enzymatic activity degrades the cell membranes and modifies electrolyte concentrations, producing measurable shifts in the impedance spectrum [
10,
11]. Prior studies have demonstrated the feasibility of EIS in tropical fruits, including mangoes and bananas [
12]; however, most rely on laboratory LCR meters and two-electrode configurations that introduce high contact impedance artifacts, limiting their field applicability. The four-electrode (tetrapolar) configuration, which separates current injection from voltage sensing, substantially reduces electrode polarization errors as described by Schwan [
13] and has been implemented in custom ASIC designs [
14], yet its application to avocados in field conditions remains largely unexplored.
In the domain of predictive modeling for fruit quality, machine learning methods have been applied across a range of crops. Mancero-Castillo et al. [
15] surveyed modeling approaches for tropical fruit production, noting the scarcity of reported models for avocado and related species. Xiao et al. [
16] reviewed the integration of multi-omics data with classical and deep learning methods for food quality classification. Specific to avocados, Support Vector Machines and ensemble methods have been applied to texture and spectral features with promising results, while recurrent architectures such as Long Short-Term Memory (LSTM) networks have shown strong performance on sequential agricultural time-series data [
17,
18].
Table 1 summarizes representative non-destructive maturity assessment studies, highlighting the research gap addressed by the present work. The reviewed literature reveals three key limitations, which are considered in this work. First, existing EIS-based fruit studies rely on two-electrode laboratory setups that confound contact and tissue impedance, preventing direct deployment in the field. Second, no prior work has collected longitudinal, multi-timepoint impedance measurements of individual avocados to characterize the dynamic maturation trajectory. Third, the comparative evaluation of temporal deep learning architectures against classical chemometric baselines has not been performed for avocado maturity classification under field conditions in the Colombian Andes.
1.2. Objectives and Contributions
The objective of this study is to develop and validate a low-cost, non-destructive approach for assessing the maturity of Hass avocado fruit using machine learning techniques. Specifically, the main contributions of this work are the following:
Design and implementation of a low-cost, non-invasive portable bioimpedance spectrometer operating in the 1–10 kHz frequency range.
Development of a custom Analog Front End (AFE) integrated with a tetrapolar surface probe to minimize skin–electrode contact impedance artifacts.
Empirical demonstration of the superiority of the 4-electrode configuration over the 2-electrode configuration, in accordance with Schwan’s electrode polarization theory [
13].
Performing a correlation analysis between the spectral characteristics of bioimpedance and destructive dry matter measurements for the specific agroclimatic conditions of the Colombian Andes.
Exhaustive evaluation of different machine learning algorithms for predicting avocado maturity based on bioimpedance-derived features.
Conducting a field study that tracks 100 labeled fruits at four fortnightly measurement points (different maturity stages) to compare predictions based on time trajectories with those based on single-point measurements.
2. Materials and Methods
This section describes the biological criteria for harvest maturity classification, the proposed instrumentation system, the modeling strategy, the data collection process, and the experimental protocol.
2.1. Maturity Windows and Export Challenges for Colombian Hass Avocado
The maturation physiology of Hass avocado (Persea americana Mill.) presents a unique challenge for the agricultural industry: it is a climacteric fruit that does not soften while attached to the tree due to the inhibitory effect of endogenous hormones. Consequently, two distinct phenological stages must be differentiated:
For the Colombian export sector, identifying the exact onset of Harvest Maturity is critical. The supply chain from Andean orchards to destination markets in Europe or Asia involves transit times ranging from 20 to 35 days. Therefore, the primary goal is to preserve the fruit’s “Green Life”—the period during which the fruit remains firm and unripe under cold storage.
If harvested too early (before reaching
Dry Matter), the fruit fails to ripen properly, developing a rubbery texture and bitter taste. Conversely, late harvesting reduces Green Life, increasing the risk of the fruit ripening inside the shipping container or succumbing to chilling injury and fungal pathogens (e.g.,
Colletotrichum spp.) [
2,
20]. The Harvest Window is defined as the critical decision-making period beginning approximately 60 days prior to consumption ripeness.
2.2. Bioimpedance Measurement System and Analog Front End
A portable spectrometer based on the AD5933 impedance converter (Analog Devices, Wilmington, MA, USA) was designed and built. The AD5933 generates a sinusoidal excitation voltage sweeping from 1 to 10 kHz and digitizes the response signal via an internal 12-bit ADC, computing the real and imaginary components of impedance through a DFT-based algorithm on-chip. Frequency resolution, output voltage amplitude, and gain setting resistor were configured via I2C from an Arduino microcontroller. Measured impedance data were logged to an SD card and transferred via USB for post-processing.
To address the high and variable contact resistance of the exocarp without compromising fruit integrity, a custom Non-Destructive Tetrapolar (4-Electrode) Surface Probe was developed (see
Figure 1). The contact interface consisted of four spring-loaded, gold-plated flat electrodes (2 mm diameter) arranged collinearly with a uniform inter-electrode spacing of 10 mm. Since the AD5933 is natively a 2-terminal device, a custom external Analog Front End (AFE) was implemented to support this tetrapolar configuration, adapting designs from recent ASIC-based front-ends [
14].
The AFE is defined by: (i) a Howland Current Pump based on an LM324 operational amplifier to inject a stable, frequency-invariant alternating current (10 μA peak) through the outer electrode pair; and (ii) a precision Instrumentation Amplifier (INA128, Texas Instruments, Dallas, TX, USA) with high input impedance (>10 GΩ) to measure the differential voltage across the inner electrode pair without drawing appreciable current. This separation of excitation and sensing paths effectively removes the electrode–tissue contact impedance from the measurement, isolating the true mesocarp impedance.
Particularly, in
Figure 1, the outer electrodes (E1, E4) inject a controlled AC current generated by the Howland Current Pump. The inner electrodes (E2, E3) sense the resulting differential voltage via a high-impedance Instrumentation Amplifier (INA128). The buffered voltage is fed back to the AD5933 VOUT/VIN terminals, enabling impedance calculation while excluding contact resistance artifacts.
This portable bioimpedance spectrometer, based on Arduino technology with a custom external Analog Front End (AFE), is an architecture with a substantially lower cost than commercial LCR meters or NIR instruments.
2.3. Statistical and Machine Learning Modeling Strategy
To map the bioimpedance spectral features to the binary maturity classes, a comparative framework was designed that evaluates three levels of algorithmic complexity.
2.3.1. Baseline Chemometrics: PLS-DA
As a baseline, Partial Least Squares Discriminant Analysis (PLS-DA) [
21] was implemented. This technique efficiently handles multicollinearity in spectral data by projecting the high-dimensional impedance data into a lower-dimensional space of Latent Variables (LVs).
2.3.2. Non-Linear Static Classifiers: SVM and Random Forest
Given the complex electrochemical nature of biological tissue, the following classifiers were deployed:
Support Vector Machine (SVM): Utilized with a Radial Basis Function (RBF) kernel [
22,
23]. SVM was selected because it maps features into a higher-dimensional space where non-linear class boundaries in the impedance feature space become linearly separable, a property established in prior spectral classification studies. Thus, SVM was selected for its proven effectiveness with high-dimensional spectral data and non-linear boundaries.
Random Forest (RF): An ensemble of decision trees [
24,
25]. RF was included for its robustness to overfitting at moderate dataset sizes and its ability to provide interpretable feature importance rankings via Gini impurity, useful for identifying the most discriminative frequency bands.
The term “static” or“single-point” refers to the fact that these techniques receive only one measurement point per fruit during their learning process, without access to previous points, in contrast to the LSTM technique, which receives a trajectory of points.
2.3.3. Temporal Deep Learning: Long Short-Term Memory (LSTM)
Long Short-Term Memory (LSTM) [
17] was selected as the representative sequential architecture because it retains long-range dependencies in time-series data through gating mechanisms (input, forget, output gates), making it suitable for learning from the multi-timepoint impedance trajectory of individual fruits. Thus, LSTM is a representative temporal architecture capable of learning from sequential impedance measurements [
17]. Unlike static models, the LSTM receives a sequence of length
(one measurement per bi-weekly interval), enabling it to encode the rate of change and curvature of the impedance decay. The architecture consists of a masking layer, a primary LSTM layer with 64 units, a secondary LSTM layer with 32 units, a dropout layer (rate = 0.2), and a sigmoid output neuron for binary classification [
18].
2.4. Experimental Context and Data Collection
The study was conducted during the 2024 main harvest season in a commercial orchard located in Cundinamarca, Colombia. A longitudinal, non-destructive tracking design was employed, where the physiological evolution of the same tagged fruits was monitored over the maturation period.
Ten distinct trees randomly distributed across the orchard were selected to account for spatial variability, and 10 fruits were randomly selected and tagged from each tree (
total individuals). Bioimpedance measurements were acquired at 15-day intervals (bi-weekly) over the monitoring period (4 timepoints).
All measurements from a given fruit were always assigned to the same cross-validation fold, preventing any temporal leakage between training and test sets (see
Section 2.5.3).
2.4.1. Microclimate Monitoring and Stratified Sampling
Given the dense canopy structure of mature Hass avocado trees, significant microclimatic variations exist between the inner and outer foliage. To characterize these environmental gradients, portable data loggers (DHT22 sensors interfaced with Arduino) were installed at two distinct canopy depths. The air temperature (
) and relative humidity (
) were recorded to calculate the Vapor Pressure Deficit (VPD) [
26]. To ensure that the bioimpedance dataset captured this intra-tree variability, a stratified random sampling method was applied (
Figure 2).
2.4.2. Dataset Structure and Preprocessing
The final dataset comprised 400 total bioimpedance measurements, derived from the longitudinal tracking of 100 fruits across four 15-day intervals (4 timepoints × 100 fruits), each linked to its final Dry Matter label. The final destructive analysis yielded a class distribution of 58 fruits labeled Harvest-Ready and 42 fruits labeled Immature. Class 1 fruits were more prevalent in the sun-exposed upper canopy (35 vs. 23), confirming non-uniform ripening rates.
Table 2 presents a representative sample of the dataset illustrating the feature structure and class labels.
To ensure reproducibility, the raw data from AD5933 were converted into the full complex impedance spectrum. For each frequency point, the input feature vector was constructed using four components: Magnitude (), Phase Angle (), Resistance (R), and Reactance (X). Outliers were screened using a Z-score threshold of . Subsequently, Min-Max normalization was applied to scale all features to the range. For the LSTM architecture, the dataset was reshaped into a 3D tensor of shape , where .
2.5. Experimental Protocol
2.5.1. Calibration Protocol and Validation
To rigorously correct for the parasitic impedance of the cabling and the AFE phase shift, a classic Open–Short–Load (OSL) compensation strategy was applied.
Open/Short Correction: Measurements were taken with probes separated (Open) and shorted (Short) to characterize leakage admittance and residual series impedance, respectively.
Load Correction (Gain Factor): A bank of precision metal-film resistors ( tolerance) covering the physiological range (2–) was used to compute the System Gain Factor.
Phase Validation (RC Dummy Cell): A dummy cell consisting of a resistor in parallel with a capacitor was tested. The system achieved a phase accuracy of compared to the theoretical Bode plot of the RC network.
2.5.2. Reference Method: Dry Matter Analysis
To establish the ground truth for the predictive models, destructive analysis was performed only at the end of the longitudinal tracking period (Week 4) for all 100 tagged fruits, following the AOAC standard method [
4]. A 10 g sample of the mesocarp was weighed (
), dried in an oven at 105 °C for 24 h, and weighed again (
). The Dry Matter percentage was calculated as
. Fruits with
were labeled Harvest-Ready (Class 1); others were labeled Immature (Class 0) [
2].
2.5.3. Validation Strategy: Stratified Group K-Fold Cross-Validation
A Stratified Group K-Fold Cross-Validation scheme (
) was implemented [
27]. Unlike standard random splitting, this approach grouped samples by Tree ID. In each iteration, data from 8 trees (80%) were used for training, while data from the remaining 2 trees (20%) were held strictly for validation. Importantly, all four timepoint measurements from any individual fruit always remained within the same fold—either entirely in training or entirely in validation. This grouping explicitly prevents temporal leakage: no information from a fruit’s early timepoints can appear in training while its later timepoints appear in testing.
2.5.4. Hyperparameter Tuning
Hyperparameter optimization was performed using an exhaustive Grid Search strategy within the inner loops of cross-validation [
25]. For the LSTM network, the Adam optimizer (
) and Binary Cross-Entropy loss were employed, with Early Stopping (patience = 10 epochs) to prevent overfitting.
Table 3 details the search spaces and optimal configurations.
Table 4 presents the sensitivity analysis for the LSTM model, quantifying the sensitivity of mean cross-validation accuracy to each major hyperparameter.
3. Results and Discussion
3.1. Preliminary Analysis Based on Feature Engineering
This section presents the analysis of variables prior to evaluating avocado maturity prediction performance. Several feature engineering processes [
23] are applied to determine the most relevant variables.
3.1.1. Analysis of Agronomic Covariates and Feature Importance
To determine the relative contribution of environmental vs. physiological variables, the study incorporated auxiliary datasets including cumulative precipitation and soil nutrient concentrations (N, K, P) [
28]. Feature importance analysis using Random Forest (
Figure 3) revealed that bioimpedance features (specifically at 7.2 kHz and 5 kHz) accounted for over 85% of the model’s decision-making weight. Soil nutrients showed a negligible contribution to the specific task of identifying the immediate harvest maturity window.
3.1.2. Impact of Electrode Configuration
The reliability of the data acquisition system was heavily dependent on the probe design. Consistent with Schwan’s theory on electrode polarization [
13], the 4-electrode configuration proved to be superior. As shown in
Figure 4, the 2-electrode setup exhibited a high variance (
) due to skin resistance. The 4-electrode setup effectively isolated the mesocarp impedance, significantly reducing noise (
).
Figure 5 shows the bioimpedance trajectories of maturation at different stages of maturation, comparing ready and immature harvests.
3.1.3. Spectral and Temporal Analysis
Feature engineering revealed specific frequency bands sensitive to maturity.
Figure 6 illustrates the correlation (
) between impedance and dry matter. Two distinct peaks were observed at 5000 Hz and 7200 Hz, aligning with Equivalent Circuit Modeling (ECM) approaches [
11].
Figure 7 highlights the temporal evolution of the signal, showing a consistent decay in impedance magnitude at 7.2 kHz.
3.2. General Analysis and Model Benchmarking
3.2.1. Robustness and Stability Analysis
The application of Group K-Fold Cross-Validation revealed the stability of the evaluated architectures.
Figure 8 illustrates the distribution of accuracy scores. Boxplots represent the spread of classification accuracy across 5 folds. The LSTM model (green) achieves the highest median accuracy and lowest variance. While PLS-DA showed significant variance between folds, the LSTM model demonstrated superior stability with a compact distribution (
), which confirms effective generalization.
Table 5 provides per-fold regression metrics for the LSTM model.
In the case of the best model (LSTM), overfitting risk was managed through three mechanisms: (1) Group K-Fold ensures the model is evaluated on entirely unseen data; (2) dropout regularization (rate 0.2) reduces co-adaptation of LSTM units; and (3) Early Stopping prevents training beyond the point of validation loss improvement. Despite these strategies, the dataset is limited to a single orchard (), and generalizability to other agro-climatic zones—where soil salinity, rootstock, and altitude differ—cannot be assumed without additional validation studies.
3.2.2. Model Performance Benchmarking
Table 6 details the metrics for all the evaluated models. The LSTM architecture, which exploits the temporal sequence of impedance measurements, achieved the highest performance across all metrics. The superior accuracy of machine learning approaches over PLS-DA confirms the non-linear nature of the impedance–maturity relationship.
It is important to note that the
values reported in
Table 6 (ranging from 0.38 for PLS-DA to 0.70 for LSTM) should not be interpreted under the conventional regression criterion of
, since the models addressed in this work are binary classifiers, not continuous regressors. In this context,
was computed between the predicted class probabilities (continuous output in [0, 1]) and the binary ground-truth labels (0 or 1), which inherently limits the achievable coefficient of determination regardless of model quality: a perfect classifier that outputs exactly 0 or 1 still does not yield
when the label distribution is unbalanced (58% vs. 42% in this study). The appropriate performance indicators for evaluating binary classification are Accuracy, AUC, F1-Score, Precision, and Recall, all of which are reported in
Table 6 and in
Figure 9. By these primary metrics, the LSTM achieves strong performance (Accuracy = 92%, AUC = 0.94, F1 = 0.91), and the moderate
values are fully consistent with these results. The
column is included as a supplementary descriptor of the probability-output calibration, and its relatively low values reflect the structural mismatch between a regression-oriented metric and a classification task, not a deficiency in model performance.
3.2.3. Signal Tracking and Model Fidelity
Figure 10 compares raw bioimpedance measurements and LSTM predictions. Environmental measurement noise is attenuated in the LSTM output, with the physiological impedance decay trajectory reconstructed more smoothly. In this way, the LSTM successfully tracks the general downward trend of maturation while ignoring transient disturbances.
3.2.4. Impact of Longitudinal Data Sequence Length
A critical hypothesis of this study is that the history of the fruit’s development contains predictive information unavailable in a single measurement. To quantify the value of the longitudinal dataset size (temporal depth), the performance of the LSTM model (trained on the full sequence of 4 bi-weekly measurements) was compared against single-point (“static”) baseline models trained exclusively on the final snapshot data (Week 4).
Table 7 demonstrates that restricting the dataset to a single time-point (
) results in a performance drop. The inclusion of the full temporal sequence (
) improves Mean Accuracy by +4.0% classifier. To assess the statistical significance of this improvement, a paired
t-test [
29] was conducted on the fold-level accuracy scores of the LSTM (
) versus Random Forest (
):
,
(<0.05), confirming that the improvement is statistically significant.
3.3. Comparative Analysis with Other Avocado Maturity Assessment Technologies
This section presents a qualitative comparison of our approach with similar studies. The criteria considered include the cost of implementing the technology, whether it involves destroying or touching the fruit, the accuracy of the maturity measurements obtained, whether it takes into account the temporal aspect of avocado maturity, and whether special field preparation is required for its use.
Table 8 shows the results of this comparison. While Wi-Fi CSI and NIR eliminate contact, they require expensive infrastructure and do not consider the temporal aspect in avocado maturity assessment. Also, Wi-Fi CSI, NIR, and EIS+LSTM require field preparation for data collection. Dry Matter is destructive and does not consider the temporal aspect. The proposed EIS+LSTM approach offers a cost-effective solution for field measurements where infrastructure is limited.
4. Conclusions
This study developed a low-cost, field-deployable bioimpedance spectroscopy system, and implemented four predictive models on the longitudinal impedance data provided by the system for non-destructive classification of Hass avocado harvest maturity in Colombia. The system has a custom tetrapolar AFE, designed around the AD5933 converter, which was validated through Open–Short–Load calibration, successfully isolating mesocarp impedance from surface contact artifacts. The identified 5.0–7.2 kHz frequency band exhibited the strongest correlation with destructive DM measurements, consistent with known electrochemical changes associated with cell membrane degradation during physiological maturation.
From a predictive perspective, among the evaluated classifiers, the LSTM architecture achieved the highest mean accuracy (92.0%, AUC = 0.94), outperforming the linear PLS-DA baseline by 14 percentage points and the best single-point classifier by 4.0%. These results indicate that harvest maturity is more effectively characterized as a dynamic trajectory than as a scalar threshold at a single observation.
The preliminary findings are subject to important limitations. The dataset comprises 100 fruits from a single high-altitude Andean orchard, and reflects the specific agro-ecological conditions of a region The bioimpedance baselines are sensitive to agro-ecological factors, including soil salinity, seasonal rainfall, rootstock variety, and altitude. Generalization to other production regions requires additional calibration studies to adjust the predictive models to the environmental values in each region and avoid degrading their performance. Finally, the AD5933 hardware limits exploration to frequencies below 100 kHz; the exploration of high-frequency phenomena (>100 kHz) could offer additional tissue insight.
Future work will focus on extending the dataset to multiple orchards and implementing Transfer Learning protocols to reduce the cost of deployment in new sites. For example, the definition of a pre-trained LSTM core that is fine-tuned with small calibration datasets from new orchards minimizes the need for extensive destructive testing or learning processes. Additionally, the integration of this sensor technology into autonomous IoT nodes or drone-mounted manipulators is planned. This would enable the creation of real-time “Maturity Maps” for large-scale plantations, allowing producers to execute selective harvesting strategies based on precise physiological data rather than calendar estimation.
Author Contributions
Conceptualization, F.J.S. and J.A.; methodology, F.J.S.; software, F.J.S.; validation, J.A. and M.T.-B.; formal analysis, F.J.S.; investigation, F.J.S.; resources, M.T.-B.; writing—original draft preparation, F.J.S.; writing—review and editing, J.A. and M.T.-B.; supervision, J.A. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable for studies involving plants.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data presented in this study are available on request from the corresponding author.
Acknowledgments
The authors acknowledge EAFIT University for providing the laboratory facilities for the Dry Matter analysis.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Xavier, P.; Rodrigues, P.M.; Silva, C.L.M. Shelf-Life Management and Ripening Assessment of ’Hass’ Avocado (Persea americana) Using Deep Learning Approaches. Foods 2024, 13, 1150. [Google Scholar] [CrossRef]
- Carvalho, C.; Velásquez, M.; Van Rooyen, Z. Determination of the minimum dry matter index for the optimum harvest of ’Hass’ avocado fruits in Colombia. Agron. Colomb. 2015, 33, 221–227. [Google Scholar] [CrossRef]
- Aline, U.; Bhattacharya, T.; Faqeerzada, M.A.; Kim, M.S.; Baek, I.; Cho, B.K. Advancement of non-destructive spectral measurements for the quality of major tropical fruits and vegetables: A review. Front. Plant Sci. 2023, 14, 1240361. [Google Scholar] [CrossRef]
- Horwitz, W. Official Methods of Analysis of AOAC International, 17th ed.; The Association of Official Analytical Chemists: Gaithersburg, MD, USA, 2000. [Google Scholar]
- Abukhalil, T.; Gao, Y.; Ranasinghe, A. HapticFormers: Utilizing Transformers for Avocado Maturity Grading through Vision-based Tactile Assessment. IEEE Robot. Autom. Lett. 2024, 9, 3533–3540. [Google Scholar] [CrossRef]
- Wang, Z.; Huang, X.; Liang, X.; Wu, B.; Tang, W. A Handheld IoT Vis/NIR Spectroscopic System to Assess the Soluble Solids Content of Wine Grapes. Sensors 2024, 24, 4636. [Google Scholar] [CrossRef]
- Wu, W.; Wei, S.; Ma, J.; Liu, M.; Wang, H.; Zhang, R.; Peng, Z. Non-Destructive Detection of Fruit Quality: Technologies, Applications and Prospects. Horticulturae 2025, 10, 615. [Google Scholar] [CrossRef]
- Tan, S.; Yang, J. Object Sensing for Fruit Ripeness Detection Using WiFi Signals. arXiv 2021, arXiv:2106.00860. [Google Scholar] [CrossRef]
- Davur, Y.J.; Kämper, W.; Khoshelham, K.; Trueman, S.J.; Bai, S.H. Estimating the Ripeness of Hass Avocado Fruit Using Deep Learning with Hyperspectral Imaging. Horticulturae 2023, 9, 599. [Google Scholar] [CrossRef]
- Garillos-Manliguez, C.A.; Chiang, J.Y. Multimodal deep learning and visible-light and hyperspectral imaging for fruit maturity estimation. Sensors 2021, 21, 1288. [Google Scholar] [CrossRef]
- Kluza, M.; Karpiel, I.; Duch, K.; Komorowski, D.; Sieciński, S. An Assessment of the Freshness of Fruits and Vegetables Through the Utilization of Bioimpedance Spectroscopy (BIS)—A Preliminary Study. Foods 2025, 14, 947. [Google Scholar] [CrossRef]
- Islam, M.N.; Kim, S.H.; Kim, Y.J. Few shot learning for avocado maturity determination from microwave images. J. Agric. Food Res. 2024, 15, 100977. [Google Scholar] [CrossRef]
- Schwan, H. Electrode polarization impedance and its influence on biological impedance measurements. Ann. N. Y. Acad. Sci. 1968, 148, 191–209. [Google Scholar] [CrossRef]
- Fernando Soane, J.F. Analog Front-End Enables Electrical Impedance Spectroscopy System On-Chip for Biomedical Applications. Physiol. Meas. 2015, 6, 67–78. [Google Scholar] [CrossRef]
- Mancero-Castillo, D.; Garcia, Y.; Aguirre-Munizaga, M.; Ponce de Leon, D.; Portalanza, D.; Avila-Santamaria, J. Dynamic perspectives into tropical fruit production: A review of modeling techniques. Front. Agron. 2024, 6, 1482893. [Google Scholar] [CrossRef]
- Xiao, X.; Fu, L.; Zhang, Z.; Tu, Z.; Shen, N.; Fan, S. Recent Advances in Integrating Machine Learning with Omics Approaches in Food Science and Nutrition Research. J. Agric. Food Chem. 2025, 73, 29998–30025. [Google Scholar] [CrossRef] [PubMed]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Amruta, A.; Mali, D.S.P. Deep Learning-Based Fruit Detection and Ripeness Assessment. Int. Res. J. Adv. Eng. Manag. 2025, 12, 678. [Google Scholar] [CrossRef]
- Ochoa-Ascencio, S.; Hertog, M.; Nicolaï, B. Physicochemical changes in ’Hass’ avocado (Persea americana Mill.) during ripening at different temperatures. J. Food Eng. 2021, 90, 551–564. [Google Scholar] [CrossRef]
- Ramírez-Gil, J.; Morales-Osorio, J.; Peterson, A. Potential geography and productivity of “Hass” avocado crops in Colombia estimated by ecological niche modeling. Sci. Hortic. 2018, 237, 287–295. [Google Scholar] [CrossRef]
- Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Morales, L.; Ouedraogo, C.; Aguilar, J.; Chassot, C.; Medjiah, S.; Drira, K. Experimental comparison of the diagnostic capabilities of classification and clustering algorithms for the QoS management in an autonomic IoT platform. Serv. Oriented Comput. Appl. 2019, 13, 199–219. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Quintero, Y.; Ardila, D.; Camargo, E.; Rivas, F.; Aguilar, J. Machine learning models for the prediction of the SEIRD variables for the COVID-19 pandemic based on a deep dependence analysis of variables. Comput. Biol. Med. 2021, 134, 104500. [Google Scholar] [CrossRef]
- Moreno-Ortega, G.; Pliego, C.; Sarmiento, D.; Barceló, A.; Martínez-Ferri, E. Yield and fruit quality of avocado trees under different regimes of water supply in the subtropical coast of Spain. Agric. Water Manag. 2019, 221, 192–201. [Google Scholar] [CrossRef]
- Hoyos, W.; Aguilar, J.; Toro, M. A clinical decision-support system for dengue based on fuzzy cognitive maps. Health Care Manag. Sci. 2022, 25, 666–681. [Google Scholar] [CrossRef] [PubMed]
- Salazar-García, S.; González-Durán, I.; Tapia-Vargas, L. Influence of climate, soil moisture, and flowering phenology on biomass and nutrient composition of ’Hass’ avocado fruit in Michoacan, Mexico. Rev. Chapingo Ser. Hortic. 2011, 17, 183–194. [Google Scholar] [CrossRef]
- Okoye, K.; Hosseini, S. T-test Statistics in R: Independent Samples, Paired Sample, and One Sample T-tests. In R Programming: Statistical Data Analysis in Research; Springer Nature: Singapore, 2024; pp. 159–186. [Google Scholar] [CrossRef]
Figure 1.
Analog Front End (AFE) Schematic for 4-Terminal Adaptation.
Figure 1.
Analog Front End (AFE) Schematic for 4-Terminal Adaptation.
Figure 2.
Effect of Canopy Position on Microclimate and Fruit Physiology. ((A) The Vapor Pressure Deficit (VPD) is significantly higher in the outer canopy during midday hours. (B) This environmental disparity results in distinct bioimpedance profiles (The dots represent individual observations), with inner canopy fruits exhibiting higher impedance values (slower maturation). This justifies the use of stratified sampling.)
Figure 2.
Effect of Canopy Position on Microclimate and Fruit Physiology. ((A) The Vapor Pressure Deficit (VPD) is significantly higher in the outer canopy during midday hours. (B) This environmental disparity results in distinct bioimpedance profiles (The dots represent individual observations), with inner canopy fruits exhibiting higher impedance values (slower maturation). This justifies the use of stratified sampling.)
Figure 3.
Feature Importance Ranking. (Bioimpedance signals (blue) dominate the predictive model. Microclimate variables such as VPD (orange) have moderate relevance, while soil nutrients (gray) show minimal relevance for immediate maturity detection.)
Figure 3.
Feature Importance Ranking. (Bioimpedance signals (blue) dominate the predictive model. Microclimate variables such as VPD (orange) have moderate relevance, while soil nutrients (gray) show minimal relevance for immediate maturity detection.)
Figure 4.
Distribution of impedance magnitude measurements across three electrode configurations at 5 kHz (The 4-electrode setup (black) drastically minimizes variance compared to the 2-electrode setup. The dots represent individual observations ).
Figure 4.
Distribution of impedance magnitude measurements across three electrode configurations at 5 kHz (The 4-electrode setup (black) drastically minimizes variance compared to the 2-electrode setup. The dots represent individual observations ).
Figure 5.
Nyquist Plot (The arc contracts as the fruit transitions from Immature (blue) to Harvest-Ready (red), indicating loss of cell wall capacitance).
Figure 5.
Nyquist Plot (The arc contracts as the fruit transitions from Immature (blue) to Harvest-Ready (red), indicating loss of cell wall capacitance).
Figure 6.
Spectral Sensitivity Analysis. (Pearson between impedance features and dry matter content across the 1–10 kHz range. Peaks at 5 kHz and 7.2 kHz identify the optimal measurement frequencies for maturity discrimination.)
Figure 6.
Spectral Sensitivity Analysis. (Pearson between impedance features and dry matter content across the 1–10 kHz range. Peaks at 5 kHz and 7.2 kHz identify the optimal measurement frequencies for maturity discrimination.)
Figure 7.
Longitudinal tracking of impedance (It shows a downward trend towards harvest. The dots represent individual observations).
Figure 7.
Longitudinal tracking of impedance (It shows a downward trend towards harvest. The dots represent individual observations).
Figure 8.
Cross-Validation Performance (Boxplots of classification accuracy across 5 folds).
Figure 8.
Cross-Validation Performance (Boxplots of classification accuracy across 5 folds).
Figure 9.
Comparative performance of the four models based on K-Fold Cross-Validation results (The LSTM network provides the highest mean accuracy (92%)).
Figure 9.
Comparative performance of the four models based on K-Fold Cross-Validation results (The LSTM network provides the highest mean accuracy (92%)).
Figure 10.
Time-Series Tracking Performance (Comparison of raw sensor data (blue) versus LSTM model output (orange)).
Figure 10.
Time-Series Tracking Performance (Comparison of raw sensor data (blue) versus LSTM model output (orange)).
Table 1.
Summary of Representative Studies on Non-Destructive Approaches to Assessing the Maturity of Avocado and other Tropical Fruits (EIS: Electrical Impedance Spectroscopy; NIR: Near-Infrared; CSI: Channel State Information).
Table 1.
Summary of Representative Studies on Non-Destructive Approaches to Assessing the Maturity of Avocado and other Tropical Fruits (EIS: Electrical Impedance Spectroscopy; NIR: Near-Infrared; CSI: Channel State Information).
| Reference | Method | Fruit | Temporal | Key Limitation |
|---|
| [6] | NIR | Avocado | No | High equipment cost |
| [7] | Acoustic | Multiple | No | Geometry-sensitive |
| [8] | Wi-Fi CSI | Avocado | No | Infrastructure required |
| [12] | EIS (2-electrode) | Mango/Banana | No | Contact artifact |
| [15] | ML (survey) | Tropical | No | No EIS integration |
| This work | EIS (4-electrode) + LSTM | Avocado | Yes | |
Table 2.
Representative Sample of the Bioimpedance Dataset. (Values at 7.2 kHz. : impedance magnitude (); : phase angle (°); R: resistance (); X: reactance ().)
Table 2.
Representative Sample of the Bioimpedance Dataset. (Values at 7.2 kHz. : impedance magnitude (); : phase angle (°); R: resistance (); X: reactance ().)
| Fruit ID | Tree | Week | | | R | X | Class |
|---|
| F-001 | T-01 | 1 | 4820 | −8.3 | 4768 | −694 | – |
| F-001 | T-01 | 2 | 4510 | −9.1 | 4450 | −714 | – |
| F-001 | T-01 | 3 | 4190 | −9.8 | 4128 | −711 | – |
| F-001 | T-01 | 4 | 3870 | −10.4 | 3803 | −698 | 1 (Ready) |
| F-012 | T-02 | 1 | 6230 | −5.2 | 6207 | −564 | – |
| F-012 | T-02 | 2 | 6080 | −5.6 | 6051 | −593 | – |
| F-012 | T-02 | 3 | 5910 | −5.9 | 5877 | −607 | – |
| F-012 | T-02 | 4 | 5780 | −6.1 | 5748 | −614 | 0 (Immature) |
Table 3.
Hyperparameter Search Space and Optimal Configuration (The optimal values represent the mode selected across the 5 cross-validation folds).
Table 3.
Hyperparameter Search Space and Optimal Configuration (The optimal values represent the mode selected across the 5 cross-validation folds).
| Model | Hyperparameter | Grid Search Space | Optimal Value |
|---|
| PLS-DA | Number of Components | | 4 |
| SVM | Kernel | {‘linear’, ‘rbf’, ‘poly’}
| ‘rbf’ |
| | C (Regularization) | {0.1, 1, 10, 100, 1000}
| 10 |
| | (Gamma) | {‘scale’, ‘auto’, 0.01, 0.1}
| ‘scale’ |
| Random Forest | N Estimators | | 200 |
| | Max Depth | | 20 |
| | Min Samples Split | | 5 |
| LSTM | Hidden Units (L1) | | 64 |
| | Hidden Units (L2) | | 32 |
| | Dropout Rate | | |
| | Batch Size | | 16 |
Table 4.
LSTM Hyperparameter Sensitivity (Mean accuracy (±SD) across a 5-fold CV at each tested value, holding other parameters at their optimal values. The values highlighted in bold are the best values for each parameter).
Table 4.
LSTM Hyperparameter Sensitivity (Mean accuracy (±SD) across a 5-fold CV at each tested value, holding other parameters at their optimal values. The values highlighted in bold are the best values for each parameter).
| Hyperparameter | Values Tested | Optimal | Best Acc. | Sensitivity |
|---|
| Hidden Units (L1) | 32, 64
, 128 | 64 | | Medium |
| Hidden Units (L2) | 16, 32
, 64 | 32 | | Low |
| Dropout Rate | 0.1, 0.2
, 0.3, 0.5 | 0.2 | | Medium |
| Batch Size | 8, 16
, 32 | 16 | | Low |
| Learning Rate | 0.0001, 0.001
, 0.01 | 0.001 | | High |
Table 5.
LSTM Per-Fold Cross-Validation Metrics (MAE: Mean Absolute Error; RMSE: Root Mean Square Error; : coefficient of determination between predicted probability and binary label).
Table 5.
LSTM Per-Fold Cross-Validation Metrics (MAE: Mean Absolute Error; RMSE: Root Mean Square Error; : coefficient of determination between predicted probability and binary label).
| Fold | Accuracy | AUC | MAE | RMSE | |
|---|
| 1 | 0.93 | 0.95 | 0.09 | 0.18 | 0.72 |
| 2 | 0.91 | 0.93 | 0.11 | 0.21 | 0.68 |
| 3 | 0.92 | 0.94 | 0.10 | 0.19 | 0.70 |
| 4 | 0.93 | 0.95 | 0.09 | 0.18 | 0.73 |
| 5 | 0.91 | 0.93 | 0.11 | 0.20 | 0.69 |
| Mean | 0.92 | 0.94 | 0.10 | 0.19 | 0.70 |
| SD | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 |
Table 6.
Comprehensive Model Benchmarking. (Accuracy reported as Mean ± SD with 95% CI. MAE, RMSE, and computed on predicted class probabilities against binary labels. MedAE: Median Absolute Error.)
Table 6.
Comprehensive Model Benchmarking. (Accuracy reported as Mean ± SD with 95% CI. MAE, RMSE, and computed on predicted class probabilities against binary labels. MedAE: Median Absolute Error.)
| Model | Accuracy [95% CI] | Prec. | Rec. | F1 | AUC | MAE | RMSE | | MedAE |
|---|
| PLS-DA | 0.78 ± 0.04 [0.73, 0.82] | 0.75 | 0.72 | 0.73 | 0.79 | 0.21 | 0.35 | 0.38 | 0.19 |
| SVM | 0.85 ± 0.03 [0.81, 0.88] | 0.83 | 0.81 | 0.82 | 0.86 | 0.15 | 0.28 | 0.55 | 0.13 |
| Random Forest | 0.88 ± 0.02 [0.85, 0.90] | 0.86 | 0.85 | 0.85 | 0.90 | 0.12 | 0.24 | 0.63 | 0.10 |
| LSTM | 0.92 ± 0.01 [0.90, 0.94] | 0.91 | 0.91 | 0.91 | 0.94 | 0.10 | 0.19 | 0.70 | 0.08 |
Table 7.
Impact of Temporal Depth on Model Accuracy (Comparison between single-point (static) and multi-point (longitudinal) inputs).
Table 7.
Impact of Temporal Depth on Model Accuracy (Comparison between single-point (static) and multi-point (longitudinal) inputs).
| Dataset Scope | Model | Mean Accuracy | AUC | Gain |
|---|
| Single-Point (Week 4 only) | Random Forest | | 0.90 | – |
| Longitudinal (Weeks 1–4) | LSTM | | 0.94 | * |
Table 8.
Comparison of Avocado Maturity Assessment Technologies.
Table 8.
Comparison of Avocado Maturity Assessment Technologies.
| Method | Type | Accuracy | Cost | Temporal | Field Ready |
|---|
| Dry Matter (Oven) | Destructive | 100% (Ref) | Low | No | No |
| NIR Spectroscopy | Non-Destructive | >95% | High | No | Yes |
| FruitSense (Wi-Fi) [8] | Non-Contact | ≈ | Medium | No | Yes |
| Proposed (EIS+LSTM) | Non-Destructive | 92% | Low | Yes | Yes |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |