MDPI - Publisher of Open Access Journals

25 pages, 47805 KB

Open AccessArticle

Comparative Evaluation of Nine Machine Learning Models for Target and Background Noise Classification in GM-APD LiDAR Signals Using Monte Carlo Simulations

by Hongchao Ni, Jianfeng Sun, Xin Zhou, Di Liu, Xin Zhang, Jixia Cheng, Wei Lu and Sining Li

Remote Sens. 2025, 17(21), 3597; https://doi.org/10.3390/rs17213597 - 30 Oct 2025

Abstract

This study proposes a complete data-processing framework for Geiger-mode avalanche photodiode (GM-APD) light detection and ranging (LiDAR) echo signals. It investigates the feasibility of classifying target and background noise using machine learning. Four feature processing schemes were first compared, among which the PNT [...] Read more.

This study proposes a complete data-processing framework for Geiger-mode avalanche photodiode (GM-APD) light detection and ranging (LiDAR) echo signals. It investigates the feasibility of classifying target and background noise using machine learning. Four feature processing schemes were first compared, among which the PNT strategy (Principal Component Analysis without tail features) was identified as the most effective and adopted for subsequent analysis. Based on this framework, nine models derived from six baseline algorithms—Decision Trees (DTs), Support Vector Machines (SVMs), Backpropagation Neural Networks (NN-BPs), Linear Discriminant Analysis (LDA), Logistic Regression (LR), and k-Nearest Neighbors (KNN)—were systematically assessed under Monte Carlo simulations with varying echo signal-to-noise ratio (ESNR) and statistical frame number (SFN) conditions. Model performance was evaluated using eight metrics: accuracy, precision, recall, FPR, FNR, F1-score, Kappa coefficient, and relative change percentage (RCP). Monte Carlo simulations were employed to generate datasets, and Principal Component Analysis (PCA) was applied for feature extraction in the machine learning training process. The results show that LDA achieves the shortest training time (0.38 s at SFN = 20,000), DT maintains stable accuracy (0.7171–0.8247) across different SFNs, and NN-BP models perform optimally under low-SNR conditions. Specifically, NN-BP-3 achieves the highest test accuracy of 0.9213 at SFN = 20,000, while NN-BP-2 records the highest training accuracy of 0.9137. Regarding stability, NN-BP-3 exhibits the smallest RCP value (0.0111), whereas SVM-3 yields the largest (0.1937) at the same frame count. In conclusion, NN-BP-based models demonstrate clear advantages in classifying sky-background noise. Building on this, we design a ResNet based on NN-BP, which achieves further accuracy gains over the best baseline at 400, 2000, and 20,000 frames—12.5% (400), 9.16% (2000), and 2.79% (20,000)—clearly demonstrating the advantage of NN-BP for GM-APD LiDAR signal classification. This research thus establishes a novel framework for GM-APD LiDAR signal classification, provides the first systematic comparison of multiple machine learning models, and highlights the trade-off between accuracy and computational efficiency. The findings confirm the feasibility of applying machine learning to GM-APD data and offer practical guidance for balancing detection performance with real-time requirements in field applications. Full article

► Show Figures

Figure 1

14 pages, 916 KB

Open AccessArticle

Limited Spectroscopy Data and Machine Learning for Detection of Zika Virus Infection in Aedes aegypti Mosquitoes

by Leonardo Reigoto, Rafael Maciel-de-Freitas, Maggy T. Sikulu-Lord, Gabriela A. Garcia, Gabriel Araujo and Amaro Lima

Trop. Med. Infect. Dis. 2025, 10(11), 308; https://doi.org/10.3390/tropicalmed10110308 - 29 Oct 2025

Viewed by 177

Abstract

This study presents a technique for categorizing Aedes aegypti mosquitoes infected with the Zika virus under laboratory conditions. Our approach involves the utilization of the near-infrared spectroscopy technique and machine learning algorithms. The model developed utilizes the absorption of light from 350 to [...] Read more.

This study presents a technique for categorizing Aedes aegypti mosquitoes infected with the Zika virus under laboratory conditions. Our approach involves the utilization of the near-infrared spectroscopy technique and machine learning algorithms. The model developed utilizes the absorption of light from 350 to 1000 nm. It integrates Linear Discriminant Analysis (LDA) of the signal’s windowed version to exploit non-linearities, along with Support Vector Machine (SVM) for classification purposes. Our proposed methodology can identify the presence of the Zika virus in intact mosquitoes with a balanced accuracy of 96% (row C2HT, average of columns TPR (%) and SPC (%)) when heads/thoraces of mosquitoes are scanned at 4, 7, and 10 days post virus infection. The model was 97.1% (10 DPI, row C2AB, column ACC (%)) accurate for mosquitoes that were used to test it, i.e., mosquitoes scanned 10-days post-infection and mosquitoes whose abdomens were scanned. Notable benefits include its cost-effectiveness and the capability for real-time predictions. This work also demonstrates the role played by different spectral wavelengths in predicting an infection in mosquitoes. Full article

(This article belongs to the Special Issue Beyond Borders—Tackling Neglected Tropical Viral Diseases)

► Show Figures

Graphical abstract

31 pages, 20333 KB

Open AccessArticle

Towards Sustainable Development: Landslide Susceptibility Assessment with Sample Optimization in Guiyang County, China

by Yuzhong Kong, Kangcheng Zhu, Hua Wu, Chong Xu, Ze Meng, Hui Kong, Wen Tan, Xiangyun Kong, Xingwang Chen, Linna Chen and Tong Xu

Sustainability 2025, 17(21), 9575; https://doi.org/10.3390/su17219575 - 28 Oct 2025

Viewed by 135

Abstract

Here we present a high-resolution landslide susceptibility model for Guiyang County, China, developed to support sustainable disaster risk management. Our approach couples optimized positive and negative training samples with an ensemble of machine-learning algorithms to maximize predictive fidelity. We compiled a georeferenced inventory [...] Read more.

Here we present a high-resolution landslide susceptibility model for Guiyang County, China, developed to support sustainable disaster risk management. Our approach couples optimized positive and negative training samples with an ensemble of machine-learning algorithms to maximize predictive fidelity. We compiled a georeferenced inventory of 146 landslides by integrating historical records with systematic field validation. Sample optimization was central to our methodology: landslide presence points were refined via buffer-based dilution, and four classifiers—SVM, LDA, RF, and ET—were trained with identical covariate sets to ensure comparability. Three strategies for selecting pseudo-absences—buffering, low-slope filtering, and coupling with the IOE—were benchmarked. The Slope-IOE-O model, which synergizes low-gradient screening with entropy-weighted sampling, yielded the highest predictive capacity (AUC = 0.965). SHAP-based interpretability revealed that slope, monthly maximum rainfall, surface roughness, and elevation collectively dominate susceptibility, with pronounced non-linearities and interactions. Slope contribution peaks at 20–30°, monthly maximum rainfall exhibits a critical threshold near 225 mm, and the synergy between high roughness and road density amplifies landslide risk. Spatially, susceptibility follows a pronounced north–south gradient, with high-hazard corridors aligned along northern and southern mountain belts and the urban core of southern Guiyang County. By integrating rigorously curated training data with robust machine-learning workflows, this study provides a transferable framework for proactive landslide risk assessment, offering scientific support for sustainable land-use planning and resilient development in mountainous regions. Full article

► Show Figures

Figure 1

18 pages, 1933 KB

Open AccessArticle

Clinical Application of Machine Learning Models for Early-Stage Chronic Kidney Disease Detection

by Hasnain Iftikhar, Atef F. Hashem, Moiz Qureshi and Paulo Canas Rodrigues

Diagnostics 2025, 15(20), 2610; https://doi.org/10.3390/diagnostics15202610 - 16 Oct 2025

Viewed by 523

Abstract

Background/Objectives: Chronic kidney disease (CKD) is a progressive condition that affects the body’s ability to remove waste and regulate fluid and electrolytes. Early detection is crucial for delaying disease progression and initiating timely interventions. Machine learning (ML) techniques have emerged as powerful tools [...] Read more.

Background/Objectives: Chronic kidney disease (CKD) is a progressive condition that affects the body’s ability to remove waste and regulate fluid and electrolytes. Early detection is crucial for delaying disease progression and initiating timely interventions. Machine learning (ML) techniques have emerged as powerful tools for automating disease diagnosis and prognosis. This study aims to evaluate the predictive performance of individual and ensemble ML algorithms for the early classification of CKD. Methods: A clinically annotated dataset was utilized to categorize patients into CKD and non-CKD groups. The models investigated included Logistic Regression, Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Ridge Classifier, Naïve Bayes, K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Ensemble learning strategies. A systematic preprocessing pipeline was implemented, and model performance was assessed using accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve (AUC). Results: The empirical findings reveal that ML-based classifiers achieved high predictive accuracy in CKD detection. Ensemble learning methods outperformed individual models in terms of robustness and generalization, indicating their potential in clinical decision-making contexts. Conclusions: The study demonstrates the efficacy of ML-based frameworks for early CKD prediction, offering a scalable, interpretable, and accurate clinical decision support approach. The proposed methodology supports timely diagnosis and can assist healthcare professionals in improving patient outcomes. Full article

(This article belongs to the Special Issue Machine-Learning-Based Disease Diagnosis and Prediction)

► Show Figures

Figure 1

21 pages, 6242 KB

Open AccessArticle

Motor Imagery Acquisition Paradigms: In the Search to Improve Classification Accuracy

by David Reyes, Sebastian Sieghartsleitner, Humberto Loaiza and Christoph Guger

Sensors 2025, 25(19), 6204; https://doi.org/10.3390/s25196204 - 7 Oct 2025

Viewed by 498

Abstract

In recent years, advances in medicine have been evident thanks to technological growth and interdisciplinary research, which has allowed the integration of knowledge, for example, of engineering into medical fields. This integration has generated developments and new methods that can be applied in [...] Read more.

In recent years, advances in medicine have been evident thanks to technological growth and interdisciplinary research, which has allowed the integration of knowledge, for example, of engineering into medical fields. This integration has generated developments and new methods that can be applied in alternative situations, highlighting, for example, aspects related to post-stroke therapies, Multiple Sclerosis (MS), or Spinal Cord Injury (SCI) treatments. One of the methods that has stood out and is gaining more acceptance every day is Brain–Computer Interfaces (BCIs), through the acquisition and processing of brain electrical activity, researchers, doctors, and scientists manage to transform this activity into control signals. In turn, there are several methods for operating a BCI, this work will focus on motor imagery (MI)-based BCI and three types of acquisition paradigms (traditional arrow, picture, and video), seeking to improve the accuracy in the classification of motor imagination tasks for naive subjects, which correspond to a MI task for both the left and the right hand. A pipeline and methodology were implemented using the CAR+CSP algorithm to extract the features and simple standard and widely used models such as LDA and SVM for classification. The methodology was tested with post-stroke (PS) subject data with BCI experience, obtaining 96.25% accuracy for the best performance, and with the novel paradigm proposed for the naive subjects, 97.5% was obtained. Several statistical tests were carried out in order to find differences between paradigms within the collected data. In conclusion, it was found that the classification accuracy could be improved by using different strategies in the acquisition stage. Full article

(This article belongs to the Section Biomedical Sensors)

► Show Figures

Figure 1

24 pages, 1454 KB

Open AccessArticle

AI-Driven Monitoring for Fish Welfare in Aquaponics: A Predictive Approach

by Jorge Saúl Fandiño Pelayo, Luis Sebastián Mendoza Castellanos, Rocío Cazes Ortega and Luis G. Hernández-Rojas

Sensors 2025, 25(19), 6107; https://doi.org/10.3390/s25196107 - 3 Oct 2025

Viewed by 583

Abstract

This study addresses the growing need for intelligent monitoring in aquaponic systems by developing a predictive system based on artificial intelligence and environmental sensing. The goal is to improve fish welfare through the early detection of adverse water conditions. The system integrates low-cost [...] Read more.

This study addresses the growing need for intelligent monitoring in aquaponic systems by developing a predictive system based on artificial intelligence and environmental sensing. The goal is to improve fish welfare through the early detection of adverse water conditions. The system integrates low-cost digital sensors to continuously measure key physicochemical variables—pH, dissolved oxygen, and temperature—using these as inputs for real-time classification of fish health status. Four supervised machine learning models were evaluated: linear discriminant analysis (LDA), support vector machines (SVMs), neural networks (NNs), and random forest (RF). A dataset of 1823 instances was collected over eight months from a red tilapia aquaponic setup. The random forest model yielded the highest classification accuracy (99%), followed by NN (98%) and SVM (97%). LDA achieved 82% accuracy. Performance was validated using 5-fold cross-validation and label permutation tests to confirm model robustness. These results demonstrate that sensor-based predictive models can reliably detect early signs of fish stress or mortality, supporting the implementation of intelligent environmental monitoring and automation strategies in sustainable aquaponic production. Full article

(This article belongs to the Section Environmental Sensing)

► Show Figures

Figure 1

27 pages, 8112 KB

Open AccessArticle

Detection of Abiotic Stress in Potato and Sweet Potato Plants Using Hyperspectral Imaging and Machine Learning

by Min-Seok Park, Mohammad Akbar Faqeerzada, Sung Hyuk Jang, Hangi Kim, Hoonsoo Lee, Geonwoo Kim, Young-Son Cho, Woon-Ha Hwang, Moon S. Kim, Insuck Baek and Byoung-Kwan Cho

Plants 2025, 14(19), 3049; https://doi.org/10.3390/plants14193049 - 2 Oct 2025

Viewed by 632

Abstract

As climate extremes increasingly threaten global food security, precision tools for early detection of crop stress have become vital, particularly for root crops such as potato (Solanum tuberosum L.) and sweet potato (Ipomoea batatas L. Lam.), which are especially susceptible to [...] Read more.

As climate extremes increasingly threaten global food security, precision tools for early detection of crop stress have become vital, particularly for root crops such as potato (Solanum tuberosum L.) and sweet potato (Ipomoea batatas L. Lam.), which are especially susceptible to environmental stressors throughout their life cycles. In this study, plants were monitored from the initial onset of seasonal stressors, including spring drought, heat, and episodes of excessive rainfall, through to harvest, capturing the full range of physiological and biochemical responses under seasonal, simulated conditions in greenhouses. The spectral data were obtained from regions of interest (ROIs) of each cultivar’s leaves, with over 3000 data points extracted per cultivar; these data were subsequently used for model development. A comprehensive classification framework was established by employing machine learning models, Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and Partial Least Squares-Discriminant Analysis (PLS-DA), to detect stress across various growth stages. Furthermore, severity levels were objectively defined using photoreflectance indices and principal component analysis (PCA) data visualizations, which enabled consistent and reliable classification of stress responses in both individual cultivars and combined datasets. All models achieved high classification accuracy (90–98%) on independent test sets. The application of the Successive Projections Algorithm (SPA) for variable selection significantly reduced the number of wavelengths required for robust stress classification, with SPA-PLS-DA models maintaining high accuracy (90–96%) using only a subset of informative bands. Furthermore, SPA-PLS-DA-based chemical imaging enabled spatial mapping of stress severity within plant tissues, providing early, non-invasive insights into physiological and biochemical status. These findings highlight the potential of integrating hyperspectral imaging and machine learning for precise, real-time crop monitoring, thereby contributing to sustainable agricultural management and reduced yield losses. Full article

(This article belongs to the Section Plant Modeling)

► Show Figures

Figure 1

20 pages, 1372 KB

Open AccessArticle

A Novel Multi-Scale Entropy Approach for EEG-Based Lie Detection with Channel Selection

by Jiawen Li, Guanyuan Feng, Chen Ling, Ximing Ren, Shuang Zhang, Xin Liu, Leijun Wang, Mang I. Vai, Jujian Lv and Rongjun Chen

Entropy 2025, 27(10), 1026; https://doi.org/10.3390/e27101026 - 29 Sep 2025

Viewed by 404

Abstract

Entropy-based analyses have emerged as a powerful tool for quantifying the complexity, regularity, and information content of complex biological signals, such as electroencephalography (EEG). In this regard, EEG-based lie detection offers the advantage of directly providing more objective and less susceptible-to-manipulation results compared [...] Read more.

Entropy-based analyses have emerged as a powerful tool for quantifying the complexity, regularity, and information content of complex biological signals, such as electroencephalography (EEG). In this regard, EEG-based lie detection offers the advantage of directly providing more objective and less susceptible-to-manipulation results compared to traditional polygraph methods. To this end, this study proposes a novel multi-scale entropy approach by fusing fuzzy entropy (FE), time-shifted multi-scale fuzzy entropy (TSMFE), and hierarchical multi-band fuzzy entropy (HMFE), which enables the multidimensional characterization of EEG signals. Subsequently, using machine learning classifiers, the fused feature vector is applied to lie detection, with a focus on channel selection to investigate distinguished neural signatures across brain regions. Experiments utilize a publicly benchmarked LieWaves dataset, and two parts are performed. One is a subject-dependent experiment to identify representative channels for lie detection. Another is a cross-subject experiment to assess the generalizability of the proposed approach. In the subject-dependent experiment, linear discriminant analysis (LDA) achieves impressive accuracies of 82.74% under leave-one-out cross-validation (LOOCV) and 82.00% under 10-fold cross-validation. The cross-subject experiment yields an accuracy of 64.07% using a radial basis function (RBF) kernel support vector machine (SVM) under leave-one-subject-out cross-validation (LOSOCV). Furthermore, regarding the channel selection results, PZ (parietal midline) and T7 (left temporal) are considered the representative channels for lie detection, as they exhibit the most prominent occurrences among subjects. These findings demonstrate that the PZ and T7 play vital roles in the cognitive processes associated with lying, offering a solution for designing portable EEG-based lie detection devices with fewer channels, which also provides insights into neural dynamics by analyzing variations in multi-scale entropy. Full article

(This article belongs to the Special Issue Entropy Analysis of Electrophysiological Signals)

► Show Figures

Figure 1

5 pages, 1163 KB

Open AccessAbstract

Raman Spectroscopy Diagnosis of Melanoma

by Gianmarco Lazzini, Daniela Massi, Davide Moroni, Ovidio Salvetti, Paolo Viacava, Marco Laurino and Mario D’Acunto

Proceedings 2025, 129(1), 10; https://doi.org/10.3390/proceedings2025129010 - 12 Sep 2025

Viewed by 344

Abstract

Cutaneous melanoma is an aggressive form of skin cancer and a leading cause of cancer-related mortality. In this sense, Raman Spectroscopy (RS) could represent a fast and effective method for melanoma-related diagnosis. We therefore introduced a new method based on RS to distinguish [...] Read more.

Cutaneous melanoma is an aggressive form of skin cancer and a leading cause of cancer-related mortality. In this sense, Raman Spectroscopy (RS) could represent a fast and effective method for melanoma-related diagnosis. We therefore introduced a new method based on RS to distinguish Compound Naevi (CN) from Primary Cutaneous Melanoma (PCM) from ex vivo solid biopsies. To this aim, integrating Confocal Raman Micro-Spectroscopy (CRM) with four Machine Learning (ML) algorithms: Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Support Vector Machine (SVM), and Random Forest Classifier (RFC). We focused our attention on the comparison between traditional pre-processing operations with Continuous Wavelet Transform (CWT). In particular, CWT led to the maximum classification accuracy, which was ∼89.0%, which highlighted the method as promising in view of future implementations in devices for everyday use. Full article

(This article belongs to the Proceedings of The 18th International Workshop on Advanced Infrared Technology and Applications)

► Show Figures

Figure 1

26 pages, 1607 KB

Open AccessFeature PaperArticle

Analyzing Performance of Data Preprocessing Techniques on CPUs vs. GPUs with and Without the MapReduce Environment

by Sikha S. Bagui, Colin Eller, Rianna Armour, Shivani Singh, Subhash C. Bagui and Dustin Mink

Electronics 2025, 14(18), 3597; https://doi.org/10.3390/electronics14183597 - 10 Sep 2025

Viewed by 663

Abstract

Data preprocessing is usually necessary before running most machine learning classifiers. This work compares three different preprocessing techniques, minimal preprocessing, Principal Components Analysis (PCA), and Linear Discriminant Analysis (LDA). The efficiency of these three preprocessing techniques is measured using the Support Vector Machine [...] Read more.

Data preprocessing is usually necessary before running most machine learning classifiers. This work compares three different preprocessing techniques, minimal preprocessing, Principal Components Analysis (PCA), and Linear Discriminant Analysis (LDA). The efficiency of these three preprocessing techniques is measured using the Support Vector Machine (SVM) classifier. Efficiency is measured in terms of statistical metrics such as accuracy, precision, recall, the F-1 measure, and AUROC. The preprocessing times and the classifier run times are also compared using the three differently preprocessed datasets. Finally, a comparison of performance timings on CPUs vs. GPUs with and without the MapReduce environment is performed. Two newly created Zeek Connection Log datasets, collected using the Security Onion 2 network security monitor and labeled using the MITRE ATT&CK framework, UWF-ZeekData22 and UWF-ZeekDataFall22, are used for this work. Results from this work show that binomial LDA, on average, performs the best in terms of statistical measures as well as timings using GPUs or MapReduce GPUs. Full article

(This article belongs to the Special Issue Hardware Acceleration for Machine Learning)

► Show Figures

Figure 1

15 pages, 3711 KB

Open AccessArticle

Improved Shell Color Index for Chicken Eggs with Blue-green Shells Based on Machine Learning Analysis

by Huanhuan Wang, Yinghui Wei, Lei Zhang, Ying Ge, Hang Liu and Xuedong Zhang

Foods 2025, 14(17), 3027; https://doi.org/10.3390/foods14173027 - 29 Aug 2025

Viewed by 765

Abstract

Shell color is a commercially valuable trait in eggs, and blue-green eggshells typically exhibit multiple color subtypes. To explore the relationship between the CIELab system and visual color classification and develop simplified discrimination indices, 2274 blue-green eggs across seven batches were selected. The [...] Read more.

Shell color is a commercially valuable trait in eggs, and blue-green eggshells typically exhibit multiple color subtypes. To explore the relationship between the CIELab system and visual color classification and develop simplified discrimination indices, 2274 blue-green eggs across seven batches were selected. The L*, a*, and b* values of each egg were measured, and average visual classification (AveObs) was calculated from four numeric categories (Light = 1, Blue = 2, Green = 3, Olive = 4) separately assigned by four observers. After batch correction using ComBat, four algorithms—linear discriminant analysis (LDA), random forest (RF), support vector machine (SVM), and neural network (NNET)—were compared. Correction substantially reduced the coefficients of variation of the L*, a*, and b* values. Correlations emerged: L* and b* (−0.722), a* and b* (0.451), and L* and a* (−0.088), while correlations of the L*, a*, and b* values with AveObs were −0.713, 0.218, and 0.771, respectively. The LDA model achieved superior comprehensive performance across all data scenarios, with the highest accuracy and efficiency as compared to the SVM, NNET, and RF models. Among the LDA functions, LD1 explained 78.53% of the variance, with L*, a*, and b* coefficients of −0.134, 0.063, and 0.349, respectively (ratio ≈ 1:0.47:2.60). Simplified formulas based on the L*, a*, and b* values were constructed and compared to the existing indices C* (=

\sqrt{{a *}^{2} + {b *}^{2}}

) and SCI (=L* − a* − b*). The correlation between L* − 2b* and AveObs was −0.803, similar to those for C* (0.797) and SCI (−0.782), while the correlation between L* − 4C* and AveObs was −0.810, significantly higher than that for SCI (p < 0.05). In conclusion, the LDA model demonstrated optimal performance in predicting color classification, and L* − 4C* is an ideal index for grading of blue-green eggs. Full article

(This article belongs to the Section Food Analytical Methods)

► Show Figures

Figure 1

31 pages, 1856 KB

Open AccessArticle

Optimizing Chatbots to Improve Customer Experience and Satisfaction: Research on Personalization, Empathy, and Feedback Analysis

by Shimon Uzan, David Freud and Amir Elalouf

Appl. Sci. 2025, 15(17), 9439; https://doi.org/10.3390/app15179439 - 28 Aug 2025

Viewed by 2447

Abstract

This study addresses the ongoing challenge of optimizing chatbot interactions to significantly enhance customer experience and satisfaction through personalized, empathetic responses. Using advanced NLP tools and strong statistical methodologies, we developed and evaluated a multi-layered analytical framework to accurately identify user intents, assess [...] Read more.

This study addresses the ongoing challenge of optimizing chatbot interactions to significantly enhance customer experience and satisfaction through personalized, empathetic responses. Using advanced NLP tools and strong statistical methodologies, we developed and evaluated a multi-layered analytical framework to accurately identify user intents, assess customer feedback, and generate emotionally intelligent interactions. With over 270,000 customer chatbot interaction records in our dataset, we employed spaCy-based NER and clustering algorithms (HDBSCAN and K-Means) to categorize customer queries precisely. Text classification was performed using random forest, logistic regression, and SVM, achieving near-perfect accuracy. Sentiment analysis was conducted using VADER, Naive Bayes, and TextBlob, complemented by semantic analysis via LDA. Statistical tests, including Chi-square, Kruskal–Wallis, Dunn’s test, ANOVA, and logistic regression, confirmed the significant impact of tailored, empathetic response strategies on customer satisfaction. Correlation analysis indicated that traditional measures like sentiment polarity and text length insufficiently capture customer satisfaction nuances. The results underscore the critical role of context-specific adjustments and emotional responsiveness, paving the way for future research into chatbot personalization and customer-centric system optimization. Full article

► Show Figures

Figure 1

22 pages, 9631 KB

Open AccessArticle

Automatic Recognition of Commercial Tree Species from the Amazon Flora Using Bark Images and Transfer Learning

by Natally Celestino Gama, Luiz Eduardo Soares Oliveira, Samuel de Pádua Chaves e Carvalho, Alexandre Behling, Pedro Luiz de Paula Filho, Márcia Orie de Sousa Hamada, Eduardo da Silva Leal and Deivison Venicio Souza

Forests 2025, 16(9), 1374; https://doi.org/10.3390/f16091374 - 27 Aug 2025

Viewed by 1143

Abstract

The application of artificial intelligence (AI) techniques has improved the accuracy of forest species identification, particularly in timber inventories conducted under Sustainable Forest Management (SFM). This study developed and evaluated machine learning models to recognize 16 Amazonian timber species using digital images of [...] Read more.

The application of artificial intelligence (AI) techniques has improved the accuracy of forest species identification, particularly in timber inventories conducted under Sustainable Forest Management (SFM). This study developed and evaluated machine learning models to recognize 16 Amazonian timber species using digital images of tree bark. Data were collected from three SFM units located in Nova Maringá, Feliz Natal, and Cotriguaçu, in the state of Mato Grosso, Brazil. High-resolution images were processed into sub-images (256 × 256 pixels), and two feature extraction methods were tested: Local Binary Patterns (LBP) and pre-trained Convolutional Neural Networks (ResNet50, VGG16, InceptionV3, MobileNetV2). Four classifiers—Support Vector Machine (SVM), Artificial Neural Networks (ANN), Random Forest (RF), and Linear Discriminant Analysis (LDA)—were used. The best result (95% accuracy) was achieved using ResNet50 with SVM, confirming the effectiveness of transfer learning for species recognition based on bark texture. These findings highlight the potential of AI-based tools to enhance accuracy in forest inventories and support decision-making in tropical forest management. Full article

(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

► Show Figures

Figure 1

28 pages, 40313 KB

Open AccessArticle

Colorectal Cancer Detection Through Sweat Volatilome Using an Electronic Nose System and GC-MS Analysis

by Cristhian Manuel Durán Acevedo, Jeniffer Katerine Carrillo Gómez, Gustavo Adolfo Bautista Gómez, José Luis Carrero Carrero and Rogelio Flores Ramírez

Cancers 2025, 17(17), 2742; https://doi.org/10.3390/cancers17172742 - 23 Aug 2025

Viewed by 3799

Abstract

Background: Colorectal cancer (CRC) remains one of the leading causes of cancer-related mortality worldwide, emphasizing the urgent need for early, non-invasive, and accessible diagnostic tools. This study aimed to evaluate the effectiveness of a microelectromechanical systems (MEMS)-based electronic nose (E-nose) in combination with [...] Read more.

Background: Colorectal cancer (CRC) remains one of the leading causes of cancer-related mortality worldwide, emphasizing the urgent need for early, non-invasive, and accessible diagnostic tools. This study aimed to evaluate the effectiveness of a microelectromechanical systems (MEMS)-based electronic nose (E-nose) in combination with gas chromatography–mass spectrometry (GC-MS) for CRC detection through sweat volatile organic compounds (VOCs). Methods: A total of 136 sweat samples were collected from 68 volunteer participants. Samples were processed using solid-phase microextraction (SPME) and analyzed by GC-MS, while a custom-designed E-nose system comprising 14 gas sensors captured real-time VOC profiles. Data were analyzed using multivariate statistical techniques, including PCA and PLS-DA, and classified with machine learning algorithms (LDA, LR, SVM, k-NN). Results: GC-MS analysis revealed statistically significant differences between CRC patients and healthy controls (COs). Cross-validation showed that the highest classification accuracy for GC-MS data was 81% with the k-NN classifier, whereas E-nose data achieved up to 97% accuracy using the LDA classifier. Conclusions: Sweat volatilome analysis, supported by advanced data processing and complementary use of E-nose technology and GC-MS, demonstrates strong potential as a reliable, non-invasive approach for early CRC detection. Full article

(This article belongs to the Section Methods and Technologies Development)

► Show Figures

Figure 1

16 pages, 1719 KB

Open AccessArticle

Geographical Origin Classification of Oolong Tea Using an Electronic Nose: Application of Machine Learning and Gray Relational Analysis

by Sushant Kaushal, Priya Rana, Chao-Chin Chung and Ho-Hsien Chen

Chemosensors 2025, 13(8), 295; https://doi.org/10.3390/chemosensors13080295 - 8 Aug 2025

Viewed by 770

Abstract

Taiwan accounts for 90% of the total oolong tea production and enjoys a good global reputation for its quality. In recent years, oolong tea from neighboring countries has been imported into Taiwan and sold as Taiwanese oolong at high prices. This study aimed [...] Read more.

Taiwan accounts for 90% of the total oolong tea production and enjoys a good global reputation for its quality. In recent years, oolong tea from neighboring countries has been imported into Taiwan and sold as Taiwanese oolong at high prices. This study aimed to rapidly classify oolong tea from four geographical origins (Taiwan, Vietnam, China, and Indonesia) using an electronic nose (E-nose) combined with machine learning. Color measurements were also conducted to support the classification. The electronic nose (E-nose) was utilized to analyze the aroma profiles of tea samples. To classify the samples, five machine learning models—linear discriminant analysis (LDA), support vector machine (SVM), K-nearest neighbor (KNN), artificial neural network (ANN), and random forest (RF)—were developed using 70% of the dataset for training and tested on the remaining 30%. Gray relational analysis (GRA) was applied to measure the relationship between sensor responses and reference tea origins. Multivariate analysis of variance (MANOVA) indicated a statistically significant effect of tea origin on color parameters, as confirmed by both Pillai’s trace and Wilks’ Lambda (Λ) tests (p = 0.000 < 0.05). Among the tested models, LDA and ANN achieved the highest overall classification accuracy (98.33%), with ANN outperforming in the discrimination of Taiwanese oolong tea, achieving 98.89% accuracy. GRA presented higher gray relational grade (GRG) values for Taiwanese tea samples compared to other origins and identified sensors S4, S6, and S14 as the dominant contributors. In conclusion, the E-nose combined with machine learning provides a rapid, non-destructive, and effective approach for geographical origin classification of oolong tea. Full article

(This article belongs to the Special Issue Applications of Electronic Nose (E-Nose) and Electronic Tongue (E-Tongue) in Food Quality)

► Show Figures

Figure 1

Search Results (297)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (297)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI