MDPI - Publisher of Open Access Journals

41 pages, 2824 KiB

Open AccessReview

Assessing Milk Authenticity Using Protein and Peptide Biomarkers: A Decade of Progress in Species Differentiation and Fraud Detection

by Achilleas Karamoutsios, Pelagia Lekka, Chrysoula Chrysa Voidarou, Marilena Dasenaki, Nikolaos S. Thomaidis, Ioannis Skoufos and Athina Tzora

Foods 2025, 14(15), 2588; https://doi.org/10.3390/foods14152588 - 23 Jul 2025

Abstract

Milk is a nutritionally rich food and a frequent target of economically motivated adulteration, particularly through substitution with lower-cost milk types. Over the past decade, significant progress has been made in the authentication of milk using advanced proteomic and chemometric approaches, with a [...] Read more.

Milk is a nutritionally rich food and a frequent target of economically motivated adulteration, particularly through substitution with lower-cost milk types. Over the past decade, significant progress has been made in the authentication of milk using advanced proteomic and chemometric approaches, with a focus on the discovery and application of protein and peptide biomarkers for species differentiation and fraud detection. Recent innovations in both top-down and bottom-up proteomics have markedly improved the sensitivity and specificity of detecting key molecular targets, including caseins and whey proteins. Peptide-based methods are especially valuable in processed dairy products due to their thermal stability and resilience to harsh treatment, although their species specificity may be limited when sequences are conserved across related species. Robust chemometric approaches are increasingly integrated with proteomic pipelines to handle high-dimensional datasets and enhance classification performance. Multivariate techniques, such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA), are frequently employed to extract discriminatory features and model adulteration scenarios. Despite these advances, key challenges persist, including the lack of standardized protocols, variability in sample preparation, and the need for broader validation across breeds, geographies, and production systems. Future progress will depend on the convergence of high-resolution proteomics with multi-omics integration, structured data fusion, and machine learning frameworks, enabling scalable, specific, and robust solutions for milk authentication in increasingly complex food systems. Full article

(This article belongs to the Special Issue AI-Powered Advances in Data Handling for Enhanced Food Analysis: From Chemometrics to Machine Learning)

► Show Figures

Figure 1

21 pages, 2742 KiB

Open AccessArticle

Origin Traceability of Chinese Mitten Crab (Eriocheir sinensis) Using Multi-Stable Isotopes and Explainable Machine Learning

by Danhe Wang, Chunxia Yao, Yangyang Lu, Di Huang, Yameng Li, Xugan Wu, Weiguo Song and Qinxiong Rao

Foods 2025, 14(14), 2458; https://doi.org/10.3390/foods14142458 - 13 Jul 2025

Viewed by 255

Abstract

The Chinese mitten crab (Eriocheir sinensis) industry is currently facing the challenges of origin fraud, as well as a lack of precision and interpretability of existing traceability methods. Here, we propose a high-precision origin traceability method based on a combination of [...] Read more.

The Chinese mitten crab (Eriocheir sinensis) industry is currently facing the challenges of origin fraud, as well as a lack of precision and interpretability of existing traceability methods. Here, we propose a high-precision origin traceability method based on a combination of stable isotope analysis and interpretable machine learning. We sampled Chinese mitten crabs from six origins representing diverse aquatic environments and farming practices, and analyzed their δ¹³C, δ¹⁵N, δ²H, and δ¹⁸O stable isotope compositions in different sexes and tissues (hepatopancreas, muscle, and gonad). By comparing the classification performance of Random Forest, XGBoost, and Logistic Regression models, we found that the Random Forest model outperformed the others, achieving high accuracy (91.3%) in distinguishing samples from different origins. Interpretation of the optimal Random Forest model, using SHAP (SHapley Additive exPlanations) analysis, identified δ²H in male muscle, δ¹⁵N in female hepatopancreas, and δ¹³C in female hepatopancreas as the most influential features for discriminating geographic origin. This analysis highlighted the crucial role of environmental factors, such as water source, diet, and trophic level, in origin discrimination and demonstrated that isotopic characteristics of different tissues provide unique discriminatory information. This study offers a novel paradigm for stable isotope traceability based on explainable machine learning, significantly enhancing the identification capability and reliability of Chinese mitten crab origin traceability, and holds significant implications for food safety assurance. Full article

(This article belongs to the Section Food Analytical Methods)

► Show Figures

Figure 1

24 pages, 1871 KiB

Open AccessArticle

Data Analyses and Chemometric Modeling for Rapid Quality Assessment of Enriched Honey

by Jasenka Gajdoš Kljusurić, Vesna Knights, Berat Durmishi, Smajl Rizani, Vezirka Jankuloska, Valentina Velkovski, Ana Jurinjak Tušek, Maja Benković, Davor Valinger and Tamara Jurina

Chemosensors 2025, 13(7), 246; https://doi.org/10.3390/chemosensors13070246 - 9 Jul 2025

Viewed by 264

Abstract

The quality and authenticity of honey are of crucial importance for food safety and consumer confidence. Given the increasing interest in enriched honey and potential fraud, rapid and non-destructive analytical methods for quality assessment, such as Near-Infrared Spectroscopy (NIRS), are needed. Therefore, the [...] Read more.

The quality and authenticity of honey are of crucial importance for food safety and consumer confidence. Given the increasing interest in enriched honey and potential fraud, rapid and non-destructive analytical methods for quality assessment, such as Near-Infrared Spectroscopy (NIRS), are needed. Therefore, the aim of this work was to investigate the applicability of NIR spectroscopy coupled with chemometric methods to assess the quality change in honey from three different countries, after addition of five different aromatic plants (lavender, rosemary, oregano, sage, and white pine oil) in three different concentrations (0.5%, 0.8% and 1%). Measurements of basic physicochemical properties, color, antioxidant activity, and NIR spectra were performed for all samples (pure honey and honey with added aromatic plants). Chemometric models, such as Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression, were applied to analyze spectral data, correlate spectra with physicochemical properties, color and antioxidant activity measurements, and develop classification and prediction models. Spectral changes in the NIR region, as expected, showed the ability to distinguish samples depending on the type and concentration of added aromatic plants. Chemometric models enabled efficient discrimination between pure and enriched honey samples, as well as assessment of the influence of different additives on antioxidant activity and color. The results highlight the potential of NIRS as a rapid, non-destructive and environmentally friendly method for quality monitoring and detection of specific additives in honey, offering technical support for quality control and food safety regulation. Full article

(This article belongs to the Special Issue Chemometrics for Food, Environmental and Biological Analysis)

► Show Figures

Figure 1

13 pages, 1670 KiB

Open AccessArticle

Rapid Classification of Cow, Goat, and Sheep Milk Using ATR-FTIR and Multivariate Analysis

by Lamprini Dimitriou, Michalis Koureas, Christos Pappas, Athanasios Manouras, Dimitrios Kantas and Eleni Malissiova

Sci 2025, 7(3), 87; https://doi.org/10.3390/sci7030087 - 1 Jul 2025

Viewed by 304

Abstract

Sheep and goat milk authenticity is of great importance, especially for countries like Greece, where these products are connected to the country’s rural economy and cultural heritage. The aim of the study is to evaluate the effectiveness of Fourier Transform Infrared Attenuated Total [...] Read more.

Sheep and goat milk authenticity is of great importance, especially for countries like Greece, where these products are connected to the country’s rural economy and cultural heritage. The aim of the study is to evaluate the effectiveness of Fourier Transform Infrared Attenuated Total Reflectance (ATR-FTIR) spectroscopy in combination with chemometric techniques for the classification of cow, sheep, and goat milk and consequently support fraud identification. A total of 178 cow, sheep and goat milk samples were collected from livestock farms in Thessaly, Greece. Sheep and goat milk samples were confirmed as authentic by applying a validated Enzyme Linked Immunosorbent Assay (ELISA), while all samples were analyzed using ATR-FTIR spectroscopy in both raw and freeze-dried form. Freeze-dried samples exhibited clearer spectral characteristics, particularly enhancing the signals from triglycerides, proteins, and carbohydrates. Partial Least Squares Discriminant Analysis (PLS-DA) delivered robust discrimination. By using the spectral range between 600 and 1800 cm⁻¹, 100% correct classification of all milk types was achieved. These findings highlight the potential of FTIR spectroscopy as a fast, non-destructive, and cost-effective tool for milk identification and species differentiation. This method is particularly suitable for industrial and regulatory applications, offering high efficiency. Full article

► Show Figures

Figure 1

27 pages, 3410 KiB

Open AccessArticle

Assessing the Authenticity and Quality of Paprika (Capsicum annuum) and Cinnamon (Cinnamomum spp.) in the Slovenian Market: A Multi-Analytical and Chemometric Approach

by Sabina Primožič, Cathrine Terro, Lidija Strojnik, Nataša Šegatin, Nataša Poklar Ulrih and Nives Ogrinc

Foods 2025, 14(13), 2323; https://doi.org/10.3390/foods14132323 - 30 Jun 2025

Viewed by 435

Abstract

The authentication of high-value spices such as paprika and cinnamon is critical due to increasing food fraud. This study explored the potential of a multi-analytical approach, combined with chemometric tools, to differentiate 45 paprika and 46 cinnamon samples from the Slovenian market based [...] Read more.

The authentication of high-value spices such as paprika and cinnamon is critical due to increasing food fraud. This study explored the potential of a multi-analytical approach, combined with chemometric tools, to differentiate 45 paprika and 46 cinnamon samples from the Slovenian market based on their geographic origin, production methods, and possible adulteration. The applied techniques included stable isotope ratio analysis (δ¹³C, δ¹⁵N, δ³⁴S), multi-elemental profiling, FTIR, and antioxidant compound analysis. Distinct isotopic and elemental markers (e.g., δ¹³C, δ³⁴S, Rb, Cs, V, Fe, Al) contributed to classification by geographic origin, with preliminary classification accuracies of 90% for paprika (Hungary, Serbia, Spain) and 89% for cinnamon (Sri Lanka, Madagascar, Indonesia). Organic paprika samples showed higher values of δ¹⁵N, δ³⁴S, and Zn, whereas conventional ones had more Na, Al, V, and Cr. For cinnamon, a 95% discrimination accuracy was achieved between production practice using δ³⁴S and Ba, as well as As, Rb, Na, δ¹³C, S, Mg, Fe, V, Al, and Cu. FTIR differentiated Ceylon from cassia cinnamon and suggested possible paprika adulteration, as indicated by spectral features consistent with oleoresin removal or azo dye addition, although further verification is required. Antioxidant profiling supported quality assessment, although the high antioxidant activity in cassia cinnamon may reflect non-phenolic contributors. Overall, the results demonstrate the promising potential of the applied analytical techniques to support spice authentication. However, further studies on larger, more balanced datasets are essential to validate and generalize these findings. Full article

(This article belongs to the Special Issue Chemometrics in Food Chemistry and Analysis: Novel Detection Methods to Assess Food Quality and Safety)

► Show Figures

Figure 1

17 pages, 1610 KiB

Open AccessArticle

Enhancing Coffee Quality and Traceability: Chemometric Modeling for Post-Harvest Processing Classification Using Near-Infrared Spectroscopy

by Mariana Santos-Rivera, Lakshmanan Viswanathan and Faris Sheibani

Spectrosc. J. 2025, 3(2), 20; https://doi.org/10.3390/spectroscj3020020 - 19 Jun 2025

Viewed by 432

Abstract

Post-harvest processing (PHP) is a key determinant of coffee quality, flavor profile, and market classification, yet verifying PHP claims remains a significant challenge in the specialty coffee industry. This study introduces near-infrared spectroscopy (NIRS) coupled with chemometrics as a rapid, non-destructive approach to [...] Read more.

Post-harvest processing (PHP) is a key determinant of coffee quality, flavor profile, and market classification, yet verifying PHP claims remains a significant challenge in the specialty coffee industry. This study introduces near-infrared spectroscopy (NIRS) coupled with chemometrics as a rapid, non-destructive approach to classify green coffee beans based on PHP. For the first time, seven distinct PHP categories—Alchemy, Anaerobic Processing (Deep Fermentation), Dry-Hulled, Honey, Natural, Washed, and Wet-Hulled—were discriminated using NIRS, encompassing 20 different processing protocols under varying environmental and fermentation conditions. The NIR spectra (350–2500 nm) of 524 green Arabica coffee samples were analyzed using PCA-LDA models (750–2450 nm), achieving classification accuracies up to 100% for underrepresented categories and strong performance (91–95%) for dominant PHP groups in an independent test set. These results demonstrate that NIRS can detect subtle chemical signatures associated with diverse PHP techniques, offering a scalable tool for quality assurance, fraud prevention, and traceability in global coffee supply chains. While limited sample sizes for some PHP categories may influence model generalization, this study lays the foundation for future work involving broader datasets and integration with digital traceability systems. The approach has direct implications for producers, traders, and certifying bodies seeking reliable, real-time PHP verification. Full article

(This article belongs to the Special Issue Feature Papers in Spectroscopy Journal)

► Show Figures

Figure 1

17 pages, 7087 KiB

Open AccessArticle

Telecom Fraud Recognition Based on Large Language Model Neuron Selection

by Lanlan Jiang, Cheng Zhang, Xingguo Qin, Ya Zhou, Guanglun Huang, Hui Li and Jun Li

Mathematics 2025, 13(11), 1784; https://doi.org/10.3390/math13111784 - 27 May 2025

Viewed by 541

Abstract

In the realm of natural language processing (NLP), text classification constitutes a task of paramount significance for large language models (LLMs). Nevertheless, extant methodologies predominantly depend on the output generated by the final layer of LLMs, thereby neglecting the wealth of information encapsulated [...] Read more.

In the realm of natural language processing (NLP), text classification constitutes a task of paramount significance for large language models (LLMs). Nevertheless, extant methodologies predominantly depend on the output generated by the final layer of LLMs, thereby neglecting the wealth of information encapsulated within neurons residing in intermediate layers. To surmount this shortcoming, we introduce LENS (Linear Exploration and Neuron Selection), an innovative technique designed to identify and sparsely integrate salient neurons from intermediate layers via a process of linear exploration. Subsequently, these neurons are transmitted to downstream modules dedicated to text classification. This strategy effectively mitigates noise originating from non-pertinent neurons, thereby enhancing both the accuracy and computational efficiency of the model. The detection of telecommunication fraud text represents a formidable challenge within NLP, primarily attributed to its increasingly covert nature and the inherent limitations of current detection algorithms. In an effort to tackle the challenges of data scarcity and suboptimal classification accuracy, we have developed the LENS-RMHR (Linear Exploration and Neuron Selection with RoBERTa, Multi-head Mechanism, and Residual Connections) model, which extends the LENS framework. By incorporating RoBERTa, a multi-head attention mechanism, and residual connections, the LENS-RMHR model augments the feature representation capabilities and improves training efficiency. Utilizing the CCL2023 telecommunications fraud dataset as a foundation, we have constructed an expanded dataset encompassing eight distinct categories that encapsulate a diverse array of fraud types. Furthermore, a dual-loss function has been employed to bolster the model’s performance in multi-class classification scenarios. Experimental results reveal that LENS-RMHR demonstrates superior performance across multiple benchmark datasets, underscoring its extensive potential for application in the domains of text classification and telecommunications fraud detection. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

42 pages, 4633 KiB

Open AccessArticle

Resolution-Aware Deep Learning with Feature Space Optimization for Reliable Identity Verification in Electronic Know Your Customer Processes

by Mahasak Ketcham, Pongsarun Boonyopakorn and Thittaporn Ganokratanaa

Mathematics 2025, 13(11), 1726; https://doi.org/10.3390/math13111726 - 23 May 2025

Viewed by 630

Abstract

In modern digital transactions involving government agencies, financial institutions, and commercial enterprises, reliable identity verification is essential to ensure security and trust. Traditional methods, such as submitting photocopies of ID cards, are increasingly susceptible to identity theft and fraud. To address these challenges, [...] Read more.

In modern digital transactions involving government agencies, financial institutions, and commercial enterprises, reliable identity verification is essential to ensure security and trust. Traditional methods, such as submitting photocopies of ID cards, are increasingly susceptible to identity theft and fraud. To address these challenges, this study proposes a novel and robust identity verification framework that integrates super-resolution preprocessing, a convolutional neural network (CNN), and Monte Carlo dropout-based Bayesian uncertainty estimation for enhanced facial recognition in electronic know your customer (e-KYC) processes. The key contribution of this research lies in its ability to handle low-resolution and degraded facial images simulating real-world conditions where image quality is inconsistent while providing confidence-aware predictions to support transparent and risk-aware decision making. The proposed model is trained on facial images resized to 24 × 24 pixels, with a super-resolution module enhancing feature clarity prior to classification. By incorporating Monte Carlo dropout, the system estimates predictive uncertainty, addressing critical limitations of conventional black-box deep learning models. Experimental evaluations confirmed the effectiveness of the framework, achieving a classification accuracy of 99.7%, precision of 99.2%, recall of 99.3%, and an AUC score of 99.5% under standard testing conditions. The model also demonstrated strong robustness against noise and image blur, maintaining reliable performance even under challenging input conditions. In addition, the proposed system is designed to comply with international digital identity standards, including the Identity Assurance Level (IAL) and Authenticator Assurance Level (AAL), ensuring practical applicability in regulated environments. Overall, this research contributes a scalable, secure, and interpretable solution that advances the application of deep learning and uncertainty modeling in real-world e-KYC systems. Full article

(This article belongs to the Special Issue Advanced Studies in Mathematical Optimization and Machine Learning)

► Show Figures

Figure 1

13 pages, 1695 KiB

Open AccessArticle

Deepfake Voice Detection: An Approach Using End-to-End Transformer with Acoustic Feature Fusion by Cross-Attention

by Liang Yu Gong and Xue Jun Li

Electronics 2025, 14(10), 2040; https://doi.org/10.3390/electronics14102040 - 16 May 2025

Viewed by 763

Abstract

Deepfake technology uses artificial intelligence to create highly realistic but fake audio, video, or images, often making it difficult to distinguish from real content. Due to its potential use for misinformation, fraud, and identity theft, deepfake technology has gained a bad reputation in [...] Read more.

Deepfake technology uses artificial intelligence to create highly realistic but fake audio, video, or images, often making it difficult to distinguish from real content. Due to its potential use for misinformation, fraud, and identity theft, deepfake technology has gained a bad reputation in the digital world. Recently, many works have reported on the detection of deepfake videos/images. However, few studies have concentrated on developing robust deepfake voice detection systems. Among most existing studies in this field, a deepfake voice detection system commonly requires a large amount of training data and a robust backbone to detect real and logistic attack audio. For acoustic feature extractions, Mel-frequency Filter Bank (MFB)-based approaches are more suitable for extracting speech signals than applying the raw spectrum as input. Recurrent Neural Networks (RNNs) have been successfully applied to Natural Language Processing (NLP), but these backbones suffer from gradient vanishing or explosion while processing long-term sequences. In addition, the cross-dataset evaluation of most deepfake voice recognition systems has weak performance, leading to a system robustness issue. To address these issues, we propose an acoustic feature-fusion method to combine Mel-spectrum and pitch representation based on cross-attention mechanisms. Then, we combine a Transformer encoder with a convolutional neural network block to extract global and local features as a front end. Finally, we connect the back end with one linear layer for classification. We summarized several deepfake voice detectors’ performances on the silence-segment processed ASVspoof 2019 dataset. Our proposed method can achieve an Equal Error Rate (EER) of 26.41%, while most of the existing methods result in EER higher than 30%. We also tested our proposed method on the ASVspoof 2021 dataset, and found that it can achieve an EER as low as 28.52%, while the EER values for existing methods are all higher than 28.9%. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 4702 KiB

Open AccessArticle

A Deep Learning Approach to Classify AI-Generated and Human-Written Texts

by Ayla Kayabas, Ahmet Ercan Topcu, Yehia Ibrahim Alzoubi and Mehmet Yıldız

Appl. Sci. 2025, 15(10), 5541; https://doi.org/10.3390/app15105541 - 15 May 2025

Cited by 1 | Viewed by 853

Abstract

The rapid advancement of artificial intelligence (AI) has introduced new challenges, particularly in the generation of AI-written content that closely resembles human-authored text. This poses a significant risk for misinformation, digital fraud, and academic dishonesty. While large language models (LLM) have demonstrated impressive [...] Read more.

The rapid advancement of artificial intelligence (AI) has introduced new challenges, particularly in the generation of AI-written content that closely resembles human-authored text. This poses a significant risk for misinformation, digital fraud, and academic dishonesty. While large language models (LLM) have demonstrated impressive capabilities across various languages, there remains a critical gap in evaluating and detecting AI-generated content in under-resourced languages such as Turkish. To address this, our study investigates the effectiveness of long short-term memory (LSTM) networks—a computationally efficient and interpretable architecture—for distinguishing AI-generated Turkish texts produced by ChatGPT from human-written content. LSTM was selected due to its lower hardware requirements and its proven strength in sequential text classification, especially under limited computational resources. Four experiments were conducted, varying hyperparameters such as dropout rate, number of epochs, embedding size, and patch size. The model trained over 20 epochs achieved the best results, with a classification accuracy of 97.28% and an F1 score of 0.97 for both classes. The confusion matrix confirmed high precision, with only 19 misclassified instances out of 698. These findings highlight the potential of LSTM-based approaches for AI-generated text detection in the Turkish language context. This study not only contributes a practical method for Turkish NLP applications but also underlines the necessity of tailored AI detection tools for low-resource languages. Future work will focus on expanding the dataset, incorporating other architectures, and applying the model across different domains to enhance generalizability and robustness. Full article

► Show Figures

Figure 1

21 pages, 1981 KiB

Open AccessArticle

Enhanced Financial Fraud Detection Using an Adaptive Voted Perceptron Model with Optimized Learning and Error Reduction

by Muhammad Binsawad

Electronics 2025, 14(9), 1875; https://doi.org/10.3390/electronics14091875 - 5 May 2025

Viewed by 1470

Abstract

Financial fraud detection is an important field in financial technology, and strong and effective machine learning (ML) models are needed to detect fraudulent transactions with high accuracy and reliability. Conventional fraud detection models, like probabilistic, instance-based, and tree-based models, tend to have high [...] Read more.

Financial fraud detection is an important field in financial technology, and strong and effective machine learning (ML) models are needed to detect fraudulent transactions with high accuracy and reliability. Conventional fraud detection models, like probabilistic, instance-based, and tree-based models, tend to have high error rates, class imbalance problems, and poor adaptability to changing fraud patterns. These issues call for sophisticated methods that improve predictive accuracy while being computationally efficient. To overcome these limitations, this research introduces the Voted Perceptron (VP) model, which utilizes an iterative learning process to dynamically adapt decision boundaries based on misclassified examples. In contrast to traditional models with static decision rules, the VP model constantly updates its weight parameters, thus providing better fraud detection abilities. The evaluation compares VP with state-of-the-art machine learning models, such as Average One Dependency Estimator (A1DE), K-nearest Neighbor (KNN), Naïve Bayes (NB), Random Tree (RT), and Functional Tree (FT), by using important performance metrics, like Mean Absolute Error (MAE), Root Mean Square Error (RMSE), True Positive Rate (TPR), recall, and accuracy. Experimental results show that VP outperforms its rivals significantly, yielding better fraud detection performance with low error rates and high recall. Furthermore, an ablation study confirms the influence of essential VP model elements on general classification performance. These results demonstrate VP to be an extremely effective model for detecting financial fraud, with enhanced flexibility towards evolving fraud patterns, and confirm the necessity for intelligent fraud detection mechanisms within financial organizations. Full article

► Show Figures

Figure 1

20 pages, 5044 KiB

Open AccessArticle

¹H-NMR Spectroscopy and Chemometric Fingerprinting for the Authentication of Organic Extra Virgin Olive Oils

by Silvana M. Azcarate, Maria P. Segura-Borrego, Rocío Ríos-Reina and Raquel M. Callejón

Chemosensors 2025, 13(5), 162; https://doi.org/10.3390/chemosensors13050162 - 1 May 2025

Cited by 1 | Viewed by 690

Abstract

The authentication of organic extra virgin olive oils (OEVOOs) is crucial for quality control and fraud prevention. This study applies proton-nuclear magnetic resonance (¹H-NMR) spectroscopy combined with chemometric analysis as a non-destructive, untargeted approach to differentiate EVOOs based on cultivation method [...] Read more.

The authentication of organic extra virgin olive oils (OEVOOs) is crucial for quality control and fraud prevention. This study applies proton-nuclear magnetic resonance (¹H-NMR) spectroscopy combined with chemometric analysis as a non-destructive, untargeted approach to differentiate EVOOs based on cultivation method (organic vs. conventional) and variety (Hojiblanca vs. Picual). Principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) demonstrated well-defined sample differentiation, while the variable importance in projection (VIP) selection and Tukey’s test identified key spectral regions responsible for classification. The results showed that sterols and lipid-related compounds played a major role in distinguishing organic from conventional oils, whereas fatty acids and phenolic compounds were more relevant for cultivar differentiation. These findings align with known metabolic differences, where Picual oils generally exhibit higher polyphenol content, and a distinct fatty acid composition compared to Hojiblanca. The agreement between chemometric classification models and statistical tests supports the potential of ¹H-NMR for OEVOO authentication. This method provides a comprehensive and reproducible metabolic fingerprint, enabling differentiation based on both agronomic practices and genetic factors. These findings suggest that ¹H-NMR spectroscopy, coupled with multivariate analysis, could be a valuable tool for quality control and fraud detection in the olive oil industry. Full article

(This article belongs to the Special Issue Chemometrics for Food, Environmental and Biological Analysis)

► Show Figures

Figure 1

32 pages, 3621 KiB

Open AccessArticle

Methodological Validation of Machine Learning Models for Non-Technical Loss Detection in Electric Power Systems: A Case Study in an Ecuadorian Electricity Distributor

by Carlos Arias-Marín, Antonio Barragán-Escandón, Marco Toledo-Orozco and Xavier Serrano-Guerrero

Appl. Sci. 2025, 15(7), 3912; https://doi.org/10.3390/app15073912 - 2 Apr 2025

Viewed by 718

Abstract

Detecting fraudulent behaviors in electricity consumption is a significant challenge for electric utility companies due to the lack of information and the complexity of both constructing patterns and distinguishing between regular and fraudulent consumers. This study proposes a methodology based on data analytics [...] Read more.

Detecting fraudulent behaviors in electricity consumption is a significant challenge for electric utility companies due to the lack of information and the complexity of both constructing patterns and distinguishing between regular and fraudulent consumers. This study proposes a methodology based on data analytics that, through the processing of information, generates lists of suspicious metering systems for fraud. The database provided by the electrical distribution company contains 266,298 records, of which 15,013 have observations for possible frauds. One of the challenges lies in managing the different variables in the training data and choosing appropriate evaluation metrics. To address this, a balanced database of 27,374 records was used, with an equitable division between fraud and non-fraud cases. The features used in the identification and construction of patterns for non-technical losses were crucial, although additional techniques could be applied to determine the most relevant variables. Following the process, several popular classification models were trained. Hyperparameter optimization was performed by using grid search, and the models were validated by using cross-validation techniques, finding that the ensemble methods Categorical Boosting (CGB), Light Gradient Boosting Machine (LGB) and Extreme Gradient Boosting (EGB) are the most suitable for identifying losses, achieving high performance and reasonable computational cost. The best performance was compared by measuring accuracy (Acc) and F1 score, which allows for the evaluation of various techniques and is a combination of two metrics: detection rate and precision. Although CGB achieved the best performance in terms of accuracy (Acc = 0.897) and F1 (0.894), it was slower than LGB, so it is considered the ideal classifier for the data provided by the electrical distribution company. This research study highlights the importance of the techniques used for fraud detection in electricity metering systems, although the results may vary depending on the characteristics of the training, the number of variables, and the available hardware resources. Full article

► Show Figures

Figure 1

14 pages, 1346 KiB

Open AccessTechnical Note

Fluorescence Spectroscopy and a Convolutional Neural Network for High-Accuracy Japanese Green Tea Origin Identification

by Rikuto Akiyama, Kana Suzuki, Yvan Llave and Takashi Matsumoto

AgriEngineering 2025, 7(4), 95; https://doi.org/10.3390/agriengineering7040095 - 1 Apr 2025

Viewed by 648

Abstract

This study aims to develop a system combining fluorescence spectroscopy and machine learning through a convolutional neural network (CNN) to identify the origins of various Japanese green teas (Sayama tea, Kakegawa tea, Yame tea, and Chiran tea). Although food origin labeling is important [...] Read more.

This study aims to develop a system combining fluorescence spectroscopy and machine learning through a convolutional neural network (CNN) to identify the origins of various Japanese green teas (Sayama tea, Kakegawa tea, Yame tea, and Chiran tea). Although food origin labeling is important for ensuring consumer quality and safety, ac-curate identification remains a priority for the food industry due to the emergence of problems with false origin labeling. In this study, image data of the fluorescent fingerprints of green teas were collected using fluorescence spectroscopy and analyzed using a CNN model implemented in Python (ver. 3.13.2), TensorFlow (ver. 2.18.0), and Keras (ver. 3.9). The fluorescence of each sample was measured in the range of 250 to 550 nm, highlighting the differences in chemical composition that reflect each region. Using these data, a CNN suitable for image recognition successfully identified the origins of the teas with an average accuracy of 92.83% in 10 trials. For Chiran tea and Yame tea, precision and recall rates of over 95% were achieved, showing clear differences from other regions. In contrast, the classification of Kakegawa and Sayama teas proved challenging due to their similar fluorescence patterns in the 300–350 nm spectral range, corresponding to catechins and polyphenolic compounds. These similarities are presumed to reflect the comparable growing conditions and processing methods characteristic of the two regions. This study shows the potential of this system in food origin identification, suggesting applications in preventing origin fraud and quality control. Future research will aim to extend the system to other regions and foods, enhance data preprocessing to improve accuracy, and develop a versatile identification system. Full article

(This article belongs to the Special Issue The Future of Artificial Intelligence in Agriculture)

► Show Figures

Figure 1

26 pages, 6245 KiB

Open AccessArticle

Secure and Transparent Banking: Explainable AI-Driven Federated Learning Model for Financial Fraud Detection

by Saif Khalifa Aljunaid, Saif Jasim Almheiri, Hussain Dawood and Muhammad Adnan Khan

J. Risk Financial Manag. 2025, 18(4), 179; https://doi.org/10.3390/jrfm18040179 - 27 Mar 2025

Cited by 2 | Viewed by 4452

Abstract

The increasing sophistication of fraud has rendered rule-based fraud detection obsolete, exposing banks to greater financial risk, reputational damage, and regulatory penalties. Financial stability, customer trust, and compliance are increasingly threatened as centralized Artificial Intelligence (AI) models fail to adapt, leading to inefficiencies, [...] Read more.

The increasing sophistication of fraud has rendered rule-based fraud detection obsolete, exposing banks to greater financial risk, reputational damage, and regulatory penalties. Financial stability, customer trust, and compliance are increasingly threatened as centralized Artificial Intelligence (AI) models fail to adapt, leading to inefficiencies, false positives, and undetected detection. These limitations necessitate advanced AI solutions for banks to adapt properly to emerging fraud patterns. While AI enhances fraud detection, its black-box nature limits transparency, making it difficult for analysts to trust, validate, and refine decisions, posing challenges for compliance, fraud explanation, and adversarial defense. Effective fraud detection requires models that balance high accuracy and adaptability to emerging fraud patterns. Federated Learning (FL) enables distributed training for fraud detection while preserving data privacy and ensuring legal compliance. However, traditional FL approaches operate as black-box systems, limiting the analysts to trust, verify, or even improve the decisions made by AI in fraud detection. Explainable AI (XAI) enhances fraud analysis by improving interpretability, fostering trust, refining classifications, and ensuring compliance. The integration of XAI and FL forms a privacy-preserving and explainable model that enhances security and decision-making. This research proposes an Explainable FL (XFL) model for financial fraud detection, addressing both FL’s security and XAI’s interpretability. With the help of Shapley Additive Explanations (SHAP) and LIME, analysts can explain and improve fraud classification while maintaining privacy, accuracy, and compliance. The proposed model is trained on a financial fraud detection dataset, and the results highlight the efficiency of detection and successful elimination of false positives and contribute to the improvement of the existing models as the proposed model attained 99.95% accuracy and a miss rate of 0.05%, paving the way for a more effective and comprehensive AI-based system to detect potential fraudulence in banking. Full article

(This article belongs to the Special Issue Corporate Financial Crises and Fraud Detection)

► Show Figures

Figure 1

Search Results (168)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (168)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI