MDPI - Publisher of Open Access Journals

24 pages, 4913 KiB

Open AccessArticle

Region-Wise Recognition and Classification of Arabic Dialects and Vocabulary: A Deep Learning Approach

by Fawaz S. Al–Anzi and Bibin Shalini Sundaram Thankaleela

Appl. Sci. 2025, 15(12), 6516; https://doi.org/10.3390/app15126516 - 10 Jun 2025

Viewed by 632

This article presents a unique approach to Arabic dialect identification using a pre-trained speech classification model. The system categorizes Arabic audio clips into their respective dialects by employing 1D and 2D convolutional neural network technologies built from diverse dialects from the Arab region [...] Read more.

This article presents a unique approach to Arabic dialect identification using a pre-trained speech classification model. The system categorizes Arabic audio clips into their respective dialects by employing 1D and 2D convolutional neural network technologies built from diverse dialects from the Arab region using deep learning models. Its objective is to enhance traditional linguistic handling and speech technology by accurately classifying Arabic audio clips into their corresponding dialects. The techniques involved include record gathering, preprocessing, feature extraction, prototypical architecture, and assessment metrics. The algorithm distinguishes various Arabic dialects, such as A (Arab nation authorized dialectal), EGY (Egyptian Arabic), GLF (Gulf Arabic), LAV and LF (Levantine Arabic, spoken in Syria, Lebanon, and Jordan), MSA (Modern Standard Arabic), NOR (North African Arabic), and SA (Saudi Arabic). Experimental results demonstrate the efficiency of the proposed approach in accurately determining diverse Arabic dialects, achieving a testing accuracy of 94.28% and a validation accuracy of 95.55%, surpassing traditional machine learning models such as Random Forest and SVM and advanced erudition models such as CNN and CNN2D. Full article

(This article belongs to the Special Issue Speech Recognition and Natural Language Processing)

► Show Figures

Figure 1

12 pages, 921 KiB

Open AccessArticle

Comparison of ECG Between Gameplay and Seated Rest: Machine Learning-Based Classification

by Emi Yuda, Hiroyuki Edamatsu, Yutaka Yoshida and Takahiro Ueno

Appl. Sci. 2025, 15(10), 5783; https://doi.org/10.3390/app15105783 - 21 May 2025

Viewed by 396

Abstract

The influence of gameplay on autonomic nervous system activity was investigated by comparing electrocardiogram (ECG) data during seated rest and gameplay. A total of 13 participants (6 in the gameplay group and 7 in the control group) were analyzed. RR interval time series [...] Read more.

The influence of gameplay on autonomic nervous system activity was investigated by comparing electrocardiogram (ECG) data during seated rest and gameplay. A total of 13 participants (6 in the gameplay group and 7 in the control group) were analyzed. RR interval time series (2 Hz) and heart-rate variability (HRV) indices, including mean RR, SDRR, VLF, LF, HF, LF/HF, and HF peak frequency, were extracted from ECG signals over 5 min and 10 min segments. HRV indices were calculated using fast Fourier transform (FFT). The classification was performed using Logistic Regression (LGR), Random Forest (RF), XGBoost (XGB, v2.9.2), One-Class SVM (OCS), Isolation Forest (ILF), and Local Outlier Factor (LOF). A balanced dataset of 5 min and 10 min segments was evaluated using k-fold cross-validation (k = 3, 4, 5). Performance metrics, including recall, F-score, and PR-AUC, were computed for each classifier. Grid search was applied to optimize parameters for LGR, RF, and XGB, while default settings were used for the other classifiers. Among all models, OCS with k = 3 achieved the highest classification accuracy for both 5 min and 10 min data. These findings suggest that machine learning-based classification can effectively distinguish ECG patterns between gameplay and rest. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Bioinformatics)

► Show Figures

Figure 1

25 pages, 9354 KiB

Open AccessArticle

Identification of Maize Kernel Varieties Using LF-NMR Combined with Image Data: An Explainable Approach Based on Machine Learning

by Chunguang Bi, Xinhua Bi, Jinjing Liu, He Chen, Mohan Wang, Helong Yu and Shaozhong Song

Plants 2025, 14(1), 37; https://doi.org/10.3390/plants14010037 - 26 Dec 2024

Cited by 1 | Viewed by 1197

Abstract

The precise identification of maize kernel varieties is essential for germplasm resource management, genetic diversity conservation, and the optimization of agricultural production. To address the need for rapid and non-destructive variety identification, this study developed a novel interpretable machine learning approach that integrates [...] Read more.

The precise identification of maize kernel varieties is essential for germplasm resource management, genetic diversity conservation, and the optimization of agricultural production. To address the need for rapid and non-destructive variety identification, this study developed a novel interpretable machine learning approach that integrates low-field nuclear magnetic resonance (LF-NMR) with morphological image features through an optimized support vector machine (SVM) framework. First, LF-NMR signals were obtained from eleven maize kernel varieties, and ten key features were extracted from the transverse relaxation decay curves. Meanwhile, five image morphological features were selected using the recursive feature elimination (RFE) algorithm. Before modeling, principal component analysis (PCA) was used to determine the distribution features of the internal components for each maize variety. Subsequently, LF-NMR features and image morphological data were integrated to construct a classification model and the SVM hyperparameters were optimized using an improved differential evolution algorithm, achieving a final classification accuracy of 96.36%, which demonstrated strong robustness and precision. The model’s interpretability was further enhanced using Shapley values, which revealed the contributions of key features such as Max Signal and Signal at Max Curvature to classification decisions. This study provides an innovative technical solution for the efficient identification of maize varieties, supports the refined management of germplasm resources, and lays a foundation for genetic improvement and agricultural applications. Full article

(This article belongs to the Section Plant Modeling)

► Show Figures

Figure 1

22 pages, 8158 KiB

Open AccessArticle

Extracting Mare-like Cryptomare Deposits in Cryptomare Regions Based on CE-2 MRM Data Using SVM Method

by Tianqi Tang, Zhiguo Meng, Yi Lian, Zhaoran Wei, Xuegang Dong, Yongzhi Wang, Mingchang Wang, Zhanchuan Cai, Xiaoping Zhang, Alexander Gusev and Yuanzhi Zhang

Remote Sens. 2023, 15(8), 2010; https://doi.org/10.3390/rs15082010 - 11 Apr 2023

Cited by 1 | Viewed by 2125

Abstract

A new kind of surface material is found and defined in the Balmer–Kapteyn (B-K) cryptomare region, Mare-like cryptomare deposits (MCD), representing highland debris mixed by mare deposits with a certain fraction. This postulates the presence of surface materials in the cryptomare regions. In [...] Read more.

A new kind of surface material is found and defined in the Balmer–Kapteyn (B-K) cryptomare region, Mare-like cryptomare deposits (MCD), representing highland debris mixed by mare deposits with a certain fraction. This postulates the presence of surface materials in the cryptomare regions. In this study, to objectively verify the existence of the MCD in the cryptomare regions, based on the Chang’E-2 microwave radiometer (MRM) data, the support vector machine (SVM) method was adopted, where the K-means algorithm was used to optimize the training samples and the random forest algorithm was used to select the proper band features. Finally, the extracted MCD is identified with the datasets including Lunar Reconnaissance Orbiter Wide Angle Camera, Diviner, and Clementine UV–VIS. The main findings are as follows: (1) Compared to the range outlined via the TB counter, the range of the MCD is objectively extracted using the SVM method in the B-K cryptomare region, which is reasonably indicated by the FeO abundance, TiO₂ abundance, and rock abundance distributions. (2) The MCDs were extracted in the Dewar, Lomonosov–Fleming (L-F), and Schiller–Schickard (S-S) regions, indicating that the MCDs are widely distributed in the cryptomaria. (3) The presence of MCDs is concentrated in a limited region, accounting for 64.9%, 52.3%, 76.4%, and 64%, respectively, in the range of Dewar, L-F, S-S, and B-K regions identified using the optical data. The occurrence of the MCD gives a new understanding of the surface evolution in the cryptomare regions. Full article

(This article belongs to the Special Issue Advances in Exploring the Moon, Mars, and Asteroids Using Spacecraft Remote Sensing and Other Toolkits)

► Show Figures

Figure 1

12 pages, 1758 KiB

Open AccessArticle

Discrimination and Prediction of Lonicerae japonicae Flos and Lonicerae Flos and Their Related Prescriptions by Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy Combined with Multivariate Statistical Analysis

by Yang-Qiannan Tang, Li Li, Tian-Feng Lin, Li-Mei Lin, Ya-Mei Li and Bo-Hou Xia

Molecules 2022, 27(14), 4640; https://doi.org/10.3390/molecules27144640 - 20 Jul 2022

Cited by 4 | Viewed by 2208

Abstract

LJF and LF are commonly used in Chinese patent drugs. In the Chinese Pharmacopoeia, LJF and LF once belonged to the same source. However, since 2005, the two species have been listed separately. Therefore, they are often misused, and medicinal materials are [...] Read more.

LJF and LF are commonly used in Chinese patent drugs. In the Chinese Pharmacopoeia, LJF and LF once belonged to the same source. However, since 2005, the two species have been listed separately. Therefore, they are often misused, and medicinal materials are indiscriminately put in their related prescriptions in China. In this work, firstly, we established a model for discriminating LJF and LF using ATR-FTIR combined with multivariate statistical analysis. The spectra data were further preprocessed and combined with spectral filter transformations and normalization methods. These pretreated data were used to establish pattern recognition models with PLS-DA, RF, and SVM. Results demonstrated that the RF model was the optimal model, and the overall classification accuracy for LJF and LF samples reached 98.86%. Then, the established model was applied in the discrimination of their related prescriptions. Interestingly, the results show good accuracy and applicability. The RF model for discriminating the related prescriptions containing LJF or LF had an accuracy of 100%. Our results suggest that this method is a rapid and effective tool for the successful discrimination of LJF and LF and their related prescriptions. Full article

(This article belongs to the Special Issue Chemometrics in Analytical Chemistry)

► Show Figures

Figure 1

21 pages, 5713 KiB

Open AccessArticle

The Superiority of Data-Driven Techniques for Estimation of Daily Pan Evaporation

by Manish Kumar, Anuradha Kumari, Deepak Kumar, Nadhir Al-Ansari, Rawshan Ali, Raushan Kumar, Ambrish Kumar, Ahmed Elbeltagi and Alban Kuriqi

Atmosphere 2021, 12(6), 701; https://doi.org/10.3390/atmos12060701 - 30 May 2021

Cited by 35 | Viewed by 4075

Abstract

In the present study, estimating pan evaporation (E_pan) was evaluated based on different input parameters: maximum and minimum temperatures, relative humidity, wind speed, and bright sunshine hours. The techniques used for estimating E_pan were the artificial neural network (ANN), wavelet-based [...] Read more.

In the present study, estimating pan evaporation (E_pan) was evaluated based on different input parameters: maximum and minimum temperatures, relative humidity, wind speed, and bright sunshine hours. The techniques used for estimating E_pan were the artificial neural network (ANN), wavelet-based ANN (WANN), radial function-based support vector machine (SVM-RF), linear function-based SVM (SVM-LF), and multi-linear regression (MLR) models. The proposed models were trained and tested in three different scenarios (Scenario 1, Scenario 2, and Scenario 3) utilizing different percentages of data points. Scenario 1 includes 60%: 40%, Scenario 2 includes 70%: 30%, and Scenario 3 includes 80%: 20% accounting for the training and testing dataset, respectively. The various statistical tools such as Pearson’s correlation coefficient (PCC), root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), and Willmott Index (WI) were used to evaluate the performance of the models. The graphical representation, such as a line diagram, scatter plot, and the Taylor diagram, were also used to evaluate the proposed model’s performance. The model results showed that the SVM-RF model’s performance is superior to other proposed models in all three scenarios. The most accurate values of PCC, RMSE, NSE, and WI were found to be 0.607, 1.349, 0.183, and 0.749, respectively, for the SVM-RF model during Scenario 1 (60%: 40% training: testing) among all scenarios. This showed that with an increase in the sample set for training, the testing data would show a less accurate modeled result. Thus, the evolved models produce comparatively better outcomes and foster decision-making for water managers and planners. Full article

(This article belongs to the Special Issue Drought Risk Management in Reflect Changing of Meteorological Conditions)

► Show Figures

Figure 1

16 pages, 1695 KiB

Open AccessArticle

Evaluation of ECG Features for the Classification of Post-Stroke Survivors with a Diagnostic Approach

by Kalaivani Rathakrishnan, Seung-Nam Min and Se Jin Park

Appl. Sci. 2021, 11(1), 192; https://doi.org/10.3390/app11010192 - 28 Dec 2020

Cited by 8 | Viewed by 5768

Abstract

Stroke is considered as a major cause of death and neurological disorders commonly associated with elderly people. Electrocardiogram (ECG) signals are used as a powerful tool in diagnosing stroke, and the analysis of ECG signals has become the focus of stroke research. ECG [...] Read more.

Stroke is considered as a major cause of death and neurological disorders commonly associated with elderly people. Electrocardiogram (ECG) signals are used as a powerful tool in diagnosing stroke, and the analysis of ECG signals has become the focus of stroke research. ECG changes and autonomic dysfunction are reportedly seen in patients with stroke. This study aimed to analyze the ECG features and develop a classification model with highly ranked ECG features as input variables based on machine-learning techniques for diagnosing stroke disease. The study included 52 stroke patients (mean age 72.7 years, 63% male) and 80 control subjects (mean age 75.5 years, 39% male) for a total of 132 elderly subjects. Resting ECG signals in the lying down position are measured using the BIOPAC MP150 system. The ECG signals are denoised using the discrete wavelet transform (DWT) method, and the features such as heart rate variability (HRV), indices of time and spectral domains and statistical and impulsive metrics, in addition to fiducial features, are extracted and analyzed. Our results showed that the values of the HRV variables were lower in the stroke group, revealing autonomic dysfunction in stroke patients. A statistically significant difference was observed in low-frequency (LF)/high-frequency (HF), time interval measured after the S wave to the beginning of the T wave (ST) and time interval measured from the beginning of the Q wave to the end of the T wave (QT) (p < 0.05) between the groups. Our study also highlighted some of the risk factors of stroke, such as age, male sex and dyslipidemia (p < 0.05), that are statistically significant. The k-nearest neighbors (KNN) model showed the highest classification results (accuracy 96.6%, precision 94.3%, recall 99.1% and F1-score 96.6%) than the random forest, support vector machine (SVM), Naïve Bayes and logistic regression models. Thus, our study reported some of the notable ECG changes in the study participants and also indicated that ECG could aid in diagnosing stroke disease. Full article

(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)

► Show Figures

Figure 1

21 pages, 4706 KiB

Open AccessArticle

Estimation of Daily Stage–Discharge Relationship by Using Data-Driven Techniques of a Perennial River, India

by Manish Kumar, Anuradha Kumari, Daniel Prakash Kushwaha, Pravendra Kumar, Anurag Malik, Rawshan Ali and Alban Kuriqi

Sustainability 2020, 12(19), 7877; https://doi.org/10.3390/su12197877 - 23 Sep 2020

Cited by 34 | Viewed by 4558

Abstract

Modeling the stage-discharge relationship in river flow is crucial in controlling floods, planning sustainable development, managing water resources and economic development, and sustaining the ecosystem. In the present study, two data-driven techniques, namely wavelet-based artificial neural networks (WANN) and a support vector machine [...] Read more.

Modeling the stage-discharge relationship in river flow is crucial in controlling floods, planning sustainable development, managing water resources and economic development, and sustaining the ecosystem. In the present study, two data-driven techniques, namely wavelet-based artificial neural networks (WANN) and a support vector machine with linear and radial basis kernel functions (SVM-LF and SVM-RF), were employed for daily discharge (Q) estimation. The hydrological data of daily stage (H) and discharge (Q) from June to October for 10 years (2004–2013) at the Govindpur station, situated in the Burhabalang river basin, Orissa, were considered for analysis. For model construction, an optimum number of inputs (lags) was extracted using the partial autocorrelation function (PACF) at a 5% level of significance. The outcomes of the WANN, SVM-LF, and SVM-RF models were appraised over the observed value of Q based on performance indicators, viz., root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), Pearson’s correlation coefficient (PCC), and Willmott index (WI), and through visual inspection (time variation, scatter plot, and Taylor diagram). Results of the evaluation showed that the SVM-RF model (RMSE = 104.426 m³/s, NSE = 0.925, PCC = 0.964, WI = 0.979) outperformed the WANN and SVM-LF models with the combination of three inputs, i.e., current stage, one-day antecedent stage, and discharge, during the testing period. In addition, the SVM-RF model was found to be more reliable and robust than the other models and having important implications for water resources management at the study site. Full article

(This article belongs to the Special Issue Machine Learning with Metaheuristic Algorithms for Sustainable Water Resources Management)

► Show Figures

Figure 1

24 pages, 2354 KiB

Open AccessArticle

A Novel Coordinated Motion Fusion-Based Walking-Aid Robot System

by Wenxia Xu, Jian Huang and Lei Cheng

Sensors 2018, 18(9), 2761; https://doi.org/10.3390/s18092761 - 22 Aug 2018

Cited by 22 | Viewed by 5486

Abstract

Human locomotion is a coordinated motion between the upper and lower limbs, which should be considered in terms of both the user’s normal walking state and abnormal walking state for a walking-aid robot system. Therefore, a novel coordinated motion fusion-based walking-aid robot system [...] Read more.

Human locomotion is a coordinated motion between the upper and lower limbs, which should be considered in terms of both the user’s normal walking state and abnormal walking state for a walking-aid robot system. Therefore, a novel coordinated motion fusion-based walking-aid robot system was proposed. To develop the accurate human motion intention (HMI) of such robots when the user is in normal walking state, force-sensing resistor (FSR) sensors and a laser range finder (LRF) are used to detect the two HMIs expressed by the user’s upper and lower limbs. Then, a fuzzy logic control (FLC)-Kalman filter (LF)-based coordinated motion fusion algorithm is proposed to synthesize these two segmental HMIs to obtain an accurate HMI. A support vector machine (SVM)-based fall detection algorithm is used to detect whether the user is going to fall and to distinguish the user’s falling mode when he/she is in an abnormal walking state. The experimental results verify the effectiveness of the proposed algorithms. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Graphical abstract

15 pages, 2695 KiB

Open AccessArticle

Determination of Optimal Heart Rate Variability Features Based on SVM-Recursive Feature Elimination for Cumulative Stress Monitoring Using ECG Sensor

by Dajeong Park, Miran Lee, Sunghee E. Park, Joon-Kyung Seong and Inchan Youn

Sensors 2018, 18(7), 2387; https://doi.org/10.3390/s18072387 - 23 Jul 2018

Cited by 27 | Viewed by 5788

Abstract

Routine stress monitoring in daily life can predict potentially serious health impacts. Effective stress monitoring in medical and healthcare fields is dependent upon accurate determination of stress-related features. In this study, we determined the optimal stress-related features for effective monitoring of cumulative stress. [...] Read more.

Routine stress monitoring in daily life can predict potentially serious health impacts. Effective stress monitoring in medical and healthcare fields is dependent upon accurate determination of stress-related features. In this study, we determined the optimal stress-related features for effective monitoring of cumulative stress. We first investigated the effects of short- and long-term stress on various heart rate variability (HRV) features using a rodent model. Subsequently, we determined an optimal HRV feature set using support vector machine-recursive feature elimination (SVM-RFE). Experimental results indicate that the HRV time domain features generally decrease under long-term stress, and the HRV frequency domain features have substantially significant differences under short-term stress. Further, an SVM classifier with a radial basis function kernel proved most accurate (93.11%) when using an optimal HRV feature set comprising the mean of R-R intervals (mRR), the standard deviation of R-R intervals (SDRR), and the coefficient of variance of R-R intervals (CVRR) as time domain features, and the normalized low frequency (nLF) and the normalized high frequency (nHF) as frequency domain features. Our findings indicate that the optimal HRV features identified in this study can effectively and efficiently detect stress. This knowledge facilitates development of in-facility and mobile healthcare system designs to support stress monitoring in daily life. Full article

(This article belongs to the Section Biosensors)

► Show Figures

Figure 1

15 pages, 3961 KiB

Open AccessArticle

Entropy SVM–Based Recognition of Transient Surges in HVDC Transmissions

by Guomin Luo, Changyuan Yao, Yinglin Liu, Yingjie Tan and Jinghan He

Entropy 2018, 20(6), 421; https://doi.org/10.3390/e20060421 - 31 May 2018

Cited by 9 | Viewed by 4471

Abstract

Protection based on transient information is the primary protection of high voltage direct current (HVDC) transmission systems. As a major part of protection function, accurate identification of transient surges is quite crucial to ensure the performance and accuracy of protection algorithms. Recognition of [...] Read more.

Protection based on transient information is the primary protection of high voltage direct current (HVDC) transmission systems. As a major part of protection function, accurate identification of transient surges is quite crucial to ensure the performance and accuracy of protection algorithms. Recognition of transient surges in an HVDC system faces two challenges: signal distortion and small number of samples. Entropy, which is stable in representing frequency distribution features, and support vector machine (SVM), which is good at dealing with samples with limited numbers, are adopted and combined in this paper to solve the transient recognition problems. Three commonly detected transient surges—single-pole-to-ground fault (GF), lightning fault (LF), and lightning disturbance (LD)—are simulated in various scenarios and recognized with the proposed method. The proposed method is proved to be effective in both feature extraction and type classification and shows great potential in protection applications. Full article

(This article belongs to the Special Issue Wavelets, Fractals and Information Theory III)

► Show Figures

Figure 1

16 pages, 1517 KiB

Open AccessArticle

A Hybrid Approach to Detect Driver Drowsiness Utilizing Physiological Signals to Improve System Performance and Wearability

by Muhammad Awais, Nasreen Badruddin and Micheal Drieberg

Sensors 2017, 17(9), 1991; https://doi.org/10.3390/s17091991 - 31 Aug 2017

Cited by 233 | Viewed by 11987

Abstract

Driver drowsiness is a major cause of fatal accidents, injury, and property damage, and has become an area of substantial research attention in recent years. The present study proposes a method to detect drowsiness in drivers which integrates features of electrocardiography (ECG) and [...] Read more.

Driver drowsiness is a major cause of fatal accidents, injury, and property damage, and has become an area of substantial research attention in recent years. The present study proposes a method to detect drowsiness in drivers which integrates features of electrocardiography (ECG) and electroencephalography (EEG) to improve detection performance. The study measures differences between the alert and drowsy states from physiological data collected from 22 healthy subjects in a driving simulator-based study. A monotonous driving environment is used to induce drowsiness in the participants. Various time and frequency domain feature were extracted from EEG including time domain statistical descriptors, complexity measures and power spectral measures. Features extracted from the ECG signal included heart rate (HR) and heart rate variability (HRV), including low frequency (LF), high frequency (HF) and LF/HF ratio. Furthermore, subjective sleepiness scale is also assessed to study its relationship with drowsiness. We used paired t-tests to select only statistically significant features (p < 0.05), that can differentiate between the alert and drowsy states effectively. Significant features of both modalities (EEG and ECG) are then combined to investigate the improvement in performance using support vector machine (SVM) classifier. The other main contribution of this paper is the study on channel reduction and its impact to the performance of detection. The proposed method demonstrated that combining EEG and ECG has improved the system’s performance in discriminating between alert and drowsy states, instead of using them alone. Our channel reduction analysis revealed that an acceptable level of accuracy (80%) could be achieved by combining just two electrodes (one EEG and one ECG), indicating the feasibility of a system with improved wearability compared with existing systems involving many electrodes. Overall, our results demonstrate that the proposed method can be a viable solution for a practical driver drowsiness system that is both accurate and comfortable to wear. Full article

(This article belongs to the Special Issue Sensors for Transportation)

► Show Figures

Figure 1

15 pages, 3648 KiB

Open AccessArticle

Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils

by Tiezhu Shi, Huizeng Liu, Yiyun Chen, Teng Fei, Junjie Wang and Guofeng Wu

Sensors 2017, 17(5), 1036; https://doi.org/10.3390/s17051036 - 4 May 2017

Cited by 26 | Viewed by 5164

Abstract

This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle [...] Read more.

This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies. Full article

(This article belongs to the Special Issue Sensors in Agriculture)

► Show Figures

Figure 1

14 pages, 838 KiB

Open AccessArticle

Support Vector Machine Classification of Drunk Driving Behaviour

by Huiqin Chen and Lei Chen

Int. J. Environ. Res. Public Health 2017, 14(1), 108; https://doi.org/10.3390/ijerph14010108 - 23 Jan 2017

Cited by 40 | Viewed by 7217

Abstract

Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a [...] Read more.

Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN), the root mean square value of the difference of the adjacent R–R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety. Full article

(This article belongs to the Special Issue Alcohol and Health)

► Show Figures

Figure 1

Search Results (14)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (14)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI