MDPI - Publisher of Open Access Journals

18 pages, 2108 KiB

Open AccessArticle

Machine Learning Forecasting of Commercial Buildings’ Energy Consumption Using Euclidian Distance Matrices

by Connor Scott and Alhussein Albarbar

Energies 2025, 18(15), 4160; https://doi.org/10.3390/en18154160 - 5 Aug 2025

Governments worldwide have set ambitious targets for decarbonising energy grids, driving the need for increased renewable energy generation and improved energy efficiency. One key strategy for achieving this involves enhanced energy management in buildings, often using machine learning-based forecasting methods. However, such methods [...] Read more.

Governments worldwide have set ambitious targets for decarbonising energy grids, driving the need for increased renewable energy generation and improved energy efficiency. One key strategy for achieving this involves enhanced energy management in buildings, often using machine learning-based forecasting methods. However, such methods typically rely on extensive historical data collected via costly sensor installations—resources that many buildings lack. This study introduces a novel forecasting approach that eliminates the need for large-scale historical datasets or expensive sensors. By integrating custom-built models with existing energy data, the method applies calculated weighting through a distance matrix and accuracy coefficients to generate reliable forecasts. It uses readily available building attributes—such as floor area and functional type to position a new building within the matrix of existing data. A Euclidian distance matrix, akin to a K-nearest neighbour algorithm, determines the appropriate neural network(s) to utilise. These findings are benchmarked against a consolidated, more sophisticated neural network and a long short-term memory neural network. The dataset has hourly granularity over a 24 h horizon. The model consists of five bespoke neural networks, demonstrating the superiority of other models with a 610 s training duration, uses 500 kB of storage, achieves an R² of 0.9, and attains an average forecasting accuracy of 85.12% in predicting the energy consumption of the five buildings studied. This approach not only contributes to the specific goal of a fully decarbonized energy grid by 2050 but also establishes a robust and efficient methodology for maintaining standards with existing benchmarks while providing more control over the method. Full article

(This article belongs to the Special Issue Advanced Energy Systems in Energy Resilient and Flexible Zero/Positive Energy Buildings, Communities and Districts)

► Show Figures

Figure 1

19 pages, 1433 KiB

Open AccessArticle

Cost-Optimised Machine Learning Model Comparison for Predictive Maintenance

by Yating Yang and Muhammad Zahid Iqbal

Electronics 2025, 14(12), 2497; https://doi.org/10.3390/electronics14122497 - 19 Jun 2025

Viewed by 669

Abstract

Predictive maintenance is essential for reducing industrial downtime and costs, yet real-world datasets frequently encounter class imbalance and require cost-sensitive evaluation due to costly misclassification errors. This study utilises the SCANIA Component X dataset to advance predictive maintenance through machine learning, employing seven [...] Read more.

Predictive maintenance is essential for reducing industrial downtime and costs, yet real-world datasets frequently encounter class imbalance and require cost-sensitive evaluation due to costly misclassification errors. This study utilises the SCANIA Component X dataset to advance predictive maintenance through machine learning, employing seven supervised algorithms, Support Vector Machine, Random Forest, Decision Tree, K-Nearest Neighbours, Multi-Layer Perceptron, XGBoost, and LightGBM, trained on time-series features extracted via a sliding window approach. A bespoke cost-sensitive metric, aligned with SCANIA’s misclassification cost matrix, assesses model performance. Three imbalance mitigation strategies, downsampling, downsampling with SMOTETomek, and manual class weighting, were explored, with downsampling proving most effective. Random Forest and Support Vector Machine models achieved high accuracy and low misclassification costs, whilst a voting ensemble further enhanced cost efficiency. This research emphasises the critical role of cost-aware evaluation and imbalance handling, proposing an ensemble-based framework to improve predictive maintenance in industrial applications Full article

► Show Figures

Figure 1

23 pages, 8533 KiB

Open AccessArticle

Integrating Hyperspectral, Thermal, and Ground Data with Machine Learning Algorithms Enhances the Prediction of Grapevine Yield and Berry Composition

by Shaikh Yassir Yousouf Jewan, Deepak Gautam, Debbie Sparkes, Ajit Singh, Lawal Billa, Alessia Cogato, Erik Murchie and Vinay Pagay

Remote Sens. 2024, 16(23), 4539; https://doi.org/10.3390/rs16234539 - 4 Dec 2024

Viewed by 1671

Abstract

Accurately predicting grapevine yield and quality is critical for optimising vineyard management and ensuring economic viability. Numerous studies have reported the complexity in modelling grapevine yield and quality due to variability in the canopy structure, challenges in incorporating soil and microclimatic factors, and [...] Read more.

Accurately predicting grapevine yield and quality is critical for optimising vineyard management and ensuring economic viability. Numerous studies have reported the complexity in modelling grapevine yield and quality due to variability in the canopy structure, challenges in incorporating soil and microclimatic factors, and management practices throughout the growing season. The use of multimodal data and machine learning (ML) algorithms could overcome these challenges. Our study aimed to assess the potential of multimodal data (hyperspectral vegetation indices (VIs), thermal indices, and canopy state variables) and ML algorithms to predict grapevine yield components and berry composition parameters. The study was conducted during the 2019/20 and 2020/21 grapevine growing seasons in two South Australian vineyards. Hyperspectral and thermal data of the canopy were collected at several growth stages. Simultaneously, grapevine canopy state variables, including the fractional intercepted photosynthetically active radiation (fiPAR), stem water potential (Ψ_stem), leaf chlorophyll content (LCC), and leaf gas exchange, were collected. Yield components were recorded at harvest. Berry composition parameters, such as total soluble solids (TSSs), titratable acidity (TA), pH, and the maturation index (IMAD), were measured at harvest. A total of 24 hyperspectral VIs and 3 thermal indices were derived from the proximal hyperspectral and thermal data. These data, together with the canopy state variable data, were then used as inputs for the modelling. Both linear and non-linear regression models, such as ridge (RR), Bayesian ridge (BRR), random forest (RF), gradient boosting (GB), K-Nearest Neighbour (KNN), and decision trees (DTs), were employed to model grape yield components and berry composition parameters. The results indicated that the GB model consistently outperformed the other models. The GB model had the best performance for the total number of clusters per vine (R² = 0.77; RMSE = 0.56), average cluster weight (R² = 0.93; RMSE = 0.00), average berry weight (R² = 0.95; RMSE = 0.00), cluster weight (R² = 0.95; RMSE = 0.13), and average berries per bunch (R² = 0.93; RMSE = 0.83). For the yield, the RF model performed the best (R² = 0.97; RMSE = 0.55). The GB model performed the best for the TSSs (R² = 0.83; RMSE = 0.34), pH (R² = 0.93; RMSE = 0.02), and IMAD (R² = 0.88; RMSE = 0.19). However, the RF model performed best for the TA (R² = 0.83; RMSE = 0.33). Our results also revealed the top 10 predictor variables for grapevine yield components and quality parameters, namely, the canopy temperature depression, LCC, fiPAR, normalised difference infrared index, Ψ_stem, stomatal conductance (g_s), net photosynthesis (P_n), modified triangular vegetation index, modified red-edge simple ratio, and ANT_gitelson index. These predictors significantly influence the grapevine growth, berry quality, and yield. The identification of these predictors of the grapevine yield and fruit composition can assist growers in improving vineyard management decisions and ultimately increase profitability. Full article

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

► Show Figures

Graphical abstract

19 pages, 5594 KiB

Open AccessArticle

An Automated Geographical Information System-Based Spatial Machine Learning Method for Leak Detection in Water Distribution Networks (WDNs) Using Monitoring Sensors

by Doha Elshazly, Rahul Gawai, Tarig Ali, Md Maruf Mortula, Serter Atabay and Lujain Khalil

Appl. Sci. 2024, 14(13), 5853; https://doi.org/10.3390/app14135853 - 4 Jul 2024

Cited by 1 | Viewed by 2372

Abstract

Pipe leakage in water distribution networks (WDNs) has been an emerging concern for water utilities worldwide due to its public health and economic significance. Not only does it cause significant water losses, but it also deteriorates the quality of the treated water in [...] Read more.

Pipe leakage in water distribution networks (WDNs) has been an emerging concern for water utilities worldwide due to its public health and economic significance. Not only does it cause significant water losses, but it also deteriorates the quality of the treated water in WDNs. Hence, a prompt response is required to avoid or minimize the eventual consequences. This raises the necessity of exploring the possible approaches for detecting and locating leaks in WDNs promptly. Currently, various leak detection methods exist, but they are not accurate and reliable in detecting leaks. This paper presents a novel GIS-based spatial machine learning technique that utilizes currently installed pressure, flow, and water quality monitoring sensors in WDNs, specifically employing the Geographically Weighted Regression (GWR) and Local Outlier Factor (LOF) models, based on a WDN dataset provided by our partner utility authority. In addition to its ability as a regression model for predicting a dependent variable based on input variables, GWR was selected to help identify locations on the WDN where coefficients deviate the most from the overall coefficients. To corroborate the GWR results, the Local Outlier Factor (LOF) is used as an unsupervised machine learning model to predict leak locations based on spatial local density, where locality is given by k-nearest neighbours. The sample WDN dataset provided by our utility partner was split into 70:30 for training and testing of the GWR model. The GWR model was able to predict leaks (detection and location) with a coefficient of determination (R²) of 0.909. The LOF model was able to predict the leaks with a matching of 80% with the GWR results. Then, a customized GIS interface was developed to automate the detection process in real-time as the sensor’s readings were recorded and spatial machine learning was used to process the readings. The results obtained demonstrate the ability of the proposed method to robustly detect and locate leaks in WDNs. Full article

(This article belongs to the Special Issue Advances in Civil Structural Damage Detection and Health Monitoring)

► Show Figures

Figure 1

20 pages, 3173 KiB

Open AccessArticle

Comparison of Machine Learning Algorithms for Heartbeat Detection Based on Accelerometric Signals Produced by a Smart Bed

by Minh Long Hoang, Guido Matrella and Paolo Ciampolini

Sensors 2024, 24(6), 1900; https://doi.org/10.3390/s24061900 - 15 Mar 2024

Cited by 6 | Viewed by 2784

Abstract

This work aims to compare the performance of Machine Learning (ML) and Deep Learning (DL) algorithms in detecting users’ heartbeats on a smart bed. Targeting non-intrusive, continuous heart monitoring during sleep time, the smart bed is equipped with a 3D solid-state accelerometer. Acceleration [...] Read more.

This work aims to compare the performance of Machine Learning (ML) and Deep Learning (DL) algorithms in detecting users’ heartbeats on a smart bed. Targeting non-intrusive, continuous heart monitoring during sleep time, the smart bed is equipped with a 3D solid-state accelerometer. Acceleration signals are processed through an STM 32-bit microcontroller board and transmitted to a PC for recording. A photoplethysmographic sensor is simultaneously checked for ground truth reference. A dataset has been built, by acquiring measures in a real-world set-up: 10 participants were involved, resulting in 120 min of acceleration traces which were utilized to train and evaluate various Artificial Intelligence (AI) algorithms. The experimental analysis utilizes K-fold cross-validation to ensure robust model testing across different subsets of the dataset. Various ML and DL algorithms are compared, each being trained and tested using the collected data. The Random Forest algorithm exhibited the highest accuracy among all compared models. While it requires longer training time compared to some ML models such as Naïve Bayes, Linear Discrimination Analysis, and K-Nearest Neighbour Classification, it keeps substantially faster than Support Vector Machine and Deep Learning models. The Random Forest model demonstrated robust performance metrics, including recall, precision, F1-scores, macro average, weighted average, and overall accuracy well above 90%. The study highlights the better performance of the Random Forest algorithm for the specific use case, achieving superior accuracy and performance metrics in detecting user heartbeats in comparison to other ML and DL models tested. The drawback of longer training times is not too relevant in the long-term monitoring target scenario, so the Random Forest model stands out as a viable solution for real-time ballistocardiographic heartbeat detection, showcasing potential for healthcare and wellness monitoring applications. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

20 pages, 2300 KiB

Open AccessArticle

Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings

by Eoghan T. Chelmiah, Violeta I. McLoone and Darren F. Kavanagh

Energies 2023, 16(14), 5312; https://doi.org/10.3390/en16145312 - 11 Jul 2023

Cited by 1 | Viewed by 1501

Abstract

Improving the reliability and performance of electric and rotating machines is crucial to many industrial applications. This will lead to improved robustness, efficiency, and eco-sustainability, as well as mitigate significant health and safety concerns regarding sudden catastrophic failure modes. Bearing degradation is the [...] Read more.

Improving the reliability and performance of electric and rotating machines is crucial to many industrial applications. This will lead to improved robustness, efficiency, and eco-sustainability, as well as mitigate significant health and safety concerns regarding sudden catastrophic failure modes. Bearing degradation is the most significant cause of machine failure and has been reported to cause up to 75% of low-voltage machine failures. This paper introduces a low complexity machine learning (ML) approach to estimate the remaining useful life (RUL) of rolling element bearings using real vibration signals. This work explores different ML recipes using novel feature engineering coupled with various k-Nearest Neighbour (k-NN), and Support Vector Machines (SVM) kernel and weighting functions in order to optimise this RUL approach. Original non-linear wear state models and feature sets are investigated, the latter are derived from Short-time Fourier Transform (STFT) and Hilbert Marginal Spectrum (HMS). These feature sets incorporate one-third octave band filtering for low complexity multivariate feature subspace compression. Our proposed ML algorithm stage has employed two robust supervised ML approaches: weighted k-NN and SVM. Real vibration data were drawn from the Pronostia platform to test and validate this prognostic monitoring approach. The results clearly demonstrate the effectiveness of this approach, with classification accuracy results of up to 82.8% achieved. This work contributes to the field by introducing a robust and computationally inexpensive method for accurate monitoring of machine health using low-cost vibration-based sensing. Full article

(This article belongs to the Special Issue Condition Monitoring and Machine Learning Strategies for Electrical Apparatus 2022–2023)

► Show Figures

Figure 1

19 pages, 1250 KiB

Open AccessArticle

Supervised Learning by Evolutionary Computation Tuning: An Application to Blockchain-Based Pharmaceutical Supply Chain Cost Model

by Hossein Havaeji, Thien-My Dao and Tony Wong

Mathematics 2023, 11(9), 2021; https://doi.org/10.3390/math11092021 - 24 Apr 2023

Cited by 8 | Viewed by 2233

Abstract

A pharmaceutical supply chain (PSC) is a system of processes, operations, and organisations for drug delivery. This paper provides a new PSC mathematical cost model, which includes Blockchain technology (BT), that can improve the safety, performance, and transparency of medical information sharing in [...] Read more.

A pharmaceutical supply chain (PSC) is a system of processes, operations, and organisations for drug delivery. This paper provides a new PSC mathematical cost model, which includes Blockchain technology (BT), that can improve the safety, performance, and transparency of medical information sharing in a healthcare system. We aim to estimate the costs of the BT-based PSC model, select algorithms with minimum prediction errors, and determine the cost components of the model. After the data generation, we applied four Supervised Learning algorithms (k-nearest neighbour, decision tree, support vector machine, and naive Bayes) combined with two Evolutionary Computation algorithms (ant colony optimization and the firefly algorithm). We also used the Feature Weighting approach to assign appropriate weights to all cost model components, revealing their importance. Four performance metrics were used to evaluate the cost model, and the total ranking score (TRS) was used to determine the most reliable predictive algorithms. Our findings show that the ACO-NB and FA-NB algorithms perform better than the other six algorithms in estimating the costs of the model with lower errors, whereas ACO-DT and FA-DT show the worst performance. The findings also indicate that the shortage cost, holding cost, and expired medication cost more strongly influence the cost model than other cost components. Full article

► Show Figures

Figure 1

19 pages, 626 KiB

Open AccessArticle

Investigating Feature Selection Techniques to Enhance the Performance of EEG-Based Motor Imagery Tasks Classification

by Md. Humaun Kabir, Shabbir Mahmood, Abdullah Al Shiam, Abu Saleh Musa Miah, Jungpil Shin and Md. Khademul Islam Molla

Mathematics 2023, 11(8), 1921; https://doi.org/10.3390/math11081921 - 19 Apr 2023

Cited by 29 | Viewed by 4132

Abstract

Analyzing electroencephalography (EEG) signals with machine learning approaches has become an attractive research domain for linking the brain to the outside world to establish communication in the name of the Brain-Computer Interface (BCI). Many researchers have been working on developing successful motor imagery [...] Read more.

Analyzing electroencephalography (EEG) signals with machine learning approaches has become an attractive research domain for linking the brain to the outside world to establish communication in the name of the Brain-Computer Interface (BCI). Many researchers have been working on developing successful motor imagery (MI)-based BCI systems. However, they still face challenges in producing better performance with them because of the irrelevant features and high computational complexity. Selecting discriminative and relevant features to overcome the existing issues is crucial. In our proposed work, different feature selection algorithms have been studied to reduce the dimension of multiband feature space to improve MI task classification performance. In the procedure, we first decomposed the MI-based EEG signal into four sets of the narrowband signal. Then a common spatial pattern (CSP) approach was employed for each narrowband to extract and combine effective features, producing a high-dimensional feature vector. Three feature selection approaches, named correlation-based feature selection (CFS), minimum redundancy and maximum relevance (mRMR), and multi-subspace randomization and collaboration-based unsupervised feature selection (SRCFS), were used in this study to select the relevant and effective features for improving classification accuracy. Among them, the SRCFS feature selection approach demonstrated outstanding performance for MI classification compared to other schemes. The SRCFS is based on the multiple k-nearest neighbour graphs method for learning feature weight based on the Laplacian score and then discarding the irrelevant features based on the weight value, reducing the feature dimension. Finally, the selected features are fed into the support vector machines (SVM), linear discriminative analysis (LDA), and multi-layer perceptron (MLP) for classification. The proposed model is evaluated with two benchmark datasets, namely BCI Competition III dataset IVA and dataset IIIB, which are publicly available and mainly used to recognize the MI tasks. The LDA classifier with the SRCFS feature selection algorithm exhibits better performance. It proves the superiority of our proposed study compared to the other state-of-the-art BCI-based MI task classification systems. Full article

(This article belongs to the Special Issue Computational Intelligence and Human–Computer Interaction: Modern Methods and Applications, 2nd Edition)

► Show Figures

Figure 1

17 pages, 3382 KiB

Open AccessArticle

Detecting Cortical Thickness Changes in Epileptogenic Lesions Using Machine Learning

by Sumayya Azzony, Kawthar Moria and Jamaan Alghamdi

Brain Sci. 2023, 13(3), 487; https://doi.org/10.3390/brainsci13030487 - 14 Mar 2023

Cited by 3 | Viewed by 4762

Abstract

Epilepsy is a neurological disorder characterized by abnormal brain activity. Epileptic patients suffer from unpredictable seizures, which may cause a loss of awareness. Seizures are considered drug resistant if treatment does not affect success. This leads practitioners to calculate the cortical thickness to [...] Read more.

Epilepsy is a neurological disorder characterized by abnormal brain activity. Epileptic patients suffer from unpredictable seizures, which may cause a loss of awareness. Seizures are considered drug resistant if treatment does not affect success. This leads practitioners to calculate the cortical thickness to measure the distance between the brain’s white and grey matter surfaces at various locations to perform a surgical intervention. In this study, we introduce using machine learning as an approach to classify extracted measurements from T1-weighted magnetic resonance imaging. Data were collected from the epilepsy unit at King Abdulaziz University Hospital. We applied two trials to classify the extracted measurements from T1-weighted MRI for drug-resistant epilepsy and healthy control subjects. The preprocessing sequence on T1-weighted MRI images was performed using C++ through BrainSuite’s pipeline. The first trial was performed on seven different combinations of four commonly selected measurements. The best performance was achieved in Exp6 and Exp7, with 80.00% accuracy, 83.00% recall score, and 83.88% precision. It is noticeable that grey matter volume and white matter volume measurements are more significant than the cortical thickness measurement. The second trial applied four different machine learning classifiers after applying 10-fold cross-validation and principal component analysis on all extracted measurements as in the first trial based on the mentioned previous works. The K-nearest neighbours model outperformed the other machine learning classifiers with 97.11% accuracy, 75.00% recall score, and 75.00% precision. Full article

(This article belongs to the Special Issue Intelligent Neural Systems for Solving Real Problems)

► Show Figures

Figure 1

38 pages, 21837 KiB

Open AccessArticle

A Machine Learning Approach to Prediction of the Compressive Strength of Segregated Lightweight Aggregate Concretes Using Ultrasonic Pulse Velocity

by Violeta Migallón, Héctor Penadés, José Penadés and Antonio José Tenza-Abril

Appl. Sci. 2023, 13(3), 1953; https://doi.org/10.3390/app13031953 - 2 Feb 2023

Cited by 12 | Viewed by 3934

Abstract

Lightweight aggregate concrete (LWAC) is an increasingly important material for modern construction. However, although it has several advantages compared with conventional concrete, it is susceptible to segregation due to the low density of the incorporated aggregate. The phenomenon of segregation can adversely affect [...] Read more.

Lightweight aggregate concrete (LWAC) is an increasingly important material for modern construction. However, although it has several advantages compared with conventional concrete, it is susceptible to segregation due to the low density of the incorporated aggregate. The phenomenon of segregation can adversely affect the mechanical properties of LWAC, reducing its compressive strength and its durability. In this work, several machine learning techniques are used to study the influence of the segregation of LWAC on its compressive strength, including the K-nearest neighbours (KNN) algorithm, regression tree-based algorithms such as random forest (RF) and gradient boosting regressors (GBRs), artificial neural networks (ANNs) and support vector regression (SVR). In addition, a weighted average ensemble (WAE) method is proposed that combines RF, SVR and extreme GBR (or XGBoost). A dataset that was recently used for predicting the compressive strength of LWAC is employed in this experimental study. Two different types of lightweight aggregate (LWA), including expanded clay as a coarse aggregate and natural fine limestone aggregate, were mixed to produce LWAC. To quantify the segregation in LWAC, the ultrasonic pulse velocity method was adopted. Numerical experiments were carried out to analyse the behaviour of the obtained models, and a performance improvement was shown compared with the machine learning models reported in previous works. The best performance was obtained with GBR, XGBoost and the proposed weighted ensemble method. In addition, a good choice of weights in the WAE method allowed our approach to outperform all of the other models. Full article

(This article belongs to the Special Issue Diagnostics and Monitoring of Steel and Concrete Structures)

► Show Figures

Figure 1

12 pages, 753 KiB

Open AccessArticle

Texture Analysis in Uterine Cervix Carcinoma: Primary Tumour and Lymph Node Assessment

by Paul-Andrei Ștefan, Adrian Coțe, Csaba Csutak, Roxana-Adelina Lupean, Andrei Lebovici, Carmen Mihaela Mihu, Lavinia Manuela Lenghel, Marius Emil Pușcas, Andrei Roman and Diana Feier

Diagnostics 2023, 13(3), 442; https://doi.org/10.3390/diagnostics13030442 - 26 Jan 2023

Cited by 4 | Viewed by 2316

Abstract

The conventional magnetic resonance imaging (MRI) evaluation and staging of cervical cancer encounters several pitfalls, partially due to subjective evaluations of medical images. Fifty-six patients with histologically proven cervical malignancies (squamous cell carcinomas, n = 42; adenocarcinomas, n = 14) who underwent pre-treatment [...] Read more.

The conventional magnetic resonance imaging (MRI) evaluation and staging of cervical cancer encounters several pitfalls, partially due to subjective evaluations of medical images. Fifty-six patients with histologically proven cervical malignancies (squamous cell carcinomas, n = 42; adenocarcinomas, n = 14) who underwent pre-treatment MRI examinations were retrospectively included. The lymph node status (non-metastatic lymph nodes, n = 39; metastatic lymph nodes, n = 17) was assessed using pathological and imaging findings. The texture analysis of primary tumours and lymph nodes was performed on T2-weighted images. Texture parameters with the highest ability to discriminate between the two histological types of primary tumours and metastatic and non-metastatic lymph nodes were selected based on Fisher coefficients (cut-off value > 3). The parameters’ discriminative ability was tested using an k nearest neighbour (KNN) classifier, and by comparing their absolute values through an univariate and receiver operating characteristic analysis. Results: The KNN classified metastatic and non-metastatic lymph nodes with 93.75% accuracy. Ten entropy variations were able to identify metastatic lymph nodes (sensitivity: 79.17–88%; specificity: 93.48–97.83%). No parameters exceeded the cut-off value when differentiating between histopathological entities. In conclusion, texture analysis can offer a superior non-invasive characterization of lymph node status, which can improve the staging accuracy of cervical cancers. Full article

(This article belongs to the Special Issue Imaging of Gynecological Disease 2.0)

► Show Figures

Figure 1

19 pages, 2704 KiB

Open AccessArticle

Machine Learning Methods for Diabetes Prevalence Classification in Saudi Arabia

by Entissar S. Almutairi and Maysam F. Abbod

Modelling 2023, 4(1), 37-55; https://doi.org/10.3390/modelling4010004 - 25 Jan 2023

Cited by 17 | Viewed by 4595

Abstract

Machine learning algorithms have been widely used in public health for predicting or diagnosing epidemiological chronic diseases, such as diabetes mellitus, which is classified as an epi-demic due to its high rates of global prevalence. Machine learning techniques are useful for the processes [...] Read more.

Machine learning algorithms have been widely used in public health for predicting or diagnosing epidemiological chronic diseases, such as diabetes mellitus, which is classified as an epi-demic due to its high rates of global prevalence. Machine learning techniques are useful for the processes of description, prediction, and evaluation of various diseases, including diabetes. This study investigates the ability of different classification methods to classify diabetes prevalence rates and the predicted trends in the disease according to associated behavioural risk factors (smoking, obesity, and inactivity) in Saudi Arabia. Classification models for diabetes prevalence were developed using different machine learning algorithms, including linear discriminant (LD), support vector machine (SVM), K -nearest neighbour (KNN), and neural network pattern recognition (NPR). Four kernel functions of SVM and two types of KNN algorithms were used, namely linear SVM, Gaussian SVM, quadratic SVM, cubic SVM, fine KNN, and weighted KNN. The performance evaluation in terms of the accuracy of each developed model was determined, and the developed classifiers were compared using the Classification Learner App in MATLAB, according to prediction speed and training time. The experimental results on the predictive performance analysis of the classification models showed that weighted KNN performed well in the prediction of diabetes prevalence rate, with the highest average accuracy of 94.5% and less training time than the other classification methods, for both men and women datasets. Full article

► Show Figures

Figure 1

24 pages, 10016 KiB

Open AccessArticle

A Novel Approach Based on Machine Learning and Public Engagement to Predict Water-Scarcity Risk in Urban Areas

by Sadeq Khaleefah Hanoon, Ahmad Fikri Abdullah, Helmi Z. M. Shafri and Aimrun Wayayok

ISPRS Int. J. Geo-Inf. 2022, 11(12), 606; https://doi.org/10.3390/ijgi11120606 - 4 Dec 2022

Cited by 11 | Viewed by 4375

Abstract

Climate change, population growth and urban sprawl have put a strain on water supplies across the world, making it difficult to meet water demand, especially in city regions where more than half of the world’s population now reside. Due to the complex urban [...] Read more.

Climate change, population growth and urban sprawl have put a strain on water supplies across the world, making it difficult to meet water demand, especially in city regions where more than half of the world’s population now reside. Due to the complex urban fabric, conventional techniques should be developed to diagnose water shortage risk (WSR) by engaging crowdsourcing. This study aims to develop a novel approach based on public participation (PP) with a geographic information system coupled with machine learning (ML) in the urban water domain. The approach was used to detect (WSR) in two ways, namely, prediction using ML models directly and using the weighted linear combination (WLC) function in GIS. Five types of ML algorithm, namely, support vector machine (SVM), multilayer perceptron, K-nearest neighbour, random forest and naïve Bayes, were incorporated for this purpose. The Shapley additive explanation model was added to analyse the results. The Water Evolution and Planning system was also used to predict unmet water demand as a relevant criterion, which was aggregated with other criteria. The five algorithms that were used in this work indicated that diagnosing WSR using PP achieved good-to-perfect accuracy. In addition, the findings of the prediction process achieved high accuracy in the two proposed techniques. However, the weights of relevant criteria that were extracted by SVM achieved higher accuracy than the weights of the other four models. Furthermore, the average weights of the five models that were applied in the WLC technique increased the prediction accuracy of WSR. Although the uncertainty ratio was associated with the results, the novel approach interpreted the results clearly, supporting decision makers in the proactive exploration processes of urban WSR, to choose the appropriate alternatives at the right time. Full article

(This article belongs to the Special Issue Urban Geospatial Analytics Based on Crowdsourced Data)

► Show Figures

Figure 1

27 pages, 2479 KiB

Open AccessArticle

Introductory Engineering Mathematics Students’ Weighted Score Predictions Utilising a Novel Multivariate Adaptive Regression Spline Model

by Abul Abrar Masrur Ahmed, Ravinesh C. Deo, Sujan Ghimire, Nathan J. Downs, Aruna Devi, Prabal D. Barua and Zaher M. Yaseen

Sustainability 2022, 14(17), 11070; https://doi.org/10.3390/su141711070 - 5 Sep 2022

Cited by 6 | Viewed by 4397

Abstract

Introductory Engineering Mathematics (a skill builder for engineers) involves developing problem-solving attributes throughout the teaching period. Therefore, the prediction of students’ final course grades with continuous assessment marks is a useful toolkit for degree program educators. Predictive models are practical tools used to [...] Read more.

Introductory Engineering Mathematics (a skill builder for engineers) involves developing problem-solving attributes throughout the teaching period. Therefore, the prediction of students’ final course grades with continuous assessment marks is a useful toolkit for degree program educators. Predictive models are practical tools used to evaluate the effectiveness of teaching as well as assessing the students’ progression and implementing interventions for the best learning outcomes. This study develops a novel multivariate adaptive regression spline (MARS) model to predict the weighted score

W S

(i.e., the course grade). To construct the proposed MARS model, Introductory Engineering Mathematics performance data over five years from the University of Southern Queensland, Australia, were used to design predictive models using input predictors of online quizzes, written assignments, and examination scores. About 60% of randomised predictor grade data were applied to train the model (with 25% of the training set used for validation) and 40% to test the model. Based on the cross-correlation of inputs vs. the

W S

, 12 distinct combinations with single (i.e., M1–M5) and multiple (M6–M12) features were created to assess the influence of each on the

W S

with results bench-marked via a decision tree regression (DTR), kernel ridge regression (KRR), and a k-nearest neighbour (KNN) model. The influence of each predictor on

W S

clearly showed that online quizzes provide the least contribution. However, the MARS model improved dramatically by including written assignments and examination scores. The research demonstrates the merits of the proposed MARS model in uncovering relationships among continuous learning variables, which also provides a distinct advantage to educators in developing early intervention and moderating their teaching by predicting the performance of students ahead of final outcome for a course. The findings and future application have significant practical implications in teaching and learning interventions or planning aimed to improve graduate outcomes in undergraduate engineering program cohorts. Full article

(This article belongs to the Collection Mobile Technology, Gamification and Artificial Intelligence to Improve Sustainability in Education)

► Show Figures

Figure 1

22 pages, 6323 KiB

Open AccessArticle

Research on a Wi-Fi RSSI Calibration Algorithm Based on WOA-BPNN for Indoor Positioning

by Min Yu, Shuyin Yao, Xuan Wu and Liang Chen

Appl. Sci. 2022, 12(14), 7151; https://doi.org/10.3390/app12147151 - 15 Jul 2022

Cited by 9 | Viewed by 2569

Abstract

Owing to the heterogeneity of software and hardware in different types of mobile terminals, the received signal strength indication (RSSI) from the same Wi-Fi access point (AP) varies in indoor environments, which can affect the positioning accuracy of fingerprint methods. To solve this [...] Read more.

Owing to the heterogeneity of software and hardware in different types of mobile terminals, the received signal strength indication (RSSI) from the same Wi-Fi access point (AP) varies in indoor environments, which can affect the positioning accuracy of fingerprint methods. To solve this problem and consider the nonlinear characteristics of Wi-Fi signal strength propagation and attenuation, we propose a whale optimisation algorithm-back-propagation neural network (WOA-BPNN) model for indoor Wi-Fi RSSI calibration. Firstly, as the selection of the initial parameters of the BPNN model has a considerable impact on the positioning accuracy of the calibration algorithm, we use the WOA to avoid blindly selecting the parameters of the BPNN model. Then, we propose an improved nonlinear convergence factor to balance the searchability of the WOA, which can also help to optimise the calibration algorithm. Moreover, we change the structure of the BPNN model to compare its influence on the calibration effect of the WOA-BPNN calibration algorithm. Secondly, in view of the low positioning accuracy of indoor fingerprint positioning algorithms, we propose a region-adaptive weighted K-nearest neighbour positioning algorithm based on hierarchical clustering. Finally, we effectively combine the two proposed algorithms and compare the results with those of other calibration algorithms such as the linear regression (LR), support vector regression (SVR), BPNN, and genetic algorithm-BPNN (GA-BPNN) calibration algorithms. The test results show that among different mobile terminals, the proposed WOA-BPNN calibration algorithm can increase positioning accuracy (one sigma error) by 41%, 42%, 44% and 36%, on average. The indoor field tests suggest that the proposed methods can effectively reduce the indoor positioning error caused by the heterogeneous differences of software and hardware in different mobile terminals. Full article

(This article belongs to the Topic Multi-Sensor Integrated Navigation Systems)

► Show Figures

Figure 1

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (27)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI