Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (29,945)

Search Parameters:
Keywords = machining performance

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 3636 KiB  
Article
Analyzing Forest Leisure and Recreation Consumption Patterns Using Deep and Machine Learning
by Jeongjae Kim, Jinhae Chae and Seonghak Kim
Forests 2025, 16(7), 1180; https://doi.org/10.3390/f16071180 (registering DOI) - 17 Jul 2025
Abstract
Globally, forest leisure and recreation (FLR) activities are widely recognized not only for their environmental and social benefits but also for their economic contributions. To better understand these economic contributions, it is vital to examine how the regional economic levels of customers vary [...] Read more.
Globally, forest leisure and recreation (FLR) activities are widely recognized not only for their environmental and social benefits but also for their economic contributions. To better understand these economic contributions, it is vital to examine how the regional economic levels of customers vary when consuming FLR. This study aimed to empirically examine whether the regional economic level of residents (i.e., gross regional domestic product; GRDP) is classifiable using FLR expenditure data, and to interpret which variables contribute to its classification. We acquired anonymized credit card transaction data on residents of two regions with different GRDP levels. The data were preprocessed by identifying FLR-related industries and extracting key spending features for classification analysis. Five classification models (e.g., deep neural network (DNN), random forest, extreme gradient boosting, support vector machine, and logistic regression) were applied. Among the models, the DNN model presented the best performance (overall accuracy = 0.73; area under the curve (AUC) = 0.82). SHAP analysis showed that the “FLR industry” variable was most influential in differentiating GRDP levels across all the models. These findings demonstrate that FLR consumption patterns may vary and are interpretable by economic levels, providing an empirical framework for designing regional economic policies. Full article
(This article belongs to the Special Issue Forest Economics and Policy Analysis)
Show Figures

Figure 1

19 pages, 1661 KiB  
Article
Evaluation of the Field Performance and Economic Feasibility of Mechanized Onion Production in the Republic of Korea
by Jae-Seo Hwang and Wan-Soo Kim
Agronomy 2025, 15(7), 1721; https://doi.org/10.3390/agronomy15071721 (registering DOI) - 17 Jul 2025
Abstract
Onion cultivation in the Republic of Korea is increasingly threatened by labor shortages and an aging rural population, underscoring the growing importance of mechanization. This study evaluated the combined and individual performances and economic feasibility of mechanized transplanting, stem cutting, harvesting, and collecting [...] Read more.
Onion cultivation in the Republic of Korea is increasingly threatened by labor shortages and an aging rural population, underscoring the growing importance of mechanization. This study evaluated the combined and individual performances and economic feasibility of mechanized transplanting, stem cutting, harvesting, and collecting operations using work efficiency; the missing plant, stem cutting, damage, and dropout rates; and foreign matter content as indicators. Mechanized operations achieved up to 358-fold higher work efficiencies than manual labor operations. However, in terms of marketability, performance was inferior due to missing plants, improperly cut stems, damaged bulbs, dropped onions, and foreign matter contamination. The economic analysis indicated that the use of individual machines is advantageous for farms larger than 10.2 ha for transplanting, 1.14 ha for stem cutting, 0 ha for harvesting (i.e., profitable regardless of farm size), and 6.95 ha for collecting. For fully mechanized operations, using machines for all four processes, the break-even area was found to be 3.63 ha, with a payback period of 2.1 years. These findings are expected to serve as a foundational reference for onion growers considering the adoption of mechanization. Full article
Show Figures

Figure 1

18 pages, 11737 KiB  
Article
MoHiPr-TB: A Monthly Gridded Multi-Source Merged Precipitation Dataset for the Tarim Basin Based on Machine Learning
by Ping Chen, Junqiang Yao, Jing Chen, Mengying Yao, Liyun Ma, Weiyi Mao and Bo Sun
Remote Sens. 2025, 17(14), 2483; https://doi.org/10.3390/rs17142483 (registering DOI) - 17 Jul 2025
Abstract
A reliable precipitation dataset with high spatial resolution is essential for climate research in the Tarim Basin. This study evaluated the performances of four models, namely a random forest (RF), a long short-term memory network (LSTM), a support vector machine (SVM), and a [...] Read more.
A reliable precipitation dataset with high spatial resolution is essential for climate research in the Tarim Basin. This study evaluated the performances of four models, namely a random forest (RF), a long short-term memory network (LSTM), a support vector machine (SVM), and a feedforward neural network (FNN). FNN, which was found to be superior to the other models, was used to integrate eight precipitation datasets spanning from 1990 to 2022 across the Tarim Basin, resulting in a new monthly high-resolution (0.1°) precipitation dataset named MoHiPr-TB. This dataset was subsequently bias-corrected by the China Land Data Assimilation System version 2.0 (CLDAS2.0). Validation results indicate that the corrected MoHiPr-TB not only accurately reflects the spatial distribution of precipitation but also effectively simulates its intensity and interannual and seasonal variations. Moreover, MoHiPr-TB is capable of detecting the precipitation–elevation relationship in the Pamir Plateau, where precipitation initially increases and then decreases with elevation, as well as the synchronous variation of precipitation and elevation in the Tianshan region. Collectively, this study delivers a high-accuracy precipitation dataset for the Tarim Basin, which is anticipated to have extensive applications in meteorological, hydrological, and ecological research. Full article
(This article belongs to the Section Earth Observation Data)
Show Figures

Figure 1

20 pages, 3636 KiB  
Article
The Prediction of Civil Building Energy Consumption Using a Hybrid Model Combining Wavelet Transform with SVR and ELM: A Case Study of Jiangsu Province
by Xiangxu Chen, Jinjin Mu, Zihan Shang and Xinnan Gao
Mathematics 2025, 13(14), 2293; https://doi.org/10.3390/math13142293 (registering DOI) - 17 Jul 2025
Abstract
As a pivotal economic province in China, Jiangsu’s efforts in civil building energy conservation are critical to achieving the national “dual carbon” goals. This paper proposes a hybrid model that integrates wavelet transform, support vector regression (SVR), and extreme learning machine (ELM) to [...] Read more.
As a pivotal economic province in China, Jiangsu’s efforts in civil building energy conservation are critical to achieving the national “dual carbon” goals. This paper proposes a hybrid model that integrates wavelet transform, support vector regression (SVR), and extreme learning machine (ELM) to predict the civil building energy consumption of Jiangsu Province. Based on data from statistical yearbooks, the historical energy consumption of civil buildings is calculated. Through a grey relational analysis (GRA), the key factors influencing the civil building energy consumption are identified. The wavelet transform technique is then applied to decompose the energy consumption data into a trend component and a fluctuation component. The SVR model predicts the trend component, while the ELM model captures the fluctuation patterns. The final prediction results are generated by combining these two predictions. The results demonstrate that the hybrid model achieves superior performance with a Mean Absolute Percentage Error (MAPE) of merely 1.37%, outperforming both individual prediction methods and alternative hybrid approaches. Furthermore, we develop three prospective scenarios to analyze civil building energy consumption trends from 2023 to 2030. The analysis reveals that the observed patterns align with the Environmental Kuznets Curve (EKC). These findings provide valuable insights for provincial governments in future policy-making and energy planning. Full article
Show Figures

Figure 1

18 pages, 591 KiB  
Article
Active Learning for Medical Article Classification with Bag of Words and Bag of Concepts Embeddings
by Radosław Pytlak, Paweł Cichosz, Bartłomiej Fajdek and Bogdan Jastrzębski
Appl. Sci. 2025, 15(14), 7955; https://doi.org/10.3390/app15147955 (registering DOI) - 17 Jul 2025
Abstract
Systems supporting systematic literature reviews often use machine learning algorithms to create classification models to assess the relevance of articles to study topics. The proper choice of text representation for such algorithms may have a significant impact on their predictive performance. This article [...] Read more.
Systems supporting systematic literature reviews often use machine learning algorithms to create classification models to assess the relevance of articles to study topics. The proper choice of text representation for such algorithms may have a significant impact on their predictive performance. This article presents an in-depth investigation of the utility of the bag of concepts representation for this purpose, which can be considered an enhanced form of the ubiquitous bag of words representation, with features corresponding to ontology concepts rather than words. Its utility is evaluated in the active learning setting, in which a sequence of classification models is created, with training data iteratively expanded by adding articles selected for human screening. Different versions of the bag of concepts are compared with bag of words, as well as with combined representations, including both word-based and concept-based features. The evaluation uses the support vector machine, naive Bayes, and random forest algorithms and is performed on datasets from 15 systematic medical literature review studies. The results show that concept-based features may have additional predictive value in comparison to standard word-based features and that the combined bag of concepts and bag of words representation is the most useful overall. Full article
Show Figures

Figure 1

22 pages, 1837 KiB  
Article
Anthropometric Measurements for Predicting Low Appendicular Lean Mass Index for the Diagnosis of Sarcopenia: A Machine Learning Model
by Ana M. González-Martin, Edgar Samid Limón-Villegas, Zyanya Reyes-Castillo, Francisco Esparza-Ros, Luis Alexis Hernández-Palma, Minerva Saraí Santillán-Rivera, Carlos Abraham Herrera-Amante, César Octavio Ramos-García and Nicoletta Righini
J. Funct. Morphol. Kinesiol. 2025, 10(3), 276; https://doi.org/10.3390/jfmk10030276 (registering DOI) - 17 Jul 2025
Abstract
Background: Sarcopenia is a progressive muscle disease that compromises mobility and quality of life in older adults. Although dual-energy X-ray absorptiometry (DXA) is the standard for assessing Appendicular Lean Mass Index (ALMI), it is costly and often inaccessible. This study aims to [...] Read more.
Background: Sarcopenia is a progressive muscle disease that compromises mobility and quality of life in older adults. Although dual-energy X-ray absorptiometry (DXA) is the standard for assessing Appendicular Lean Mass Index (ALMI), it is costly and often inaccessible. This study aims to develop machine learning models using anthropometric measurements to predict low ALMI for the diagnosis of sarcopenia. Methods: A cross-sectional study was conducted on 183 Mexican adults (67.2% women and 32.8% men, ≥60 years old). ALMI was measured using DXA, and anthropometric data were collected following the International Society for the Advancement of Kinanthropometry (ISAK) protocols. Predictive models were developed using Logistic Regression (LR), Decision Trees (DTs), Random Forests (RFs), Artificial Neural Networks (ANNs), and LASSO regression. The dataset was split into training (70%) and testing (30%) sets. Model performance was evaluated using classification performance metrics and the area under the ROC curve (AUC). Results: ALMI indicated strong correlations with BMI, corrected calf girth, and arm relaxed girth. Among models, DT achieved the best performance in females (AUC = 0.84), and ANN indicated the highest AUC in males (0.92). Regarding the prediction of low ALMI, specificity values were highest in DT for females (100%), while RF performed best in males (92%). The key predictive variables varied depending on sex, with BMI and calf girth being the most relevant for females and arm girth for males. Conclusions: Anthropometry combined with machine learning provides an accurate, low-cost approach for identifying low ALMI in older adults. This method could facilitate sarcopenia screening in clinical settings with limited access to advanced diagnostic tools. Full article
Show Figures

Figure 1

11 pages, 1250 KiB  
Article
Optimizing Multivariable Logistic Regression for Identifying Perioperative Risk Factors for Deep Brain Stimulator Explantation: A Pilot Study
by Peyton J. Murin, Anagha S. Prabhune and Yuri Chaves Martins
Clin. Pract. 2025, 15(7), 132; https://doi.org/10.3390/clinpract15070132 (registering DOI) - 17 Jul 2025
Abstract
Background/Objectives: Deep brain stimulation (DBS) is an effective surgical treatment for Parkinson’s Disease (PD) and other movement disorders. Despite its benefits, DBS explantation occurs in 5.6% of cases, with costs exceeding USD 22,000 per implant. Traditional statistical methods have struggled to identify [...] Read more.
Background/Objectives: Deep brain stimulation (DBS) is an effective surgical treatment for Parkinson’s Disease (PD) and other movement disorders. Despite its benefits, DBS explantation occurs in 5.6% of cases, with costs exceeding USD 22,000 per implant. Traditional statistical methods have struggled to identify reliable risk factors for explantation. We hypothesized that supervised machine learning would more effectively capture complex interactions among perioperative factors, enabling the identification of novel risk factors. Methods: The Medical Informatics Operating Room Vitals and Events Repository was queried for patients with DBS, adequate clinical data, and at least two years of follow-up (n = 38). Fisher’s exact test assessed demographic and medical history variables. Data were analyzed using Anaconda Version 2.3.1. with pandas, numpy, sklearn, sklearn-extra, matplotlin. pyplot, and seaborn. Recursive feature elimination with cross-validation (RFECV) optimized factor selection was used. A multivariate logistic regression model was trained and evaluated using precision, recall, F1-score, and area under the curve (AUC). Results: Fisher’s exact test identified chronic pain (p = 0.0108) and tobacco use (p = 0.0026) as risk factors. RFECV selected 24 optimal features. The logistic regression model demonstrated strong performance (precision: 0.89, recall: 0.86, F1-score: 0.86, AUC: 1.0). Significant risk factors included tobacco use (OR: 3.64; CI: 3.60–3.68), primary PD (OR: 2.01; CI: 1.99–2.02), ASA score (OR: 1.91; CI: 1.90–1.92), chronic pain (OR: 1.82; CI: 1.80–1.85), and diabetes (OR: 1.63; CI: 1.62–1.65). Conclusions: Our study suggests that supervised machine learning can identify risk factors for early DBS explantation. Larger studies are needed to validate our findings. Full article
Show Figures

Figure 1

13 pages, 2051 KiB  
Article
Near-Infrared Spectroscopy and Machine Learning for Fast Quality Prediction of Bottle Gourd
by Xiao Guo, Hongyu Huang, Haiyan Wang, Chang Cai, Ying Wang, Xiaohua Wu, Jian Wang, Baogen Wang, Biao Zhu and Yun Xiang
Foods 2025, 14(14), 2503; https://doi.org/10.3390/foods14142503 (registering DOI) - 17 Jul 2025
Abstract
Protein and amino acid content are the crucial quality parameters in bottle gourd, and traditional measurement methods for detecting those parameters are complicated, time-consuming, and costly. In this study, we employed NIRS along with machine learning and neural network-based methods to model and [...] Read more.
Protein and amino acid content are the crucial quality parameters in bottle gourd, and traditional measurement methods for detecting those parameters are complicated, time-consuming, and costly. In this study, we employed NIRS along with machine learning and neural network-based methods to model and predict protein and free amino acids (FAAs) of bottle gourd. Specifically, the content of protein and FAAs were measured through conventional methods. Then a near-infrared analyzer was utilized to obtain the spectral data, which were processed using multiple scattering correction (MSC) and standard normalized variate (SNV). The processed spectral data were further processed using feature importance selection to select the feature bands that had the highest correlation with protein and FAAs, respectively. The models for protein and FAAs estimation were developed using support vector regression (SVR), ridge regression (RR), random forest regression (RFR), and fully connected neural networks (FCNNs). Among them, ridge regression achieved the optimal performance, with determination coefficients (R2) of 0.96 and 0.77 on the protein and FAAs test sets, respectively, and root mean square error (RMSE) values of 0.23 and 0.5, respectively. Based on this, we developed a precise and rapid prediction model for the important quality indices of bottle gourd. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Figure 1

27 pages, 3217 KiB  
Article
Identification of Writing Strategies in Educational Assessments with an Unsupervised Learning Measurement Framework
by Cheng Tang, Jiawei Xiong and George Engelhard
Educ. Sci. 2025, 15(7), 912; https://doi.org/10.3390/educsci15070912 (registering DOI) - 17 Jul 2025
Abstract
This study proposes a framework that leverages natural language processing and unsupervised machine learning techniques to measure, identify, and classify examinees’ writing strategies. The framework integrates three categories of writing strategies (text complexity, evidence use, and argument structure) to identify the characteristics of [...] Read more.
This study proposes a framework that leverages natural language processing and unsupervised machine learning techniques to measure, identify, and classify examinees’ writing strategies. The framework integrates three categories of writing strategies (text complexity, evidence use, and argument structure) to identify the characteristics of examinees’ writing. Additionally, a measurement model is used to calibrate examinees’ writing proficiency. An empirical example is presented to demonstrate the performance of the framework. The data comprise 430 Grade 8 examinees’ responses to English Language Arts (ELA) assessments in the United States. Using K-means clustering, distinct patterns were identified in each category. The one-parameter logistic measurement model was applied to estimate examinees’ writing proficiency. Analyses revealed significant effects of text complexity and evidence use on writing proficiency, while argument structure was not significant. This study has implications for writing instruction and assessment design that highlight the point that effective writing is not simply a matter of isolated skill acquisition, but rather the coordinated implementation of complementary strategies, a finding that supports cognitive developmental theories of writing. Full article
(This article belongs to the Section Education and Psychology)
Show Figures

Figure 1

20 pages, 2612 KiB  
Article
Development and Evaluation of a Nanoparticle-Based Immunoassay for Rotavirus Detection: A Suitable Alternative to ELISA and PCR in Low-Income Setting
by Margaret Oluwatoyin Japhet, Adeogo Timilehin Bankole, Temiloluwa Ifeoluwa Omotade, Oyelola Eyinade Adeoye, Oladiran Famurewa and Simeon K. Adesina
Methods Protoc. 2025, 8(4), 81; https://doi.org/10.3390/mps8040081 (registering DOI) - 17 Jul 2025
Abstract
Every year, diarrhoea is responsible for >1 million deaths in children with ages from 0 to 5 years, with rotavirus as the leading cause. The regions most affected lack routine rotavirus diagnosis due to high cost, lack of necessary equipment and shortage of [...] Read more.
Every year, diarrhoea is responsible for >1 million deaths in children with ages from 0 to 5 years, with rotavirus as the leading cause. The regions most affected lack routine rotavirus diagnosis due to high cost, lack of necessary equipment and shortage of trained-personnel for Enzyme-Link-Immunosorbent-Assay (ELISA) and molecular methods. We report the development and evaluation of a cheap, nanoparticle-based immunoassay for routine machine-free rotavirus diagnosis. In this work, optimal conditions for oxidation of cotton swabs and aldehyde production for kit development was confirmed by Fourier-Transform Infrared Spectroscopy (FTIR). Lactoferrin (LF) needed to bind the virus to the cotton swab was immobilised on activated cotton swabs, followed by the capture of commercial rotavirus antigen on LF-immobilised swabs. This was dipped in coloured nanobeads covalently coupled to rotavirus-group-specific monoclonal antibody for visual rotavirus detection. Subsequently, rotavirus detection by nanoassay, commercial ELISA and quantitative reverse transcription PCR were compared using same set of 186 stool samples and subjected to statistical analyses. Optimal oxidisation condition was observed using 48 mg/mL NaIO4 in 0.1 M sodium acetate buffer at 35 °C for 9 h. Rotavirus detection was confirmed visually by blue colour retention on swabs after several washings. Sensitivity, specificity, positive-predictive-value and negative-predictive-value of ELISA in rotavirus detection were 60%, 84%, 53% and 88%, respectively, while our immunoassay showed performance at 88%, 94%, 82% and 96%. This immunoassay will provide effective rotavirus public health interventions in low-and-middle-income countries with high morbidity/mortality. Full article
(This article belongs to the Section Biochemical and Chemical Analysis & Synthesis)
Show Figures

Figure 1

14 pages, 2941 KiB  
Article
Experimental and Numerical Investigation of the Mechanical Properties of ABS Parts Fabricated via Fused Deposition Modeling
by Yanqin Li, Peihua Zhu and Dehai Zhang
Polymers 2025, 17(14), 1957; https://doi.org/10.3390/polym17141957 (registering DOI) - 17 Jul 2025
Abstract
This study investigates the mechanical properties of ABS parts fabricated via used deposition modeling (FDM) through integrated experimental and numerical approaches. ABS resin was used as the experimental material, and tensile tests were conducted using a universal testing machine. Finite element analysis (FEA) [...] Read more.
This study investigates the mechanical properties of ABS parts fabricated via used deposition modeling (FDM) through integrated experimental and numerical approaches. ABS resin was used as the experimental material, and tensile tests were conducted using a universal testing machine. Finite element analysis (FEA) was performed via ANSYS 2021 to simulate stress deformation behavior, with key parameters including a gauge length of 10 mm (pre-stretching) and printing temperature gradients. The results show that the specimen exhibited a maximum tensile force of 7.3 kN, upper yield force of 3.7 kN, and lower yield force of 3.2 kN, demonstrating high strength and toughness. The non-proportional elongation reached 0.06 (6%), and the quantified enhancement multiple of AM relative to traditional manufacturing was 1.1, falling within the reasonable range for glass fiber-reinforced or specially formulated ABS. FEA results validated the experimental data, showing that the material underwent 15 mm of plastic deformation before fracture, consistent with ABS’s ductile characteristics. Full article
Show Figures

Figure 1

16 pages, 1251 KiB  
Article
Enhanced Detection of Intrusion Detection System in Cloud Networks Using Time-Aware and Deep Learning Techniques
by Nima Terawi, Huthaifa I. Ashqar, Omar Darwish, Anas Alsobeh, Plamen Zahariev and Yahya Tashtoush
Computers 2025, 14(7), 282; https://doi.org/10.3390/computers14070282 (registering DOI) - 17 Jul 2025
Abstract
This study introduces an enhanced Intrusion Detection System (IDS) framework for Denial-of-Service (DoS) attacks, utilizing network traffic inter-arrival time (IAT) analysis. By examining the timing between packets and other statistical features, we detected patterns of malicious activity, allowing early and effective DoS threat [...] Read more.
This study introduces an enhanced Intrusion Detection System (IDS) framework for Denial-of-Service (DoS) attacks, utilizing network traffic inter-arrival time (IAT) analysis. By examining the timing between packets and other statistical features, we detected patterns of malicious activity, allowing early and effective DoS threat mitigation. We generate real DoS traffic, including normal, Internet Control Message Protocol (ICMP), Smurf attack, and Transmission Control Protocol (TCP) classes, and develop nine predictive algorithms, combining traditional machine learning and advanced deep learning techniques with optimization methods, including the synthetic minority sampling technique (SMOTE) and grid search (GS). Our findings reveal that while traditional machine learning achieved moderate accuracy, it struggled with imbalanced datasets. In contrast, Deep Neural Network (DNN) models showed significant improvements with optimization, with DNN combined with GS (DNN-GS) reaching 89% accuracy. However, we also used Recurrent Neural Networks (RNNs) combined with SMOTE and GS (RNN-SMOTE-GS), which emerged as the best-performing with a precision of 97%, demonstrating the effectiveness of combining SMOTE and GS and highlighting the critical role of advanced optimization techniques in enhancing the detection capabilities of IDS models for the accurate classification of various types of network traffic and attacks. Full article
Show Figures

Figure 1

17 pages, 497 KiB  
Article
Generative Data Modelling for Diverse Populations in Africa: Insights from South Africa
by Sally Sonia Simmons, John Elvis Hagan and Thomas Schack
Information 2025, 16(7), 612; https://doi.org/10.3390/info16070612 (registering DOI) - 17 Jul 2025
Abstract
Studies on the demography and health of racially diverse African populations are scarce, particularly due to lingering data challenges. Generative data modelling has emerged as a valuable solution to this burden. The study, therefore, examined the efficacy of Conditional Tabular GAN (CTGAN), CopulaGAN, [...] Read more.
Studies on the demography and health of racially diverse African populations are scarce, particularly due to lingering data challenges. Generative data modelling has emerged as a valuable solution to this burden. The study, therefore, examined the efficacy of Conditional Tabular GAN (CTGAN), CopulaGAN, and Tabula Variational Autoencoder (TVAE) for generating synthetic but realistic demographic and health data. This study employed the World Health Organisation stigy on global ageing and adult health survey (SAGE) Wave 1 South African data (n = 4227). Information missing from SAGE Wave 1, including demographic (e.g., race, age) and health (e.g., hypertension, blood pressure) indicators, were imputed using Generative Adversarial Imputation Nets (GAIN). CopulaGAN, CTGAN, and TVAE, sourced from the sdv 1.24.1 python library, generated 104,227 synthetic records based on the SAGE data constituents. The outcomes were accessed with similarity and machine learning (XGBoost) augmentation metrics (sourced from the sdmetrics 0.21.0 python library), including column shapes and overall and precision ratio scores. Generally, the GAIN imputations resulted in data with properties that were comparable to original and with no missing information. CTGAN’s (89.20%) overall quality of performance was above that of TVAE (86.50%) and CopulaGAN (88.45%). These findings underscore the usefulness of generative data modelling in addressing data quality challenges in diverse populations to enhance actionable health research and policy implementation. Full article
Show Figures

Graphical abstract

24 pages, 1991 KiB  
Article
A Multi-Feature Semantic Fusion Machine Learning Architecture for Detecting Encrypted Malicious Traffic
by Shiyu Tang, Fei Du, Zulong Diao and Wenjun Fan
J. Cybersecur. Priv. 2025, 5(3), 47; https://doi.org/10.3390/jcp5030047 (registering DOI) - 17 Jul 2025
Abstract
With the increasing sophistication of network attacks, machine learning (ML)-based methods have showcased promising performance in attack detection. However, ML-based methods often suffer from high false rates when tackling encrypted malicious traffic. To break through these bottlenecks, we propose EFTransformer, an encrypted flow [...] Read more.
With the increasing sophistication of network attacks, machine learning (ML)-based methods have showcased promising performance in attack detection. However, ML-based methods often suffer from high false rates when tackling encrypted malicious traffic. To break through these bottlenecks, we propose EFTransformer, an encrypted flow transformer framework which inherits semantic perception and multi-scale feature fusion, can robustly and efficiently detect encrypted malicious traffic, and make up for the shortcomings of ML in the context of modeling ability and feature adequacy. EFTransformer introduces a channel-level extraction mechanism based on quintuples and a noise-aware clustering strategy to enhance the recognition ability of traffic patterns; adopts a dual-channel embedding method, using Word2Vec and FastText to capture global semantics and subword-level changes; and uses a Transformer-based classifier and attention pooling module to achieve dynamic feature-weighted fusion, thereby improving the robustness and accuracy of malicious traffic detection. Our systematic experiments on the ISCX2012 dataset demonstrate that EFTransformer achieves the best detection performance, with an accuracy of up to 95.26%, a false positive rate (FPR) of 6.19%, and a false negative rate (FNR) of only 5.85%. These results show that EFTransformer achieves high detection performance against encrypted malicious traffic. Full article
(This article belongs to the Section Security Engineering & Applications)
Show Figures

Figure 1

17 pages, 10396 KiB  
Article
Feature Selection Based on Three-Dimensional Correlation Graphs
by Adam Dudáš and Aneta Szoliková
AppliedMath 2025, 5(3), 91; https://doi.org/10.3390/appliedmath5030091 (registering DOI) - 17 Jul 2025
Abstract
The process of feature selection is a critical component of any decision-making system incorporating machine or deep learning models applied to multidimensional data. Feature selection on input data can be performed using a variety of techniques, such as correlation-based methods, wrapper-based methods, or [...] Read more.
The process of feature selection is a critical component of any decision-making system incorporating machine or deep learning models applied to multidimensional data. Feature selection on input data can be performed using a variety of techniques, such as correlation-based methods, wrapper-based methods, or embedded methods. However, many conventionally used approaches do not support backwards interpretability of the selected features, making their application in real-world scenarios impractical and difficult to implement. This work addresses that limitation by proposing a novel correlation-based strategy for feature selection in regression tasks, based on a three-dimensional visualization of correlation analysis results—referred to as three-dimensional correlation graphs. The main objective of this study is the design, implementation, and experimental evaluation of this graphical model through a case study using a multidimensional dataset with 28 attributes. The experiments assess the clarity of the visualizations and their impact on regression model performance, demonstrating that the approach reduces dimensionality while maintaining or improving predictive accuracy, enhances interpretability by uncovering hidden relationships, and achieves better or comparable results to conventional feature selection methods. Full article
Show Figures

Figure 1

Back to TopTop