Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (88)

Search Parameters:
Keywords = Box-Cox Transformation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 4060 KB  
Article
High-Performance Concrete Strength Regression Based on Machine Learning with Feature Contribution Visualization
by Lei Zhen, Chang Qu, Man-Lai Tang and Junping Yin
Mathematics 2025, 13(24), 3965; https://doi.org/10.3390/math13243965 - 12 Dec 2025
Viewed by 258
Abstract
Concrete compressive strength is a fundamental indicator of the mechanical properties of High-Performance Concrete (HPC) with multiple components. Traditionally, it is measured through laboratory tests, which are time-consuming and resource-intensive. Therefore, this study develops a machine learning-based regression framework to predict compressive strength, [...] Read more.
Concrete compressive strength is a fundamental indicator of the mechanical properties of High-Performance Concrete (HPC) with multiple components. Traditionally, it is measured through laboratory tests, which are time-consuming and resource-intensive. Therefore, this study develops a machine learning-based regression framework to predict compressive strength, aiming to reduce experimental costs and resource usage. Under three different data preprocessing strategies—raw data, standard score, and Box–Cox transformation—a selected set of high-performance ensemble models demonstrates excellent predictive capacity, with both the coefficient of determination (R2) and explained variance score (EVS) exceeding 90% across all datasets, indicating high accuracy in compressive strength prediction. In particular, stacking ensemble (R2-0.920, EVS-0.920), XGBoost regression (R2-0.920, EVS-0.920), and HistGradientBoosting regression (R2-0.913, EVS-0.914) based on Box–Cox transformation data show strong generalization capability and stability. Additionally, tree-based and boosting methods demonstrate high effectiveness in capturing complex feature interactions. Furthermore, this study presents an analytical workflow that enhances feature interpretability through visualization techniques—including Partial Dependence Plots (PDP), Individual Conditional Expectation (ICE), and SHapley Additive exPlanations (SHAP). These methods clarify the contribution of each feature and quantify the direction and magnitude of its impact on predictions. Overall, this approach supports automated concrete quality control, optimized mixture proportioning, and more sustainable construction practices. Full article
(This article belongs to the Special Issue Advanced Computational Mechanics)
Show Figures

Figure 1

13 pages, 1045 KB  
Article
Development of a Nomogram for Predicting Lymphovascular Invasion at Initial Transurethral Resection of Bladder Tumors
by Takatoshi Somoto, Takanobu Utsumi, Rino Ikeda, Naoki Ishitsuka, Takahide Noro, Yuta Suzuki, Shota Iijima, Yuka Sugizaki, Ryo Oka, Takumi Endo, Naoto Kamiya, Nobuyuki Hiruta and Hiroyoshi Suzuki
Appl. Sci. 2025, 15(24), 12979; https://doi.org/10.3390/app152412979 - 9 Dec 2025
Viewed by 200
Abstract
Lymphovascular invasion (LVI) is a potent yet underutilized prognostic marker in bladder cancer, particularly in non–muscle-invasive disease (NMIBC). We aimed to develop and internally validate a predictive nomogram to estimate the probability of LVI at initial transurethral resection of bladder tumors (TURBT), utilizing [...] Read more.
Lymphovascular invasion (LVI) is a potent yet underutilized prognostic marker in bladder cancer, particularly in non–muscle-invasive disease (NMIBC). We aimed to develop and internally validate a predictive nomogram to estimate the probability of LVI at initial transurethral resection of bladder tumors (TURBT), utilizing preoperative clinical parameters. In this retrospective cohort study, 413 patients with histologically confirmed urothelial carcinoma who underwent initial TURBT were included. LVI was identified histologically in 9.2% of cases. Univariate and multivariate logistic regression, in conjunction with the least absolute shrinkage and selection operator modeling, revealed eight significant predictors: papillary architecture, Box–Cox–transformed tumor size, urinary cytology classification, age ≥ 75 years, pedunculated morphology, gender, hydronephrosis, and tumor multiplicity. The resulting nomogram demonstrated excellent discriminative performance, with an AUC of 0.888 in the training cohort and 0.827 in the validation cohort, and exhibited good calibration based on weighted plots. This model facilitates individualized prediction of LVI using routinely available clinical data. Early detection of LVI may inform risk-adapted management strategies, including repeat resection, or intensified surveillance in patients with bladder cancer. The model complements existing predictive frameworks and can contribute to more personalized and effective bladder cancer care. Full article
Show Figures

Figure 1

35 pages, 2126 KB  
Review
Techniques and Developments in Stochastic Streamflow Synthesis—A Comprehensive Review
by Shirin Studnicka and Umed S. Panu
Encyclopedia 2025, 5(4), 198; https://doi.org/10.3390/encyclopedia5040198 - 21 Nov 2025
Viewed by 460
Abstract
Stochastic streamflow synthesis has long been the cornerstone of water resource planning, enabling the generation of extended hydrological sequences that reflect natural variability beyond the limitations of observed records. This paper presents a comprehensive review of the theoretical foundations, methodological advancements, and evolving [...] Read more.
Stochastic streamflow synthesis has long been the cornerstone of water resource planning, enabling the generation of extended hydrological sequences that reflect natural variability beyond the limitations of observed records. This paper presents a comprehensive review of the theoretical foundations, methodological advancements, and evolving trends in synthetic streamflow generation. Historical progression is explored through three distinct eras: the pre-modern formulation era (pre-1960), the era dominated by autoregressive models (1960–2000), and the recent period marked by the rise of data-driven AI/ML approaches. Various modelling paradigms, parametric versus non-parametric, traditional versus AI-based, and single- versus multi-scale approaches, are critically assessed and compared with a focus on their applicability across temporal resolutions and hydrological regimes. This study also categorizes evaluation criteria into four dimensions: preservation of stochastic characteristics, distributional consistency, error-based metrics, and operational performance. In addition, the use and impact of transformation techniques (e.g., log or Box-Cox) employed to normalize streamflow distributions for improved model fidelity are examined. A bibliometric analysis of over 200 studies highlights the global research footprint, showing that the United States leads with 70 studies, followed by Canada with 15, reflecting the growing international engagement in the field. The analysis also identifies the most active journals publishing streamflow synthesis research: Water Resources Research (50 publications, since 1967), Journal of Hydrology (25 publications, since 1963), and Journal of the American Water Resources Association (9 publications, since 1974). This review not only synthesizes past and current practices but also outlines key challenges and future research directions to advance stochastic hydrology in an era of climatic uncertainty and data complexity. Full article
(This article belongs to the Section Earth Sciences)
Show Figures

Figure 1

15 pages, 1251 KB  
Article
Application of a Box-Cox Transformed LSTAR-GARCH Model for Point and Interval Forecasting of Monthly Rainfall in Hainan, China
by Xiaoxuan Zhang, Yu Liu and Jun Li
Water 2025, 17(22), 3274; https://doi.org/10.3390/w17223274 - 16 Nov 2025
Viewed by 463
Abstract
To improve the accuracy of monthly rainfall forecasting and reasonably quantify its uncertainty, this study developed a hybrid LSTAR-GARCH model incorporating a Box–Cox transformation. Using monthly rainfall data from 1999 to 2019 from four meteorological stations in Hainan Province (Haikou, Dongfang, Danzhou, and [...] Read more.
To improve the accuracy of monthly rainfall forecasting and reasonably quantify its uncertainty, this study developed a hybrid LSTAR-GARCH model incorporating a Box–Cox transformation. Using monthly rainfall data from 1999 to 2019 from four meteorological stations in Hainan Province (Haikou, Dongfang, Danzhou, and Qiongzhong), the non-stationarity and nonlinearity of the series were first verified using KPSS and BDS tests, and the Box–Cox transformation was applied to reduce skewness. A Logistic Smooth Transition Autoregressive (LSTAR) model was then established to capture nonlinear dynamics, followed by a GARCH(1,1) model to address heteroskedasticity in the residuals. The results indicate that: (1) The LSTAR model effectively captured the nonlinear characteristics of monthly rainfall, with Nash-Sutcliffe efficiency (NSE) values ranging from 0.565 to 0.802, though some bias remained in predicting extreme values; (2) While the GARCH component did not improve point forecast accuracy, it significantly enhanced interval forecasting performance. At the 95% confidence level, the average interval width (RIW) of the LSTAR-GARCH model was reduced to 0.065–0.130, substantially narrower than that of the LSTAR-ARCH model (RIW: 4.548–8.240), while maintaining high coverage rates (CR) between 93.8% and 97.9%; (3) The LSTAR-GARCH model effectively characterizes both the nonlinear mean process and time-varying volatility in rainfall series, proving to be an efficient and reliable tool for interval rainfall forecasting, particularly in tropical monsoon regions with high rainfall variability. This study provides a scientific basis for regional water resource management and climate change adaptation. Full article
(This article belongs to the Section Water and Climate Change)
Show Figures

Figure 1

25 pages, 5257 KB  
Article
A Reduced Stochastic Data-Driven Approach to Modelling and Generating Vertical Ground Reaction Forces During Running
by Guillermo Fernández, José María García-Terán, Álvaro Iglesias-Pordomingo, César Peláez-Rodríguez, Antolin Lorenzana and Alvaro Magdaleno
Modelling 2025, 6(4), 144; https://doi.org/10.3390/modelling6040144 - 6 Nov 2025
Viewed by 443
Abstract
This work presents a time-domain approach for characterizing the Ground Reaction Forces (GRFs) exerted by a pedestrian during running. It is focused on the vertical component, but the methodology is adaptable to other components or activities. The approach is developed from a statistical [...] Read more.
This work presents a time-domain approach for characterizing the Ground Reaction Forces (GRFs) exerted by a pedestrian during running. It is focused on the vertical component, but the methodology is adaptable to other components or activities. The approach is developed from a statistical perspective. It relies on experimentally measured force-time series obtained from a healthy male pedestrian at eight step frequencies ranging from 130 to 200 steps/min. These data are subsequently used to build a stochastic data-driven model. The model is composed of multivariate normal distributions which represent the step patterns of each foot independently, capturing potential disparities between them. Additional univariate normal distributions represent the step scaling and the aerial phase, the latter with both feet off the ground. A dimensionality reduction procedure is also implemented to retain the essential geometric features of the steps using a sufficient set of random variables. This approach accounts for the intrinsic variability of running gait by assuming normality in the variables, validated through state-of-the-art statistical tests (Henze-Zirkler and Shapiro-Wilk) and the Box-Cox transformation. It enables the generation of virtual GRFs using pseudo-random numbers from the normal distributions. Results demonstrate strong agreement between virtual and experimental data. The virtual time signals reproduce the stochastic behavior, and their frequency content is also captured with deviations below 4.5%, most of them below 2%. This confirms that the method effectively models the inherent stochastic nature of running human gait. Full article
Show Figures

Figure 1

13 pages, 815 KB  
Article
A Bayesian Geostatistical Approach to Analyzing Groundwater Depth in Mining Areas
by Maria Chrysanthi, Andrew Pavlides and Emmanouil A Varouchakis
Geosciences 2025, 15(11), 410; https://doi.org/10.3390/geosciences15110410 - 25 Oct 2025
Viewed by 491
Abstract
This study addresses the spatial variability of groundwater levels within a mining basin in Greece. The objective is to develop an accurate spatial model of groundwater levels in the area to support an integrated groundwater management plan. Hydraulic heads were measured in 72 [...] Read more.
This study addresses the spatial variability of groundwater levels within a mining basin in Greece. The objective is to develop an accurate spatial model of groundwater levels in the area to support an integrated groundwater management plan. Hydraulic heads were measured in 72 observation wells, which are irregularly distributed, primarily in mining zones. Multiple geostatistical approaches are evaluated to identify an optimal model based on cross-validation metrics. We introduce a novel trend model that includes the surface elevation gradient, as well as the proximity of wells to the riverbed, utilizing a modified Box–Cox transformation to normalize residuals. The results indicate that Regression Kriging with a non-differentiable Matérn variogram outperforms Ordinary Kriging in cross-validation accuracy. The study provides maps of the piezometric head and kriging variance within a Bayesian framework, being among the first to quantify and incorporate river-distance effects within regression kriging for groundwater. Full article
(This article belongs to the Section Hydrogeology)
Show Figures

Figure 1

15 pages, 1977 KB  
Article
Robustness of the Trinormal ROC Surface Model: Formal Assessment via Goodness-of-Fit Testing
by Christos Nakas
Stats 2025, 8(4), 101; https://doi.org/10.3390/stats8040101 - 17 Oct 2025
Viewed by 735
Abstract
Receiver operating characteristic (ROC) surfaces provide a natural extension of ROC curves to three-class diagnostic problems. A key summary index is the volume under the surface (VUS), representing the probability that a randomly chosen observation from each of the three ordered groups is [...] Read more.
Receiver operating characteristic (ROC) surfaces provide a natural extension of ROC curves to three-class diagnostic problems. A key summary index is the volume under the surface (VUS), representing the probability that a randomly chosen observation from each of the three ordered groups is correctly classified. A parametric estimation of VUS typically assumes trinormality of the class distributions. However, a formal method for the verification of this composite assumption has not appeared in the literature. Our approach generalizes the two-class AUC-based GOF test of Zou et al. to the three-class setting by exploiting the parallel structure between empirical and trinormal VUS estimators. We propose a global goodness-of-fit (GOF) test for trinormal ROC models based on the difference between empirical and trinormal parametric estimates of the VUS. To improve stability, a probit transformation is applied and a bootstrap procedure is used to estimate the variance of the difference. The resulting test provides a formal diagnostic for assessing the adequacy of trinormal ROC modeling. Simulation studies illustrate the robustness of the assumption via the empirical size and power of the test under various distributional settings, including skewed and multimodal alternatives. The method’s application to COVID-19 antibody level data demonstrates the practical utility of it. Our findings suggest that the proposed GOF test is simple to implement, computationally feasible for moderate sample sizes, and a useful complement to existing ROC surface methodology. Full article
(This article belongs to the Section Biostatistics)
Show Figures

Figure 1

23 pages, 1850 KB  
Article
Forecasting of GDP Growth in the South Caucasian Countries Using Hybrid Ensemble Models
by Gaetano Perone and Manuel A. Zambrano-Monserrate
Econometrics 2025, 13(3), 35; https://doi.org/10.3390/econometrics13030035 - 10 Sep 2025
Viewed by 1672
Abstract
This study aimed to forecast the gross domestic product (GDP) of the South Caucasian nations (Armenia, Azerbaijan, and Georgia) by scrutinizing the accuracy of various econometric methodologies. This topic is noteworthy considering the significant economic development exhibited by these countries in the context [...] Read more.
This study aimed to forecast the gross domestic product (GDP) of the South Caucasian nations (Armenia, Azerbaijan, and Georgia) by scrutinizing the accuracy of various econometric methodologies. This topic is noteworthy considering the significant economic development exhibited by these countries in the context of recovery post COVID-19. The seasonal autoregressive integrated moving average (SARIMA), exponential smoothing state space (ETS) model, neural network autoregressive (NNAR) model, and trigonometric exponential smoothing state space model with Box–Cox transformation, ARMA errors, and trend and seasonal components (TBATS), together with their feasible hybrid combinations, were employed. The empirical investigation utilized quarterly GDP data at market prices from 1Q-2010 to 2Q-2024. According to the results, the hybrid models significantly outperformed the corresponding single models, handling the linear and nonlinear components of the GDP time series more effectively. Rolling-window cross-validation showed that hybrid ETS-NNAR-TBATS for Armenia, hybrid ETS-NNAR-SARIMA for Azerbaijan, and hybrid ETS-SARIMA for Georgia were the best-performing models. The forecasts also suggest that Georgia is likely to record the strongest GDP growth over the projection horizon, followed by Armenia and Azerbaijan. These findings confirm that hybrid models constitute a reliable technique for forecasting GDP in the South Caucasian countries. This region is not only economically dynamic but also strategically important, with direct implications for policy and regional planning. Full article
Show Figures

Figure 1

22 pages, 1710 KB  
Article
Machine Learning Techniques Improving the Box–Cox Transformation in Breast Cancer Prediction
by Sultan S. Alshamrani
Electronics 2025, 14(16), 3173; https://doi.org/10.3390/electronics14163173 - 9 Aug 2025
Cited by 1 | Viewed by 1394
Abstract
Breast cancer remains a major global health problem, characterized by high incidence and mortality rates. Developing accurate prediction models is essential to improving early detection and treatment outcomes. Machine learning (ML) has become a valuable resource in breast cancer prediction; however, the complexities [...] Read more.
Breast cancer remains a major global health problem, characterized by high incidence and mortality rates. Developing accurate prediction models is essential to improving early detection and treatment outcomes. Machine learning (ML) has become a valuable resource in breast cancer prediction; however, the complexities inherent in medical data, including biases and imbalances, can hinder the effectiveness of these models. This paper explores combining the Box–Cox transformation with ML models to normalize data distributions and stabilize variance, thereby enhancing prediction accuracy. Two datasets were analyzed: a synthetic gamma-distributed dataset that simulates skewed real-world data and the Surveillance, Epidemiology, and End Results (SEER) breast cancer dataset, which displays imbalanced real-world data. Four distinct experimental scenarios were conducted on the ML models with a synthetic dataset, the SEER dataset with the Box–Cox transformation, a SEER dataset with the logarithmic transformation, and with Synthetic Minority Over-sampling Technique (SMOTE) augmentation to evaluate the impact of the Box–Cox transformation through different lambda values. The results show that the Box–Cox transformation significantly improves the performance of Artificial Intelligence (AI) models, particularly the stacking model, achieving the highest accuracy with 94.53% and 94.74% of the F1 score. This study demonstrates the importance of feature transformation in healthcare analytics, offering a scalable framework for improving breast cancer prediction and potentially applicable to other medical datasets with similar challenges. Full article
Show Figures

Figure 1

16 pages, 855 KB  
Article
Evaluating Time Series Models for Monthly Rainfall Forecasting in Arid Regions: Insights from Tamanghasset (1953–2021), Southern Algeria
by Ballah Abderrahmane, Morad Chahid, Mourad Aqnouy, Adam M. Milewski and Benaabidate Lahcen
Geosciences 2025, 15(7), 273; https://doi.org/10.3390/geosciences15070273 - 20 Jul 2025
Cited by 1 | Viewed by 1491
Abstract
Accurate precipitation forecasting remains a critical challenge due to the nonlinear and multifactorial nature of rainfall dynamics. This is particularly important in arid regions like Tamanghasset, where precipitation is the primary driver of agricultural viability and water resource management. This study evaluates the [...] Read more.
Accurate precipitation forecasting remains a critical challenge due to the nonlinear and multifactorial nature of rainfall dynamics. This is particularly important in arid regions like Tamanghasset, where precipitation is the primary driver of agricultural viability and water resource management. This study evaluates the performance of several time series models for monthly rainfall prediction, including the autoregressive integrated moving average (ARIMA), Exponential Smoothing State Space Model (ETS), Seasonal and Trend decomposition using Loess with ETS (STL-ETS), Trigonometric Box–Cox transform with ARMA errors, Trend and Seasonal components (TBATS), and neural network autoregressive (NNAR) models. Historical monthly precipitation data from 1953 to 2020 were used to train and test the models, with lagged observations serving as input features. Among the approaches considered, the NNAR model exhibited superior performance, as indicated by uncorrelated residuals and enhanced forecast accuracy. This suggests that NNAR effectively captures the nonlinear temporal patterns inherent in the precipitation series. Based on the best-performing model, rainfall was projected for the year 2021, providing actionable insights for regional hydrological and agricultural planning. The results highlight the relevance of neural network-based time series models for climate forecasting in data-scarce, climate-sensitive regions. Full article
(This article belongs to the Section Climate and Environment)
Show Figures

Figure 1

15 pages, 1019 KB  
Article
Diagnostic Stratification of Prostate Cancer Through Blood-Based Biochemical and Inflammatory Markers
by Donatella Coradduzza, Leonardo Sibono, Alessandro Tedde, Sonia Marra, Maria Rosaria De Miglio, Angelo Zinellu, Serenella Medici, Arduino A. Mangoni, Massimiliano Grosso, Massimo Madonia and Ciriaco Carru
Diagnostics 2025, 15(11), 1385; https://doi.org/10.3390/diagnostics15111385 - 30 May 2025
Viewed by 1522
Abstract
Background: Prostate cancer (PCa) remains one of the most prevalent malignancies in men, with diagnostic challenges arising from the limited specificity of current biomarkers, like PSA. Improved stratification tools are essential to reduce overdiagnosis and guide personalized patient management. Objective: This study aimed [...] Read more.
Background: Prostate cancer (PCa) remains one of the most prevalent malignancies in men, with diagnostic challenges arising from the limited specificity of current biomarkers, like PSA. Improved stratification tools are essential to reduce overdiagnosis and guide personalized patient management. Objective: This study aimed to identify and validate clinical and hematological biomarkers capable of differentiating PCa from benign prostatic hyperplasia (BPH) and precancerous lesions (PL) using univariate and multivariate statistical methods. Methods: In a cohort of 514 patients with suspected PCa, we performed a univariate analysis (Kruskal–Wallis and ANOVA) with preprocessing via adaptive Box–Cox transformation and missing value imputation through probabilistic principal component analysis (PPCA). LASSO regression was used for variable selection and classification. An ROC curve analysis assessed diagnostic performance. Results: Five variables—age, PSA, Index %, hemoglobin (HGB), and the International Index of Erectile Function (IIEF)—were consistently significant across univariate and multivariate analyses. The LASSO regression achieved a classification accuracy of 70% and an AUC of 0.74. Biplot and post-hoc analyses confirmed partial separation between PCa and benign conditions. Conclusions: The integration of multivariate modeling with reconstructed clinical data enabled the identification of blood-based biomarkers with strong diagnostic potential. These routinely available, cost-effective indicators may support early PCa diagnosis and patient stratification, reducing unnecessary invasive procedures. Full article
(This article belongs to the Special Issue Biochemical Testing Applications in Clinical Diagnosis)
Show Figures

Figure 1

19 pages, 12800 KB  
Article
Pareto Front Transformation in the Decision-Making Process for Spectral and Energy Efficiency Trade-Off in Massive MIMO Systems
by Eni Haxhiraj, Desar Shahu and Elson Agastra
Sensors 2025, 25(5), 1451; https://doi.org/10.3390/s25051451 - 27 Feb 2025
Cited by 2 | Viewed by 1460
Abstract
This paper presents a method of choosing a single solution in the Pareto Optimal Front of the multi-objective problem of the spectral and energy efficiency trade-off in Massive MIMO (Multiple Input, Multiple Output) systems. It proposes the transformation of the group of non-dominated [...] Read more.
This paper presents a method of choosing a single solution in the Pareto Optimal Front of the multi-objective problem of the spectral and energy efficiency trade-off in Massive MIMO (Multiple Input, Multiple Output) systems. It proposes the transformation of the group of non-dominated alternatives using the Box–Cox transformation with values of λ < 1 so that the graph with a complex shape is transformed into a concave graph. The Box–Cox transformation solves the selection bias shown by the decision-making algorithms in the non-concave part of the Pareto Front. After the transformation, four different MCDM (Multi-Criteria Decision-Making) algorithms were implemented and compared: SAW (Simple Additive Weighting), TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution), PROMITHEE (Preference Ranking Organization Method for Enrichment Evaluations) and VIKOR (Vlse Kriterijumska Optimizacija Kompromisno Resenje). The simulations showed that the best value of the λ parameter is 0, and the MCDM algorithms which explore the Pareto Front completely for different values of weights of the objectives are VIKOR as well as SAW and TOPSIS when they include the Max–Min normalization technique. Full article
(This article belongs to the Special Issue Energy-Efficient Communication Networks and Systems: 2nd Edition)
Show Figures

Figure 1

38 pages, 13675 KB  
Article
Advanced Hybrid Models for Air Pollution Forecasting: Combining SARIMA and BiLSTM Architectures
by Sabina-Cristiana Necula, Ileana Hauer, Doina Fotache and Luminița Hurbean
Electronics 2025, 14(3), 549; https://doi.org/10.3390/electronics14030549 - 29 Jan 2025
Cited by 5 | Viewed by 3229
Abstract
This study explores a hybrid forecasting framework for air pollutant concentrations (PM10, PM2.5, and NO2) that integrates Seasonal Autoregressive Integrated Moving Average (SARIMA) models with Bidirectional Long Short-Term Memory (BiLSTM) networks. By leveraging SARIMA’s strength in linear and [...] Read more.
This study explores a hybrid forecasting framework for air pollutant concentrations (PM10, PM2.5, and NO2) that integrates Seasonal Autoregressive Integrated Moving Average (SARIMA) models with Bidirectional Long Short-Term Memory (BiLSTM) networks. By leveraging SARIMA’s strength in linear and seasonal trend modeling and addressing nonlinear dependencies using BiLSTM, the framework incorporates Box-Cox transformations and Fourier terms to enhance variance stabilization and seasonal representation. Additionally, attention mechanisms are employed to prioritize temporal features, refining forecast accuracy. Using five years of daily pollutant data from Romania’s National Air Quality Monitoring Network, the models were rigorously evaluated across short-term (1-day), medium-term (7-day), and long-term (30-day) horizons. Metrics such as RMSE, MAE, and MAPE revealed the hybrid models’ superior performance in capturing complex pollutant dynamics, particularly for PM2.5 and PM10. The SARIMA combined with BiLSTM, Fourier, and Attention configuration demonstrated consistent improvements in predictive accuracy and interpretability, with attention mechanisms proving effective for extreme values and long-term dependencies. This study highlights the benefits of combining statistical preprocessing with advanced neural architectures, offering a robust and scalable solution for air quality forecasting. The findings provide valuable insights for environmental policymakers and urban planners, emphasizing the potential of hybrid models for improving air quality management and decision-making in dynamic urban environments. Full article
Show Figures

Figure 1

42 pages, 7150 KB  
Article
LightweightUNet: Multimodal Deep Learning with GAN-Augmented Imaging Data for Efficient Breast Cancer Detection
by Hari Mohan Rai, Joon Yoo, Saurabh Agarwal and Neha Agarwal
Bioengineering 2025, 12(1), 73; https://doi.org/10.3390/bioengineering12010073 - 15 Jan 2025
Cited by 6 | Viewed by 4390
Abstract
Breast cancer ranks as the second most prevalent cancer globally and is the most frequently diagnosed cancer among women; therefore, early, automated, and precise detection is essential. Most AI-based techniques for breast cancer detection are complex and have high computational costs. Hence, to [...] Read more.
Breast cancer ranks as the second most prevalent cancer globally and is the most frequently diagnosed cancer among women; therefore, early, automated, and precise detection is essential. Most AI-based techniques for breast cancer detection are complex and have high computational costs. Hence, to overcome this challenge, we have presented the innovative LightweightUNet hybrid deep learning (DL) classifier for the accurate classification of breast cancer. The proposed model boasts a low computational cost due to its smaller number of layers in its architecture, and its adaptive nature stems from its use of depth-wise separable convolution. We have employed a multimodal approach to validate the model’s performance, using 13,000 images from two distinct modalities: mammogram imaging (MGI) and ultrasound imaging (USI). We collected the multimodal imaging datasets from seven different sources, including the benchmark datasets DDSM, MIAS, INbreast, BrEaST, BUSI, Thammasat, and HMSS. Since the datasets are from various sources, we have resized them to the uniform size of 256 × 256 pixels and normalized them using the Box-Cox transformation technique. Since the USI dataset is smaller, we have applied the StyleGAN3 model to generate 10,000 synthetic ultrasound images. In this work, we have performed two separate experiments: the first on a real dataset without augmentation and the second on a real + GAN-augmented dataset using our proposed method. During the experiments, we used a 5-fold cross-validation method, and our proposed model obtained good results on the real dataset (87.16% precision, 86.87% recall, 86.84% F1-score, and 86.87% accuracy) without adding any extra data. Similarly, the second experiment provides better performance on the real + GAN-augmented dataset (96.36% precision, 96.35% recall, 96.35% F1-score, and 96.35% accuracy). This multimodal approach, which utilizes LightweightUNet, enhances the performance by 9.20% in precision, 9.48% in recall, 9.51% in F1-score, and a 9.48% increase in accuracy on the combined dataset. The LightweightUNet model we proposed works very well thanks to a creative network design, adding fake images to the data, and a multimodal training method. These results show that the model has a lot of potential for use in clinical settings. Full article
(This article belongs to the Special Issue Application of Deep Learning in Medical Diagnosis)
Show Figures

Figure 1

14 pages, 2324 KB  
Article
Application of Statistical Methods for the Characterization of Radon Distribution in Indoor Environments: A Case Study in Lima, Peru
by Rafael Liza, Félix Díaz, Patrizia Pereyra, Daniel Palacios, Nhell Cerna, Luis Curo and Max Riva
Eng 2025, 6(1), 14; https://doi.org/10.3390/eng6010014 - 14 Jan 2025
Cited by 2 | Viewed by 1690
Abstract
This study evaluates the effectiveness of advanced statistical and geospatial methods for analyzing radon concentration distributions in indoor environments, using the district of San Martín de Porres, Lima, Peru, as a case study. Radon levels were monitored using LR-115 nuclear track detectors over [...] Read more.
This study evaluates the effectiveness of advanced statistical and geospatial methods for analyzing radon concentration distributions in indoor environments, using the district of San Martín de Porres, Lima, Peru, as a case study. Radon levels were monitored using LR-115 nuclear track detectors over three distinct measurement periods between 2015 and 2016, with 86 households participating. Detectors were randomly placed in various rooms within each household. Normality tests (Shapiro–Wilk, Anderson–Darling, and Kolmogorov–Smirnov) were applied to assess the fit of radon concentrations to a log-normal distribution. Additionally, analysis of variance (ANOVA) was used to evaluate the influence of environmental and structural factors on radon variability. Non-normally distributed data were normalized using a Box–Cox transformation to improve statistical assumptions, enabling subsequent geostatistical analyses. Geospatial interpolation methods, specifically Inverse Distance Weighting (IDW) and Kriging, were employed to map radon concentrations. The results revealed significant temporal variability in radon concentrations, with geometric means of 146.4 Bq·m3, 162.3 Bq·m3, and 150.8 Bq·m3, respectively, across the three periods. Up to 9.5% of the monitored households recorded radon levels exceeding the safety threshold of 200 Bq·m3. Among the interpolation methods, Kriging provided a more accurate spatial representation of radon concentration variability compared to IDW, allowing for the precise identification of high-risk areas. This study provides a framework for using advanced statistical and geospatial techniques in environmental risk assessment. Full article
(This article belongs to the Section Chemical, Civil and Environmental Engineering)
Show Figures

Figure 1

Back to TopTop