Next Article in Journal
Degradation Mechanism, Performance Impact, and Maintenance Strategies for Expansion Devices of Large-Span Railway Bridges
Next Article in Special Issue
Roller-Compacted Concrete for Pavements: A Critical Review of Its Structural Design, Construction, Monitoring, and Applications
Previous Article in Journal
Effect of GGBFS and Fly Ash on Elevated Temperature Resistance of Pumice-Based Geopolymers
Previous Article in Special Issue
Analysis of Variance in Runway Friction Measurements and Surface Life-Cycle: A Case Study of Four Australian Airports
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Two-Stage Feature Reduction (FIRRE) Framework for Improving Artificial Neural Network Predictions in Civil Engineering Applications

1
School of Data Science, The Chinese University of Hong Kong, Shenzhen 518172, China
2
College of Civil Engineering, Fuzhou University, Fuzhou 350116, China
3
Department of Computer Science & Engineering, University of Minnesota, Twin Cities, Minneapolis, MN 55455, USA
4
Department of Civil, Environmental, and Geo-Engineering, University of Minnesota, Twin Cities, Minneapolis, MN 55455, USA
*
Author to whom correspondence should be addressed.
Infrastructures 2026, 11(1), 29; https://doi.org/10.3390/infrastructures11010029
Submission received: 24 November 2025 / Revised: 3 January 2026 / Accepted: 12 January 2026 / Published: 16 January 2026

Abstract

Artificial neural networks (ANNs) are widely used in engineering prediction, but excessive input dimensionality can reduce both accuracy and efficiency. This study proposes a two-stage feature-reduction framework, Feature Importance Ranking and Redundancy Elimination (FIRRE), to optimize ANN inputs by removing weakly informative and redundant variables. In Stage 1, four complementary ranking methods, namely Pearson correlation, recursive feature elimination, random forest importance, and F-test scoring, are combined into an ensemble importance score. In Stage 2, highly collinear features (ρ > 0.95) are pruned while retaining the more informative variable in each pair. FIRRE is evaluated on 32 civil engineering datasets spanning materials, structural, and environmental applications, and benchmarked against Principal Component Analysis, variance-threshold filtering, random feature selection, and K-means clustering. Across the benchmark suite, FIRRE consistently achieves competitive or improved predictive performance while reducing input dimensionality by 40% on average and decreasing computation time by 10–60%. A dynamic modulus case study further demonstrates its practical value, improving R2 from 0.926 to 0.966 while reducing inputs from 25 to 7. Overall, FIRRE provides a practical, robust framework for simplifying ANN inputs and improving efficiency in civil engineering prediction tasks.

1. Introduction

Artificial Neural Networks (ANNs) have been widely used in engineering to predict material properties or structure responses. The involvement of numerous components, such as material constituents or structural elements, coupled with varying environmental and testing conditions, makes such applications inherently complex. To achieve accurate predictions, researchers often include as many input features as possible to comprehensively characterize the system. However, high-dimensional input data increases computational cost and training time and typically requires larger datasets to achieve satisfactory performance. Moreover, excessive or redundant input variables can lead to overfitting, thereby reducing prediction accuracy. A common practice is to carefully select input variables and identify the most important and influential features [1]. Previous studies have shown that using too many input parameters can make predictive model unnecessarily complex, whereas using too few may omit critical information [2]. This trade-off underscores the importance of dimensionality reduction in ANN models in engineering applications.
Beyond predictive accuracy, reducing input parameters is motivated by the growing emphasis on sustainable and resource-efficient modeling in civil engineering. Reducing input dimensionality can simplify model structure and lower computational demand during training and inference, which in turn reduces runtime and energy use for data-driven prediction workflows. Such computational efficiency aligns with the principles of Green AI by prioritizing performance per unit of compute and enabling faster, lower-cost design iterations for engineering materials and systems, including low-carbon concretes and durable pavement structures [3,4].
A common strategy for reducing input dimensionality is feature selection, which identifies the most informative subset of variables. Recent engineering studies have applied feature selection to enhance ANN model performance. Zhu and Wang [5] adopted improved an improved Relief algorithm to automatically identify the most relevant bridge attributes for a deep-learning model, enabling accurate bridge condition forecasts up to four years ahead. Liu et al. [6] used random forest feature importance and LASSO regression to select key factors influencing process-induced deformation of composite structures. Their ANN-based framework achieved a 98% reduction in computational time with less than 5% loss in accuracy, showing that under limited computing resources, focusing on the top-ranked variables yields nearly the same accuracy as using the full set. Khalaf et al. [7] integrated a genetic algorithm for input selection in predicting shear connector strength. Out of ten candidate variables, the optimal ANN model required only seven inputs while maintaining high accuracy (R2 = 0.96). Other feature selection techniques frequently used in engineering include correlation filtering [8], sensitivity analysis [9,10], rank-correlation analysis [11], and metaheuristic approaches such as particle swarm optimization and firefly algorithms [12,13]. Bezerra et al. recently compared conventional feature selection methods and found that, although methods like PCA and random forest often enhance performance, they may perform differently depending on downstream models and task characteristics, especially when high-variance components obscure task-relevant patterns, underscoring that feature selection is not universally beneficial without careful method choice [14]. Similarly, Heidary et al. show that dimensionality reduction can improve computational efficiency in classification tasks but that aggressive reduction may degrade generalization on unseen data if relevant structures are discarded [15]. Collectively, these studies demonstrate that selecting a relevant subset of inputs enhances model robustness and interpretability while reducing overfitting risks associated with high-dimensional data, but also that the choice and tuning of selection techniques critically influence outcomes.
In addition to selecting a subset of existing features, another approach is to transform the input space into a lower dimensional representation. Principal Component Analysis (PCA) is widely used in engineering machine learning studies as a dimensionality reduction technique that generates new, uncorrelated features capturing the largest variance in the data. Wan et al. [16] examined the effect of dimensionality reduction on predicting concrete compressive strength using eight original input variables. A comparison on performance was conducted among using all features, six PCA-selected features, and six manually chosen features. The results showed that the ANN with PCA derived features improved test accuracy from an R2 of 0.913 to 0.934 by mitigating noise and multicollinearity. Although prediction accuracy decreased slightly with fewer inputs, training became noticeably faster. Similarly, Sun et al. [17] applied a PCA-ANN model to predict frozen soil strength and found that using only four to five principal components, which captured 90–95% of the data variance, produced predictions nearly identical to those using the full feature set. However, since PCA is an unsupervised reduction method, it does not always guarantee improved model performance. It primarily removes linear correlations and noise. For instance, in a recent study [16], an extreme gradient boosting model performed worse with PCA-reduced features than with the original inputs, likely because the discarded components contained nonlinear information useful for prediction.
While existing feature selection and transformation techniques have shown promise in improving ANN performance, they also exhibit several limitations. Most feature selection methods rely heavily on model-specific importance metrics (e.g., weight sensitivity analysis in ANNs or permutation-based feature importance) or optimization heuristics (e.g., genetic algorithms and particle swarm optimization), which may overlook feature redundancy and inter-feature correlations, because such metrics typically evaluate features individually or focus on optimizing a global objective without explicitly accounting for dependencies among features. Conversely, transformation-based approaches such as PCA eliminate collinearity but disregard nonlinear dependencies and lack interpretability, as the transformed components no longer represent physical variables. Moreover, these methods are often applied in isolation (e.g., used separately without combining feature selection and transformation approaches), which can result in inconsistent improvements across datasets and problem types.
Recent advances in embedded and ensemble-based feature selection have provided additional tools for handling high-dimensional inputs. For instance, SHapley Additive exPlanations (SHAP)-based selection methods leverage game-theoretic attribution to quantify each feature’s contribution to model output, offering strong interpretability and the ability to capture nonlinear interactions [18]. However, SHAP values are computed on a feature-by-feature basis and may still retain redundant predictors that exhibit high mutual correlation, which can negatively influence ANN training stability. Similarly, XGBoost-based feature selection has gained popularity due to its capacity to model nonlinear relationships and rank features through split-based gain metrics [19]. Yet, the tree-boosting mechanism tends to favor features that perform well in early splits, and its importance scores do not explicitly account for inter-feature dependencies or multicollinearity. As recent studies have noted, the lack of redundancy-aware mechanisms in these modern methods can lead to inconsistent selection behavior across datasets with varied feature interactions [20]. These limitations highlight the continued need for feature reduction frameworks that integrate multiple selection criteria while explicitly addressing feature redundancy.
To address these shortcomings, this study develops a robust two-stage input reduction framework that combines statistical and model-based feature selection with autocorrelation-based redundancy elimination. The objective is to evaluate this framework across 32 publicly available datasets, assess its stability, sensitivity and efficiency, and further demonstrate its effectiveness through a case study on ANN-based prediction performance.

2. Data and Engineering Context

2.1. Datasets

Thirty-two publicly available datasets across diverse engineering domains were collected. As summarized in Table 1, these datasets include studies related to material strength, foundation settlement, structural displacement, hydrological prediction, and so on. Most datasets are relatively modest in scale, with a median size of approximately 1100 samples, while the distribution is skewed by a small number of very large datasets exceeding 200,000 samples. Specifically, 11 datasets contain fewer than 500 records, 13 fall in the 500 to 10,000 range, and 8 exceed 10,000, with the largest reaching 236,505 samples. When sample size is normalized by the number of inputs, four datasets have fewer than 20 samples per input variable, suggesting a higher risk of overfitting without dimensionality reduction or regularization. These findings indicate that while most datasets are well-suited for relatively simple ANN models, the smaller- or higher-dimensional cases may require dimension reduction strategies to ensure reliable prediction.

2.2. Curation and Preprocessing

Prior to feature selection, the dataset was carefully curated and subjected to a series of preprocessing procedures to ensure data quality and analytical reliability. All variables were converted into numerical formats, and any non-numeric or invalid entries were treated as missing values [44]. Records containing missing data were subsequently removed to eliminate incomplete observations. Furthermore, data instances exhibiting autocorrelation were excluded to maintain logical consistency within the dataset.
Following these steps, normalization was applied to standardize the scale of the variables and improve the robustness of the neural network training process [45]. Input features were standardized to have zero mean and unit variance, while the output variable was rescaled to a range between 0 and 1. This normalization procedure ensured balanced feature contributions during optimization and enhanced the overall accuracy and stability of model training.

2.3. Artificial Neural Network Architecture

To ensure consistency and comparability across datasets, all experiments in this study were conducted using a unified artificial neural network (ANN) architecture. The network adopts a feed-forward fully connected design comprising an input layer, two hidden layers, and a single-node output layer for regression prediction. The input layer size corresponds to the number of retained input variables after dimensionality reduction. The first hidden layer contains 64 neurons, followed by a Rectified Linear Unit (ReLU) activation and batch normalization to stabilize training and mitigate internal covariate shift. The second hidden layer consists of 32 neurons with ReLU activation, providing additional nonlinear representation capability. The final output layer contains one neuron producing the predicted response variable.
This architecture represents a moderate-complexity network that balances learning capacity and generalization ability. It is sufficiently expressive to model nonlinear relationships in medium- and large-scale datasets, while its relatively compact structure helps reduce the risk of overfitting in smaller datasets. By employing the same ANN configuration for all datasets, a standardized and fair evaluation framework is established, allowing the effects of dimensionality reduction strategies to be assessed independently of architectural variability.

3. Feature Importance Ranking and Redundancy Elimination (FIRRE) Methods

To improve the optimization efficiency and interpretability of high-dimensional input features in ANN, a two-stage feature reduction framework, termed Feature Importance Ranking and Redundancy Elimination (FIRRE), is proposed. As presented in Figure 1, FIRRE operates in two sequential steps, namely feature ranking based on importance scores, and redundancy removal using pairwise autocorrelation analysis.
In the first stage, four feature selection methods are employed to quantify the importance of the original input features. These methods are selected to evaluate feature relevance to the target variable from both statistical and model-based perspectives. The four methods are as follows:
(a) Pearson Correlation Analysis. This method evaluates the linear association between each input feature and the target variable and is commonly used for preliminary variable screening. The Pearson correlation coefficient is calculated using Equation (1), where x denotes the feature, y represents the target variable, and n is the number of samples in the dataset.
ρ x , y = Cov ( x , y ) σ x σ y = i = 1 n ( x i x ¯ ) ( y i y ¯ ) i = 1 n ( x i x ¯ ) 2 i = 1 n ( y i y ¯ ) 2
(b) Recursive Feature Elimination (RFE). RFE evaluates feature importance by iteratively training a model and removing the least important features to retain an optimal subset. In this study, Support Vector Machine is adopted. The features with the lowest scores are excluded, and the iteration process is repeated until there is only the top 60% of the features are retained.
(c) Random Forest. This approach ranks and selects features based on important scores obtained from a trained random forest model. The Mean Decrease in Impurity (MDI) criterion is employed to quantify the average reduction in node impurity resulting from splits on a given feature across all trees in the ensemble, as presented in Equation (2). In this equation, Δ I m p u r i t y ( s ) denotes the decrease in impurity from a split, and T represents the total number of trees in the ensemble.
Importance ( x j ) = t T s t I [ s   splits   on   x j ] Δ Impurity ( s )
(d) SelectKBest. This method evaluates features individually using statistical hypothesis testing. As shown in Equation (3), the F-test for regression is used as the scoring function to quantify the linear dependency between each feature and the continuous target variable, where R2 denotes the coefficient of determination, n represents the number of the samples. The top 60% of features with the highest F-scores are then retained for subsequent analysis.
F = R 2 / 1 ( 1 R 2 ) / ( n 2 )
The four feature ranking methods were selected to provide complementary assessments of input importance. Pearson correlation captures linear statistical dependence between individual inputs and the target variable, while the F-test evaluates group-wise variance-based relevance [46]. In contrast, RFE-SVM and Random Forest provide model-based rankings that account for nonlinear relationships and feature interactions [47]. Combining these methods improves robustness by reducing the bias associated with any single criterion.
The scores obtained from the four feature selection methods are first normalized and combined using a weighted average to produce an integrated importance score. Features accounting for the top 60% of cumulative importance are retained for the next stage. Such ensemble scoring strategies are widely applied in feature selection research and have been shown to outperform single-method approaches [48,49,50].
In the first stage, the primary objective is to evaluate the relationship between each feature and the target variable. In the second stage, the focus shifts to examining inter-feature relationships to mitigate high collinearity, which can cause gradient instability and model overfitting. To identify and remove redundant features, a pairwise Pearson correlation matrix is constructed using the features retained from the first stage. A correlation threshold of ρ > 0.95 is applied to detect strongly correlated feature pairs. For each correlated pair, the feature with the lower importance score is discarded, ensuring that the more informative variable is preserved. This process adheres to the low-collinearity principle commonly recommended in variable selection for regression and neural-network models. By eliminating highly correlated variables, this procedure not only reduces redundancy but also compacts the feature space, stabilizes model training, and enhances interpretability. It effectively mitigates the distortive effects of multicollinearity, which are particularly critical for models sensitive to input scaling and weight updates.
The proposed FIRRE framework integrates multi-source feature importance evaluation with a structured correlation-based pruning mechanism. This hybrid approach, grounded in both theoretical rationale and empirical evidence, yields a more compact, stable, and interpretable feature set for downstream neural-network modeling. Comparable methodologies have been widely adopted in high-dimensional data analysis.

4. Cross-Dataset Comparative Evaluation

4.1. Comparison with Existing Methods

This section presents a cross-dataset comparative evaluation of FIRRE against widely used dimensionality reduction and feature selection approaches, namely Random Feature Selection (RFS), Principal Component Analysis (PCA), Variance-Threshold Filtering (VTF), and K-means clustering. For each of the 32 datasets, each method was applied within an identical preprocessing and training pipeline to reduce the input dimensions. The predictive performance of ANN was assessed on held-out data using the coefficient of determination (R2) and root mean square error (RMSE). Higher R2 and lower RMSE indicate better performance. The aggregated outcomes across datasets, together with per-dataset results, are presented in Figure 2 and Figure 3.
In terms of R2, FIRRE ranked 1st in 18 out of 32 datasets, and ranked in the top 2 in 25 out of 32 datasets, demonstrating superior or highly competitive predictive accuracy across the majority of datasets. Regarding RMSE, FIRRE ranked 1st in 20 out of 32 datasets, and ranked in the top 2 in 26 out of 32 datasets, indicating consistently low prediction errors and strong generalization performance. These rankings prove FIRRE’s robust performance, with the method achieving first or second place in approximately 80% of all datasets for both evaluation metrics. Notably, FIRRE maintained competitive performance even in datasets where it did not rank first, rarely falling below third place, which highlights its stability and reliability across diverse data characteristics and dimensionality challenges.

4.2. Comparison with Novel Methods

This section presents a cross-dataset comparative evaluation of FIRRE against advanced machine learning-based feature selection approaches, namely Least Absolute Shrinkage and Selection Operator (LASSO) [51], Extreme Gradient Boosting feature importance (XGB) [19], Boruta algorithm [52], and SHapley Additive exPlanations (SHAP) [18]. For each of the 32 datasets, each method was applied within an identical preprocessing and training pipeline to reduce the input dimensions. The predictive performance of ANN was assessed on held-out data using R2 and RMSE. The aggregated outcomes across datasets, together with per-dataset results, are presented in Figure 4 and Figure 5.
The comparative analysis reveals that FIRRE demonstrates highly competitive performance against these advanced machine-learning-based feature selection methods. In Figure 4, FIRRE maintains consistently high R2 values across most datasets, exhibiting particular stability in challenging scenarios where competing methods show performance degradation. In datasets 2, 7, 13, 20, and 23, where several methods experience marked drops in predictive accuracy (R2 below 0.7 or 0.6), FIRRE sustains robust performance above these thresholds. While LASSO, XGB, Boruta, and SHAP occasionally achieve comparable or superior results on datasets with moderate complexity, FIRRE demonstrates greater consistency across the diverse data characteristics in the benchmark suite.
Regarding RMSE, FIRRE achieves particularly pronounced advantages in high-dimensional datasets (29–32), where errors reach the 10−1 to 10−4 scale compared to 100 to 10 or higher for other methods. This performance gap suggests FIRRE’s iterative refinement mechanism is especially effective for complex problems where feature interactions and redundancies pose greater challenges. Quantitatively, FIRRE ranked 1st in R2 for 15 out of 32 datasets and in the top 2 for 23 out of 32 datasets (72%). For RMSE, FIRRE ranked 1st in 17 datasets and in the top 2 for 24 datasets (75%). Even when not ranking first, FIRRE typically remains within the top three methods, revealing its reliability and competitiveness against state-of-the-art feature selection approaches.

4.3. Stability Analysis

To further assess the robustness of the proposed FIRRE framework, a stability analysis was conducted to evaluate its consistency across different datasets compared with eight competing methods, namely RFS, PCA, VTF, K-means, LASSO, XGB, Boruta, and SHAP. While previous sections demonstrated that FIRRE outperforms other methods in most cases, this analysis focuses on evaluating stability. Although FIRRE is not always the best-performing approach for every single dataset, FIRRE exhibits markedly superior stability. In contrast, the other four methods occasionally achieve higher accuracy on certain datasets but perform poorly on others, indicating strong dataset dependency and weak generalization capability. This instability severely limits their practical applicability, whereas FIRRE maintains consistently good performance across all datasets.
To quantitatively evaluate this stability, the relative errors of R2 and RMSE between the dimension-reduced and original results were computed for five representative datasets, using Equations (4) and (5).
R e l a t i v e   E r r o r R 2 = { R d i m 2 R r a w 2 R r a w 2 ,               R d i m 2 > R r a w 2 0                                                   o t h e r w i s e
R e l a t i v e   E r r o r R M S E = { R M S E d i m R M S E r a w R M S E r a w , otherwise 0 R M S E d i m < R M S E r a w
As shown in Figure 6, two boxplots were drawn to illustrate the distributions of relative errors in R2 and RMSE for all methods, with corresponding variances calculated for quantitative comparison. FIRRE achieved the lowest median relative error and the smallest variance for both metrics, indicating minimal and consistent accuracy loss after feature reduction. For R2, FIRRE demonstrated a variance of σ2 = 0.018, substantially lower than almost all competing methods. Among the baseline methods, RFS (σ2 = 0.079) and K-means (σ2 = 0.086) exhibited the highest variances, representing the poorest stability. Notably, the advanced machine learning methods showed mixed results: while LASSO achieved competitive stability (σ2 = 0.017), XGB (σ2 = 0.023), Boruta (σ2 = 0.016), and SHAP (σ2 = 0.020) demonstrated modest improvements over traditional methods but still exhibited comparable or greater variability than FIRRE. For RMSE, FIRRE again demonstrated superior consistency with σ2 = 8272.336, considerably lower than RFS (σ2 = 52,764.258), K-means (σ2 = 57,622.798), and even the advanced methods such as XGB (σ2 = 12,812.966) and Boruta (σ2 = 30,185.304). LASSO (σ2 = 34,070.728) and SHAP (σ2 = 8744.576) showed relatively better stability among competing methods but remained less consistent than FIRRE.
These findings confirm that FIRRE provides highly stable and reliable dimensionality reduction performance across diverse datasets. Its minimal fluctuation in predictive accuracy demonstrates strong generalization and robustness. While some advanced machine learning methods (particularly LASSO and SHAP) approach FIRRE’s stability in certain metrics, FIRRE maintains consistently superior performance across both R2 and RMSE evaluations. This balanced stability makes FIRRE a more dependable choice for applications in engineering practices where consistent performance across varied data characteristics is essential.

4.4. Stage-Wise Analysis of FIRRE

To evaluate the contribution of each reduction stage within the proposed FIRRE framework, a stage-wise analysis was conducted across all 32 datasets. Since FIRRE comprises only two sequential stages, this analysis serves as a focused ablation study examining the incremental contribution of each stage. Three configurations were compared: (1) RFS, which serves as a baseline by randomly selecting features to match the same reduced dimensionality as FIRRE; (2) Stage 1 only, applying importance screening to remove weakly correlated features; and (3) the full FIRRE framework, combining both stages. Figure 7 presents the comparison through mean values, standard error bars, and individual dataset distributions for R2 and RMSE.
Overall, both Stage 1 and FIRRE yield positive relative gains in R2 and negative relative changes in RMSE, demonstrating systematic improvement over the random baseline. Stage 1 importance screening removes variables that have weak correlations with the target, leading to a clear median R2 improvement of 48.1% and a corresponding RMSE reduction of 21.3%. These improvements indicate that eliminating non-informative inputs suppresses noise propagation within the ANN and enhances generalization stability.
The full FIRRE framework further improves predictive performance by pruning redundant features with high pairwise correlation. Although the gains beyond Stage 1 are smaller in magnitude, the average R2 improvement reaches almost 50% relative to RFS, and RMSE shows an additional 20% decrease on average. The reduction in performance variance also highlights FIRRE’s enhanced robustness and consistency across datasets.
The scattered dots beside each bar reveal that while a few datasets exhibit moderate increase, the overwhelming majority cluster above the RFS baseline, confirming that both stages contribute positively in most cases. Taken together, these findings verify that Stage 1 provides the primary accuracy enhancement through relevance screening, whereas Stage 2 delivers additional refinement and stability through redundancy elimination. The two-stage procedure thus ensures reliable accuracy gains relative to random selection while achieving substantial dimensionality reduction and improved efficiency.

5. Sensitivity and Efficiency Analysis

In this section, the results after applying FIRRE to the 32 datasets are presented and analyzed. The results are presented in Figure 8.
As presented in Figure 8, the use of FIRRE method led to a clear reduction in input dimensionality across the 32 datasets while maintaining prediction accuracy. The number of input variables decreased substantially, with a median reduction of approximately 35% and an average reduction of around 40%. More than half of the datasets were reduced to five or fewer inputs, indicating that FIRRE effectively identified and removed redundant or weakly correlated features without compromising the representational capacity. With respect to prediction performance, R2 remained largely stable after applying FIRRE method. Across all datasets, the mean change in R2 was only −0.008, and the median change was −0.001, revealing minimal overall deviation. Moreover, 14 datasets exhibited improved R2, while 18 exhibited slight decreases, most within ±0.2. The largest observed improvement was an increase of 0.1, whereas the largest decline was −0.11. These results confirm that FIRRE achieves a substantial simplification of model inputs with negligible impact on prediction accuracy, which enhances both the interpretability and computational efficiency of ANN models in engineering applications.

5.1. Sensitivity Analysis

In Figure 9, a bubble plot was used to provide a comprehensive visualization of the relationship between input parameter reduction, predictive accuracy, and dataset characteristics after applying FIRRE. Along the horizontal axis, input reduction ranges from 20% to 70%, showing that most datasets underwent moderate dimensionality reduction. The vertical distribution of ΔR2 ranges from −0.12 to 0.10, with more than 30% of the datasets concentrated around zero and around 25% of datasets higher than zero. This indicates that most datasets maintained nearly identical and even better predictive accuracy after feature reduction. Only 18% of datasets exhibit notable decrease in predictive accuracy, with ΔR2 exceeding −0.05, reflecting moderate decline. The color gradient, representing ΔRMSE, shows mostly yellow hues near the center, corresponding to minimal changes in mean absolute error. A few darker colored bubbles that have higher input reduction ratios indicate cases where prediction error decreases notably, implying that more aggressive reduction occasionally enhanced model generalization. The purple circle corresponds to Dataset 23 (N = 12,000) and shows a large negative ΔRMSE, indicating a substantial reduction in RMSE after applying FIRRE. Although ΔR2 is slightly negative, RMSE decreases, suggesting that reducing the inputs from 7 to 3 may remove some predictive information captured by R2 while still improving average prediction accuracy as measured by RMSE.
Regarding the effect of sample size on the application of FIRRE, Figure 10 illustrates the relationship between dataset size, initial input dimensionality, and the change in predictive performance (ΔR2) after applying FIRRE. Overall, most datasets cluster around ΔR2 = 0, reaffirming that FIRRE generally maintains model accuracy regardless of dataset scale or input complexity. Moreover, the marker shapes reveal useful trends across input dimensionality groups. Datasets with fewer than 8 inputs predominantly exhibit ΔR2 values close to zero or slightly negative, suggesting that when input dimensionality is already low, further reduction provides limited benefit and may occasionally result in minor information loss. Datasets with moderate inputs show small variations on both sides of the zero line, indicating a balanced effect where FIRRE selectively improves or slightly reduces accuracy depending on the correlation structure among variables. In contrast, datasets with more than 15 inputs show more frequent positive ΔR2 values, including the highest gains observed. This trend suggests that FIRRE is particularly effective for higher-dimensional problems, where redundant features are more prevalent and their removal improves model generalization.
Across all dataset sizes, from small laboratory datasets to large scale numerical simulations, the results confirm that FIRRE performs consistently, with no clear dependence of ΔR2 on sample size. Instead, the primary factor influencing performance gain is the initial input dimensionality, where the higher the number of original features, the more likely FIRRE enhances predictive accuracy. Overall, both Figure 9 and Figure 10 illustrate that FIRRE consistently reduces input dimensionality without significantly sacrificing prediction accuracy, and in some instances, improved model efficiency and stability by eliminating redundant or noisy features.

5.2. Efficiency Analysis

5.2.1. Predictive Efficiency

To further evaluate the impact of feature reduction on model performance, two predictive efficiency indicators were proposed. These two indicators quantify the change in prediction accuracy or RMSE relative to the proportion of inputs removed. Specifically, the efficiency of feature reduction on R2, ER2, and the efficiency of feature reduction on RMSE, ERMSE, were computed using Equations (6) and (7), where ER2 > 0 indicates an improvement in model accuracy per fraction of input reduction, and ERMSE < 0 represents a reduction in the RMSE per fraction of input reduction. These normalized indicators provide a fair basis for comparing datasets with different input dimensions and reduction ratios, allowing a more consistent interpretation of FIRRE’s predictive performance. The results are presented in Figure 11 and Figure 12.
E R 2 = R 2 I n p u t   p a r a m e t e r s   r e m o v e d × 100 %
E R M S E = R M S E I n p u t   p a r a m e t e r s   r e m o v e d × 100 %
As presented in Figure 11, ER2 exhibits considerable variation across datasets but demonstrates a balanced distribution around zero for the majority of cases. Positive ER2 values are observed primarily in datasets with high dimensional inputs, where dataset 2 with 27 inputs has 0.09 ER2, datasets 21 and 22 with 27 inputs have 0.27 and 0.25 ER2, respectively. This indicates that FIRRE successfully identifies and removes redundant or weakly corelated inputs, thereby improving generalization. In contrast, slightly negative ER2 values occur mostly in low dimensional input datasets, where further reduction can remove a small portion of informative variables and lead to marginal declines in R2. Overall, most datasets exhibit modest efficiency magnitudes, demonstrating that FIRRE achieves a balanced performance between simplification and predictive accuracy.
As presented in Figure 12, ERMSE reveals a similar pattern. Approximately 16 out of 32 datasets show negative ERMSE values, suggesting that FIRRE not only maintains but actively improves prediction accuracy by mitigating overfitting and reducing noise sensitivity. These improvements are most evident in datasets with many inputs, where redundancy is high. Conversely, a few smaller datasets show positive ERMSE values, indicating slightly higher errors after reduction. However, these datasets remain limited in magnitude. Overall, the efficiency analysis demonstrates that FIRRE delivers consistent and stable predictive performance across diverse datasets, simultaneously achieving input simplification and enhanced model generalization.

5.2.2. Computational Efficiency

To further evaluate the computational benefits of input reduction, a time improvement indicator was used to quantify the change in computation time before and after applying FIRRE. This indicator can be calculated in Equation (8), where positive values indicate a reduction in total computation time, reflecting improved computational efficiency after feature reduction. Computation time includes the complete training and validation process for each dataset and is plotted on a logarithmic scale to accommodate the wide variation in dataset sizes and model complexities.
T i m e   i m p r o v e m e n t = t b e f o r e t a f t e r t a f t e r × 100 %
As shown in Figure 13, FIRRE consistently reduces computation time across most datasets, demonstrating its effectiveness in enhancing computational efficiency. Approximately 85% of the datasets exhibit a noticeable reduction in training time, indicating that feature reduction generally accelerates model convergence and improves computational performance. The blue bars are generally lower than the gray bars, particularly in datasets with a large number of inputs or samples, such as datasets 3, 13, 21, and 29 to 32. The corresponding dotted line indicates that most datasets achieve a time improvement between 10% and 60%. However, a few smaller datasets show minimal or slightly negative improvement, likely due to nonlinear overheads from deeper architectures, batch size or input–output inefficiencies, and hardware-level mismatches. These cases indicate that when system or selection costs approach the scale of training time, the overall gains may diminish slightly [53,54]. It should be noted that hardware-level mismatches refer to minor variations in system resource allocation, background processes, or memory management that can slightly affect computation time. Changes in input features, such as different numbers of inputs, may interact with these hardware-level factors, leading to small variations in measured training times.
The logarithmic scale highlights that FIRRE’s impact becomes more pronounced as computation time increases. For computationally intensive datasets, feature reduction results in substantial time savings, primarily due to the smaller input layer size and fewer network connections during training. Conversely, datasets with limited input or already efficient training configurations show less noticeable improvement. Overall, the results demonstrate that FIRRE not only simplifies model inputs but also enhances computational efficiency, particularly for medium- and large-scale datasets where training time is a limiting factor.

5.3. Summary

The effect of FIRRE framework can be characterized using six complementary indicators that jointly describe its predictive and computational performance. Besides the percentage of input reduction (%), the percentage of computation time reduction (%), ER2 (%), ERMSE (%), ΔR2 and ΔRMSE are also incorporated and normalized using Equations (9) and (10) into relative values. Together, these six indicators capture FIRRE’s effects on accuracy, efficiency, and model simplification in a consistent, quantitative manner.
R e l a t i v e R 2 = R a f t e r 2 R b e f o r e 2 R b e f o r e 2 × 100 %
R e l a t i v e R M S E = R M S E a f t e r R M S E b e f o r e R M S E b e f o r e × 100 %
Figure 14 presents radar plots summarizing the median performance and interquartile range (IQR) across six normalized indicators (0 to 1 scale) for three datasets groups categorized by initial input dimensionality, namely fewer than 8, between 9 and 15, and larger than 15 input variables. The indicators include changes in predictive accuracy (ΔR2, ΔRMSE) and efficiency metrics (ER2, ERMSE), alongside input and time reduction percentages. For ΔR2 and ER2, higher values indicate performance improvement, while for ΔRMSE and ERMSE, lower values (more negative) indicate better performance as they represent error reduction. However, in the normalization process, datasets with extremely large baseline RMSE values can produce disproportionately large ΔRMSE and ERMSE magnitudes, which dominate the normalization range and cause other datasets to cluster near the maximum normalized value.
High-dimensional datasets exhibit the largest normalized gains in ΔR2 and input reduction, confirming that FIRRE achieves substantial dimensionality compression while maintaining accuracy when feature redundancy is high. Medium-dimensional datasets demonstrate balanced performance across all indicators with median values consistently above 0.5 and narrow IQR bands, reflecting optimal trade-offs between simplification and accuracy. Low-dimensional datasets show more modest improvements, as fewer redundant features are available for removal, though FIRRE maintains minimal accuracy loss.
Overall, these results confirm that FIRRE adapts effectively to varying dimensionality, delivering greatest benefits in high-dimensional problems while maintaining stable performance across all dataset complexities.

6. Case Study: Prediction of Dynamic Modulus of Asphalt Mixtures

To further demonstrate the practical applicability and interpretability of the proposed FIRRE framework in engineering contexts, this section presents a case study using a representative dataset. Whereas the preceding sections focused on cross-dataset evaluation, sensitivity analysis, and computational efficiency, the current case study illustrates FIRRE’s performance when applied to a specific engineering problem. Through a step-by-step implementation, the reduced input set, prediction accuracy, and computational efficiency are examined in detail, emphasizing how FIRRE enhances model generalization and simplifies input representation without compromising predictive accuracy.

6.1. Dataset Description and ANN Architecture

As a case study, the proposed FIRRE method is applied on an ANN dataset for dynamic modulus prediction of asphalt mixtures [55]. The original ANN structure was a Genetic Algorithm modified ANN (GA-ANN), consisting of 25 input parameters, one hidden layer with 16 hidden neurons and 1 output parameter. Hyperparameters for the architecture are summarized in Table 2. The architecture of the ANN was evaluated, and all hyperparameters were fine-tuned and reported in the previous study [55].
The input and output parameters are summarized in Table 3. As dynamic modulus is the material characteristic, the original study classified the input parameters into the properties of various components and testing conditions. It was reported that the input parameters had not been checked for correlation evaluation or redundancy elimination, as the 25 input parameters could help characterize the material composition of aggregates, base binder, aged binder, as well as material testing method and aging procedures [55].

6.2. Application of FIRRE Reduction Method

The FIRRE method was applied to reduce redundant or weakly informative variables. In stage 1, the importance ranking module removed inputs with weak correlations to the target output, reducing the feature set from 25 to 16 variables. Using the same GA-ANN configuration, model performance improved from R2 = 0.926 and RMSE = 2259 to R2 = 0.947 and RMSE = 1915, confirming that removal of irrelevant variables enhanced predictive performance. In stage 2, a pairwise Pearson-correlation analysis was performed on the remaining 16 variables to identify and remove redundant inputs. Figure 15 presents the corresponding correlation map. In Figure 15a, the initial correlation matrix for all 25 inputs reveals strong multicollinearity among variables describing coarse aggregate gradation and RAP binder properties. Then, in Figure 15b, the reduced matrix after stage 1 still presents moderate cross correlations between binder related and aging related features. After stage 2, the final correlation structure depicted in Figure 15c after redundancy pruning shows weakly correlated variables. The sparse pattern confirms the elimination of multicollinearity and ensures that each retained feature provides distinct information. Subsequent retaining of the ANN using the 7 inputs configuration yielded the highest predictive accuracy, with R2 = 0.966 and RMSE = 1492. This demonstrates that removing highly correlated variables stabilizes the learning process and improves model efficiency without sacrificing accuracy. The prediction results are presented in Figure 16.

6.3. Interpretation on the Reduced Input Parameters

Inspection of the reduced input composition clarifies the physical meaning of FIRRE’s selection. After stage 1, 9 features were eliminated, including large sieve aggregate contents (13.2 mm, 9.5 mm, 4.75 mm), crumb rubber mesh, aged binder content in 0–3 and 3–5 mm RAP, aged binder penetration grade and softening point, and the accelerated factor. Among these, three were related to coarse aggregate gradation, one described crumb rubber characteristic, four described RAP binder properties, and one represented the laboratory aging condition. As noted in the study [10], mixture gradations were controlled to be consistent among groups; thus, the three largest sieve sizes (13.2 mm, 9.5 mm, and 4.75 mm) were excluded due to low correlation with the target output. Since only a single crumb rubber type (30 mesh) and a single accelerated aging factor (53.1) were used, their exclusion was expected. Notably, aside from RAP content (0–3 mm and 3–5 mm), all other RAP-related binder properties were removed, suggesting weak correlation between these RAP binder properties and the target modulus.
After Stage 2, the input parameters were further reduced from 16 to 7, removing nine additional variables: aggregates retained on 2.36 mm, 1.18 mm, 0.6 mm, 0.15 mm, and 0.075 mm sieves, filler content, RAP content 0–3 mm, RAP content 3–5 mm, and base binder softening point. Six of these were aggregate gradation parameters. Following FIRRE, only aggregate retained on the 0.3 mm sieve remained, indicating that this parameter had the strongest correlation with the target modulus while showing minimal redundancy with other inputs. In this case, aggregate retained on the 0.3 mm sieve might be the critical factor that affects the dynamic modulus of asphalt mixtures. Interestingly, both RAP content variables were removed at this stage, possibly because their influence, along with that of RAP on gradation, was captured by the 0.3 mm sieve retention parameter when the designing aggregate gradation. The base binder softening point was also excluded, likely due to redundancy with penetration grade, as both describe binder rheological properties. Thus, only penetration grade was retained among the final seven inputs.
Overall, the case study confirms that FIRRE can automatically identify physically meaningful and statistically independent variables, enhancing both predictive reliability and interpretability of ANN models for asphalt materials.

7. Conclusions

This study proposed and validated a two-stage feature reduction framework, Feature Importance Ranking and Redundancy Elimination (FIRRE), designed to improve the performance, interpretability, and computational efficiency of artificial neural networks in engineering prediction tasks. By combining supervised feature-importance screening with correlation-based redundancy pruning, FIRRE consistently reduced input dimensionality while maintaining or enhancing model accuracy across 32 datasets and a detailed asphalt mixture case study. The results confirm that FIRRE effectively balances predictive performance with model simplicity and robustness, particularly in high-dimensional problems where redundancy and noise are prevalent. The main findings are summarized as follows:
  • FIRRE is a two-stage, repeatable workflow that integrates established feature-ranking and correlation-based pruning methods to reduce ANN input dimensionality in civil engineering prediction tasks.
  • Across 32 civil engineering datasets, FIRRE consistently achieved competitive or improved prediction performance compared with common reduction baselines and recent feature reduction approaches, supporting its cross-domain applicability.
  • FIRRE typically reduced inputs by about 40% on average and cut computation time by roughly 10–60%, with the largest benefits in higher-dimensional, redundancy-prone datasets.
  • In the dynamic modulus example, FIRRE reduced inputs from 25 to 7 while improving R2 from 0.926 to 0.966, and the retained variables remained engineering interpretable.
  • Future studies are recommended to extend FIRRE to additional domain-specific case studies, examine stability under different data splits and alternative redundancy criteria, and explore integration with other models and interpretability assessments.

Author Contributions

Conceptualization, Z.Z. and Y.G.; methodology, Y.G. and Z.Z.; validation, L.X., X.C. and Z.Z.; formal analysis, Z.Z., Y.G., X.C. and L.X.; investigation, Z.Z. and Y.G.; data curation, Y.G.; writing—original draft preparation, Z.Z. and Y.G.; writing—review and editing, Z.Z. and Y.G.; visualization, Z.Z.; supervision, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Acknowledgments

The authors sincerely thank Ruipeng Zhang for his help and support in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Abouhalima, M.; Das Neves, L.; Taveira-Pinto, F.; Rosa-Santos, P. Machine Learning in Coastal Engineering: Applications, Challenges, and Perspectives. J. Mar. Sci. Eng. 2024, 12, 638. [Google Scholar] [CrossRef]
  2. Huang, F.; Xiong, H.; Chen, S.; Lv, Z.; Huang, J.; Chang, Z.; Catani, F. Slope Stability Prediction Based on a Long Short-Term Memory Neural Network: Comparisons with Convolutional Neural Networks, Support Vector Machines and Random Forest Models. Int. J. Coal. Sci. Technol. 2023, 10, 18. [Google Scholar] [CrossRef]
  3. Zhang, S.; Chen, W.; Xu, J.; Xie, T. Use of Interpretable Machine Learning Approaches for Quantificationally Understanding the Performance of Steel Fiber-Reinforced Recycled Aggregate Concrete: From the Perspective of Compressive Strength and Splitting Tensile Strength. Eng. Appl. Artif. Intell. 2024, 137, 109170. [Google Scholar] [CrossRef]
  4. Reguero, Á.D.; Martínez-Fernández, S.; Verdecchia, R. Energy-Efficient Neural Network Training through Runtime Layer Freezing, Model Quantization, and Early Stopping. Comput. Stand. Interfaces 2025, 92, 103906. [Google Scholar] [CrossRef]
  5. Zhu, J.; Wang, Y. Feature Selection and Deep Learning for Deterioration Prediction of the Bridges. J. Perform. Constr. Facil. 2021, 35, 04021078. [Google Scholar] [CrossRef]
  6. Liu, Q.; Wang, X.; Guan, Z.; Li, Z. Rapid Prediction and Parameter Evaluation of Process-Induced Deformation in L-Shape Structures Based on Feature Selection and Artificial Neural Networks. J. Compos. Sci. 2024, 8, 455. [Google Scholar] [CrossRef]
  7. Khalaf, J.A.; Majeed, A.A.; Aldlemy, M.S.; Ali, Z.H.; Al Zand, A.W.; Adarsh, S.; Bouaissi, A.; Hameed, M.M.; Yaseen, Z.M. Hybridized Deep Learning Model for Perfobond Rib Shear Strength Connector Prediction. Complexity 2021, 2021, 6611885. [Google Scholar] [CrossRef]
  8. Xie, Y.; Xiao, J.; Huang, K.; Thiyagalingam, J.; Zhao, Y. Correlation Filter Selection for Visual Tracking Using Reinforcement Learning. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 192–204. [Google Scholar] [CrossRef]
  9. Naeij, M.; Soroush, A.; Javanmardi, Y. Numerical Investigation of the Effects of Embedment on the Reverse Fault-Foundation Interaction. Comput. Geotech. 2019, 113, 103098. [Google Scholar] [CrossRef]
  10. Zhao, Z.; Wang, J.; Hou, X.; Xiang, Q.; Xiao, F. Viscosity Prediction of Rubberized Asphalt–Rejuvenated Recycled Asphalt Pavement Binders Using Artificial Neural Network Approach. J. Mater. Civ. Eng. 2021, 33, 04021071. [Google Scholar] [CrossRef]
  11. Radhakrishnan, P.; Vignesh, B. A Note on Rank Correlation and Semi-Supervised Machine Learning Based Measure. In 2017 Innovations in Power and Advanced Computing Technologies (i-PACT); IEEE: Vellore, India, 2017; pp. 1–8. [Google Scholar]
  12. Qolomany, B.; Maabreh, M.; Al-Fuqaha, A.; Gupta, A.; Benhaddou, D. Parameters Optimization of Deep Learning Models Using Particle Swarm Optimization. In 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC); IEEE: Valencia, Spain, 2017; pp. 1285–1290. [Google Scholar]
  13. Zare, M.; Ghasemi, M.; Zahedi, A.; Golalipour, K.; Mohammadi, S.K.; Mirjalili, S.; Abualigah, L. A Global Best-Guided Firefly Algorithm for Engineering Problems. J. Bionic. Eng. 2023, 20, 2359–2388. [Google Scholar] [CrossRef]
  14. Bezerra, F.E.; Oliveira Neto, G.C.D.; Cervi, G.M.; Francesconi Mazetto, R.; Faria, A.M.D.; Vido, M.; Lima, G.A.; Araújo, S.A.D.; Sampaio, M.; Amorim, M. Impacts of Feature Selection on Predicting Machine Failures by Machine Learning Algorithms. Appl. Sci. 2024, 14, 3337. [Google Scholar] [CrossRef]
  15. Atluri, V.; Heidary, K.; Bland, J. Performance Evaluation of Machine Learning Algorithms in Reduced Dimensional Spaces. JCS 2024, 6, 69–87. [Google Scholar] [CrossRef]
  16. Wan, Z.; Xu, Y.; Šavija, B. On the Use of Machine Learning Models for Prediction of Compressive Strength of Concrete: Influence of Dimensionality Reduction on the Model Performance. Materials 2021, 14, 713. [Google Scholar] [CrossRef] [PubMed]
  17. Sun, Y.; Zhou, S.; Meng, S.; Wang, M.; Mu, H. Principal Component Analysis–Artificial Neural Network-Based Model for Predicting the Static Strength of Seasonally Frozen Soils. Sci. Rep. 2023, 13, 16085. [Google Scholar] [CrossRef] [PubMed]
  18. Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]
  19. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  20. Bolón-Canedo, V.; Alonso-Betanzos, A. Ensembles for Feature Selection: A Review and Future Trends. Inf. Fusion 2019, 52, 1–12. [Google Scholar] [CrossRef]
  21. He, W.; Zhao, Z.; Zhang, Z.; Xiao, F. Modification and Application of the Arrhenius Equation Based on Activated Energy Methods from Various Asphalt Binders. J. Mater. Civ. Eng. 2024, 36, 04024123. [Google Scholar] [CrossRef]
  22. Abambres, M.; Rajana, K.; Tsavdaridis, K.D.; Ribeiro, T.P. Neural Network-Based Formula for the Buckling Load Prediction of I-Section Cellular Steel Beams. Computers 2018, 8, 2. [Google Scholar] [CrossRef]
  23. Bomers, A.; Van Der Meulen, B.; Schielen, R.M.J.; Hulscher, S.J.M.H. Historic Flood Reconstruction with the Use of an Artificial Neural Network. Water. Resour. Res. 2019, 55, 9673–9688. [Google Scholar] [CrossRef]
  24. Gong, K.; Aytas, T.; Zhang, S.Y.; Olivetti, E.A. Data-Driven Prediction of Quartz Dissolution Rates at near-Neutral and Alkaline Environments. Front. Mater. 2022, 9, 924834. [Google Scholar] [CrossRef]
  25. Bui, D.-K.; Nguyen, T.; Chou, J.-S.; Nguyen-Xuan, H.; Ngo, T.D. A Modified Firefly Algorithm-Artificial Neural Network Expert System for Predicting Compressive and Tensile Strength of High-Performance Concrete. Constr. Build. Mater. 2018, 180, 320–333. [Google Scholar] [CrossRef]
  26. Aliyu, A.M.; Choudhury, R.; Sohani, B.; Atanbori, J.; Ribeiro, J.X.F.; Ahmed, S.K.B.; Mishra, R. An Artificial Neural Network Model for the Prediction of Entrained Droplet Fraction in Annular Gas-Liquid Two-Phase Flow in Vertical Pipes. Int. J. Multiphas. Flow. 2023, 164, 104452. [Google Scholar] [CrossRef]
  27. Yeh, I.C. Modeling of Strength of High-Performance Concrete Using Artificial Neural Networks. Cement. Concrete. Res. 1998, 28, 1797–1808. [Google Scholar] [CrossRef]
  28. Li, X.; Bellerby, R.G.J.; Ge, J.; Wallhead, P.; Liu, J.; Yang, A. Retrieving Monthly and Interannual Total-Scale pH (pHT) on the East China Sea Shelf Using an Artificial Neural Network: ANN-pHT-V1. Geosci. Model Dev. 2020, 13, 5103–5117. [Google Scholar] [CrossRef]
  29. Acharjee, P.K. Application of Artificial Neural Network (ANN) in Development of Prediction Models for Pavement Performance and Material Properties. Master’s Thesis, The University of Texas, Tyler, TX, USA, 2023. [Google Scholar]
  30. Mba, L.; Meukam, P.; Kemajou, A. Application of Artificial Neural Network for Predicting Hourly Indoor Air Temperature and Relative Humidity in Modern Building in Humid Region. Energy Build. 2016, 121, 32–42. [Google Scholar] [CrossRef]
  31. Wang, W.-L.; Song, G.; Primeau, F.; Saltzman, E.S.; Bell, T.G.; Moore, J.K. Global Ocean Dimethyl Sulfide Climatology Estimated from Observations and an Artificial Neural Network. Biogeosciences 2020, 17, 5335–5354. [Google Scholar] [CrossRef]
  32. Wang, J.; Wang, X. Dataset and Machine Learning Models for Seismic Response Predictions of Small-to-Medium Continuous Girder Bridges [Data Set]; Zenodo: Geneva, Switzerland, 2024. [Google Scholar]
  33. Li, X.; Bellerby, R.G.J.; Wallhead, P.; Ge, J.; Liu, J.; Liu, J.; Yang, A. A Neural Network-Based Analysis of the Seasonal Variability of Surface Total Alkalinity on the East China Sea Shelf. Front. Mar. Sci. 2020, 7, 219. [Google Scholar] [CrossRef]
  34. Makoond, N.; Pelà, L.; Molins, C. Robust Estimation of Axial Loads Sustained by Tie-Rods in Historical Structures Using Artificial Neural Networks. Struct. Health. Monit. 2023, 22, 2496–2515. [Google Scholar] [CrossRef]
  35. Boucetta, L.N.; Amrane, Y.; Chouder, A.; Arezki, S.; Kichou, S. Enhanced Forecasting Accuracy of a Grid-Connected Photovoltaic Power Plant: A Novel Approach Using Hybrid Variational Mode Decomposition and a CNN-LSTM Model. Energies 2024, 17, 1781. [Google Scholar] [CrossRef]
  36. Pasquier, G.; Pull, C.D.; Ott, S.R.; Leadbeater, E. A Neural Correlate of Learning Fails to Predict Foraging Efficiency in the Bumble Bee Bombus Terrestris. Anim. Behav. 2025, 219, 123012. [Google Scholar] [CrossRef]
  37. Liu, C. Theater Questionnair Data and BP Neural Network Prediction Network [Data Set]; Zenodo: Geneva, Switzerland, 2020. [Google Scholar]
  38. Tsague, C.B.N.; Seutche, J.C.N.; Djeusu, L.N.; Chara-Dakou, V.S. Application of Neural Networks to Predict Indoor Air Temperature in a Building with Artificial Ventilation: Impact of Early Stopping. Int. J. Inf. Learn. Technol. 2024. [Google Scholar] [CrossRef]
  39. Zheng, R. The Raw Data for the Research “Comparing Neural Network Models Based on Macro Perspective Economic and Environmental Indicators with ARIMA Model in Predicting Construction Cost Index in UK” [Data Set]; Zenodo: Geneva, Switzerland, 2024. [Google Scholar]
  40. Wang, J.; Wang, M.; Wang, X.; Ye, A. Quantifying Post-Earthquake Residual Vertical Load-Carrying Capacity (VLCC) of RC Bridge Bents: Parametric Study and Development of Interpretable Machine Learning Models. Soil. Dyn. Earthq. Eng. 2025, 199, 109662. [Google Scholar] [CrossRef]
  41. Żurawski, M.; Grabska, K.; Kulawik, A.; Zalewski, R. Neural Model of the Adaptive Tuned Particle Impact Damper. Eng. Appl. Artif. Intel. 2024, 162, 112334. [Google Scholar] [CrossRef]
  42. Alidoost, F.; Han, Q. STEMMUS SCOPE Emulator Train Test Example Data of 2014 [Data Set]; Zenodo: Geneva, Switzerland, 2024. [Google Scholar]
  43. Kraus, K.; Kočandrle, P.; Šika, Z. Identification and Modeling of 3-DoF Mechanism with Nonlinear Friction Effects by Combination of LuGre and Neural Network Modeling. Mech. Syst. Signal Process. 2025, 239, 113218. [Google Scholar] [CrossRef]
  44. Alshdaifat, E.; Alshdaifat, D.; Alsarhan, A.; Hussein, F.; El-Salhi, S.M.F.S. The Effect of Preprocessing Techniques, Applied to Numeric Features, on Classification Algorithms’ Performance. Data 2021, 6, 11. [Google Scholar] [CrossRef]
  45. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  46. Mohammadi, F.; Irani, A.J. A Review of Feature Selection Methods for Disease Risk Prediction and Healthcare: Review of Feature Selection Methods for Disease Prediction. In 2024 11th International Symposium on Telecommunications (IST); IEEE: Tehran, Iran, 2024; pp. 713–719. [Google Scholar]
  47. Biernacki, A. Evaluating Filter, Wrapper, and Embedded Feature Selection Approaches for Encrypted Video Traffic Classification. Electronics 2025, 14, 3587. [Google Scholar] [CrossRef]
  48. Wang, J.; Xu, J.; Zhao, C.; Peng, Y.; Wang, H. An Ensemble Feature Selection Method for High-Dimensional Data Based on Sort Aggregation. Syst. Sci. Control Eng. 2019, 7, 32–39. [Google Scholar] [CrossRef]
  49. Göcs, L.; Johanyák, Z.C. Feature Selection with Weighted Ensemble Ranking for Improved Classification Performance on the CSE-CIC-IDS2018 Dataset. Computers 2023, 12, 147. [Google Scholar] [CrossRef]
  50. Ma, Z. Ensemble Feature Selection Using Neighbourhood Rough Set–Based Multicriterion Fusion. J. Appl. Math. 2024, 2024, 5534285. [Google Scholar] [CrossRef]
  51. Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
  52. Kursa, M.B.; Jankowski, A.; Rudnicki, W.R. Boruta–A System for Feature Selection. Fundam. Inform. 2010, 101, 271–285. [Google Scholar] [CrossRef]
  53. Kwon, W.; Yu, G.-I.; Jeong, E.; Chun, B.-G. Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning. In Proceedings of the 34th Conference on Neural Information Processing System, Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
  54. Anthony, Q.; Hatef, J.; Narayanan, D.; Biderman, S.; Bekman, S.; Yin, J.; Shafi, A.; Subramoni, H.; Panda, D. The Case for Co-Designing Model Architectures with Hardware. In Proceedings of the 53rd International Conference on Parallel Processing, Gotland, Sweden, 12–15 August 2024; ACM: Gotland, Sweden, 2024; pp. 84–96. [Google Scholar]
  55. Zhao, Z. Viscoelasticity Evolution of Rubberized Asphalt Rejuvenated RAP Mixtures Under Moisture-Thermal-Radiation Coupling Effect. Ph.D. Thesis, Tongji University, Shanghai, China, 2023. [Google Scholar]
Figure 1. Two-stage FIRRE scheme.
Figure 1. Two-stage FIRRE scheme.
Infrastructures 11 00029 g001
Figure 2. Cross dataset comparative evaluation of FIRRE vs. RFS, PCA, VTF, and K-means methods on 32 engineering datasets on R2.
Figure 2. Cross dataset comparative evaluation of FIRRE vs. RFS, PCA, VTF, and K-means methods on 32 engineering datasets on R2.
Infrastructures 11 00029 g002
Figure 3. Cross dataset comparative evaluation of FIRRE vs. RFS, PCA, VTF, and K-means methods on 32 engineering datasets on RMSE.
Figure 3. Cross dataset comparative evaluation of FIRRE vs. RFS, PCA, VTF, and K-means methods on 32 engineering datasets on RMSE.
Infrastructures 11 00029 g003
Figure 4. Cross dataset comparative evaluation of FIRRE vs. LASSO, XGB, Boruta, and Shap methods on 32 engineering datasets on R2.
Figure 4. Cross dataset comparative evaluation of FIRRE vs. LASSO, XGB, Boruta, and Shap methods on 32 engineering datasets on R2.
Infrastructures 11 00029 g004
Figure 5. Cross dataset comparative evaluation of FIRRE vs. LASSO, XGB, Boruta, and Shap methods on 32 engineering datasets on RMSE.
Figure 5. Cross dataset comparative evaluation of FIRRE vs. LASSO, XGB, Boruta, and Shap methods on 32 engineering datasets on RMSE.
Infrastructures 11 00029 g005
Figure 6. Boxplot comparison of method deviations for (a) analysis using R2 baseline showing deviation distributions for FIRRE, RFS, PCS, VFT, K-means, LASSO, XGB, Boruta, and SHAP methods, (b) analysis using RMSE baseline showing deviation distributions for the same methods. Variance (σ2) values are provided in the legends for each method.
Figure 6. Boxplot comparison of method deviations for (a) analysis using R2 baseline showing deviation distributions for FIRRE, RFS, PCS, VFT, K-means, LASSO, XGB, Boruta, and SHAP methods, (b) analysis using RMSE baseline showing deviation distributions for the same methods. Variance (σ2) values are provided in the legends for each method.
Infrastructures 11 00029 g006
Figure 7. Stage-wise performance comparison of (a) R2, and (b) RMSE relative to RFS baseline across 32 datasets. Bars denote mean improvements with standard-error caps, and dots represent individual dataset values. FIRRE achieves higher R2 and lower RMSE than both RFS and Stage 1, indicating cumulative benefits of the two-stage reduction strategy.
Figure 7. Stage-wise performance comparison of (a) R2, and (b) RMSE relative to RFS baseline across 32 datasets. Bars denote mean improvements with standard-error caps, and dots represent individual dataset values. FIRRE achieves higher R2 and lower RMSE than both RFS and Stage 1, indicating cumulative benefits of the two-stage reduction strategy.
Infrastructures 11 00029 g007
Figure 8. Number of input parameters and R2 of ANN after applying FIRRE.
Figure 8. Number of input parameters and R2 of ANN after applying FIRRE.
Infrastructures 11 00029 g008
Figure 9. Scattered bubble plot between input reduction (%) and ΔR2 across 32 datasets.
Figure 9. Scattered bubble plot between input reduction (%) and ΔR2 across 32 datasets.
Infrastructures 11 00029 g009
Figure 10. Effect of sample size on predictive accuracy change after applying FIRRE.
Figure 10. Effect of sample size on predictive accuracy change after applying FIRRE.
Infrastructures 11 00029 g010
Figure 11. Efficiency of feature reduction on predictive performance R2.
Figure 11. Efficiency of feature reduction on predictive performance R2.
Infrastructures 11 00029 g011
Figure 12. Efficiency of feature reduction on prediction error RMSE.
Figure 12. Efficiency of feature reduction on prediction error RMSE.
Infrastructures 11 00029 g012
Figure 13. Computational efficiency before and after FIRRE.
Figure 13. Computational efficiency before and after FIRRE.
Infrastructures 11 00029 g013
Figure 14. Radar plots summarizing FIRRE performance across datasets grouped by initial input dimensionality: (a) datasets with fewer than 8 inputs, (b) datasets with more than 15 inputs, and (c) datasets with 9 to 15 inputs. Each axis represents a normalized performance indicator, including relative improvements in coefficient of determination (ΔR2), relative reduction in mean absolute error (ΔRMSE), efficiency indices (ER2 and ERMSE), and percentage reductions in computational time and input dimensionality. Solid lines denote the group median values, and shaded regions correspond to the interquartile range (IQR, Q1–Q3) across all datasets within each group, reflecting the variability of FIRRE performance. It should be noted that arrows pointing up indicate improvement and positive effects, while arrows pointing down indicate negative effects.
Figure 14. Radar plots summarizing FIRRE performance across datasets grouped by initial input dimensionality: (a) datasets with fewer than 8 inputs, (b) datasets with more than 15 inputs, and (c) datasets with 9 to 15 inputs. Each axis represents a normalized performance indicator, including relative improvements in coefficient of determination (ΔR2), relative reduction in mean absolute error (ΔRMSE), efficiency indices (ER2 and ERMSE), and percentage reductions in computational time and input dimensionality. Solid lines denote the group median values, and shaded regions correspond to the interquartile range (IQR, Q1–Q3) across all datasets within each group, reflecting the variability of FIRRE performance. It should be noted that arrows pointing up indicate improvement and positive effects, while arrows pointing down indicate negative effects.
Infrastructures 11 00029 g014
Figure 15. Correlation matrices of input features before and after FIRRE reduction. (a) Pairwise Pearson correlation coefficients among all 25 original input variables, showing the initial degree of multicollinearity; (b) matrix for the 16 variables retained after Stage-1 importance screening; and (c) matrix for the 7 variables preserved after Stage-2 redundancy pruning.
Figure 15. Correlation matrices of input features before and after FIRRE reduction. (a) Pairwise Pearson correlation coefficients among all 25 original input variables, showing the initial degree of multicollinearity; (b) matrix for the 16 variables retained after Stage-1 importance screening; and (c) matrix for the 7 variables preserved after Stage-2 redundancy pruning.
Infrastructures 11 00029 g015
Figure 16. Predictions of the GA-ANN model before and after the application of FIRRE, (a) with the original 25 input parameters, (b) with 16 input parameters after first step feature ranking, (c) with 7 input parameters after FIRRE.
Figure 16. Predictions of the GA-ANN model before and after the application of FIRRE, (a) with the original 25 input parameters, (b) with 16 input parameters after first step feature ranking, (c) with 7 input parameters after FIRRE.
Infrastructures 11 00029 g016
Table 1. Datasets used for cross-dataset comparative evaluation.
Table 1. Datasets used for cross-dataset comparative evaluation.
No.Output/PredictionTotal DataInputsReference
1Viscosity of rubberized asphalt908Zhao et al. [10]
2Air void of asphalt mixtures103027He et al. [21]
3Bucking load of Steel Beams36458Abambres et al. [22]
4Maximum flood discharge at Lobith1607Bomers et al. [23]
5Quartz dissolution rate6135Gong et al. [24]
6Compressive and tensile strength of high-performance concrete11338Bui et al. [25]
7Entrained droplet fraction in annular gas–liquid two-phase flow in vertical pipes13678Aliyu et al. [26]
8Compressive strength of concrete10308Yeh [27]
9Monthly and interannual pHT on the East China Sea shelf185412Li et al. [28]
10International Roughness Index of pavement867Acharjee [29]
11Model coefficient Cf in Soil–Water Characteristics Curve Model80007
12Hourly indoor air temperature12,0007Mba et al. [30]
13Dimethyl sulfide climatology of ocean82,9969Wang et al. [31]
14Peak column drifts of small-to-medium continuous girder bridges72012Wang and Wang [32]
15Peak bearing deformation of small-to-medium continuous girder bridges72012
16Alkalinity on east China sea shelf in summer27010Li et al. [33]
17Alkalinity on east China sea shelf in winter27010
18Axial loads in tie-rods87306Makoond et al. [34]
19Short-term photovoltaic solar energy forecast29755Boucetta et al. [35]
20Foraging efficiency in the bumble bee133917Pasquier et al. [36]
21Small theater questionnaire data41627Liu [37]
22Medium theater questionnaire data47827
23Indoor air temperature with ventilation12,0007Tsague et al. [38]
24Construction cost index parameter 11637Zheng [39]
25Construction cost index parameter 21637
26Post-earthquake residual vertical load-carrying capacity of single-column bridge4307Wang et al. [40]
27Post-earthquake residual vertical load-carrying capacity of double-column bridge4308
28Optimal damper height14,8208Zurawski [41]
29Latent heat flux of land–atmosphere136,58823Alidoost and Han [42]
30Parameter 1 of 30DoF mechanism236,50560Kraus [43]
31Parameter 2 of 30DoF mechanism236,50560
32Parameter 3 of 30DoF mechanism236,50560
Table 2. Hyperparameters for GA-ANN.
Table 2. Hyperparameters for GA-ANN.
ModelHyperparametersValue
Genetic AlgorithmPopulation size20
Cross probability0.6
Mutation probability0.2
Selection methodRoulette
Fitness functionAbsolute value between prediction and measurement
ANNLearning rate0.01
Momentum factor0.9
Iterations10,000
Loss function value (MAE)<10−5
Activation function (Input-Hidden)tansig
Activation function (Hidden-Output)purelin
Table 3. Summary of input and output parameters.
Table 3. Summary of input and output parameters.
No.Input and OutputMin.Max.Mean.STD.
1Agg. retained on 13.2 mm sieve (wt%)4.95.15.00.1
2Agg. retained on 9.5 mm sieve (wt%)15.916.116.00.1
3Agg. retained on 4.75 mm sieve (wt%)23.924.124.00.1
4Agg. retained on 2.36 mm sieve (wt%)7.422.012.86.8
5Agg. retained on 1.18 mm sieve (wt%)0.47.02.83.1
6Agg. retained on 0.6 mm sieve (wt%)3.26.04.21.3
7Agg. retained on 0.3 mm sieve (wt%)4.67.05.71.0
8Agg. retained on 0.15 mm sieve (wt%)3.03.83.50.4
9Agg. retained on 0.075 mm sieve (wt%)1.34.82.61.6
10Filler content (wt%)3.35.24.10.9
11RAP content 0~3 mm (wt%)0.012.57.85.8
12RAP content 3~5 mm (wt%)0.017.911.48.5
13Base binder content (%)2.54.83.41.0
14Penetration grade (0.1 mm)73.094.084.010.5
15Softening point (°C)46.447.346.80.4
16Crumb rubber content (%)0.015.07.67.5
17Crumb rubber mesh29.930.130.00.1
18Aged binder content in RAP 0~3 mm (%)7.29.08.20.9
19Aged binder content in RAP 3~5 mm (%)4.85.85.30.5
20Aged binder penetration grade (0.1 mm)12.226.019.56.9
21Aged binder softening point (°C)70.876.473.52.8
22Testing temperature (°C)5.050.027.916.7
23Testing frequency (Hz)0.125.06.98.7
24Accelerated factor53.153.153.10.0
25Accelerated aging time (day)0.021.010.67.8
OutputDynamic modulus (MPa)133.532,172.010,290.48149.3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guo, Y.; Xu, L.; Chen, X.; Zhao, Z. A Two-Stage Feature Reduction (FIRRE) Framework for Improving Artificial Neural Network Predictions in Civil Engineering Applications. Infrastructures 2026, 11, 29. https://doi.org/10.3390/infrastructures11010029

AMA Style

Guo Y, Xu L, Chen X, Zhao Z. A Two-Stage Feature Reduction (FIRRE) Framework for Improving Artificial Neural Network Predictions in Civil Engineering Applications. Infrastructures. 2026; 11(1):29. https://doi.org/10.3390/infrastructures11010029

Chicago/Turabian Style

Guo, Yaohui, Ling Xu, Xianyu Chen, and Zifeng Zhao. 2026. "A Two-Stage Feature Reduction (FIRRE) Framework for Improving Artificial Neural Network Predictions in Civil Engineering Applications" Infrastructures 11, no. 1: 29. https://doi.org/10.3390/infrastructures11010029

APA Style

Guo, Y., Xu, L., Chen, X., & Zhao, Z. (2026). A Two-Stage Feature Reduction (FIRRE) Framework for Improving Artificial Neural Network Predictions in Civil Engineering Applications. Infrastructures, 11(1), 29. https://doi.org/10.3390/infrastructures11010029

Article Metrics

Back to TopTop