Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (88)

Search Parameters:
Keywords = multicollinearity problem

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1899 KB  
Article
Peripheral Blood Cells and Clinical Profiles as Biomarkers for Pain Detection in Palliative Care Patients
by Hugo Ribeiro, Raquel Alves, Joana Jorge, Bárbara Oliveiros, Tânia Gaspar, Inês Rodrigues, João Rocha Neves, Joana Brandão Silva, António Pereira Neves, Ana Bela Sarmento-Ribeiro, Marília Dourado, Ana Cristina Gonçalves and José Paulo Andrade
Biomedicines 2026, 14(1), 176; https://doi.org/10.3390/biomedicines14010176 - 14 Jan 2026
Viewed by 569
Abstract
Background/Objectives: Patients in need of specialized palliative care are clinically highly complex, with pain being the most prevalent problem. Furthermore, in these patients, a self-report for characterization of pain could be difficult to obtain. This cross-sectional, exploratory study investigates the use of clinical [...] Read more.
Background/Objectives: Patients in need of specialized palliative care are clinically highly complex, with pain being the most prevalent problem. Furthermore, in these patients, a self-report for characterization of pain could be difficult to obtain. This cross-sectional, exploratory study investigates the use of clinical parameters and peripheral blood biomarkers for potentially identifying and characterizing pain (assessed using Pain Assessment in Advanced Dementia (PAINAD) and Numeric Scale (NS)) in patients under palliative care, including a population with dementia where pain is often underdiagnosed. Methods: Fifty-three patients with non-oncological diseases were analyzed in a cross-sectional study using medical and nursing records. Among previous biomarkers related to monocytes and platelets assessed by flow cytometry, we selected the most significative ones for pain characterization in a logistic regression analysis (multivariate analysis), alongside patient-specific characteristics such as renal function, nutritional status, and age. Results: Our exploratory findings suggest strong relationships between chronic pain and advanced age, reduced glomerular filtration rate (GFR), and malnutrition within this cohort. Furthermore, the percentage of lymphocytes, total and classical monocytes, the relative expression in monocytes of CD206, CD163, the CD163/CD206 ratio, and the relative expression in platelets of CD59 emerged as potential predictors of pain. Statistical analyses highlighted the challenges of multicollinearity among variables such as age, GFR, and nutritional status. A classification model further suggested that all patients over 65 years in our specific sample reported pain. Conclusions: This pilot study provides preliminary support for prior evidence linking chronic pain to aging, nutritional deficits, and renal impairment, and highlights potential novel peripheral blood biomarkers for pain assessment. This work emphasizes the promise of clinical and molecular biomarkers to improve pain detection and management, contributing to personalized and effective palliative care strategies. Full article
(This article belongs to the Special Issue Biomarkers in Pain: 2nd Edition)
Show Figures

Graphical abstract

14 pages, 399 KB  
Article
LAFS: A Fast, Differentiable Approach to Feature Selection Using Learnable Attention
by Hıncal Topçuoğlu, Atıf Evren, Elif Tuna and Erhan Ustaoğlu
Entropy 2026, 28(1), 20; https://doi.org/10.3390/e28010020 - 24 Dec 2025
Viewed by 569
Abstract
Feature selection is a critical preprocessing step for mitigating the curse of dimensionality in machine learning. Existing methods present a difficult trade-off: filter methods are fast but often suboptimal as they evaluate features in isolation, while wrapper methods are powerful but computationally prohibitive [...] Read more.
Feature selection is a critical preprocessing step for mitigating the curse of dimensionality in machine learning. Existing methods present a difficult trade-off: filter methods are fast but often suboptimal as they evaluate features in isolation, while wrapper methods are powerful but computationally prohibitive due to their iterative nature. In this paper, we propose LAFS (Learnable Attention for Feature Selection), a novel, end-to-end differentiable framework that achieves the performance of wrapper methods at the speed of simpler models. LAFS employs a neural attention mechanism to learn a context-aware importance score for all features simultaneously in a single forward pass. To encourage the selection of a sparse and non-redundant feature subset, we introduce a novel hybrid loss function that combines the standard classification objective with an information-theoretic entropic regularizer on the attention weights. We validate our approach on real-world high-dimensional benchmark datasets. Our experiments demonstrate that LAFS successfully identifies complex feature interactions and handles multicollinearity. In general comparison, LAFS achieves very close and accurate results to state-of-the-art RFE-LGBM and embedded FSA methods. Our work establishes a new point on the accuracy-efficiency frontier, demonstrating that attention-based architectures provide a compatible solution to the feature selection problem. Full article
(This article belongs to the Special Issue Information-Theoretic Methods in Data Analytics, 2nd Edition)
Show Figures

Figure 1

31 pages, 511 KB  
Article
Shrinkage Approaches for Ridge-Type Estimators Under Multicollinearity
by Marwan Al-Momani, Bahadır Yüzbaşı, Mohammad Saleh Bataineh, Rihab Abdallah and Athifa Moideenkutty
Mathematics 2025, 13(22), 3733; https://doi.org/10.3390/math13223733 - 20 Nov 2025
Viewed by 480
Abstract
Multicollinearity is a common issue in regression analyses that occurs when some predictor variables are highly correlated, leading to unstable least squares estimates of model parameters. Various estimation strategies have been proposed to address this problem. In this study, we enhanced a ridge-type [...] Read more.
Multicollinearity is a common issue in regression analyses that occurs when some predictor variables are highly correlated, leading to unstable least squares estimates of model parameters. Various estimation strategies have been proposed to address this problem. In this study, we enhanced a ridge-type estimator by incorporating pretest and shrinkage techniques. We conducted an analytical comparison to evaluate the performance of the proposed estimators in terms of their bias, quadratic risk, and numerical performance using both simulated and real data. Additionally, we assessed several penalization methods and three machine learning algorithms to facilitate a comprehensive comparison. Our results demonstrate that the proposed estimators outperformed the standard ridge-type estimator with respect to the mean squared error of the simulated data and the mean squared prediction error of two real data applications. Full article
(This article belongs to the Special Issue Advances in Statistical Methods with Applications)
Show Figures

Figure 1

15 pages, 695 KB  
Article
Novel Data-Driven Shrinkage Ridge Parameters for Handling Multicollinearity in Regression Models with Environmental and Chemical Data Applications
by Muteb Faraj Alharthi
Axioms 2025, 14(11), 812; https://doi.org/10.3390/axioms14110812 - 31 Oct 2025
Cited by 2 | Viewed by 541
Abstract
Multicollinearity among predictor variables is a common challenge in modeling chemical and environmental datasets in physical sciences, often leading to collinearity issues and unreliable parameter estimates when fitting regression models. Ridge regression method emerges as an effective solution to this problem by introducing [...] Read more.
Multicollinearity among predictor variables is a common challenge in modeling chemical and environmental datasets in physical sciences, often leading to collinearity issues and unreliable parameter estimates when fitting regression models. Ridge regression method emerges as an effective solution to this problem by introducing a penalty term (k) that shrinks parameters to mitigate multicollinearity and balances bias and variance. In this study, we propose three novel shrinkage parameters for ridge regression and use them to develop three ridge-type estimators, referred to as SPS1, SPS2, and SPS3, which are designed to enhance parameter estimation based on sample size (n), number of predictors (p), and standard error (σ). These shrinkage estimators aim to improve the accuracy of regression models in the presence of multicollinearity. To evaluate the performance of the SPS estimators, we conduct comprehensive Monte Carlo simulations comparing them to ordinary least squares (OLS) and other existing estimators based on the mean squared error (MSE) criteria. The simulation results demonstrate that the SPS estimators outperform OLS and other methods. Additionally, we apply these three shrinkage estimators to two real-world environmental and chemical datasets, showing their ability to address multicollinearity as compared to OLS and other estimators. The proposed SPS estimators offer more stable and accurate regression results, contributing to improved decision-making in environmental modeling, pollution analysis, and other scientific research involving correlated variables. Full article
(This article belongs to the Section Mathematical Analysis)
Show Figures

Figure 1

24 pages, 8294 KB  
Article
Computing Two Heuristic Shrinkage Penalized Deep Neural Network Approach
by Mostafa Behzadi, Saharuddin Bin Mohamad, Mahdi Roozbeh, Rossita Mohamad Yunus and Nor Aishah Hamzah
Math. Comput. Appl. 2025, 30(4), 86; https://doi.org/10.3390/mca30040086 - 7 Aug 2025
Cited by 1 | Viewed by 952
Abstract
Linear models are not always able to sufficiently capture the structure of a dataset. Sometimes, combining predictors in a non-parametric method, such as deep neural networks (DNNs), would yield a more flexible modeling of the response variables in the predictions. Furthermore, the standard [...] Read more.
Linear models are not always able to sufficiently capture the structure of a dataset. Sometimes, combining predictors in a non-parametric method, such as deep neural networks (DNNs), would yield a more flexible modeling of the response variables in the predictions. Furthermore, the standard statistical classification or regression approaches are inefficient when dealing with more complexity, such as a high-dimensional problem, which usually suffers from multicollinearity. For confronting these cases, penalized non-parametric methods are very useful. This paper proposes two heuristic approaches and implements new shrinkage penalized cost functions in the DNN, based on the elastic-net penalty function concept. In other words, some new methods via the development of shirnkaged penalized DNN, such as DNNelastic-net and DNNridge&bridge, are established, which are strong rivals for DNNLasso and DNNridge. If there is any dataset grouping information in each layer of the DNN, it may be transferred using the derived penalized function of elastic-net; other penalized DNNs cannot provide this functionality. Regarding the outcomes in the tables, in the developed DNN, not only are there slight increases in the classification results, but there are also nullifying processes of some nodes in addition to a shrinkage property simultaneously in the structure of each layer. A simulated dataset was generated with the binary response variables, and the classic and heuristic shrinkage penalized DNN models were generated and tested. For comparison purposes, the DNN models were also compared to the classification tree using GUIDE and applied to a real microbiome dataset. Full article
Show Figures

Figure 1

27 pages, 4506 KB  
Article
Interpretable Machine Learning Framework for Corporate Financialization Prediction: A SHAP-Based Analysis of High-Dimensional Data
by Yanhe Wang, Wei Wei, Zhuodong Liu, Jiahe Liu, Yinzhen Lv and Xiangyu Li
Mathematics 2025, 13(15), 2526; https://doi.org/10.3390/math13152526 - 6 Aug 2025
Cited by 8 | Viewed by 3150
Abstract
High-dimensional prediction problems with complex non-linear feature interactions present significant algorithmic challenges in machine learning, particularly when dealing with imbalanced datasets and multicollinearity issues. This study proposes an innovative Shapley Additive Explanations (SHAP)-enhanced machine learning framework that integrates SHAP with advanced ensemble methods [...] Read more.
High-dimensional prediction problems with complex non-linear feature interactions present significant algorithmic challenges in machine learning, particularly when dealing with imbalanced datasets and multicollinearity issues. This study proposes an innovative Shapley Additive Explanations (SHAP)-enhanced machine learning framework that integrates SHAP with advanced ensemble methods for interpretable financialization prediction. The methodology simultaneously addresses high-dimensional feature selection using 40 independent variables (19 CSR-related and 21 financialization-related), multicollinearity issues, and model interpretability requirements. Using a comprehensive dataset of 25,642 observations from 3776 Chinese A-share companies (2011–2022), we implement nine optimized machine learning algorithms with hyperparameter tuning via the Hippopotamus Optimization algorithm and five-fold cross-validation. XGBoost demonstrates superior performance with 99.34% explained variance, achieving an RMSE of 0.082 and R2 of 0.299. SHAP analysis reveals non-linear U-shaped relationships between key predictors and financialization outcomes, with critical thresholds at approximately 10 for CSR_SocR, 1.5 for CSR_S, and 5 for CSR_CV. SOE status, EPU, ownership concentration, firm size, and housing prices emerge as the most influential predictors. Notable shifts in factor importance occur during the COVID-19 pandemic period (2020–2022). This work contributes a scalable, interpretable machine learning architecture for high-dimensional financial prediction problems, with applications in risk assessment, portfolio optimization, and regulatory monitoring systems. Full article
Show Figures

Figure 1

22 pages, 6463 KB  
Article
State of Charge Prediction for Electric Vehicles Based on Integrated Model Architecture
by Min Wei, Yuhang Liu, Haojie Wang, Siquan Yuan and Jie Hu
Mathematics 2025, 13(13), 2197; https://doi.org/10.3390/math13132197 - 4 Jul 2025
Cited by 2 | Viewed by 864
Abstract
To enhance the accuracy of SOC prediction in EVs, which often suffers from significant discrepancies between displayed and actual driving ranges, this study proposes a data-driven model guided by an energy consumption framework. The approach addresses the problem of inaccurate remaining range prediction, [...] Read more.
To enhance the accuracy of SOC prediction in EVs, which often suffers from significant discrepancies between displayed and actual driving ranges, this study proposes a data-driven model guided by an energy consumption framework. The approach addresses the problem of inaccurate remaining range prediction, improving drivers’ travel planning and vehicle efficiency. A PCA-GA-K-Means-based driving cycle clustering method is introduced, followed by driving style feature extraction using a GMM to capture behavioral differences. A coupled library of twelve typical driving cycle style combinations is constructed to handle complex correlations among driving style, operating conditions, and range. To mitigate multicollinearity and nonlinear feature redundancies, a Pearson-DII-based feature extraction method is proposed. A stacking ensemble model, integrating Random Forest, CatBoost, XGBoost, and SVR as base models with ElasticNet as the meta model, is developed for robust prediction. Validated with real-world vehicle data across −21 °C to 39 °C and four driving cycles, the model significantly improves SOC prediction accuracy, offering a reliable solution for EV range estimation and enhancing user trust in EV technology. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

17 pages, 557 KB  
Article
Identification and Estimation in Linear Models with Endogeneity Through Time-Varying Volatility
by Shih-Tang Hwu
Mathematics 2025, 13(11), 1849; https://doi.org/10.3390/math13111849 - 2 Jun 2025
Viewed by 916
Abstract
This paper proposes a novel control function approach to identify and estimate linear models with endogenous variables in the absence of valid instrumental variables. The identification strategy exploits time-varying volatility to address the multicollinearity problem that arises in conventional control function methods when [...] Read more.
This paper proposes a novel control function approach to identify and estimate linear models with endogenous variables in the absence of valid instrumental variables. The identification strategy exploits time-varying volatility to address the multicollinearity problem that arises in conventional control function methods when instruments are weak. We establish the identification conditions and show that the proposed method is T-consistent and asymptotically normal. We apply the proposed approach to estimate the elasticity of intertemporal substitution, a key parameter in macroeconomics. Using quarterly data on aggregate stock returns across eleven countries, we find that the data exhibit substantial time variation in volatility, supporting the identifying assumptions. The proposed method yields confidence intervals that are broadly consistent with the general findings in the literature and are substantially narrower than those obtained using weak-instrument robust methods. Full article
Show Figures

Figure 1

10 pages, 1175 KB  
Data Descriptor
A Dataset for Examining the Problem of the Use of Accounting Semi-Identity-Based Models in Econometrics
by Francisco Javier Sánchez-Vidal
Data 2025, 10(5), 62; https://doi.org/10.3390/data10050062 - 28 Apr 2025
Viewed by 975
Abstract
The problem of using accounting semi-identity-based (ASI) models in Econometrics can be severe in certain circumstances, and estimations from OLS regressions in such models may not accurately reflect causal relationships. This dataset was generated through Monte Carlo simulations, which allowed for the precise [...] Read more.
The problem of using accounting semi-identity-based (ASI) models in Econometrics can be severe in certain circumstances, and estimations from OLS regressions in such models may not accurately reflect causal relationships. This dataset was generated through Monte Carlo simulations, which allowed for the precise control of a causal relationship. The problem of an ASI cannot be directly demonstrated in real samples, as researchers lack insight into the specific factors driving each company’s investment policy. Consequently, it is impossible to distinguish whether regression results in such datasets stem from actual causality or are merely a byproduct of arithmetic distortions introduced by the ASI. The strategy of addressing this issue through simulations allows researchers to determine the true value of any estimator with certainty. The selected model for testing the influence of the ASI problem is the investment-cash flow sensitivity model (Fazzari, Hubbard and Petersen (FHP hereinafter) (1988)), which seeks to establish a relationship between a company’s investments and its cash flows and which is an ASI as well. The dataset included randomly generated independent variables (cash flows and Tobin’s Q) to analyze how they influence the dependent variable (cash flows). The Monte Carlo methodology in Stata enabled repeated sampling to assess how ASIs affect regression models, highlighting their impact on variable relationships and the unreliability of estimated coefficients. The purpose of this paper is twofold: its first goal is to provide a deeper explanation of the syntax in the related article, offering more insights into the ASI problem. The openly available dataset supports replication and further research on ASIs’ effects in economic models and can be adapted for other ASI-based analyses, as the information comprised in the reusability examples prove. Second, our aim is to encourage research supported by Monte Carlo simulations, as they enable the modeling of a comprehensive ecosystem of economic relationships between variables. This allows researchers to address a variety of issues, such as partial correlations, heteroskedasticity, multicollinearity, autocorrelation, endogeneity, and more, while testing their impact on the true value of coefficients. Full article
Show Figures

Figure 1

34 pages, 5191 KB  
Article
Factor Investment or Feature Selection Analysis?
by Jifang Mai, Shaohua Zhang, Haiqing Zhao and Lijun Pan
Mathematics 2025, 13(1), 9; https://doi.org/10.3390/math13010009 - 24 Dec 2024
Viewed by 2509
Abstract
This study has made significant findings in A-share market data processing and portfolio management. Firstly, by adopting the Lasso method and CPCA framework, we effectively addressed the problem of multicollinearity among feature indicators, with the Lasso method demonstrating superior performance in handling this [...] Read more.
This study has made significant findings in A-share market data processing and portfolio management. Firstly, by adopting the Lasso method and CPCA framework, we effectively addressed the problem of multicollinearity among feature indicators, with the Lasso method demonstrating superior performance in handling this issue, thus providing a new method for financial data processing. Secondly, Deep Feedforward Neural Networks (DFN) exhibited exceptional performance in portfolio management, significantly outperforming other evaluated machine learning methods, and achieving high levels of out-of-sample performance and Sharpe ratios. Additionally, we consistently identified price changes, earnings per share, net assets per share, and excess returns as key factors influencing predictive signals. Finally, this study combined the Lasso method with DFN, providing a new perspective and methodological support for asset pricing measurement in the financial field. Full article
(This article belongs to the Special Issue Advanced Statistical Applications in Financial Econometrics)
Show Figures

Figure 1

38 pages, 5178 KB  
Article
Assessing Urban Land Parcel Dynamics Driven by Bus Rapid Transit (BRT) as an Exclusive Transit Route
by Rana Tahir Mehmood, Muhammad Zaly Shah, Mehdi Moeinaddini, Muhammad Mashhood Arif, Ramine Chuhdary and Mufeeza Tahira
Urban Sci. 2024, 8(4), 227; https://doi.org/10.3390/urbansci8040227 - 25 Nov 2024
Cited by 1 | Viewed by 2447
Abstract
The addition of transit routes transforms urban development by disrupting the existing equilibrium that land parcels have achieved over time and promotes revitalization. It is based on the relationships between land parcel variables and transit route characteristics, including feeder routes and road infrastructure. [...] Read more.
The addition of transit routes transforms urban development by disrupting the existing equilibrium that land parcels have achieved over time and promotes revitalization. It is based on the relationships between land parcel variables and transit route characteristics, including feeder routes and road infrastructure. Traditional parametric methods for explaining this relationship have problems with multicollinearity and generalizability while non-parametric methods are not used with the multiple variables of both transit route and land parcel changes over time. This study applies the C5.0 decision tree algorithm, a non-parametric model that creates a decision tree with leaf nodes that predict the relationship. Using the BRT Lahore case study, the time series data of parcel variables in the 2 km circle of five transit stations before BRT 2010 and after BRT 2018, as well as transit route characteristics including feeder routes and road infrastructure, were collected and analyzed. The model identified eight important predictors and explained the relationship in the form of a flowchart. Property condition emerged as the strongest predictor, followed by property value, parking, population density, land use, building height, access routes, and distance from transit stations, in that order. The results show that well-developed transport infrastructure, parking spaces, and feeder routes enable sustainable urban transformation. Full article
Show Figures

Figure 1

21 pages, 8936 KB  
Article
A Proposal for a New Python Library Implementing Stepwise Procedure
by Luiz Paulo Fávero, Helder Prado Santos, Patrícia Belfiore, Alexandre Duarte, Igor Pinheiro de Araújo Costa, Adilson Vilarinho Terra, Miguel Ângelo Lellis Moreira, Wilson Tarantin Junior and Marcos dos Santos
Algorithms 2024, 17(11), 502; https://doi.org/10.3390/a17110502 - 4 Nov 2024
Cited by 4 | Viewed by 1772
Abstract
Carefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensive approach [...] Read more.
Carefully selecting variables in problems with large volumes of data are extremely important, as it reduces the complexity of the model, improves the interpretation of the results, and increases computational efficiency, ensuring more accurate and relevant analyses. This paper presents a comprehensive approach to selecting variables in multiple regression models using the stepwise procedure. As the main contribution of this study, we present the stepwise function implemented in Python to improve the effectiveness of statistical analyses, allowing the intuitive and efficient selection of statistically significant variables. The application of the function is exemplified in a real case study of real estate pricing, validating its effectiveness in improving the fit of regression models. In addition, we presented a methodological framework for treating joint problems in data analysis, such as heteroskedasticity, multicollinearity, and nonadherence of residues to normality. This framework offers a robust computational implementation to mitigate such issues. This study aims to advance the understanding and application of statistical methods in Python, providing valuable tools for researchers, students, and professionals from various areas. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (2nd Edition))
Show Figures

Figure 1

18 pages, 1047 KB  
Article
Modified Liu Parameters for Scaling Options of the Multiple Regression Model with Multicollinearity Problem
by Autcha Araveeporn
Mathematics 2024, 12(19), 3139; https://doi.org/10.3390/math12193139 - 7 Oct 2024
Cited by 2 | Viewed by 2056
Abstract
The multiple regression model statistical technique is employed to analyze the relationship between the dependent variable and several independent variables. The multicollinearity problem is one of the issues affecting the multiple regression model, occurring in regard to the relationship among independent variables. The [...] Read more.
The multiple regression model statistical technique is employed to analyze the relationship between the dependent variable and several independent variables. The multicollinearity problem is one of the issues affecting the multiple regression model, occurring in regard to the relationship among independent variables. The ordinal least square is the standard method to evaluate parameters in the regression model, but the multicollinearity problem affects the unstable estimator. Liu regression is proposed to approximate the Liu estimators based on the Liu parameter, to overcome multicollinearity. In this paper, we propose a modified Liu parameter to estimate the biasing parameter in scaling options, comparing the ordinal least square estimator with two modified Liu parameters and six standard Liu parameters. The performance of the modified Liu parameter is considered, generating independent variables from the multivariate normal distribution in the Toeplitz correlation pattern as the multicollinearity data, where the dependent variable is obtained from the independent variable multiplied by a coefficient of regression and the error from the normal distribution. The mean absolute percentage error is computed as an evaluation criterion of the estimation. For application, a real Hepatitis C patients dataset was used, in order to investigate the benefit of the modified Liu parameter. Through simulation and real dataset analysis, the results indicate that the modified Liu parameter outperformed the other Liu parameters and the ordinal least square estimator. It can be recommended to the user for estimating parameters via the modified Liu parameter when the independent variable exhibits the multicollinearity problem. Full article
(This article belongs to the Special Issue Application of Regression Models, Analysis and Bayesian Statistics)
Show Figures

Figure 1

17 pages, 428 KB  
Article
Mitigating Multicollinearity in Regression: A Study on Improved Ridge Estimators
by Nadeem Akhtar, Muteb Faraj Alharthi and Muhammad Shakir Khan
Mathematics 2024, 12(19), 3027; https://doi.org/10.3390/math12193027 - 27 Sep 2024
Cited by 16 | Viewed by 5098
Abstract
Multicollinearity, a critical issue in regression analysis that can severely compromise the stability and accuracy of parameter estimates, arises when two or more variables exhibit correlation with each other. This paper solves this problem by introducing six new, improved two-parameter ridge estimators (ITPRE): [...] Read more.
Multicollinearity, a critical issue in regression analysis that can severely compromise the stability and accuracy of parameter estimates, arises when two or more variables exhibit correlation with each other. This paper solves this problem by introducing six new, improved two-parameter ridge estimators (ITPRE): NATPR1, NATPR2, NATPR3, NATPR4, NATPR5, and NATPR6. These ITPRE are designed to remove multicollinearity and improve the accuracy of estimates. A comprehensive Monte Carlo simulation analysis using the mean squared error (MSE) criterion demonstrates that all proposed estimators effectively mitigate the effects of multicollinearity. Among these, the NATPR2 estimator consistently achieves the lowest estimated MSE, outperforming existing ridge estimators in the literature. Application of these estimators to a real-world dataset further validates their effectiveness in addressing multicollinearity, underscoring their robustness and practical relevance in improving the reliability of regression models. Full article
(This article belongs to the Special Issue Application of Regression Models, Analysis and Bayesian Statistics)
Show Figures

Figure 1

17 pages, 1013 KB  
Article
Comparative Study on Housing Defect Repair Cost through Linear Regression Model
by Junmo Park and Deokseok Seo
Eng 2024, 5(3), 2328-2344; https://doi.org/10.3390/eng5030121 - 20 Sep 2024
Cited by 1 | Viewed by 1691
Abstract
Despite stiff competition in the construction industry, housing quality remains a problem. From the consumer’s perspective, these quality problems are called defects. Homeowners experience inconvenience and suffering due to home defects, and developers and builders also experience severe damage in time, costs, and [...] Read more.
Despite stiff competition in the construction industry, housing quality remains a problem. From the consumer’s perspective, these quality problems are called defects. Homeowners experience inconvenience and suffering due to home defects, and developers and builders also experience severe damage in time, costs, and reputation due to defect repairs. In Korea, lawsuits are increasing due to the rise in housing defects, and the cost of repairing defects determined by lawsuits is of great concern. Litigation is a burden to consumers and producers, requiring a hefty court fee, as well as attorneys and specialist firms, and takes some years. Suppose it is possible to predict the repair costs based on the outcome of a lawsuit and present it as objective supporting data. In that case, it can be of great help in bringing a settlement between consumers and producers. According to previous studies on housing repair costs, linear regression models were mainly used. Accordingly, in this study, a linear regression model was adopted as a method to predict housing repair costs. We analyzed the defect repair costs in 100 cases in which lawsuits were filed and the verdict was finalized for housing complexes in Korea. Previous studies investigated using the following independent variables: elapsed period, litigation period, claim amount, home warranty deposit, total floor area, households, and main building’s quantity, construction cost, region, and highest floor. Among these, the floor area, elapsed period, and litigation period were determined to be valid independent variables. In addition, the construction period was discovered as a valid independent variable. The present research model, which combines these independent variables, was compared with previous research models. The results showed that the earlier research model was found to have a multicollinearity issue among some independent variables. Also, the coefficients of some independent variables were not statistically significant. This research model did not have a multicollinearity problem; all independent variables’ coefficients were statistically significant, and the coefficient of determination was higher than other linear research models. Our proposed regression model, which accounts for the interaction of each independent variable, is a significant step forward in our research. This model, using the number of households multiplied by the construction period, the construction period multiplied by the litigation period, and the litigation period multiplied by the litigation period as independent variables, has been rigorously tested and found to have no multicollinearity issue. The coefficients of all independent variables are statistically significant, further bolstering the model’s reliability. Additionally, the explanatory power of this model is comparable to the previous model, suggesting its potential to be used in conjunction with the existing model. Therefore, the linear regression model predicting the repair cost of housing defects following litigation in this study was considered the best. Utilizing the model proposed in this study is expected to play a major role in reconciling disputes between consumers and producers over housing defects. Full article
(This article belongs to the Section Chemical, Civil and Environmental Engineering)
Show Figures

Figure 1

Back to TopTop