This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars
by
Imran Fayyaz
Imran Fayyaz
,
G. G. Md. Nawaz Ali
G. G. Md. Nawaz Ali *
and
Samantha S. Khairunnesa
Samantha S. Khairunnesa
Department of Computer Science and Information Systems, Bradley University, Peoria, IL 61625, USA
*
Author to whom correspondence should be addressed.
Vehicles 2025, 7(3), 94; https://doi.org/10.3390/vehicles7030094 (registering DOI)
Submission received: 11 July 2025
/
Revised: 24 August 2025
/
Accepted: 3 September 2025
/
Published: 6 September 2025
Abstract
The rapid growth of the automobile industry has intensified the demand for accurate price prediction models in the used car market. Buyers often struggle to determine fair market value due to the complexity of factors such as mileage, brand, model, transmission type, accident history, and overall condition. This study presents a comparative analysis of machine learning models for used car price prediction, with a strong emphasis on the impact of feature engineering. We begin by evaluating multiple models, including Linear Regression, Decision Trees, Random Forest, Support Vector Regression (SVR), XGBoost, Stacking Regressor, and Keras-based neural networks, on raw, unprocessed data. We then apply a comprehensive feature engineering pipeline that includes categorical encoding, outlier removal, data standardization, and extraction of hidden features (e.g., vehicle age, horsepower). The results demonstrate that advanced preprocessing significantly improves predictive performance across all models. For instance, the Stacking Regressor’s R2 score increased from 0.14 to 0.8899 after feature engineering. Ensemble methods, such as CatBoost and XGBoost, also showed strong gains. This research not only benchmarks models for this task but also serves as a practical tutorial illustrating how engineered features enhance performance in structured ML pipelines for the fellow researchers. The proposed workflow offers a reproducible template for building high-accuracy pricing tools in the automotive domain, fostering transparency and informed decision making.
Share and Cite
MDPI and ACS Style
Fayyaz, I.; Ali, G.G.M.N.; Khairunnesa, S.S.
Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars. Vehicles 2025, 7, 94.
https://doi.org/10.3390/vehicles7030094
AMA Style
Fayyaz I, Ali GGMN, Khairunnesa SS.
Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars. Vehicles. 2025; 7(3):94.
https://doi.org/10.3390/vehicles7030094
Chicago/Turabian Style
Fayyaz, Imran, G. G. Md. Nawaz Ali, and Samantha S. Khairunnesa.
2025. "Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars" Vehicles 7, no. 3: 94.
https://doi.org/10.3390/vehicles7030094
APA Style
Fayyaz, I., Ali, G. G. M. N., & Khairunnesa, S. S.
(2025). Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars. Vehicles, 7(3), 94.
https://doi.org/10.3390/vehicles7030094
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.