Next Article in Journal
Development of Representative Urban Driving Cycles for Congested Traffic Conditions in Guayaquil Using Real-Time OBD-II Data and Weighted Statistical Methods
Previous Article in Journal
Simulation of a City Bus Vehicle: Powertrain and Driving Cycle Sensitivity Analysis Based on Fuel Consumption Evaluation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars

by
Imran Fayyaz
,
G. G. Md. Nawaz Ali
* and
Samantha S. Khairunnesa
Department of Computer Science and Information Systems, Bradley University, Peoria, IL 61625, USA
*
Author to whom correspondence should be addressed.
Vehicles 2025, 7(3), 94; https://doi.org/10.3390/vehicles7030094 (registering DOI)
Submission received: 11 July 2025 / Revised: 24 August 2025 / Accepted: 3 September 2025 / Published: 6 September 2025

Abstract

The rapid growth of the automobile industry has intensified the demand for accurate price prediction models in the used car market. Buyers often struggle to determine fair market value due to the complexity of factors such as mileage, brand, model, transmission type, accident history, and overall condition. This study presents a comparative analysis of machine learning models for used car price prediction, with a strong emphasis on the impact of feature engineering. We begin by evaluating multiple models, including Linear Regression, Decision Trees, Random Forest, Support Vector Regression (SVR), XGBoost, Stacking Regressor, and Keras-based neural networks, on raw, unprocessed data. We then apply a comprehensive feature engineering pipeline that includes categorical encoding, outlier removal, data standardization, and extraction of hidden features (e.g., vehicle age, horsepower). The results demonstrate that advanced preprocessing significantly improves predictive performance across all models. For instance, the Stacking Regressor’s R2 score increased from 0.14 to 0.8899 after feature engineering. Ensemble methods, such as CatBoost and XGBoost, also showed strong gains. This research not only benchmarks models for this task but also serves as a practical tutorial illustrating how engineered features enhance performance in structured ML pipelines for the fellow researchers. The proposed workflow offers a reproducible template for building high-accuracy pricing tools in the automotive domain, fostering transparency and informed decision making.
Keywords: feature engineering; machine learning; regressor; price prediction; car price prediction; regression; continuous value prediction feature engineering; machine learning; regressor; price prediction; car price prediction; regression; continuous value prediction

Share and Cite

MDPI and ACS Style

Fayyaz, I.; Ali, G.G.M.N.; Khairunnesa, S.S. Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars. Vehicles 2025, 7, 94. https://doi.org/10.3390/vehicles7030094

AMA Style

Fayyaz I, Ali GGMN, Khairunnesa SS. Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars. Vehicles. 2025; 7(3):94. https://doi.org/10.3390/vehicles7030094

Chicago/Turabian Style

Fayyaz, Imran, G. G. Md. Nawaz Ali, and Samantha S. Khairunnesa. 2025. "Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars" Vehicles 7, no. 3: 94. https://doi.org/10.3390/vehicles7030094

APA Style

Fayyaz, I., Ali, G. G. M. N., & Khairunnesa, S. S. (2025). Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-Own Cars. Vehicles, 7(3), 94. https://doi.org/10.3390/vehicles7030094

Article Metrics

Back to TopTop