Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention

Nie, Xingfu; Lin, Xiaojun; Li, Zun; Ji, Bo

doi:10.3390/app16041925

Open AccessArticle

Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention

¹

School of Mechanical Engineering, Northwestern Polytechnical University, Xi’an 710072, China

²

Shaanxi Fast Automobile Transmission Engineering Research Institute, Xi’an 710119, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(4), 1925; https://doi.org/10.3390/app16041925

Submission received: 9 January 2026 / Revised: 10 February 2026 / Accepted: 11 February 2026 / Published: 14 February 2026

Download

Browse Figures

Versions Notes

Abstract

Driving style recognition plays a crucial role in improving the operational safety, fuel efficiency, and intelligent control of commercial vehicles. Under real-world driving conditions, Controller Area Network (CAN) bus data from commercial vehicles simultaneously contain rapid transient variations induced by pedal and braking operations, as well as long-term behavioral trends reflecting driving habits, exhibiting pronounced multi-temporal characteristics. In addition, such data are typically affected by high noise levels, high dimensionality, and highly variable operating conditions, which makes it difficult for methods relying on single-scale features or handcrafted rules difficult to maintain robust and stable performance in complex scenarios. To address these challenges, this paper proposes a driving style classification network, termed the Multi-Scale Convolution and Efficient Channel Attention Network (MSCA-Net). By employing parallel convolutional branches with different temporal receptive fields, the proposed network is able to capture fast driver responses, local temporal dependencies, and long-term behavioral evolution, enabling unified modeling of cross-scale temporal patterns in driving behavior. Meanwhile, the Efficient Channel Attention mechanism adaptively emphasizes CAN signal channels that are highly relevant to driving style discrimination, thereby enhancing the discriminative capability and robustness of the learned feature representations. Experiments conducted on real-world multi-dimensional CAN time-series data collected from commercial vehicles demonstrate that the proposed MSCA-Net achieves improved classification performance in driving style recognition. Furthermore, the potential application of the recognized driving styles in adaptive Automated Manual Transmission shift strategy adjustment is discussed, providing a feasible engineering pathway toward behavior-aware intelligent control of commercial vehicle powertrains.

Keywords:

driving style recognition; commercial vehicles; can bus data; convolutional neural network; channel attention

1. Introduction

In recent years, the rapid development of intelligent transportation systems and vehicular networking technologies has substantially enhanced the capabilities for collecting, storing, and processing vehicle operational data, providing richer and more realistic data support for driving behavior analysis [1,2,3]. Driving style, as an important descriptor of drivers’ operational habits and behavioral patterns, not only has a direct impact on fuel economy, driving safety, and powertrain wear, but also serves as a critical input for optimizing control strategies in driver assistance systems, commercial vehicle energy management, and operational decision-making [4,5,6]. Simulation-based analyses further confirm that aggressive driving can increase average fuel consumption by approximately 30% compared with calm driving under urban conditions [7]. Studies focusing on heavy vehicles also report significantly higher fuel consumption under aggressive driving patterns (e.g., 49.8 L/100 km versus 40.6 L/100 km) [8]. Additional real-world investigations indicate that driving style and traffic conditions have a significant influence on vehicle emissions and fuel consumption dynamics [9]. Existing studies generally agree that driving style is closely associated with road safety risk. Aggressive driving is typically characterized by a higher frequency of safety-critical behaviors, such as rapid acceleration, harsh braking, and increased speed fluctuations, which are widely regarded as important proxy indicators of elevated collision risk [10]. For commercial vehicles equipped with automated manual transmissions (AMTs), differences in throttle and braking operations across driving styles can significantly affect shift frequency and shift smoothness, thereby influencing drivetrain performance and driving comfort [11].

In the field of transportation and human factors engineering, previous studies have systematically investigated driving behavior and its influencing factors. For example, research on ride-hailing drivers has shown that external factors, such as working hours and road conditions, can significantly increase the probability of aggressive driving behaviors [12]. Moreover, empirical studies have reported potential associations between abnormal driving risks and drivers’ sleep-related disorders [13]. These findings indicate that driving styles are affected by multiple interacting factors, providing important background and theoretical support for subsequent data-driven driving style recognition.

Compared with passenger cars, heavy-duty commercial vehicles are characterized by large vehicle mass, complex operating conditions, and long driving durations, making the influence of driver behavior differences on overall energy consumption, powertrain durability, and operational efficiency more pronounced [14]. Therefore, investigating driving style recognition for commercial vehicles under real-world operating conditions is of significant practical importance for improving operational efficiency and reducing operating costs [15,16,17].

Existing driving style recognition approaches can generally be categorized into traditional methods based on handcrafted features and data-driven intelligent methods [18]. Traditional approaches typically rely on statistical features extracted from signals such as vehicle speed, acceleration, throttle position, and braking behavior, and employ clustering analysis, fuzzy logic inference, or rule-based decision strategies to distinguish different driving styles [19,20,21]. These studies have, to some extent, revealed behavioral differences among drivers by modeling multiple driving style categories using clustering techniques or fuzzy control frameworks [22,23,24,25,26,27]. However, such methods are highly dependent on feature selection, threshold determination, and rule design, and they often struggle to effectively capture the temporal dynamics of driving behavior. As a result, their robustness and generalization capability remain limited in complex operating conditions and diverse driving scenarios [28].

With the advancement of deep learning techniques, models such as Convolutional Neural Networks (CNNs) and Graph Convolutional Networks (GCNs) have been increasingly introduced into the field of driving behavior recognition to automatically learn latent feature representations from high-dimensional vehicle operational data [29,30,31]. One-dimensional CNNs have demonstrated strong capability in extracting discriminative features from time-series signals such as vehicle operating data, while graph-based approaches are effective in modeling the interdependencies among different vehicle state variables. Existing studies have applied multi-scale convolutional networks, recurrent neural networks, or hybrid architectures to driving behavior and driving style recognition tasks, and have validated their effectiveness on simulated data or public benchmark datasets [32,33,34,35]. However, most of these studies focus primarily on passenger vehicle scenarios and rely heavily on simulation platforms or driving simulators, with limited attention paid to adaptability and stability under the complex real-world operating conditions of commercial vehicles [36].

In recent years, with the advancement of perception technologies, artificial intelligence approaches based on visual or multimodal information have been increasingly applied to driver behavior recognition and traffic scene understanding [37,38]. Such methods typically rely on sensing devices, such as cameras, to extract high-dimensional semantic features from drivers’ operational behaviors or the surrounding traffic environment, demonstrating promising recognition performance in specific scenarios [39,40]. However, these approaches often depend on additional sensing hardware and complex data acquisition systems, which introduce practical constraints in terms of cost and deployment for large-scale commercial vehicle fleets [41]. Therefore, this study focuses on time-series vehicle operation data directly available from the CAN bus, which better satisfies the requirements of scalability and engineering deployability in real-world commercial vehicle applications.

In practical commercial vehicle applications, operational signals collected from the Controller Area Network (CAN) typically exhibit characteristics such as high noise levels, high dimensionality, significant redundancy, and frequent variations in operating conditions [42,43]. Directly feeding raw CAN time-series signals into deep learning models may introduce redundant or noisy information, leading to unstable training processes and degraded generalization performance. Moreover, commercial vehicle applications impose stringent requirements on computational efficiency and real-time performance, which poses additional challenges in balancing model complexity and feature representation capability during network design.

To address the aforementioned challenges, this paper utilizes multi-dimensional operational data collected from the CAN of real-world commercial vehicles and first establishes a systematic data preprocessing pipeline, including signal denoising, normalization, sliding-window segmentation, and label annotation, to construct structured multi-channel time-series samples. Building upon this, a driving style recognition network integrating multi-scale convolutions and an Efficient Channel Attention mechanism, termed MSCA-Net, is proposed. By employing parallel convolutional branches with different temporal receptive fields, the proposed network is capable of simultaneously capturing instantaneous driving operations, local temporal correlations, and long-term behavioral trends. In addition, the Efficient Channel Attention mechanism adaptively reweights different CAN signal channels, enhancing the representation of discriminative behavioral features while suppressing redundant or less informative signals.

Experimental results obtained on real-world commercial vehicle CAN datasets under diverse operating conditions demonstrate that the proposed MSCA-Net can achieve stable and accurate recognition of conservative, normal, and aggressive driving styles. Compared with traditional methods based on handcrafted features, the proposed approach exhibits superior robustness and generalization performance in complex and noisy environments. The findings of this paper provide reliable methodological support and practical insights for behavior-aware intelligent control of commercial vehicles, particularly for the adaptive optimization of Automated Manual Transmission shift strategies.

2. Methodology for Driving Style Recognition Based on MSCA-Net

This section describes the methodology for driving style recognition based on the proposed MSCA-Net. It first presents the problem formulation and system overview, followed by feature construction and temporal representation, and finally details the architecture of the proposed network.

2.1. Problem Definition and System Overview

This section presents the overall framework of driving style recognition based on real-world commercial vehicle CAN data, which consists of four main stages: data acquisition, data preprocessing, sample construction and partitioning, and model training and inference. The overall framework is illustrated in Figure 1.

Vehicle operating data are collected in real time via the CAN bus, covering key parameters such as accelerator pedal position, brake pedal position, longitudinal acceleration, and vehicle speed. The raw data are first synchronized in time and filtered to remove outliers. Subsequently, normalization or standardization is applied to eliminate scale differences across variables. A sliding-window technique is then employed to segment the time-series signals into fixed-length sample sequences, each containing a predefined number of consecutive time steps, which form the model input dataset. Finally, the constructed dataset is partitioned into training, validation, and test subsets.

During the training stage, the constructed multi-channel fixed-length time-series samples are fed into the proposed MSCA-Net architecture for feature learning. Cross-entropy is employed as the loss function, and model parameters are optimized using the Adam algorithm. Dropout is incorporated to mitigate overfitting. The model is iteratively updated through mini-batch training, while the validation set is used to monitor performance changes. An early-stopping strategy is adopted to select the optimal model configuration as the final classifier. The overall training workflow is illustrated in Figure 2.

The trained model can be deployed on real vehicle operation data to enable real-time driving style recognition. Given a sliding-window segment of the current CAN signals as input, the model outputs the probability distribution corresponding to conservative, moderate, and aggressive driving styles.

2.2. Feature Construction and Temporal Representation

To ensure the quality and reliability of real-world CAN-bus data, a systematic preprocessing pipeline was applied prior to feature construction. Raw signals were first subjected to integrity checking and anomaly removal based on physical constraints and temporal continuity to eliminate invalid or corrupted samples. Short-duration missing segments caused by packet loss were reconstructed using linear interpolation. To suppress high-frequency noise induced by sensor inaccuracies and environmental interference, moving-average and median filtering were applied to dynamic signals. Subsequently, a sliding-window strategy was adopted to segment continuous CAN-bus signals into fixed-length multivariate time-series samples, enabling consistent temporal representation for model input.

To eliminate scale discrepancies among heterogeneous physical quantities and facilitate stable network training, key input variables—including vehicle speed, longitudinal acceleration, accelerator pedal position, and brake pedal position—were linearly normalized to either the

[0, 1]

or

[- 1, 1]

range.

Driving style feature selection aims to characterize the differences among driving styles using a limited yet representative set of features through data screening and statistical transformation, thereby revealing their underlying behavioral patterns. Considering feature availability, engineering practicality, and the discriminability of different driving styles in longitudinal operations, four core variables that directly reflect drivers’ power demand and driving stability were selected in this study: vehicle speed, longitudinal acceleration, accelerator pedal position, and brake pedal position. These variables can be directly obtained from the CAN bus and exhibit clear correspondence with drivers’ acceleration, deceleration, and speed regulation behaviors, enabling a comprehensive characterization of longitudinal control characteristics. The constructed feature categories are summarized in Table 1.

Although the extracted multidimensional features effectively describe driving styles, the high dimensionality and nonlinear correlations among features may increase computational complexity and lead to redundancy. To address this issue, Kernel Principal Component Analysis (KPCA) was employed to reduce feature dimensionality while preserving discriminative information. This step yields a compact and informative feature representation, which serves as the final input for subsequent driving style recognition.

2.3. Architecture of the Proposed MSCA-Net

To achieve accurate driving style recognition, this paper designs a driving style recognition network that integrates multi-scale convolutions and Efficient Channel Attention (ECA), termed MSCA-Net, specifically targeting the pronounced multi-time-scale characteristics of commercial vehicle driving behaviors under real-world operating conditions. In practical driving scenarios, driver operations involve not only rapid short-term fluctuations in pedal-related signals such as accelerator and brake inputs, but also long-term behavioral patterns, including speed variation trends, sustained acceleration or deceleration behaviors, and gear-shifting preferences over extended periods. These driving behaviors exhibit substantial temporal heterogeneity across different time scales. Relying solely on single-scale feature modeling may result in the loss of either local dynamic information or long-term behavioral trends, thereby limiting the accuracy and robustness of driving style recognition.

To address the above challenges, MSCA-Net directly takes multi-dimensional time-series data collected from the vehicle CAN bus as input and automatically learns driving behavior features in an end-to-end manner, thereby avoiding the limitations of traditional approaches that rely heavily on handcrafted features derived from expert experience. The overall architecture of the proposed model is illustrated in Figure 3, and the recognition pipeline mainly consists of three stages: data input, feature extraction, and classification. By employing a hierarchical convolutional structure, driving behaviors are progressively modeled across multiple time scales, enabling the network to simultaneously capture instantaneous operational characteristics and long-term behavioral patterns, which provides effective support for stable driving style recognition.

During the feature extraction stage, to improve training stability and convergence efficiency, the input signals are first standardized, and the continuous time-series data are reorganized into three-dimensional tensors that meet the network input requirements. Building upon this, a multi-scale parallel convolution module is introduced, where convolution kernels of sizes 1 × 1, 1 × 3, and 1 × 5 are employed to model driving behaviors from different receptive fields. Specifically, the 1 × 1 convolution is mainly used to extract global variation trends and reduce channel redundancy, helping to highlight critical driving states while suppressing noise interference. The 1 × 3 convolution focuses on capturing local dependencies between adjacent time steps, effectively characterizing short-term coupling relationships among throttle variations, braking operations, and vehicle speed responses. The 1 × 5 convolution targets longer temporal spans of continuous driving operations and is designed to describe long-term sequential features, such as sustained acceleration or deceleration behaviors and gear-shifting decisions that reflect driving style tendencies.

By fusing the outputs of multi-scale convolutions along the channel dimension and incorporating normalization and nonlinear activation operations, the proposed model achieves joint modeling of multi-level temporal patterns, including global trends, local dynamics, and long-term dependencies. This multi-scale feature extraction mechanism effectively compensates for the limitations of single-scale convolution in representing complex driving behaviors under real-world operating conditions, allowing the network to better adapt to the high noise levels and frequent operating condition variations inherent in heavy-duty commercial vehicle CAN data. Moreover, it provides high-quality feature representations for the subsequent introduction of attention mechanisms to enhance critical channels.

The feature extraction layer constitutes the core of the model and is composed of two stacked one-dimensional convolutional modules. Each module includes a convolution layer, a ReLU activation function, and a max-pooling layer. The first convolution layer employs sixteen kernels of size three with padding to preserve the temporal length, followed by a pooling layer with a stride of two. This configuration enables the extraction of short-term local patterns, such as instantaneous acceleration and deceleration behaviors. The second convolution layer uses thirty-two kernels of size three to capture more complex temporal feature combinations, including sustained steering actions and long-duration fluctuations in acceleration. The pooling operations reduce feature dimensionality and introduce translation invariance, thereby improving model robustness and mitigating overfitting. After the two convolution–pooling stages, the input signals are transformed into high-level semantic feature maps that provide discriminative representations for subsequent classification.

Due to the varying significance of different signal channels in driving-style characterization, relying exclusively on convolutional layers for feature fusion may result in the attenuation of critical information. To address this limitation, an Efficient Channel Attention (ECA) module is integrated into the network architecture to strengthen its capacity to emphasize salient channels. The ECA module captures channel dependencies through a lightweight local convolution operation, effectively highlighting those most relevant to driving-style discrimination while introducing minimal computational overhead. This design enhances the efficiency of feature extraction and contributes to improved classification accuracy.

The input feature map X is processed by the convolutional layer to produce the output feature map Z, which can be expressed mathematically as:

\begin{matrix} Z = X \cdot σ (C o n v (G A P (X))) \end{matrix}

(1)

where

G A P

denotes the global average pooling operation,

σ

represents the activation function, and “·” indicates element-wise multiplication. The term

C o n v

refers to an adaptive convolution with a kernel k size, where k is determined adaptively based on a mapping from the channel dimension C. This process can be formulated as:

\begin{matrix} k = ψ (C) = {|\frac{l o g_{2} (C)}{γ} + \frac{b}{γ}|}_{o d d} \end{matrix}

(2)

In this formulation, C denotes the channel dimension of the given feature map, and

{|t|}_{o d d}

represents the nearest odd integer to t. In this paper, the parameters

γ

and b are both set to 1.

After feature extraction, the high-dimensional feature maps are flattened and passed into the classification module, whose primary function is to map the convolution-derived representations to the final driving-style categories. First, a flattening operation converts the multi-dimensional feature maps into a one-dimensional vector to meet the input requirements of the fully connected layers. The flattened vector is then fed into a fully connected layer with 64 neurons, enabling dimensionality reduction and nonlinear transformation to further extract highly discriminative high-level features. To enhance the model’s generalization capability, a Dropout layer with a rate of 0.5 is applied after this stage to effectively mitigate overfitting. Finally, the processed features are passed to the output layer, whose dimension corresponds to the number of driving-style categories, generating unnormalized logits and completing the classification decision.

3. Dataset and Experimental Setup

This section describes the dataset and experimental setup employed to evaluate the proposed driving style recognition method. It presents the data acquisition process, dataset construction, training settings, and evaluation metrics.

3.1. Data Acquisition from Commercial Vehicles

To enable an in-depth analysis of driving-style characteristics, obtaining real-world operational data that accurately captures behavioral differences is essential. In this paper, key dynamic signals during vehicle operation were collected with high precision and continuity via the standard in-vehicle signal acquisition interface. Data were accessed through the CAN bus and recorded using the professional acquisition tool CANape, ensuring reliable and high-fidelity measurements.

A heavy-duty truck equipped with a dedicated data acquisition system was selected as the test platform. Long-term road tests were conducted under naturalistic driving conditions, and the data acquisition setup is illustrated in Figure 4. The CAN communication module within the acquisition system continuously captured key signals, including vehicle speed, longitudinal acceleration, lateral acceleration, yaw rate, accelerator pedal position, and brake pedal position. The sampling frequency of the collected data meets the requirements for subsequent temporal feature extraction.

To enhance the representativeness of the samples and improve the model’s generalization capability, multiple drivers were scheduled to operate the same test vehicle in different time periods under identical vehicle parameter configurations. By controlling the consistency of the vehicle platform, the influence of vehicle-related differences on driving-behavior characteristics was minimized. In addition, no fixed driving route was prescribed during the tests. Drivers were allowed to select their routes freely according to real traffic conditions and personal driving habits, ensuring that the collected data adequately capture authentic and naturalistic driving behaviors.

3.2. Dataset Construction

To train the supervised learning model, each sliced sample was assigned a driving-style label. Driving behavior features were first extracted from commercial vehicle CAN bus data, and an unsupervised clustering analysis was conducted using the K-means algorithm to partition the samples into three clusters. Subsequently, based on the statistical distribution differences of key driving-related variables, including accelerator pedal input, brake pedal operation, and longitudinal acceleration, together with domain expertise in commercial vehicle operation, the three clusters were labeled as conservative, moderate, and aggressive driving styles, respectively, thus establishing the labels for supervised learning.

The dataset was divided into training, validation, and test sets in a chronological order with a ratio of 7:2:1. Data from earlier time periods were used for model training, while data from later periods were reserved for validation and performance evaluation. This time-based splitting strategy preserves the temporal dependency of driving behaviors and effectively prevents future information leakage into the training process, thereby ensuring a realistic and rigorous assessment of model performance. The distribution of samples across driving-style categories is summarized in Table 2. The training and validation sets were used for network optimization and parameter tuning, while the test set was reserved exclusively for evaluating the final model performance.

3.3. Parameter Settings and Training Details

In the experimental setup, a multi-step learning rate scheduler was employed in PyTorch to dynamically adjust the learning rate during training. The Adam optimizer was used, and the model was trained for 60 epochs with an initial learning rate of 0.01. The learning rate was reduced by a factor of 0.1 at the 40th epoch. Cross-entropy loss was adopted for all experiments. Detailed hardware and software configurations of the experimental platform are summarized in Table 3.

3.4. Evaluation Metrics

To comprehensively evaluate the performance of the proposed model, several widely used evaluation metrics are adopted. The class-wise accuracy is defined as follows:

A c c_{c} = \frac{T P_{c}}{N_{c}}

(3)

where

T P_{c}

denotes the number of true positive samples for class c, and

N_{c}

denotes the total number of samples belonging to that class.

Overall accuracy (OA) is defined as the ratio of correctly classified samples to the total number of samples, representing the overall classification performance of the model. It is given by:

O A = \frac{\sum_{i = 1}^{C} T P_{i}}{N}

(4)

where

T P_{i}

denotes the number of correctly classified samples of class i, C denotes the number of classes, and N denotes the total number of samples.

Average accuracy (AA) is defined as the mean of the per-class accuracies over all classes, treating each class equally regardless of its sample size. It is computed as:

A A = \frac{1}{C} \sum_{i = 1}^{C} \frac{T P_{i}}{N_{i}}

(5)

where

N_{i}

denotes the number of samples in class i.

To further evaluate the class-wise performance, precision, recall, and F1-score are adopted. Precision measures the proportion of correctly predicted positive samples among all predicted positives, whereas recall represents the proportion of correctly predicted positives among all ground-truth positives. They are defined as follows:

P r e c i s i o n_{i} = \frac{T P_{i}}{T P_{i} + F P_{i}}

(6)

R e c a l l_{i} = \frac{T P_{i}}{T P_{i} + F N_{i}}

(7)

where

F P_{i}

and

F N_{i}

denote the numbers of false positives and false negatives for class i, respectively.

The F1-score is the harmonic mean of precision and recall, which provides a balanced measure of classification performance:

F 1_{i} = \frac{2 * P r e c i s i o n_{i} \cdot R e c a l l_{i}}{P r e c i s i o n_{i} + R e c a l l_{i}}

(8)

To address class imbalance, Macro-F1 is adopted to provide a fair evaluation by averaging the F1-scores of all classes and assigning equal importance to each class regardless of its sample size:

Macro-F 1 = \frac{1}{C} \sum_{i = 1}^{C} F 1_{i}

(9)

4. Results and Discussion

This section presents the experimental results and discusses the performance of the proposed method. Recognition performance, comparisons with baseline models, and ablation analyses are reported to validate the effectiveness of MSCA-Net.

4.1. Driving Style Recognition Performance

To evaluate the generalization performance of the proposed recognition model, the 8981 test samples listed in Table 2 were treated as unseen data and randomly fed into the convolutional neural network. The predicted driving style categories were then compared with the corresponding ground-truth labels. The confusion matrix illustrating the model’s classification performance across the three driving-style categories on the test set is shown in Figure 5.

As illustrated, the model demonstrates strong overall performance, with high values along the diagonal of the confusion matrix, indicating robust discriminative capability. The recognition accuracy for the moderate driving style reaches 99.89%, suggesting that the model effectively captures the dominant behavioral characteristics present in the majority class. The accuracy for the aggressive driving style is 97.41%, slightly lower due to misclassifications primarily into the moderate category, reflecting the partial similarity and relatively blurred boundary between these two styles. The conservative driving style achieves an accuracy of 96.61%, with most misclassifications also occurring toward the moderate category, indicating that certain conservative behaviors share overlapping characteristics with moderate driving patterns.

From the misclassification patterns, most errors occur between the aggressive and moderate styles and between the conservative and moderate styles, with mutual misclassification rates of 2.59% and 3.39%, respectively. In contrast, the conservative class exhibits almost no cross-category misclassification. This indicates that the behavioral characteristics of the moderate and aggressive styles overlap to some extent—for example, similar ranges of accelerator pedal variation or acceleration fluctuation—thereby increasing the difficulty of discrimination for the model. Overall, the model demonstrates stable and accurate recognition across all three driving styles, with diagonal accuracies exceeding 96.5%. In particular, the recognition accuracy for the moderate style reaches 99.89%. These confusion-matrix results further validate the effectiveness of the proposed MSCA-Net model for driving-style classification.

The convergence behavior of the proposed model during training is shown in Figure 6. The training loss decreases steadily with increasing iterations, with a pronounced drop observed within the first ten epochs, indicating that the model rapidly captures the primary discriminative features at the early stage of training. Thereafter, the rate of decline gradually slows, and the loss stabilizes after approximately 35 epochs, remaining at a low level without noticeable oscillation or rebound. This behavior suggests effective parameter convergence and a stable training process without optimization difficulties or loss divergence.

In addition, the training loss curve exhibits good smoothness without pronounced fluctuations, indicating that the chosen optimizer and learning-rate schedule are appropriate and that the model achieves efficient learning and stable convergence. The final loss remains at a low level, providing a solid foundation for subsequent classification performance and demonstrating that the model possesses strong fitting capability.

4.2. Comparison with Baseline Models

To further validate the effectiveness of the proposed model in commercial vehicle driving style recognition, this section conducts a comparative analysis of the classification performance of different types of neural networks. Given that driving behavior data exhibit clear temporal characteristics and multidimensional perceptual variables, three representative network architectures are selected as comparison baselines: one-dimensional convolutional neural network 1D-CNN, two-dimensional convolutional neural network 2D-CNN, and multi-layer perception MLP. The 1D-CNN is capable of directly extracting local features along the temporal dimension and is well suited for one-dimensional sequential data. The 2D-CNN is typically used for processing two-dimensional structured data and is introduced here to explore its capability in combining features through two-dimensional convolution. The MLP, as a traditional fully connected network, provides a performance reference in the absence of convolutional mechanisms.

Before presenting the quantitative results, the architectures of all compared models are briefly summarized for clarity and reproducibility. The baseline networks (2D-CNN, MLP, and 1D-CNN) are lightweight structures composed of two convolutional (or fully connected) layers followed by two fully connected layers, with parameter sizes ranging from 0.001 M to 0.010 M. In contrast, the proposed MSCA-Net adopts a medium-depth architecture consisting of six convolutional layers and two fully connected layers, resulting in approximately 0.015 M trainable parameters. The multi-scale convolution branches include three convolutional layers, and an additional convolutional layer is employed to align features across different scales, while the remaining layers follow the same backbone design as the baseline models. Despite the increased depth, MSCA-Net remains within the same order of magnitude in terms of parameter size, ensuring a fair and meaningful performance comparison.

By comparing the classification accuracy and confusion matrix performance of the three baseline networks with that of the proposed MSCA-Net under the same dataset and consistent training strategy, the adaptability and limitations of each architecture in driving style recognition can be systematically evaluated. This comparison provides a foundation for model selection and practical engineering applications. Table 4 presents the classification accuracy for the three driving styles along with the overall evaluation metrics.

From an overall performance perspective, MSCA-Net achieves the best results across all evaluation metrics. The average accuracy (AA) and overall accuracy (OA) reach 97.97% and 99.12%, respectively. In addition, it significantly outperforms the other baseline models in terms of F1-score (99.11%) and Macro-F1 (98.70%), indicating that the proposed method is capable of maintaining balanced recognition performance across different driving styles under class-imbalanced conditions. In comparison, 1D-CNN ranks second in overall performance, whereas MLP and 2D-CNN exhibit relatively inferior results, with 2D-CNN showing noticeable performance degradation in AA, OA, and Macro-F1.

From a class-wise perspective, all models generally achieve higher recall for the conservative class, suggesting that this category exhibits stronger intra-class consistency and better separability. However, 2D-CNN shows relatively low precision for this class, indicating the presence of misclassifications. For the moderate and aggressive classes, the performance differences among models become more pronounced. Specifically, 2D-CNN suffers from low recall, while MLP still exhibits an imbalance between precision and recall. In contrast, MSCA-Net achieves both high and well-balanced precision and recall for these two classes, effectively mitigating the bias toward majority classes.

Further analysis based on F1-score and Macro-F1 confirms that MSCA-Net maintains stable performance across both majority and minority classes. This improvement can be attributed to the multi-scale convolutional structure, which effectively captures driving behavior patterns at different temporal scales, as well as the ECA channel attention mechanism that enhances discriminative feature representations. Consequently, the proposed model demonstrates strong robustness and generalization capability under complex operating conditions and class-imbalanced scenarios.

4.3. Ablation Study

To further investigate the contribution of individual components, an abation study is conducted to evaluate the effects of dimensionality reduction and network architecture on overall recognition performance.

4.3.1. Effect of KPCA-Based Dimensionality Reduction

To evaluate the impact of the KPCA-based dimensionality reduction technique on the proposed driving style recognition framework, an ablation study was performed by comparing model performance under two settings, i.e., with and without the KPCA preprocessing step. The network architecture, training strategy, and hyperparameter configurations were strictly kept consistent to ensure fairness and reliability of the comparison.

Table 5 presents the quantitative results under the two experimental settings. Incorporating KPCA consistently improves recognition performance across all driving-style categories. Specifically, the average accuracy (AA) increases from 96.03% to 97.97%, while the overall accuracy (OA) improves from 98.27% to 99.12%. These results demonstrate that the KPCA-based preprocessing effectively enhances the overall classification performance of the model. When KPCA is applied, the original multivariate CAN time-series signals are projected into a low-dimensional feature space to reduce feature redundancy and suppress noise while preserving the primary nonlinear characteristics associated with driving behaviors. In contrast, without KPCA, the raw time-series signals are directly fed into the network, requiring the model to rely solely on the subsequent deep architecture to learn discriminative feature representations.

In this work, KPCA is used only as an auxiliary preprocessing step rather than a substitute for deep feature extraction. Under both experimental settings, the subsequent multi-scale convolutional modules and channel attention mechanisms remain unchanged and are responsible for learning hierarchical temporal feature representations. The experimental results indicate that combining nonlinear dimensionality reduction with deep time-series modeling contributes to improved training stability and enhanced generalization performance on real-world commercial vehicle CAN data.

4.3.2. Effect of Network Architecture

To verify the rationality of the key architectural designs in the proposed MSCA-Net and to quantify their actual contributions to the driving style recognition task, this section conducts ablation experiments to perform a module-level decomposition and analysis of the network. Considering the pronounced differences in driving behaviors in terms of temporal scales and feature importance, the experiments separately evaluate the individual effects of the multi-scale convolutional structure and the ECA channel attention module, as well as their combined impact on overall recognition performance. Specifically, four model configurations are investigated: a baseline model without multi-scale structures, a model incorporating only the multi-scale convolution module, a model incorporating only the ECA attention module, and the complete MSCA-Net integrating both components. By comparing the recognition results under different configurations, the functional roles of each module in modeling driving behaviors can be systematically analyzed.

The experimental results shown in Table 6 indicate that the multi-scale convolutional structure plays a critical role in improving overall recognition accuracy. When the multi-scale convolution module is introduced into the baseline model (Network A), the overall accuracy (OA) increases from 96.57% to 97.32%. This improvement demonstrates that convolution kernels operating at different temporal scales can effectively capture the discrepancies between short-term operational fluctuations and long-term driving trends. For example, instantaneous variations in accelerator pedal input and the global evolution of vehicle speed typically exhibit distinct temporal characteristics, which are difficult to model simultaneously using single-scale convolution. By employing parallel receptive fields, the multi-scale convolution mechanism enhances the model’s ability to represent complex temporal features, leading to a more pronounced improvement in overall classification performance. In contrast, the gain in average accuracy (AA) is relatively limited, suggesting that this module primarily strengthens the discrimination of dominant driving behavior patterns.

In comparison, when only the ECA channel attention module is incorporated (Network B), the improvement in OA is relatively modest, whereas the AA increases noticeably to 97.39%. This observation suggests that the ECA module does not primarily enhance performance by introducing additional temporal structures, but rather by adaptively modeling the importance distribution across different feature channels, thereby alleviating the imbalance in feature contributions among driving style categories. In the context of driving style recognition, different styles often rely on distinct combinations of key vehicle signals. The ECA mechanism is able to emphasize channels that are strongly correlated with specific driving styles, which improves the recognition performance for minority or easily confused classes and leads to a more balanced classification outcome.

When both the multi-scale convolutional structure and the ECA attention module are jointly integrated, the model achieves the best overall performance. The complete MSCA-Net attains AA and OA values of 97.97% and 99.12%, respectively, outperforming all single-module configurations by a clear margin. These results indicate a strong complementarity between the multi-scale convolution and ECA modules: the former provides hierarchical representations of driving behaviors across different temporal scales, while the latter performs effective feature selection and enhancement across channels. Their synergistic interaction enables the model to simultaneously account for temporal heterogeneity and feature discriminability in driving behaviors, thereby significantly improving its capability to model complex driving style patterns.

5. Application Study: Driving Style-Aware AMT Shift Strategy

To evaluate the applicability of driving-style recognition in vehicle control and to demonstrate the engineering value of the proposed behavioral perception model, this study investigates its integration into Automated Manual Transmission (AMT) control strategies. Conventional AMT shift control typically relies on fixed calibration maps designed for average driving behavior, which are insufficient to accommodate driver-specific differences in throttle usage, power-demand preference, and driving aggressiveness. As a result, shift decisions often fail to achieve an optimal balance among performance, fuel economy, and shift smoothness in real-world applications [44].

By employing the MSCA-Net-based driving-style recognition model, driving-style labels can be obtained in real time and used as a low-cost, easily deployable behavioral input for AMT control. Based on these recognition results, driver-adaptive adjustments can be introduced into the conventional shift logic, such as tuning shift thresholds, throttle-mapping characteristics, and hysteresis strategies according to individual driving tendencies. This coupling enables the transmission system to better align shift behavior with driver intentions, improving both power response and shift quality, and facilitating a transition from fixed calibration to style-aware control.

Accordingly, a behavior-perception-driven AMT intelligent control framework is established, illustrating the practical potential of driving-style recognition for adaptive transmission control in commercial vehicles.

5.1. Motivation for AMT Adaptation Based on Driving Style

In conventional commercial vehicle control, AMT shift strategies typically rely on fixed calibration maps, where gear shift decisions are determined based on upshift and downshift RPM thresholds, accelerator pedal position limits, and vehicle acceleration conditions. However, significant differences exist among drivers in terms of operational rhythm, throttle control amplitude, and acceleration/deceleration preferences, making it challenging for a fixed shift strategy to achieve optimal performance and fuel efficiency across diverse driving behaviors [45,46].

Based on the driver behavior recognition model developed in this paper, drivers can be categorized into three styles: conservative, normal, and aggressive. The typical shift requirements corresponding to each driving style are summarized in Table 7.

These differences indicate that dynamically adjusting shift strategy parameters according to the driver’s style can enhance vehicle performance, reduce energy consumption, and improve driving smoothness.

5.2. Framework of Driving Style–Aware AMT Shift Strategy

An overall framework for the proposed driver style-driven adaptive AMT shift strategy is presented in Figure 7. The system consists of three main modules, forming a closed-loop control process from driver behavior perception to adaptive adjustment of shift parameters.

The framework consists of three main modules. First, the CAN signal acquisition and preprocessing module collects multivariate vehicle dynamics and driver operation data in real time. The raw signals are cleaned, normalized, and segmented using a sliding-window approach to construct structured multi-channel temporal sequences, which serve as the input to the driving style recognition model.

Second, the real-time driving style recognition module, based on the proposed MSCA-Net, automatically extracts temporal features and outputs driving style classifications (conservative, normal, or aggressive) along with confidence levels. The model is suitable for online deployment and provides stable behavioral prior information for AMT control.

Finally, the adaptive shift strategy adjustment module modifies key shift parameters—such as upshift and downshift RPM thresholds, pedal sensitivity, and downshift response characteristics—according to the identified driving style. By incorporating driving style information directly into shift decision-making, the AMT system can adapt its behavior to different driver patterns, balancing fuel economy and dynamic performance.

Overall, the proposed framework enables style-aware and adaptive AMT shift control based on real-time driver behavior, demonstrating practical applicability in commercial vehicle transmission systems.

5.3. Style-to-Shift Parameter Mapping

Under typical operating conditions, automatic transmission shift rules are predefined with fixed timing and patterns. However, substantial inter-driver variability leads to diverse driving habits, making static shift strategies insufficient to accommodate individual preferences. Consequently, incorporating driver behavior recognition into the shift control logic is necessary to enable adaptive adjustment of shift decisions according to driver-specific characteristics.

The driver style-based shift correction process is illustrated in Figure 8. Multivariate operational signals, such as accelerator and brake pedal positions, vehicle speed, and acceleration, are first collected and preprocessed. The resulting time-series data are then input into the driver style recognition model to identify the current driving style—conservative, normal, or aggressive—in real time. Based on the recognition outcome, a corresponding shift strategy is selected: economy-oriented logic for conservative drivers, performance-oriented logic for aggressive drivers, and a balanced strategy for normal drivers. By dynamically adjusting shift points through style-dependent offset coefficients, the AMT shift parameters are updated online, enabling adaptive shift decisions that improve both power responsiveness and fuel efficiency.

To characterize the trade-off between power performance and fuel economy, the transfer characteristics of both aspects were analyzed, and two weighting coefficients, denoted as a and b, were introduced. Based on the identified driving style, these coefficients are employed to adjust the weighted shift points and derive the corrected shift rules:

v_{c} = a \cdot v_{d} + b \cdot v_{e}

(10)

where

v_{d}

and

v_{e}

denote the vehicle speeds corresponding to performance-oriented and economy-oriented shifting strategies, respectively, while a and b represent the performance and economy weighting coefficients. For conservative drivers, fuel economy is emphasized by setting

a = 0

and

b = 1

; for aggressive drivers, performance is prioritized with

a = 1

and

b = 0

; and for normal drivers, a balanced strategy is adopted with

a = b = 0.5

. This weighted formulation enables flexible adjustment of shift logic according to driver-specific preferences.

This section develops an adaptive AMT shifting control framework based on driving style recognition to demonstrate the potential application of the proposed recognition method in commercial vehicle transmission control. By introducing a mapping relationship between the recognized driving styles and shifting parameters, the framework provides a feasible approach for the online adjustment of shift strategies according to the driver’s operational intentions.

6. Conclusions

This paper focuses on the challenges associated with real-world heavy-duty commercial vehicle CAN bus multivariate time-series data, such as high noise levels, strong temporal dependencies, and complex operating conditions. The limitations of conventional driving style recognition methods in terms of performance and robustness under practical engineering scenarios are first analyzed. To address these issues, a driving style recognition model termed MSCA-Net, which integrates multi-scale convolutional structures with an efficient channel attention mechanism, is proposed. The model employs parallel multi-scale convolutions to capture both global temporal evolution trends and local dynamic variations of driving behaviors across different receptive fields, while the efficient channel attention mechanism adaptively reweights key CAN signal channels. As a result, MSCA-Net achieves unified modeling of temporal-scale heterogeneity and feature importance, avoiding reliance on handcrafted feature design. Experimental results based on real-world heavy-duty commercial vehicle CAN data collected under multiple operating conditions demonstrate that MSCA-Net outperforms comparative methods in terms of overall accuracy, average accuracy, and class discrimination capability, and exhibits strong robustness in the presence of noise and complex driving conditions. On this basis, the potential application of driving style recognition in adaptive AMT shift control is preliminarily explored, providing a reference for the development of intelligent and adaptive control strategies for commercial vehicle transmissions.

Author Contributions

Conceptualization, X.N., X.L. and Z.L.; methodology, X.N.; software, Z.L. and B.J.; validation, X.N. and X.L.; resources, X.N. and Z.L.; data curation, Z.L.; writing—original draft preparation, X.N.; writing—review and editing, X.N. and B.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy.

Conflicts of Interest

Author Xingfu Nie, Zun Li, and Bo Ji are employed by Shaanxi Fast Automobile Transmission Engineering Research Institute, which is directly invested and managed by Shaanxi Fast Auto Drive Group Co., Ltd. The company had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

References

Seraj, M. The Classification of Short- and Long-Term Driving Behavior for an Advanced Driver Assistance System by Analyzing Bidirectional Driving Features. arXiv 2023, arXiv:2302.14743. [Google Scholar] [CrossRef]
Guo, Y.; Liu, P.; Yuan, Q.; Liu, P.; Xu, J.; Zhang, H. A Review of Road Traffic Safety for Connected and Automated Vehicles. J. Traffic Transp. Eng. 2023, 23, 19–38. [Google Scholar]
Al-refai, G.; Al-refai, M.; Alzu’bi, A. Driving style and traffic prediction with artificial neural networks using on-board diagnostics and smartphone sensors. Appl. Sci. 2024, 14, 5008. [Google Scholar] [CrossRef]
Li, G.; Li, S.E.; Cheng, B.; Green, P. Estimation of Driving Style in Naturalistic Highway Traffic Using Maneuver Transition Probabilities. Transp. Res. Part C Emerg. Technol. 2017, 74, 113–125. [Google Scholar] [CrossRef]
Li, K.; Dai, Y.; Li, S.; Bian, M. Development Status and Trends of Intelligent Connected Vehicle (ICV) Technologies. J. Automot. Saf. Energy 2017, 8, 1–14. [Google Scholar]
Peng, J.; Tang, H.; Wang, C.; Gu, X.; Peng, H. Intelligent Vehicle Lane-Changing Intention Identification Method with Driving Style Recognition. In Proceedings of the 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD); IEEE: Tianjin, China, 2024; pp. 3036–3041. [Google Scholar]
Szumska, E.M.; Jurecki, R. The effect of aggressive driving on vehicle parameters. Energies 2020, 13, 6675. [Google Scholar] [CrossRef]
Bayi Boumal, F.L.; Aboud, A.; Mwanabute, S.; Nguietchuan, J.; Banquando, J. Influence of Driver Behavior on Fuel Efficiency in Intercity Buses: A Simulation-Based Study of the Yaoundé-Douala Corridor in Cameroon. Acad. J. Sci. Technol. 2025, 15, 27–36. [Google Scholar] [CrossRef]
Shahariar, G.H.; Bodisco, T.A.; Zare, A.; Sajjad, M.; Jahirul, M.I.; Van, T.C.; Bartlett, H.; Ristovski, Z.; Brown, R.J. Impact of driving style and traffic condition on emissions and fuel consumption during real-world transient operation. Fuel 2022, 319, 123874. [Google Scholar] [CrossRef]
Sagberg, F.; Selpi; Bianchi Piccinini, G.F.; Engström, J. A review of research on driving styles and road safety. Hum. Factors 2015, 57, 1248–1275. [Google Scholar] [CrossRef] [PubMed]
Xia, G.; Gao, J.; Tang, X.; Wang, S.; Sun, B. Control strategy for shift schedule correction based on driving habits for vehicles with automatic transmission. Int. J. Automot. Technol. 2020, 21, 407–418. [Google Scholar] [CrossRef]
Lee, E.H.; Yun, H.; Cho, S.H.; Lee, E. Aggressive driving in ride-hailing: Work hours and road conditions. J. Transp. Saf. Secur. 2025, 1–24. [Google Scholar] [CrossRef]
Wang, P.; Chen, Z.; Liu, W.T.; Majumdar, A.; Tsai, C.Y. Association between the risk of aberrant driving behavior and sleep disorder indices: A pilot study involving urban taxi drivers. J. Transp. Health 2025, 40, 101942. [Google Scholar] [CrossRef]
Qin, D.; Zhan, S.; Zeng, Y.; Su, L. Energy Management Strategy for Hybrid Electric Vehicles Based on Driving Style Recognition. J. Mech. Eng. 2016, 52, 162–169. [Google Scholar] [CrossRef]
Chu, D.; Deng, Z.; He, Y.; Wu, C.; Sun, C.; Lu, Z. Curve Speed Model for Driver Assistance Based on Driving Style Classification. IET Intell. Transp. Syst. 2017, 11, 501–510. [Google Scholar] [CrossRef]
Sun, G.; Rong, J.; Chang, X.; Liu, S.; Gao, Y. Driving Behavior Risk Assessment Method Based on Driving Behavior Pattern Transition. Automob. Technol. 2021, 11, 22–29. [Google Scholar]
Hao, J.; Yu, Z.; Zhao, Z.; Zhan, X.; Shen, P. Driving Style Recognition for Hybrid Electric Vehicles. Automot. Eng. 2017, 39, 1444–1450. [Google Scholar]
Zhao, D.; Zhong, Y.; Fu, Z.; Hou, J.; Zhao, M. A Review of Driving Behavior Recognition Methods Based on Vehicle Multisensor Information. J. Adv. Transp. 2022, 2022, 7287511. [Google Scholar] [CrossRef]
Wang, K.; Yang, Y.; Wang, S.; Yang, Z.; Zhang, J. Driving Style Clustering and Recognition. J. Hubei Univ. Automot. Technol. 2021, 35, 1–6. [Google Scholar]
de Zepeda, M.V.N.; Meng, F.; Su, J.; Zeng, X.-J.; Wang, Q. Dynamic Clustering Analysis for Driving Style Identification. Eng. Appl. Artif. Intell. 2021, 97, 104096. [Google Scholar] [CrossRef]
Cordero, J.; Aguilar, J.; Aguilar, K.; Chávez, D.; Puerto, E. Recognition of the Driving Style in Vehicle Drivers. Sensors 2020, 20, 2597. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Zhao, Z.; Shen, P.; Guo, Q. Driving Style Recognition Based on K-Means Clustering. Automob. Technol. 2018, 12, 8–12. [Google Scholar]
Li, L.; Yang, J.; Liu, S.; Zhang, X.; Guo, W.; Li, P. Classification and Recognition of Driving Styles for Domestic Drivers. J. Chongqing Univ. Technol. (Nat. Sci.) 2019, 33, 33–40. [Google Scholar]
Qi, G.; Du, Y.; Wu, J.; Xu, M. Leveraging Longitudinal Driving Behaviour Data with Data Mining Techniques for Driving Style Analysis. IET Intell. Transp. Syst. 2015, 9, 792–801. [Google Scholar] [CrossRef]
Han, W.; Wang, W.; Li, X.; Xi, J. Statistical-Based Driving Style Recognition Using Bayesian Probability with Kernel Density Estimation. IET Intell. Transp. Syst. 2019, 13, 22–30. [Google Scholar] [CrossRef]
Lee, G.S. Machine Learning-Based Driving Style Classification Using Real-World Data. In Proceedings of the 4th International Conference on Artificial Intelligence, Robotics, and Communication (ICAIRC); IEEE: Xiamen, China, 2024; pp. 53–57. [Google Scholar]
Gao, J.; Li, T.; Li, Z.; Xi, J.; Liu, P. Driving Style Recognition and Shift Control Strategy Considering Traffic Conflicts. Mod. Manuf. Eng. 2023, 09, 69–76. [Google Scholar]
Xue, Q.; Wang, K.; Lu, J.J.; Liu, Y. Rapid Driving Style Recognition in Car-Following Using Machine Learning and Vehicle Trajectory Data. J. Adv. Transp. 2019, 2019, 9085238. [Google Scholar] [CrossRef]
Higgs, B.; Abbas, M. Segmentation and Clustering of Car-Following Behavior: Recognition of Driving Patterns. IEEE Trans. Intell. Transp. Syst. 2014, 16, 81–90. [Google Scholar] [CrossRef]
Yan, S.; Teng, Y.; Smith, J.S.; Zhang, B. Driver Behavior Recognition Based on Deep Convolutional Neural Networks. In Proceedings of the ICNC-FSKD; IEEE: Changsha, China, 2016; pp. 636–641. [Google Scholar]
Wei, Y.; Zeng, X.; Chen, X.; Zhang, H.; Yang, Z.; Li, Z. HTSA-LSTM: Leveraging Driving Habits for Enhanced Long-Term Urban Traffic Trajectory Prediction. Appl. Sci. 2025, 15, 2922. [Google Scholar] [CrossRef]
He, X.; Xu, L.; Zhang, Z. Driving Behaviour Characterisation Using Phase-Space Reconstruction and Pre-Trained CNN. IET Intell. Transp. Syst. 2019, 13, 1173–1180. [Google Scholar] [CrossRef]
Cheng, Z.; Duan, Y.; Yang, M.; Feng, Z.; Wang, H.; Zhu, X.; Bao, L. Recognition of Dangerous Driving and Driving Styles in Weaving Areas Based on Hybrid Neural Networks. J. Automot. Saf. Energy 2025, 16, 688–697. [Google Scholar]
Zhang, H.; Nan, Z.; Yang, T.; Liu, Y.; Zheng, N. A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV); IEEE: Las Vegas, NV, USA, 2020; pp. 284–289. [Google Scholar]
Yu, J.; Sun, Z.; Yu, C. Unsupervised Learning of Fine-Grained and Explainable Driving Style Representations from Car-Following Trajectories. Appl. Sci. 2025, 15, 10041. [Google Scholar] [CrossRef]
Liang, K.; Zhao, Z.; Li, W.; Zhou, J.; Yan, D. Comprehensive Identification of Driving Style Based on Vehicle Driving Cycle Recognition. IEEE Trans. Veh. Technol. 2022, 72, 312–326. [Google Scholar] [CrossRef]
Dong, Y.; Hu, Z.; Uchimura, K.; Murayama, N. Driver inattention monitoring system for intelligent vehicles: A review. IEEE Trans. Intell. Transp. Syst. 2010, 12, 596–614. [Google Scholar] [CrossRef]
Ohn-Bar, E.; Trivedi, M.M. Looking at humans in the age of self-driving and highly automated vehicles. IEEE Trans. Intell. Veh. 2016, 1, 90–104. [Google Scholar] [CrossRef]
Tan, D.; Tian, W.; Wang, C.; Chen, L.; Xiong, L. Driver distraction behavior recognition for autonomous driving: Approaches, datasets and challenges. IEEE Trans. Intell. Veh. 2024, 9, 8000–8026. [Google Scholar] [CrossRef]
Lee, K.W.; Yoon, H.S.; Song, J.M.; Park, K.R. Convolutional neural network-based classification of driver’s emotion during aggressive and smooth driving using multi-modal camera sensors. Sensors 2018, 18, 957. [Google Scholar] [CrossRef] [PubMed]
Haydari, A.; Yılmaz, Y. Deep reinforcement learning for intelligent transportation systems: A survey. IEEE Trans. Intell. Transp. Syst. 2020, 23, 11–32. [Google Scholar] [CrossRef]
Hallac, D.; Bhooshan, S.; Chen, M.; Abida, K.; Sosic, R.; Leskovec, J. Drive2vec: Multiscale State-Space Embedding of Vehicular Sensor Data. In Proceedings of the IEEE ITSC; IEEE: Maui, HI, USA, 2018; pp. 3233–3238. [Google Scholar][Green Version]
Hanselmann, M.; Strauss, T.; Dormann, K.; Ulmer, H. CANet: An Unsupervised Intrusion Detection System for High-Dimensional CAN Bus Data. IEEE Access 2020, 8, 58194–58205. [Google Scholar] [CrossRef]
He, Y.; Sui, S.; Wang, Q.; Jin, Y.; Zhang, L.; Wang, J. Super-High-Speed AMT Shifting Strategy and Energy Consumption Optimization for Electric Vehicles. Energy 2025, 322, 135489. [Google Scholar] [CrossRef]
Xia, G.; Zhang, H.; Tang, X.; Wu, S.; Zhao, L.; Hu, J. Identification of Drivers’ Driving Habits and Shift Schedule Correction for Vehicles with Automatic Transmission. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2022, 236, 805–824. [Google Scholar] [CrossRef]
Lin, X.; Li, Y.; Xia, B. An Online Driver Behavior Adaptive Shift Strategy for Two-Speed AMT Electric Vehicles Based on Dynamic Corrected Factors. Sustain. Energy Technol. Assessments 2021, 48, 101598. [Google Scholar] [CrossRef]

Figure 1. Workflow of Driver Style Recognition.

Figure 2. Model Training Pipeline.

Figure 3. The Proposed MSCA-Net for Driver Style Recognition.

Figure 4. Data Acquisition Scheme.

Figure 5. Confusion Matrix of Driver Style Recognition Results.

Figure 6. Iterative Evolution of the Loss Value.

Figure 7. Framework of the Adaptive Shift Strategy Based on Driver Style Recognition.

Figure 8. Flowchart of the Adaptive Shift Strategy.

Table 1. Feature Categories and Descriptions.

Feature	Description
Speed	Speed control behavior and driving stability.
Acceleration	Acceleration and deceleration intensity and smoothness.
Throttle	Driver power demand and throttle response.
Braking	Braking intensity and braking behavior patterns.

Table 2. Sample Distribution Across Driving-Style Categories.

Class	Training Set	Validation Set	Test Set	Total
Moderate	8194	1639	6556	16,389
Aggressive	1592	318	1273	3183
Conservative	1440	288	1152	2880
Total	11,226	2245	8981	22,452

Table 3. Experimental Platform Configuration.

Component	Specification
CPU	Intel Core i5-13500H
Memory	16 GB
Programming Environment	Python 3.8
Deep Learning Framework	PyTorch 1.8.1

Table 4. Comparison of Classification Performance Across Different Models.

Class	2D-CNN	MLP	1D-CNN	MSCA-Net
Class	Precision/Recall	Precision/Recall	Precision/Recall	Precision/Recall
Moderate	95.67/91.04	99.77/94.11	99.11/97.20	98.91/99.89
Aggressive	99.71/81.93	85.53/92.85	98.74/92.22	99.44/97.41
Conservative	65.57/96.52	81.34/99.91	84.58/100.00	100.00/96.61
AA (%)	89.84	95.62	96.48	97.97
OA (%)	90.46	94.67	96.86	99.12
F1-score (%)	90.88	94.83	96.92	99.11
Macro-F1 (%)	87.11	91.86	95.06	98.70

Table 5. Performance comparison with and without KPCA preprocessing.

Class	Without KPCA	with KPCA
Moderate	99.10	99.89
Aggressive	94.80	97.41
Conservative	94.20	96.61
AA (%)	96.03	97.97
OA (%)	98.27	99.12

Table 6. Results of the Ablation Study.

Model	Multi-Scale Conv	ECA	AA (%)	OA (%)
Baseline	×	×	96.59	96.57
Net A	×	✔	96.57	97.32
Net B	✔	×	97.39	96.92
MSCA-Net	✔	✔	97.97	99.12

Table 7. Adjustment Principles of AMT Shift Strategies Under Different Driver Styles.

Driving Style	Behavioral Characteristics	Shift Strategy Adjustment Principles
Conservative	Small throttle openings	Lower upshift RPM
	Low acceleration variability	Conservative downshift logic
	Smooth speed profiles	Fuel-efficiency-oriented engine speed
Normal	Moderate throttle operation	Standard calibrated shift strategy
Normal	Medium acceleration and speed variations	Balanced performance and fuel economy
Aggressive	Large throttle openings	Higher upshift RPM
	High acceleration variability	More sensitive and earlier downshifts
	Strong dynamic demand	High engine speed preference

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nie, X.; Lin, X.; Li, Z.; Ji, B. Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention. Appl. Sci. 2026, 16, 1925. https://doi.org/10.3390/app16041925

AMA Style

Nie X, Lin X, Li Z, Ji B. Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention. Applied Sciences. 2026; 16(4):1925. https://doi.org/10.3390/app16041925

Chicago/Turabian Style

Nie, Xingfu, Xiaojun Lin, Zun Li, and Bo Ji. 2026. "Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention" Applied Sciences 16, no. 4: 1925. https://doi.org/10.3390/app16041925

APA Style

Nie, X., Lin, X., Li, Z., & Ji, B. (2026). Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention. Applied Sciences, 16(4), 1925. https://doi.org/10.3390/app16041925

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Driving Style Recognition for Commercial Vehicles Based on Multi-Scale Convolution and Channel Attention

Abstract

1. Introduction

2. Methodology for Driving Style Recognition Based on MSCA-Net

2.1. Problem Definition and System Overview

2.2. Feature Construction and Temporal Representation

2.3. Architecture of the Proposed MSCA-Net

3. Dataset and Experimental Setup

3.1. Data Acquisition from Commercial Vehicles

3.2. Dataset Construction

3.3. Parameter Settings and Training Details

3.4. Evaluation Metrics

4. Results and Discussion

4.1. Driving Style Recognition Performance

4.2. Comparison with Baseline Models

4.3. Ablation Study

4.3.1. Effect of KPCA-Based Dimensionality Reduction

4.3.2. Effect of Network Architecture

5. Application Study: Driving Style-Aware AMT Shift Strategy

5.1. Motivation for AMT Adaptation Based on Driving Style

5.2. Framework of Driving Style–Aware AMT Shift Strategy

5.3. Style-to-Shift Parameter Mapping

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI