Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques

Pinto, Victor; Severo, Verusca; Madeiro, Francisco

doi:10.3390/electronics14051013

Open AccessArticle

Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques

by

Victor Pinto

^*

,

Verusca Severo

and

Francisco Madeiro

Polytechnic School of Pernambuco (POLI), University of Pernambuco (UPE), Recife 50720-001, Brazil

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(5), 1013; https://doi.org/10.3390/electronics14051013

Submission received: 21 January 2025 / Revised: 22 February 2025 / Accepted: 25 February 2025 / Published: 3 March 2025

(This article belongs to the Special Issue Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

In the realm of product design and development, efficient retrieval and reuse of 3D CAD models are vital for optimizing workflows and minimizing redundant efforts. Manual labeling of CAD models, while traditional, is labor-intensive and prone to inconsistency, highlighting the need for automated classification systems. Multi-view convolutional neural networks (MVCNNs) offer an automated solution by leveraging 2D projections to represent 3D objects, balancing high classification accuracy with computational efficiency. Despite their effectiveness, the computational demands of MVCNNs pose challenges in large-scale CAD applications. This study investigates the use of optimization strategies, precisely pruning and quantization, in the scenario of MVCNN applied to the classification of 3D CAD mechanical models. By using different pruning and quantization strategies, we evaluate trade-offs between classification accuracy, execution time, and memory usage. In our evaluation of pruning and quantization techniques, 8-bit quantization reduced the memory used by the model from 83.78 MB to 21.01 MB, with accuracy only slightly decreasing from 93.83% to 93.59%. When applying 25% structured pruning, the model’s memory usage was reduced to 47.16 MB, execution time decreased from 133 to 97 s, and accuracy decreased to 92.14%. A combined approach of 25% pruning and 8-bit quantization achieved even better resource efficiency, with memory usage at 11.86 MB, execution time at 99 s, and accuracy at 92.06%. This combination of pruning and quantization leads to efficient MVCNN model optimization, balancing resource usage and classification performance, which is especially relevant in large-scale applications.

Keywords:

3D CAD model classification; deep learning; MVCNN; pruning; quantization; compression techniques

1. Introduction

Throughout the stages of product development, 3D model designers devote substantial time to locating relevant design information and resources to expedite the creative process. A significant portion of this design work could be streamlined by reusing or modifying existing computer-aided design (CAD) models, a commonly employed technique for saving time and costs. Efficient retrieval and reuse of CAD models are essential within CAD model management systems to optimize design workflows and minimize redundant design efforts. However, as CAD repositories continue to expand in scale and diversity, the organization of these massive collections becomes increasingly complex, making design reuse challenging [1].

Traditionally, the categorization and retrieval of 3D CAD models are achieved through a manual labeling process, requiring designers to label models with various tags and metadata. This method, while straightforward, is both time-consuming and error-prone, often leading to inconsistencies in labeling across different projects or teams [2]. Furthermore, CAD models generated across different development activities may possess unique parameters and feature definitions, which complicates classification under a unified labeling system and demands extensive data harmonization [3]. The inherent complexity of 3D CAD model structures, coupled with variations in origin, makes it difficult to establish universal rules for categorization. These challenges highlight the need for an automated classification system that can standardize the retrieval and organization of CAD models across different engineering domains.

In the context of 3D model representation, various CAD file formats exist to store geometric data. Among these formats, stereolithography (STL) has become widely adopted due to its simplicity and compatibility with both CAD systems and 3D printing workflows. STL files represent 3D surfaces as a collection of triangular faces, where each face is described by its three vertices and a normal vector indicating its orientation [4]. While STL files do not maintain advanced CAD features such as parametric relationships or construction history, their straightforward representation of surface geometry makes them particularly suitable for tasks like model classification and 3D printing preparation [4]. The format’s widespread use in industry and its ability to approximate complex geometries through triangular meshes make it the choice for our study.

In the era of Industry 4.0, the integration of technologies such as machine learning (ML) into industrial processes has transformed the way many businesses operate [5,6,7]. Deep learning (DL), a subset of ML, has found applications across various domains, impacting traditional practices by enhancing efficiency, accuracy, and scalability. In manufacturing, DL models are being deployed to optimize supply chains, predict equipment maintenance needs, and automate quality control processes, making production lines more adaptive and intelligent [8].

Recent advancements in deep learning, particularly in the use of convolutional neural networks (CNNs) for image-based tasks, offer promising solutions for the automated classification of CAD models [9]. Various deep learning approaches have been developed to tackle 3D model classification, each leveraging different data representations. Voxel-based models, such as NormalNet [10], convert 3D objects into grid-like structures, enabling machine learning models to operate directly on volumetric data. Mesh-based models, on the other hand, focus on representing an object’s surface geometry, capturing intricate structural details by using interconnected polygonal faces. Point cloud-based models, such as point cloud convolutional neural networks (PCNNs) [11], work by processing the raw 3D coordinates of points that represent the object’s surface, allowing for a sparse but detailed representation of the shape. While these approaches provide unique advantages in specific contexts, they often require a larger amount of training and computational power when compared to multi-view convolutional neural networks (MVCNNs) [12].

MVCNNs work by representing 3D objects with a series of 2D views taken from different angles. By leveraging these multiple perspectives, MVCNNs capture an object’s spatial information more effectively than single-view models, while avoiding the high computational burden of processing complex 3D data directly. This multi-view approach enables MVCNNs to maintain the high performance of 3D representations while operating within the computationally efficient framework of 2D CNNs [13]. For these reasons, the image-based MVCNN approach demonstrates advantages in classification accuracy compared to voxel-based, mesh-based, and point cloud-based methods [12,13,14].

However, despite being more computationally efficient than some methods that process 3D data directly, MVCNNs are still more demanding in terms of computational resources than traditional 2D CNNs due to the need to process multiple projections for each object, particularly in large-scale CAD repositories [15].

To address these challenges, techniques such as pruning and quantization can be employed to optimize deep learning models. Pruning and quantization are used for the compression of machine learning models. The quantization technique results in savings in the number of bits used for weight representation, reducing the precision of the model weights from 32 bits to 8 bits, for example. This decreases the memory needed to store the network, at the cost of decreasing the accuracy of the model [16]. Pruning also leads to compression of the machine learning model because it selectively removes less significant weights from the neural network, reducing the overall number of parameters and the number of computer operations needed to run the model, which improves its execution time and lowers the memory used by the network, but can reduce classification accuracy [17].

By optimizing an MVCNN model through pruning and quantization, we aim to create an efficient model for CAD classification, suitable for large databases in the industry 4.0 space, using a mechanical parts dataset for classification.

The main contributions of this paper are summarized as follows:

We analyze the effects of varying pruning ratios on the MVCNN model’s performance, evaluating trade-offs between classification accuracy, execution time, and memory demanded by the model. Furthermore, we examine the impact of applying quantization to the original MVCNN model, assessing its effect on the performance metrics stated above.
In addition, we analyze the simultaneous application of both pruning and quantization and its impact on those performance indicators.

As far as we are concerned, no other work in the literature has used a combination of pruning and quantization techniques on an MVCNN network for the classification of 3D CAD mechanical models.

2. Background

In the field of 3D object classification, several models have demonstrated state-of-the-art accuracy on generic datasets such as ModelNet40, which consists of common object categories. Among these models, one can mention architectures like RotationNet [15] and View-GCN [18], which leverage multi-view representations and advanced graph-based approaches.

Several datasets and models have been proposed for mechanical component classification, each with its strengths and limitations. The mechanical component benchmark (MCB) dataset [19], for instance, offers a large number of mechanical components. But many of its classes are similar in nature, especially the cylindrical models [20].

The model studied by Li et al. [20] also obtained good results regarding classification accuracy but used a subset of the MCB database.

The work by Kuzmin et al. [21] addresses quantization and unstructured pruning and shows that—for most scenarios—quantization outperforms unstructured pruning. Additionally, it shows that a combination of quantization and a lower magnitude unstructured pruning provides better results in terms of accuracy. However, it is important to note that the work does not explore structured pruning. In contrast, the work by Tian et al. [22] explores structured pruning and other optimization techniques to remove low-utility components while preserving task-relevant features, achieving efficiency without sacrificing accuracy. However, the work does not explore quantization.

3. Materials and Methods

Mechanical CAD model repositories are important in advancing research on 3D classification, retrieval, and recognition tasks. These datasets provide standardized benchmarks that enable the evaluation and comparison of deep learning models applied to mechanical components [19]. Table 1 presents information on different mechanical CAD 3D model repositories: number of models, number of classes, and normalized entropy.

In order to evaluate the class imbalance across databases, we employ normalized entropy, whose values vary between 0 and 1. A normalized entropy of 1 corresponds to the maximum uniformity of classes, in the sense that the probability of an element belonging to a class is equal to the probability of belonging to each one of the remaining classes. A normalized entropy equal to 0 reflects a scenario where a single class has all the elements and each one of the remaining classes is empty, i.e., with no element. Normalized entropy has been used in scenarios such as image encryption [23].

The entropy H of a dataset is given by the following:

H = \sum_{k = 1}^{K} P_{k} {log}_{2} (\frac{1}{P_{k}}),

(1)

in which K is the number of classes,

P_{k}

is the probability of an element belonging to class k, calculated as

P_{k} = \frac{n_{k}}{N}

(with

n_{k}

denoting the number of elements in class k, and N denoting the number of elements of the dataset).

The normalized entropy

H_{n o r m}

is given by the following:

H_{n o r m} = \frac{\sum_{k = 1}^{K} P_{k} {log}_{2} (\frac{1}{P_{k}})}{{log}_{2} K},

(2)

in which

{log}_{2} K

is the maximum possible entropy, occurring when all classes are equally probable.

Table 1. Comparison of mechanical CAD model repositories for 3D Classification.

Repository	Number of Models	Number of Classes	Normalized Entropy
Mechanical Components Benchmark (MCB) [19]	58,696	68	0.814
CADNET [24]	3317	43	0.984
Engineering Shape Benchmark (ESB) [25]	801	45	0.937
MCB-B [19]	18,038	25	0.848

The Mechanical Components Benchmark (MCB) stands out as the largest dataset, with 58,696 models across 68 classes; however, its normalized entropy of 0.814 suggests a higher class imbalance compared to the other datasets. The largest class comprises 7058 models while the smallest contains only 47. The MCB-B subset, though smaller with 18,038 models across 25 classes, exhibits a lower class imbalance, with a normalized entropy of 0.848.

According to Table 1, the dataset that has the greatest normalized entropy is CADNET, suggesting a more balanced distribution when compared to the other datasets. The Engineering Shape Benchmark (ESB) has a normalized entropy of 0.937, but its small size, 801 models in 45 classes, leads to challenges for training deep learning models.

Class imbalance can affect the convergence of learning algorithms. A theoretical analysis by Francazi et al. [26] demonstrated that data imbalance negatively impacts learning dynamics, causing sub-optimal convergence trajectories for both minority and majority classes during training.

As an attempt to decrease the effects of class imbalance, various strategies can be employed, such as oversampling the minority class or undersampling the majority class, as well as algorithmic approaches such as cost-sensitive learning, which assigns higher misclassification costs to the minority class. Additionally, ensemble methods and data augmentation techniques can be utilized to improve model performance in the presence of class imbalance [27]. In this work, class imbalance is addressed using a class weighting strategy to adjust the loss function and penalize misclassifications of underrepresented classes.

3.1. Data Preparation

The CADNET repository was selected for our study. Table 2 shows the details of the CADNET repository. While the dataset exhibits class imbalance, with some classes containing more samples than others, this imbalance is less severe compared to other repositories. As an attempt to decrease the chance of the MVCNN model becoming biased toward more frequent classes, we apply the same class weighting technique used in the original CADNET paper [24]. Mathematically, the class weight

w_{c}

for a class c is defined as follows:

w_{c} = \frac{N}{C \cdot n_{c}}

(3)

in which N denotes the number of samples, C denotes the number of classes, and

n_{c}

denotes the number of samples in class c.

The weight for each class is computed as the ratio of the number of samples to the product of the number of classes and the number of samples in the specific class. This weighting strategy ensures that the loss function penalizes misclassifications of underrepresented classes more heavily, thereby mitigating the effects of class imbalance during training [28].

To prepare the data for the MVCNN architecture, each 3D CAD model was rendered into 20 2D images. This was conducted using a Python version 3.10.12 script that positions each viewpoint at the vertices of a regular dodecahedron. Different 3D objects of the database are shown in Figure 1. The code for that purpose is available at https://github.com/bharadwaj-manda/CADNET-Dataset, accessed on 5 September 2024.

3.2. Model Architecture

This study adopts the MVCNN architecture, as proposed by Manda et al. [24], with its model’s structural representation depicted in Figure 2.

The network’s input is a grayscale image with dimensions of 256 × 256. The first convolutional layer contains 32 filters, each performing a 7 × 7 convolution on the input image. The resulting feature maps are then passed through multiple hidden convolutional layers before reaching the final output layer. The ReLU activation function is used throughout the network.

The hidden layers are composed of several residual blocks. In a standard convolutional network, the input to each layer is simply the output of the previous one. However, with residual connections, the input to a layer is the sum of the output from the previous layer and the value from the residual connection. The structure of the hidden layers is organized as follows.

The hidden layers are divided into five groups, each highlighted in a different shade of gray in Figure 2. Every group contains three residual blocks, and each residual block consists of two convolutional layers, resulting in six hidden layers per group. Residual connections link consecutive residual blocks within a group. At the beginning of every residual block, batch normalization is applied.

In Figure 2, solid lines are used to indicate that the number of filters remains constant, while dashed lines indicate where the number of filters is doubled. The number of filters and their sizes are annotated across the layers. All convolutions use a stride of 1. For layers marked with “*”, two additional operations precede batch normalization: a 2 × 2 max-pooling operation and a 1 × 1 convolution with a stride of 2. All hidden layers use ReLU activations.

Batch normalization is also applied to the final layer in the last group. The resulting feature maps are processed by a pooling layer that performs 4 × 4 average pooling. The pooled output is flattened into a 1D vector, which is then passed through two fully connected layers, each with 512 nodes. Both layers use ReLU activations and include dropout regularization with a probability of 0.25 to improve generalization and reduce overfitting [29].

The network contains a total of 33 layers: 30 convolutional layers (5 groups × 6 layers), 1 average pooling layer, and 2 fully connected layers. The output of the final hidden layer is fed into a fully connected layer whose number of nodes equals the number of classes. For CADNET, this corresponds to 43 nodes or a 1D vector of size 43. The vector’s elements represent class probabilities, and the class with the highest probability determines the predicted label for the input image.

3.3. Experimental Setup

All experiments were conducted on Google Colab, using an NVIDIA Tesla T4 GPU with 16 GB of VRAM. Key software and tools included TensorFlow and Keras for model development, training, and for implementing and evaluating quantization and pruning.

The training process used the same hyperparameters as Manda et al. [24]. A batch size of 20, a learning rate of 0.001, categorical cross-entropy as the loss function, and the Adam optimizer. After training, the model achieved a test accuracy of 93.83%.

The network was trained on the CADNET dataset with a train–test split of 80-20, resulting in 2654 3D objects for training and 663 3D objects for testing. After training the network, quantization and pruning techniques were applied to the trained model. The quantization applied does not require retraining, while each pruned network was retrained using the same hyperparameters as the original model.

3.4. Optimization Techniques

Quantization is a model optimization technique that reduces the precision of weights, biases, or activations. This process minimizes memory usage and computational demands, making it useful for deploying neural networks in resource-constrained environments like edge devices, mobile applications, or embedded systems [30]. Quantization introduces a trade-off by potentially decreasing the memory occupied by the model, while also decreasing the model’s accuracy, especially in cases where small variations in weights significantly impact the predictions.

The quantization approach was implemented using TensorFlow and utilizes the technique that is addressed in Section 3 of the paper by Jacob et al. [31].

The dynamic range (Min and Max) of the weights in each tensor was calculated. A scaling factor (Scale) was determined to map the original floating-point values into a quantized level. Each weight was then rounded to the nearest value within the scale, effectively representing the weights with lower precision. Bias terms were excluded from quantization because maintaining their precise representation is crucial for preserving the model’s classification accuracy without the need for retraining [31]. If altered, the network would require retraining. The calculations for the Scale and the quantized weights

Q (w)

are shown as follows [31]:

Scale = \frac{Max - Min}{2^{Bits} - 1},

(4)

in which Max and Min denote, respectively, the maximum and minimum values of the weights in the layer being quantized, and Bits is the number of bits used for quantization.

Q (w) = Round (\frac{w - Min}{Scale}) \cdot Scale + Min,

(5)

in which

Q (w)

is the quantized weight and w is the original weight.

In this implementation, the parameters were stored in 32 bits, and quantization was applied to the weights of the convolution layers and the fully connected layers. The bias terms, however, remained unquantized and retained their original full-precision values in an effort to minimize changes in the model’s learned behavior [31]. Three applications of this quantization technique were chosen, 16 bits, 8 bits, and 4 bits. Also, this method differs from other quantization techniques by directly applying quantization to a previously trained model without any further retraining or fine-tuning [31].

Pruning is a model compression technique that reduces the number of non-zero weights in a neural network. It enhances computational efficiency by reducing the number of operations needed to execute the model, leading to savings in execution time and in memory occupied by the model, enabling the deployment in resource-constrained environments such as mobile devices or embedded systems [32].

Pruning can be classified into structured and unstructured approaches: structured pruning removes entire components of the network, leading to savings in model size and inference time; unstructured pruning eliminates individual weights based on their magnitude but without altering the overall network structure [32].

The method used in this work was structured pruning, which focuses on the removal of individual filters or neurons in convolutional and dense layers while preserving biases and critical layers. The process was conducted on a layer-by-layer basis [33].

Only Conv2D and Dense layers were pruned, as these contain the majority of the trainable parameters. Layers such as BatchNormalization and output layers were left unchanged from the original model to preserve its prediction accuracy [34].

For each target layer, the L1-norm of the weights was computed to assess the importance of each filter (in Conv2D layers) or neuron (in Dense layers) in influencing the model’s classification. Filters or neurons with the smallest L1-norm values were considered less important and were pruned [34]. The pruning ratio was used to determine the proportion of filters or neurons to be removed. Biases were excluded from pruning since they play a critical role in the model’s functionality.

The equations and methodology for calculating importance and selecting the top filters and neurons are explained in Section 3.1 of the work by Li et al. [34]. For dense layers, the importance of the n-th neuron is calculated as follows:

I_{n} = \sum_{i = 1}^{I} | W_{i, n} |,

(6)

in which

I_{n}

represents the importance of the n-th neuron,

W_{i, n}

denotes the weight of the connection between the i-th input and the n-th neuron, and I is the number of input connections to the neuron. For convolutional layers, the importance of the f-th filter is calculated as follows:

I_{f} = \sum_{h = 1}^{H} \sum_{w = 1}^{W} \sum_{c = 1}^{C} | W_{h, w, c, f} |,

(7)

in which

I_{f}

represents the importance of the f-th filter,

W_{h, w, c, f}

denotes the weight at position

(h, w, c, f)

in the filter, H and W are the height and width of the filter, C is the number of input channels, and f is the index of the filter.

Filters or neurons are ranked based on their calculated importance. Those filters or neurons deemed most important will have the highest

I_{f}

and

I_{n}

values, respectively, and are retained. The filters or neurons with the lowest importance are then pruned and the layer is adjusted based on its new shape, the process is then repeated on a layer-by-layer basis until the final layer is pruned [34].

By sequentially pruning and reorganizing each layer, the network is gradually compressed, resulting in a pruned architecture by the end of the process.

The pruning ratios chosen for this study were based on the findings by Li et al. [34], where pruning was applied to a CNN model, and the accuracy remained minimally affected up to a pruning rate of 30%. Based on this observation, a maximum pruning ratio of 25% was chosen as a conservative approach to minimize any potential impact on model accuracy, especially considering the combination of pruning and quantization. Additionally, we explored other pruning ratios, decreasing by 5% at each level, to evaluate the effect of different pruning rates on the model’s accuracy.

Due to the restructured architecture, each pruned model is fine-tuned with the same training parameters as the original model [34]. This process reduces the overall complexity and computational cost of the model while impacting its ability to make accurate predictions.

Regarding the quantization levels, the work by Banner et al. [35] highlights 8-bit quantization as an effective balance, delivering strong results with minimal accuracy loss. Based on this, 8-bit quantization was selected as the baseline, while more aggressive (4-bit) and less aggressive (16-bit) quantization was also explored for comparison.

The main metrics used to evaluate the model after the application of quantization and pruning techniques were as follows: execution time for classification of the complete test set, computer memory occupied by the model, and its test set accuracy. To measure the execution time, the original model and each quantization and pruning implementation were tested three times each, and an average of the execution times was measured. Also, each execution of the test set for each quantization and pruning implementation was conducted in the same Colab execution environment to avoid Python library dependencies and other computing factors impacting the results obtained [36].

4. Results

The original MVCNN network shown in Figure 2 achieved a test accuracy of 93.83%.

The results obtained for the application of quantization and pruning techniques are summarized in Table 3.

The results provide an evaluation of the trade-offs between accuracy, execution time, and memory occupied by an MVCNN network under quantization and structured pruning. In terms of accuracy, the original model achieves the highest value at 93.83%, serving as the baseline for comparison.

Quantization effectively reduces the memory use of the model. The original model has 21,963,307 parameters. In the simulations performed, 21,939,371 parameters were reduced from 32 bits to the chosen quantization (16 bits, 8 bits, and 4 bits). For 16-bit and 8-bit quantization, memory savings were obtained (50% for 16 bits and 75% for 8 bits) while the test accuracy score decreased by less than 0.3% in both cases. For 4-bit quantization, the memory usage was reduced by 87.41%, but the test accuracy was severely impacted, scoring only 23.11%. The degradation in accuracy observed with 4-bit quantization can be attributed to the significant reduction in the number of representable states. While a 32-bit floating-point representation provides approximately 4.29 billion possible states, and 8-bit quantization reduces this to 256 states, a 4-bit representation allows for only 16 states, which can be insufficient for properly capturing the complexity of deep neural network weights and activations. Lower-bit quantization also introduces more quantization noise, which can propagate through the layers of the network, especially in deep architectures and if the model is not retrained [37].

Execution time is practically unaffected by quantization, remaining in the range of 132 to 134 s for all quantization levels, which is expected as the number of calculations per execution of the model remains the same.

Structured pruning reduces the number of active weights in the model, but its impact on accuracy varies based on the extent of pruning. With a pruning ratio of 5%, we see the technique decreased the model’s accuracy by 0.38%, while reducing the memory occupied by the model from 83.78 MB to 75.45 MB, also reducing the execution time from 133 s to 127 s. With a pruning ratio of 25% the original model’s accuracy was decreased by 1.69% while reducing the memory occupied by the model to 47.16 MB and the execution time to 97 s.

Based on the data in Table 3, the combination of pruning and quantization appears to offer savings in terms of execution time and memory usage without substantially affecting classification accuracy. The results show that applying both quantization and pruning to the original MVCNN model increases the memory savings achieved through pruning alone, while also reducing execution time. For instance, a 25% pruned model combined with 16-bit quantization reduces the memory occupied to just 23.63 MB, compared to 47.16 MB for pruning alone and 41.94 MB for 16-bit quantization alone. Despite this compression, the test accuracy of the combined technique was 92.13%.

The combination of 25% pruning and 8-bit quantization achieves a balance between memory efficiency and performance, with a test accuracy of 92.06%, an execution time of the test set of 99 s, and a memory use of just 11.86 MB. This finding indicates that structured pruning when combined with quantization can reduce the execution time by 25.56% (from 133 s to 99 s) and the memory occupied by the network by 85.84% (from 83.78 MB to 11.86 MB), while decreasing the classification accuracy of the test set by 1.89% (from 93.83% to 92.06%).

These findings highlight the effectiveness of quantization and structured pruning for reducing memory usage and execution time at the cost of accuracy losses. The results indicate that while accuracy decreased by a small margin, the advantages of memory efficiency and processing time make this approach viable for real-world deployments where resource optimization is critical.

A visual comparison of the model’s performance in terms of accuracy, execution time, and memory use can be seen in Figure 3.

Observing the results in Figure 3, there is a noticeable trend in the pruning techniques applied: as the pruning ratio increases, the savings in execution time and memory used by the model increase, while the classification accuracy decreases. The quantization saves memory usage while only minimally affecting execution time and accuracy, as Figure 3 demonstrates. Both 8-bit and 16-bit quantization achieve significant memory savings while impacting execution time and accuracy by a small amount.

These techniques enhance the efficiency of deep learning models regarding execution time and occupied memory, making them more suitable for deployment in resource-constrained environments such as mobile devices and embedded systems.

5. Conclusions

This study explored the application of pruning and quantization techniques to optimize a multi-view convolutional neural network (MVCNN) designed for 3D CAD model classification. The proposed optimizations focused on reducing the memory used by the machine learning model and execution time while maintaining good test accuracy levels. The results demonstrated that 8-bit quantization significantly reduced memory usage from 83.78 MB to 21.01 MB, while lowering the test accuracy by 0.24%, from a score of 93.83% to 93.59%. Similarly, structured pruning at varying levels showed that the model could tolerate pruning with small accuracy loss.

Notably, the combination of 25% pruning and 8-bit quantization achieved a balanced result, with 92.06% accuracy, an execution time of 99 s for the classification of the test set, and a dramatic reduction in the memory occupied by the model from 83.78 MB to just 11.86 MB. This optimization is particularly significant for deployment on resource-constrained devices, such as mobile phones, embedded systems, and IoT platforms, where computational efficiency, energy savings, and reduced latency are crucial.

Future studies could explore other pruning and quantization methods, or additional metrics, such as inference latency and energy consumption. Additionally, combinations of other optimization or compression techniques beyond pruning and quantization could be investigated.

Author Contributions

Conceptualization: V.P., V.S., and F.M.; methodology: V.P. and F.M.; software: V.P.; validation: V.P., V.S., and F.M.; formal analysis: V.P. and F.M.; investigation: V.P., V.S., and F.M.; resources: V.S. and F.M.; writing—original draft preparation: V.P.; writing—review and editing: V.P., V.S., and F.M.; visualization: V.P.; supervision: V.S. and F.M.; project administration: F.M. All authors have read and agreed to the published version of the manuscript.

Funding

The Article Processing Charge was funded by the University of Pernambuco (UPE).

Data Availability Statement

The code created for pruning and quantization used in the study are available in https://github.com/Victor-Pint/StructPruning_Quantization.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CAD	computer-aided design
ML	machine learning
DL	deep learning
CNN	convolutional neural network
PCNN	point cloud convolutional neural network
MVCNN	multi-view convolutional neural network
STL	stereolithography
MCB	Mechanical Component Benchmark
ESB	Engineering Shape Benchmark

References

Mandelli, L.; Berretti, S. CAD 3D Model classification by graph neural networks: A new approach based on STEP format. arXiv 2022, arXiv:2210.16815. [Google Scholar] [CrossRef]
Bonino, B.; Giannini, F.; Monti, M.; Raffaeli, R. Shape and context-based recognition of standard mechanical parts in CAD models. Comput.-Aided Des. 2023, 155, 103438. [Google Scholar] [CrossRef]
Fang, H.C.; Ong, S.K.; Nee, A.Y.C. Product remanufacturability assessment based on design information. Procedia CIRP 2014, 15, 195–200. [Google Scholar] [CrossRef]
Iancu, C.; Iancu, D.; Stăncioiu, A. From CAD model to 3D print via “STL” file format. Fiability Durability/Fiabilitate Durabilitate 2010, 1, 73–80. [Google Scholar]
Kim, H.; Yeo, C.; Lee, I.D.; Mun, D. Deep-learning-based retrieval of piping component catalogs for plant 3D CAD model reconstruction. Comput. Ind. 2020, 123, 103320. [Google Scholar] [CrossRef]
Adesso, M.F.; Hegewald, R.; Wolpert, N.; Schömer, E.; Maier, B.; Epple, B.A. Automatic classification and disassembly of fasteners in industrial 3D CAD-Scenarios. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 9874–9880. [Google Scholar] [CrossRef]
Ip, C.Y.; Regli, W.C. A 3D object classifier for discriminating manufacturing processes. Comput. Graph. 2006, 30, 903–916. [Google Scholar] [CrossRef]
Hernavs, J.; Ficko, M.; Klančnik, L.; Rudolf, R.; Klančnik, S. Deep learning in industry 4.0—Brief overview. J. Prod. Eng. 2018, 21, 1–5. [Google Scholar] [CrossRef]
Heidari, N.; Iosifidis, A. Geometric deep learning for computer-aided design: A survey. arXiv 2024, arXiv:2402.17695. [Google Scholar] [CrossRef]
Wang, C.; Cheng, M.; Sohel, F.; Bennamoun, M.; Li, J. NormalNet: A voxel-based CNN for 3D object classification and retrieval. Neurocomputing 2019, 323, 139–147. [Google Scholar] [CrossRef]
Atzmon, M.; Maron, H.; Lipman, Y. Point convolutional neural networks by extension operators. arXiv 2018, arXiv:1803.10091. [Google Scholar] [CrossRef]
Qi, S.; Ning, X.; Yang, G.; Zhang, L.; Long, P.; Cai, W.; Li, W. Review of multi-view 3D object recognition methods based on deep learning. Displays 2021, 69, 102053. [Google Scholar] [CrossRef]
Gezawa, A.S.; Zhang, Y.; Wang, Q.; Lei, Y. A review on deep learning approaches for 3D data representations in retrieval and classifications. IEEE Access 2020, 8, 57566–57593. [Google Scholar] [CrossRef]
Qi, C.R.; Su, H.; Nießner, M.; Dai, A.; Yan, M.; Guibas, L.J. Volumetric and multi-view CNNs for object classification on 3d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 5648–5656. [Google Scholar] [CrossRef]
Kanezaki, A.; Matsushita, Y.; Nishida, Y. Rotationnet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5010–5019. [Google Scholar] [CrossRef]
Wu, J.; Leng, C.; Wang, Y.; Hu, Q.; Cheng, J. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4820–4828. [Google Scholar] [CrossRef]
Li, G.; Wang, J.; Shen, H.W.; Chen, K.; Shan, G.; Lu, Z. CNNpruner: Pruning convolutional neural networks with visual analytics. IEEE Trans. Vis. Comput. Graph. 2020, 27, 1364–1373. [Google Scholar] [CrossRef]
Wei, X.; Yu, R.; Sun, J. View-GCN: View-based graph convolutional network for 3D shape analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1850–1859. [Google Scholar] [CrossRef]
Kim, S.; Chi, H.G.; Hu, X.; Huang, Q.; Ramani, K. A large-scale annotated mechanical components benchmark for classification and retrieval tasks with deep neural networks. In Computer Vision—ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; Part XVIII; pp. 175–191. [Google Scholar] [CrossRef]
Li, S.; Corney, J. Multi-view expressive graph neural networks for 3D CAD model classification. Comput. Ind. 2023, 151, 103993. [Google Scholar] [CrossRef]
Kuzmin, A.; Nagel, M.; Van Baalen, M.; Behboodi, A.; Blankevoort, T. Pruning vs quantization: Which is better? In Advances in Neural Information Processing Systems, Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Curran Associates Inc.: Red Hook, NY, USA, 2023; Volume 36, pp. 62414–62427. [Google Scholar]
Tian, Q.; Arbel, T.; Clark, J.J. Grow-push-prune: Aligning deep discriminants for effective structural network compression. Comput. Vis. Image Underst. 2023, 231, 103682. [Google Scholar] [CrossRef]
Lima, V.S.; Ferreira, F.A.; Madeiro, F.; Lima, J.B. Light field image encryption based on steerable cosine number transform. Signal Process. 2023, 202, 108781. [Google Scholar] [CrossRef]
Manda, B.; Bhaskare, P.; Muthuganapathy, R. A convolutional neural network approach to the classification of engineering models. IEEE Access 2021, 9, 22711–22723. [Google Scholar] [CrossRef]
Iyer, N.; Jayanti, S.; Ramani, K. An engineering shape benchmark for 3D models. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Long Beach, CA, USA, 24–28 September 2005; ASME: New York, NY, USA, 2005; Volume 47403, pp. 501–509. [Google Scholar] [CrossRef]
Francazi, E.; Baity-Jesi, M.; Lucchi, A. A theoretical analysis of the learning dynamics under class imbalance. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 10285–10322. Available online: https://dl.acm.org/doi/proceedings/10.5555/3618408 (accessed on 12 February 2025).
Chen, W.; Yang, K.; Yu, Z.; Shi, Y.; Chen, C.L. A survey on imbalanced learning: Latest research, applications and future directions. Artif. Intell. Rev. 2024, 57, 137. [Google Scholar] [CrossRef]
Johnson, J.M.; Khoshgoftaar, T.M. Survey on deep learning with class imbalance. J. Big Data 2019, 6, 27. [Google Scholar] [CrossRef]
Santos, C.F.G.D.; Papa, J.P. Avoiding overfitting: A survey on regularization methods for convolutional neural networks. ACM Comput. Surv. (CSUR) 2022, 54, 213. [Google Scholar] [CrossRef]
Kwasniewska, A.; Szankin, M.; Ozga, M.; Wolfe, J.; Das, A.; Zajac, A.; Ruminski, J.; Rad, P. Deep learning optimization for edge devices: Analysis of training quantization parameters. In Proceedings of the IECON 2019—45th Annual Conference of the IEEE Industrial Electronics Society, Lisbon, Portugal, 14–17 October 2019; Volume 1, pp. 96–101. [Google Scholar] [CrossRef]
Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2704–2713. [Google Scholar] [CrossRef]
Liang, T.; Glossner, J.; Wang, L.; Shi, S.; Zhang, X. Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing 2021, 461, 370–403. [Google Scholar] [CrossRef]
Chen, S.; Wang, W.; Pan, S.J. Deep neural network quantization via layer-wise optimization using limited training data. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 3329–3336. [Google Scholar] [CrossRef]
Li, H.; Kadav, A.; Durdanovic, I.; Samet, H.; Graf, H.P. Pruning filters for efficient convnets. arXiv 2016, arXiv:1608.08710. [Google Scholar] [CrossRef]
Banner, R.; Hubara, I.; Hoffer, E.; Soudry, D. Scalable methods for 8-bit training of neural networks. In Advances in Neural Information Processing Systems, Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; Curran Associates Inc.: Red Hook, NY, USA, 2018; Volume 31. [Google Scholar]
Chen, C. Hardware-Software Co-Exploration and Optimization for Next-Generation Learning Machines. Ph.D. Thesis, Nanyang Technological University, Singapore, 2024. Available online: https://dr.ntu.edu.sg/bitstream/10356/178423/2/PhD_Thesis_ChenChunyun-Final.pdf (accessed on 18 December 2024).
Wei, L.; Ma, Z.; Yang, C.; Yao, Q. Advances in the neural network quantization: A comprehensive review. Appl. Sci. 2024, 14, 7445. [Google Scholar] [CrossRef]

Figure 1. Sample 3D objects of the CADNET dataset in the STL format viewed in a 3D modeling software.

Figure 2. MVCNN architecture used in the study. The structure is discussed in the materials and methods section. The hidden layers are divided into five groups, each highlighted in a different shade of gray. Solid line arrows are used to indicate that the number of filters remains constant, while dashed lines indicate where the number of filters is doubled. Layers marked with * have two additional operations, a 2 × 2 max-pooling operation and a 1 × 1 convolution with a stride of 2.

Figure 3. Scatter plot illustrating the trade-off between test accuracy, execution time, and memory used across different models.

Table 2. Class distribution of the dataset CADNET.

Category Name	Number of 3D Objects	Category Name	Number of 3D Objects
90_degree_elbows	100	Gear_like_Parts	97
BackDoors	57	Handles	119
Bearing_Blocks	50	Intersecting_Pipes	50
Bearing_Like_Parts	50	L_Blocks	107
Bolt_Like_Parts	111	Long_Machine_Elements	77
Bracket_like_Parts	27	Long_Pins	104
Clips	54	Machined_Blocks	59
Contact_Switches	60	Machined_Plates	99
Container_Like_Parts	60	Motor_Bodies	58
Contoured_Surfaces	55	Non-90_degree_elbows	108
Curved_Housings	51	Nuts	125
Cylindrical_Parts	94	Oil_Pans	58
Discs	163	Posts	109
Flange_Like_Parts	109	Prismatic_Stock	86
Pulley_Like_Parts	61	Rectangular_Housings	70
Rocker_Arms	60	Round_Change_At_End	51
Screws	111	Simple_Pipes	66
Slender_Links	60	Slender_Thin_Plates	62
Small_Machined_Blocks	62	Spoked_Wheels	57
Springs	55	Thick_Plates	82
Thin_Plates	83	T-shaped_parts	65
U-shaped_parts	75
Number of 3D Objects	3317

Table 3. Accuracy, execution time, and memory occupied by the model for quantization and pruning techniques.

Technique Used	Test Accuracy (%)	Execution Time (s)	Memory Occupied by the Model (MB)
Original Model	93.83	133	83.78
16-Bit Quantization	93.81	133	41.94
8-Bit Quantization	93.59	132	21.01
4-Bit Quantization	23.11	134	10.55
5% Structured Pruning	93.45	127	75.45
10% Structured Pruning	92.85	123	67.62
15% Structured Pruning	92.81	117	60.40
20% Structured Pruning	92.29	110	53.43
25% Structured Pruning	92.14	97	47.16
5% Structured Pruning + 16-Bit Quantization	93.45	127	37.77
10% Structured Pruning + 16-Bit Quantization	92.85	123	33.86
15% Structured Pruning + 16-Bit Quantization	92.80	117	30.25
20% Structured Pruning + 16-Bit Quantization	92.28	110	26.76
25% Structured Pruning + 16-Bit Quantization	92.13	98	23.63
5% Structured Pruning + 8-Bit Quantization	93.35	128	18.93
10% Structured Pruning + 8-Bit Quantization	92.80	124	16.97
15% Structured Pruning + 8-Bit Quantization	92.69	118	15.17
20% Structured Pruning + 8-Bit Quantization	92.03	110	13.42
25% Structured Pruning + 8-Bit Quantization	92.06	99	11.86

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pinto, V.; Severo, V.; Madeiro, F. Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques. Electronics 2025, 14, 1013. https://doi.org/10.3390/electronics14051013

AMA Style

Pinto V, Severo V, Madeiro F. Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques. Electronics. 2025; 14(5):1013. https://doi.org/10.3390/electronics14051013

Chicago/Turabian Style

Pinto, Victor, Verusca Severo, and Francisco Madeiro. 2025. "Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques" Electronics 14, no. 5: 1013. https://doi.org/10.3390/electronics14051013

APA Style

Pinto, V., Severo, V., & Madeiro, F. (2025). Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques. Electronics, 14(5), 1013. https://doi.org/10.3390/electronics14051013

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimizing Multi-View CNN for CAD Mechanical Model Classification: An Evaluation of Pruning and Quantization Techniques

Abstract

1. Introduction

2. Background

3. Materials and Methods

3.1. Data Preparation

3.2. Model Architecture

3.3. Experimental Setup

3.4. Optimization Techniques

4. Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI