Blister Defect Detection Based on Convolutional Neural Network for Polymer Lithium-Ion Battery

To ensure the quality and reliability of polymer lithium-ion battery (PLB), automatic blister defect detection instead of manual detection is developed in the production of PLB cell sheets. A convolutional neural network (CNN) based detection method is proposed to detect blister in cell sheets employing cell sheet images. An improved architecture for dense block and a learning method based on optimization of learning rate are discussed. The proposed method was superior to other machine learning based methods when the classification performance and confusion matrix were compared in experiments. The proposed CNN method had the best defect detection performance and real-time performance for industry field application.


Introduction
The application of lithium-ion batteries has changed consumer electronic products, which greatly reduce the weight and volume of mobile phones, notebooks, and other portable products.Lithium-ion batteries have been widely studied, including model, optimal design, and so on [1][2][3][4][5][6].At present, the most commonly used lithium-ion battery is polymer lithium-ion battery (PLB).The cathode materials of PLB are commonly lithium cobalt acid, lithium manganese acid, ternary materials or lithium iron phosphate.The anode materials usually use carbon materials, such as artificial graphite, natural graphite, intermediate phase carbon microspheres, petcoke, carbon fiber, pyrolysis resin carbon and so on.C-rate is the measurement of the charge and discharge current with respect to its nominal capacity.At present, most PLB use polymer gel electrolyte instead of liquid electrolyte, which makes PLB have the advantages of thinning, arbitrary area and arbitrary shape.These characteristics improve the capacity of PLB, and PLB has the characteristics of miniaturization, thinning, and light quantification.PLB has been widely used in portable electronic equipment, and it is gradually being applied to more fields.With the application of PLB in more and more electronic products, the quality of PLB has more and more influence on the quality of electronic products.The quality of PLB is critical to the quality and reliability of electronic products.
Recently, an automatic line of PLB has been developed in Dongsheng Energy Corporation, Weihai, China.The PLB has a voltage of 48 V, a charging current of 2-3 A, and nominal capacity of 16 AH.The anode of PLB is a conductive high molecular polymer, the catrode is graphite, and a colloidal polymer electrolyte is used.During the production process of the PLB, several cathode pieces, anode pieces, and separator pieces are combined to produce a cell sheet.Great battery capacity can be provided with more cell sheets combined together.Thus, the quality inspection of cell sheets in automated production lines is essential to the quality of the final battery product.The cell sheet defect needs to be detected to ensure product quality.Blister is a kind of common defect in the grid net of cell sheet.There are two main reasons for blister.One is that the coating components do not mix properly with appropriate time to form a homogeneous slurry, and the other is that the feed rate is not appropriate [7].Blister can damage the chemical properties of the battery and cause micro-short circuits.Blister seriously affects the safety, service life and other quality characteristics of PLB.The detection of blister in cell sheets is inefficient and laborsome when it is performed manually.Automated blister detection methods need to be developed.Some automated defect detection methods for lithium-ion battery have been developed.For example, X-ray is used for electrode coating detection [8], computed tomography is employed to inspect defects and structural deformations [9], laser and thermography methods are used to detect battery electrodes [10], and thermography is used to detect defects [11,12].Reconstruction of object surface is an important method for these detection methods [13].Combining computer vision with senseless detection is also effective [14].Currently, X-Ray, computed tomography, and thermography methods are mainly used for inner defect detection of lithium-ion battery and are not used for blister detection, which appears on the surface of PLB.Laser and vision inspection can be used for blister detection.This article uses vision inspection technology.Compared with laser, the main advantages of vision inspection are: First, the hardware cost of the device is low.Second, the speed is fast.Laser detection requires scanning the PLB surface line-by-line, while visual inspection can complete image capture at once.
Visual inspection is a fast, convenient and economical method for detecting surface defects.Compared with manual processing, vision inspection does not suffer from fatigue, emotion, repetitive work boredom and other factors leading to reduced detection efficiency.Due to its real-time and low-cost features, it is widely used in automated production lines and other fields [15][16][17][18].Visual inspection has been considered for lithium-ion battery production recently employing industry camera and image processing technique [15][16][17].These studies focus on applying traditional image processing method to the inspection of flaws, scratches, and defects in battery separator or electrode surface.Structured light is important for visual inspection, and a novel classifier subset selection for stacked generalization is reported in [19].In these studies, feature extraction of defects or flaws is the key to successful detection.
Defect detection can be regarded as a classification problem of battery components.The battery components are classified into qualified and unqualified according to whether there are defects.Since machine learning methods have made great progress in classification problem and have produced many examples of successful applications, machine learning methods have also been used for defect detection for batteries.Some defect detection applications based on common machine learning algorithms have been developed.The neural network method is an early machine learning method.It is applied to Li-ion battery, defect diagnosis or evaluation of battery module state [20][21][22][23].Support vector machine (SVM) is able to solve the nonlinear classification problem when the number of samples in the training dataset is small.SVM is applied to the classification of post-weld defects for battery [24].Another improved tensor based SVM method is used for bubble detection in cell sheets [25,26].In these machine learning methods, the selected features determine the accuracy of the classification [27,28].These features are selected by hand, and how to select these features is a difficult problem.
With the development of machine learning recently, deep learning technology has shown impressive results in image classification applications [29][30][31][32][33][34][35][36].As a machine learning technology, deep learning simulates the human brain.It can automatically complete feature extraction, and the features can be employed for image classification with superior performance.As a widely used deep learning model, convolutional neural network (CNN) performs well in a variety of visual recognition tasks, especially in the field of image classification.CNN can automatically extract typical and representative features from the input image.CNN uses a hierarchical structure to gradually obtain the required advanced features from the low-level features, and then it uses these advanced features to complete image classification and other tasks.CNN has been used in defect detection of industry field.For example, CNN is employed to detect whether solders, chips or circuit boards have defects [37][38][39][40][41][42][43][44][45].
In this paper, a novel blister detection method based on CNN is proposed employing images of PLB sheets.The contribution of this paper includes two aspects.On the one hand, an improved CNN architecture with optimization based learning strategy is proposed.Trainable weight parameters are added to each skip connection to improve dense block.Optimization of the learning rate is used to improve the efficiency of training process.Experimental results indicate that the proposed CNN method is superior to other machine learning based methods for blister detection.On the other hand, this paper shows that deep learning based method has potential for defect detection application of PLB.
The rest of this paper is organized as follows.Blister defect detection for PLB and the proposed CNN method is described in Section 2. The experiments and performance evaluations are discussed in Section 3. Finally, this paper is concluded in Section 4.
The main abbreviations used in this paper are listed in Table 1.The main symbols used in this paper are listed in Table 2.

Data Capture
Visual inspection is employed to detect blister of sheet net of PLB.PLB sheet is controlled by the manipulator and images of both sides are captured.The field image acquisition in the automated production line is shown in Figure 1.The damages and scratches of PLB sheet are easy to be detected using traditional image processing methods.Due to the inconsistent background color, shape and size, blister defects cannot be well detected using the usual image processing method.The image of the PLB sheet is divided into multiple patch images, and each patch image is detected.Some images of blister defects are shown in Figure 2.

Detection Scheme Based on CNN
Blister defect detection problem of PLB is to judge whether PLB sheet has blister from images.It can be considered as an image classification problem.CNN method has achieved great success in the image classification applications.In this paper, an optimization based CNN method is proposed to detect blister defect of PLB sheet.
As an important deep learning model, CNN uses an architecture of multi-layer stack.Each layer in the stack can be considered an input-to-output conversion that is used to achieve selective extraction of image feature representations.CNN learns the mapping relationship between a large number of input samples and outputs with a combination of input layer, convolution layer, ReLU layer, pooling layer and fully connected layer.Input layer completes image preprocessing.The convolution layer implements the perception of local feature information of the image.These local information will be combined at a higher level to get global information.The convolution layer also greatly reduces the amount of computation through parameter sharing, and it extracts different features by employing multiple kernels.ReLU layer performs a nonlinear mapping of the output of convolutional layer.The pooling layer is used in the middle of a continuous convolutional layer to reduce overfitting and compress the amount of data and parameters.Fully connected layer is used to achieve the final classification using the advanced features extracted from the previous layers.The major feature of CNN is the shared convolution kernel, which works well for high-dimensional data processing.Another feature is that it is not necessary to manually select features and train weights.
In this paper, an efficient CNN based detection method is provided for blister detection in PLB sheets.The experimental results indicate that the proposed method is superior to other machine learning based methods.

Improved Architecture for CNN
In the deep learning approach, deeper networks are used to accomplish complex tasks.The learning process of the neural network adopts the strategy of backpropagation, that is, the error calculated by the loss function is used to guide the update and optimization of the weights of the deep network through the backpropagation of the gradient.The deep neural network is composed of many nonlinear layers, and each nonlinear layer can be regarded as a nonlinear function.Therefore, the entire deep network can be regarded as a composite nonlinear function.The purpose of neural network learning is to make this nonlinear function perform a good mapping between input and output.To find the optimal solution of different input and output, the learning process is to find the appropriate depth network weights so that the loss function takes a minimum value.The gradient descent method is used to solve this minimum problem.Its idea is to take the negative direction of the current gradient as the search direction, and adjust the weights to make the loss function approach the local minimum, that is, let the loss function become smaller and smaller.
In backpropagation, the gradient is updated layer by layer.The gradient update can be seen as multiplying the output of the upper layer network nonlinear function by a factor.If the factor is less than 1, as the number of layers increases, the gradient update will decay exponentially and become smaller and smaller, gradually disappearing.This is called the gradient vanishment; it causes poor learning and training effect.To solve the gradient vanishment problem, short paths are often created from early layers to later layers in CNN architecture.
DenseNet is an efficient architecture of CNN for image classification.In DenseNet, all layers are connected directly to ensure maximum information transmission to solve the gradient vanishment problem.DenseNet uses dense block to create short paths from early layers to later layers.Dense block uses skip connections not only to connect the upper and lower layers, but also to achieve cross-layer connections.The gradients obtained from each layer are derived from the gradient concatenation of the preceding layers.Because the gradient transfers directly between layers, the effect of gradient vanishment is reduced.This kind of architecture also strengthens the transmission and usage of features.
Denote the input and output of the dth layer as x d and y d , respectively, then where F d is a nonlinear transformation function, the symbol [] indicates the concatenation operation, and W d is the parameters of F d in the dth layer.In the dense block of DenseNet, previous layer features are skipped and connected with concatenation operation.That is to say, the features of different layers are treated in an equivalent manner in this architecture.However, this is not the case in actual classification systems.Not all features of the previous layers play a key role in image classification.Only some of the key features are important for classification.Inspired by this, a novel weight-based architecture is proposed to improve the network performance of dense block.An improved dense block architecture is proposed in this paper, as shown in Figure 3.Its detailed architecture is shown in Figure 4.In this architecture, the output is where k d,0 , k d,1 , ..., k d,d−1 are the parameters that determine the weights of x 0 , x 1 , ..., x d−1 to be concatenated together.These parameters are trained during the CNN training process.The whole CNN architecture proposed in this paper is illustrated in Figure 5.
In the above architecture, the weight parameters will be effectively trained in the training of CNN network.The weight parameters here have practical meanings for indicating how important the corresponding feature map is.The greater is the weight value, the more important is the role of the corresponding feature map in the classification task.That is, the corresponding features contain more useful information for classification.When the trainable weight parameters are introduced in our proposed architecture, the important features can be quickly found and efficiently represented for image classification.

Training Method Based on Optimization of Learning Rate
Learning rate is an important super-parameter in CNN.How to adjust the learning rate is one of the key elements for training a good CNN model.When the learning rate is too large, the learning process becomes unstable, and small learning rate leads to extremely long training time.By properly setting the learning rate, it is possible to reasonably improve the training speed and reduce the training time while stabilizing the training.
When the number of samples is large, the calculation of gradient descent processing on the entire sample is slow and inefficient.The method of dividing the samples into mini-batch is usually used to increase the speed.Let x be the input of CNN network in mini-batch processing, w be the network parameter, l be the learning rate, and the output is y = g(x, w). ( Using loss function can obtain the loss by comparing the output y with its label.The gradient is obtained with w = ∂C/∂w.w is updated as where t is the current iteration number.
In the process of above mini-batch-based learning, after the current mini-batch parameter w is updated, the processing and parameters of the next mini-batch data are continuously updated.However, the effect of parameter update for mini-batch learning has not been verified in this process.At the same time, the learning rate is usually manually selected based on experience, so it is likely that the calculation loss of the current mini-batch cannot be effectively reduced.
In this paper, a mechanism for optimizing the learning rate is provided.For each mini-batch, the optimal learning rate is found before the update formula (Equation ( 4)) is applied.Thus, the current mini-batch can reduce the loss function value.In other words, the original mini-batch does not guarantee that each update will be done in the direction of the correct gradient.The mechanism provided in this paper makes mini-batch update in the correct direction every time, which improves the efficiency of training and reduces the training time.
Flower pollination algorithm (FPA) is a new optimization algorithm of meta heuristic swarm intelligent.FPA is optimized by simulating the pollination process of flowering plants in nature.The pollination process includes two modes, self-pollination and cross-pollination, which represent local search and global search, respectively.Cross-pollination occurs between the pollens of different plants.Pollinators can fly for a long time and transmit pollen over a long distance.In contrast, self-pollination is the implantation of the same flower or different flowers of the same plant in the pollen, usually without pollinators.In the existing engineering applications, FPA shows the ability to search in the space with multiple local optima adaptively.FPA can avoid premature convergence, thus it has better performance.
Rule 1. Biological biotic and cross-pollination is a process of global pollination by pollinators with pollen.
Rule 2. Biological abiotic and self-pollination is the process of local pollination.Rule 3. Flower constancy.Plants and pollinators form a partnership to maximize the reproduction.Rule 4. Switch probability controls the conversion between global pollination and local pollination.FPA has achieved good results in solving multi-objective optimization problem and other application problems [50][51][52][53].FPA has robust performance for applications.FPA has only few parameters.FPA is employed to find optimal learning rate in this paper.
FPA simulates two kinds of pollination, which are cross-pollination and self-pollination.Each flower in FPA is regarded as a solution to the destination function.Cross-pollination or self-pollination is selected by a flower to reproduce.This choice is selected by switch probability, the probability of choosing cross-pollination is P C , and the probability of choosing self-pollination is 1 − P C .Cross-pollination operations draw on the cross-pollination method of different flowers by bees and butterflies at a long distance.The flight of pollinators is regarded as Levy flight, so the global pollination is modeled using a Levy distribution.Similarly, self-pollination models near-distance pollination in nature.
The optimal learning rate for CNN training is found in this paper employing FPA, and the main steps are summarized as below.
Step 1. Initialize parameters.The initial parameters include: maximum iteration number N, total pollen number m, and probability P C .The learning rate is the pollen in FPA model.It conforms to the standard distribution and takes values in the range of [l min , l max ]. m learning rates are created, which are listed as l 1 , l 2 , ..., l m .
Step 2. FPA operation.Probability P is chosen randomly.When P ≤ P C , the current learning rate l i is updated as below to simulate cross-pollination where l best is the optimal learning rate solution in the global.γ is the scaling factor; its value is suggested to be in the range of (0,10) in previous studies [49].It was found that the best result can be obtained when it is set to 0.1 in this application.Thus, it was set to 0.1 in this study.L can be drawn from Levy distribution as where Γ(λ) is a standard gamma function, s is step, and λ was set to 1.5 in this study as recommended [49].
When P > P C , the current learning rate l i is updated as below to simulate self-pollination operation where ε is drawn from the uniform distribution of [0, 1], l u and l v are two randomly pollens, which represent learning rates, and 1 ≤ u, v ≤ m.The implementation flow chart of FPA is illustrated in Figure 6.Previous studies have suggested that the range of P c is [0.1, 0.9], and the recommended value is 0.8 [54].In the FPA implementation of this study, P c was set to 0.8.

Dataset and Training
Some images of both sides of PLB sheets were captured in the automatic line.The size of the PLB image obtained from the camera was 2448 × 2048.The size of the input image in the CNN method was 219 × 219.We cut the polar area in each image of the PLB into 160 patch images with the size of 219 × 219.Because each PLB has two sides, 320 patch images of each PLB were taken.Professional engineers selected 600 qualified PLBs and 600 blister PLBs to obtain images.After cutting out patches, a typical patch was selected as sample image.Finally, a blister sample image dataset was created including 11,600 qualified images and 10,460 blister images.Then, 1800 qualified images and 1400 blister images were selected randomly for test in the CNN training, and the other images were employed for training.
It was obvious that the dataset was not large enough.To overcome the overfitting problem caused by small dataset training in CNN, transfer learning was employed.A set of pre-trained weights was transferred from ImageNet to the network proposed in this paper.After transferring network weights, the proposed network could be trained employing the blister image dataset.To enhance CNN image classification performance, batch normalization, dropout strategy and early stop scheme were used as in other image classification task.

Results and Discussions
To evaluate the proposed blister detection method based on CNN, some other machine learning based methods were employed for comparison: neural network (NN) [22], support vector machine (SVM) [24], support Tucker machine (STM) [25], and CNN methods with DenseNet model [46], ResNet model [47], VGG16 model [43], and fast RCNN model [45].The comparisons included classification performance evaluation and confusion matrix.

Classification Performance Evaluation
According to whether the classification results are correct, TP, TN, FP, and FN can be determined.TP means that the classification result is true and positive, TN means true negative, FP means false positive, and FN means false negative.
Recall, precision, accuracy, specificity and F1-score were employed as classification performance indicators to evaluate different methods.They are defined as follows [55].
Recall measures the proportion of actual positives that are correctly identified as such.Specificity represents the proportion of actual negatives that are correctly identified as such.Accuracy is defined as the proportion of all samples that have been successfully classified.Precision is the ratio of samples correctly classified as positive to all the samples that are classified.F1-score is the harmonic mean of precision and sensitivity.When the above performance index is greater, the classification performance is better.
The above-mentioned five indicators of different blister classification methods are listed in Table 3.The method proposed in this paper had the greatest value for all performance indicators, meaning that the proposed CNN based method was superior to other classification methods for blister recognition of PLB sheets.The performance comparison is also shown in Figure 7, which indicates that the proposed method had the best classification performance.The improved CNN architecture with optimization based training method was efficient for blister detection when trainable weight parameters were added to skip connections in dense block.To observe the effectiveness and efficiency of the proposed FPA optimization in this paper, an ablation study was performed on the proposed FPA optimization.The FPA optimization based method was compared with the method without FPA optimization in the experiment.The results of the comparison are shown in Table 4, and the data are also illustrated in Figure 8.By observing the data in Table 4 and Figure 8, it can be seen that the performance was obviously improved when applying the proposed FPA optimization.Because FPA optimization could find the optimum learning rate, the training process was more efficient and the classification results were improved.

Confusion Matrix
Confusion matrix is often used to visualize the performance of supervised learning based classification.The matrix row represents samples in a predicted class while matrix column indicates the samples in an actual class [55].Confusion matrices of the experiments in this paper are illustrated in Figure 9.The confusion matrix of the proposed method obtainEd the maximum value on the main diagonal and the minimum value in the secondary diagonal, showing that the proposed method had the best classification performance.This is consistent with the analysis results of the performance data presented in Section 3.1.This also indicates that our proposed method was the most efficient for blister detection.

Real-Time
All tests were performed on a computer with 32G RAM, Intel Xeon E5-2620 CPU, and NVIDIA GeForce 1080Ti GPU.Each sample test took no more than 0.3 s.The total processing time including capturing images of two sides and transportation on the production line for each PLB sheet was less than 10 s.If parallel processing were used, the processing speed could be improved.In the actual production line, parallel pipeline processing was adopted.The image acquisition and detection on both sides of the PLB were performed simultaneously, the time was shortened to half of the original, and the efficiency was increased to twice the manual processing.The proposed CNN based defect detection method was fast enough for real-time industry application.

Figure 3 .
Figure 3. Improved dense block with trainable parameters for concatenation.

Figure 5 .
Figure 5. Proposed CNN architecture for blister detection.

Figure 8 .
Figure 8. Classification performance comparison on FPA optimization.

Figure 9 .
Figure 9. Confusion matrices of different methods.

Table 3 .
Classification performance of different methods for blister detection.
Figure 7. Classification performance comparison of different methods.

Table 4 .
Classification performance comparison on the proposed FPA optimization.