A High-Robust Automatic Reading Algorithm of Pointer Meters Based on Text Detection

Automatic reading of pointer meters is of great significance for efficient measurement of industrial meters. However, existing algorithms are defective in the accuracy and robustness to illumination shooting angle when detecting various pointer meters. Hence, a novel algorithm for adaptive detection of different pointer meters was presented. Above all, deep learning was introduced to detect and recognize scale value text in the meter dial. Then, the image was rectified and meter center was determined based on text coordinate. Next, the circular arc scale region was transformed into a linear scale region by polar transform, and the horizontal positions of pointer and scale line were obtained based on secondary search in the expanded graph. Finally, the distance method was used to read the scale region where the pointer is located. Test results showed that the algorithm proposed in this paper has higher accuracy and robustness in detecting different types of meters.


Introduction
Pointer meters are widely used in petrochemical, electric power and other industries because of their simple structure, convenient use and low cost [1]. Because most of these meters have no digital communication interface, the manual reading method is usually adopted, but the manual detection cost is high and the efficiency is low, which is insufficient to meet the real-time and intelligent monitoring requirements in industry [2]. The automatic reading of pointer meters [3][4][5][6][7] can save a lot of labor and time costs for factories, so it has great practical value [8][9][10].
In the past few years, many researchers have put forward automatic recognition methods of pointer meters based on computer vision technology. Alegria et al. [11] first applied computer vision to meter readings, and developed an automatic calibration system for meter dials, which can automatically read the meter readings. They first used binary and thinning operations, then subtracted the two preprocessed images to extract the pointer, and used Hough transform to fit the pointer line, thus getting and calibrating the readings. Belan et al. [12] proposed a free segmentation method, and realized the calibration of digital and analog measuring instruments. They used radial projection and Bresenham line algorithm to locate the pointer position, thus obtaining the readings of pointer meters and calibrating the meters. Zheng et al. [13] proposed a robust automatic recognition algorithm. MSRCR with color recovery function was used for preprocessing to reduce the influence of brightness, and projection transformation was applied to obtain the front view of the image, and then Hough transform was used to recognize the pointer and get the readings. This algorithm has improved the robustness of meter recognition system to brightness and shooting angle. Gao et al. [14] put forward an adaptive algorithm for verifying automobile dashboard. They first used cascaded HOG/SVM and HOG/MSVM to locate and recognize the digital text of scale value. Contour analysis method was This paper is also superior to the algorithm proposed in the latest paper in some aspects. Liu [18] used the traditional angle method for reading, which requires manual input of the angles of the zero scale line and the full scale line in the image, which is low in efficiency. The algorithm proposed in this paper uses the secondary area search method to directly obtain the main scale line and pointer line. The horizontal coordinate can really be read by algorithm. Zhang [19] used the slope of the edge line of the instrument to correct the instrument image, which is only applicable to instrument panels with straight edges. The algorithm in this paper corrects the image based on the scale value text coordinates. The scale value text is a common feature of different types of instruments, so the algorithm has a wider range of adaptation. Cai [20] used an end-to-end neural network model to directly read the meter. When a new meter appears, a large amount of training data needs to be prepared again and the network needs to be retrained. Algorithm cannot be used flexibly on different instruments. The deep learning introduced in this article is to detect the scale value text which is the common feature of the instrument. Once the training is completed, algorithm can be applied to different instruments. He [21] used Mask-RCNN to obtain the pointer area. For new instruments with different pointer shapes, a large number of data sets need to be prepared and the network needs to be retrained. The algorithm cannot be flexibly applied to different instruments. This article uses the projection of pixel values to locate the horizontal coordinates of the pointer, which is applicable to pointers of different shapes. This paper is also superior to the algorithm proposed in the latest paper in some aspects. Liu [18] used the traditional angle method for reading, which requires manual input of the angles of the zero scale line and the full scale line in the image, which is low in efficiency. The algorithm proposed in this paper uses the secondary area search method to directly obtain the main scale line and pointer line. The horizontal coordinate can really be read by algorithm. Zhang [19] used the slope of the edge line of the instrument to correct the instrument image, which is only applicable to instrument panels with straight edges. The algorithm in this paper corrects the image based on the scale value text coordinates. The scale value text is a common feature of different types of instruments, so the algorithm has a wider range of adaptation. Cai [20] used an end-to-end neural network model to directly read the meter. When a new meter appears, a large amount of training data needs to be prepared again and the network needs to be retrained. Algorithm cannot be used flexibly on different instruments. The deep learning introduced in this article is to detect the scale value text which is the common feature of the instrument. Once the training is completed, algorithm can be applied to different instruments. He [21] used Mask-RCNN to obtain the pointer area. For new instruments with different pointer shapes, a large number of data sets need to be prepared and the network needs to be retrained. The algorithm cannot be flexibly applied to different instruments. This article uses the projection of pixel values to locate the horizontal coordinates of the pointer, which is applicable to pointers of different shapes.
The main contributions of the algorithm proposed in this paper are as follows: • Deep learning was applied to the detection of scale value text in the meters, which realizes the text coordinate positioning with high precision and robustness, and the text recognition with high accuracy. Also, compared with the distance method of reading from zero scale to full scale, using the recognition result of scale value in the distance method of reading allows a smaller error. • A novel meter center positioning method was proposed, which locates the meter center according to the position of scale value text. The image of scale value text provides more features than that of scale line, so it can adapt to more complex environments when used to fit the meter center.

•
The detection of scale value text was applied to the meter rectification. Since scale value text is a common feature of almost all meters, such design can greatly improve the adaptive ability of the algorithm.

•
Based on the position of scale value text, a secondary region search method was proposed to extract the pointer and scale line. This method has effectively solved the problem of pointer shadow, and also eliminated the influence of other objects in the dial on pointer and scale line extraction. The detailed algorithm flowchart is shown in Figure 2.
Sensors 2020, 20, x FOR PEER REVIEW 4 of 17 The main contributions of the algorithm proposed in this paper are as follows:  Deep learning was applied to the detection of scale value text in the meters, which realizes the text coordinate positioning with high precision and robustness, and the text recognition with high accuracy. Also, compared with the distance method of reading from zero scale to full scale, using the recognition result of scale value in the distance method of reading allows a smaller error.  A novel meter center positioning method was proposed, which locates the meter center according to the position of scale value text. The image of scale value text provides more features than that of scale line, so it can adapt to more complex environments when used to fit the meter center.  The detection of scale value text was applied to the meter rectification. Since scale value text is a common feature of almost all meters, such design can greatly improve the adaptive ability of the algorithm.  Based on the position of scale value text, a secondary region search method was proposed to extract the pointer and scale line. This method has effectively solved the problem of pointer shadow, and also eliminated the influence of other objects in the dial on pointer and scale line extraction. The detailed algorithm flowchart is shown in Figure 2.  Although different meter dials should may have different shapes and structures, their scale value texts are usually Arabic numerals. Hence, in this paper, meter rectification, meter center positioning, pointer and scale line extraction are all based on the detection of scale value text, which greatly improves the adaptive ability of the algorithm. The other parts of this paper are arranged as follows: Part 2 introduces the text detection and recognition algorithm based on deep learning, and image rectification based on text bounding box; Part 3 introduces the method of extracting pointer and scale line and meter reading based on secondary region search method; Part 4 illustrates the test results; and the final Part 5 shows the conclusions.

Digital Detection and Recognition of Scale Value
In this paper, the algorithms of meter rectification, meter center positioning, pointer recognition and scale line recognition are all based on the detection of meter scale value, while the traditional meter detection methods are all based on SVM classification [22], which is insufficient to adapt to complex industrial site environment. In order to improve robustness, a neural network was introduced, which makes the text detection and recognition algorithm more robust and adaptable, and also lays a good foundation for the later meter image rectification and reading.
In this paper, a FOTS [23] network was used to detect and recognize the text of scale values in the meter. FOTS is an end-to-end text recognition model, of which the network structure is shown in Figure 3. Firstly, a shared convolution network is used to extract the shared features of the image, and these features are used to determine the position of text region in the detection part. RoIRotate Although different meter dials should may have different shapes and structures, their scale value texts are usually Arabic numerals. Hence, in this paper, meter rectification, meter center positioning, pointer and scale line extraction are all based on the detection of scale value text, which greatly improves the adaptive ability of the algorithm. The other parts of this paper are arranged as follows: Part 2 introduces the text detection and recognition algorithm based on deep learning, and image rectification based on text bounding box; Part 3 introduces the method of extracting pointer and scale line and meter reading based on secondary region search method; Part 4 illustrates the test results; and the final Part 5 shows the conclusions.

Digital Detection and Recognition of Scale Value
In this paper, the algorithms of meter rectification, meter center positioning, pointer recognition and scale line recognition are all based on the detection of meter scale value, while the traditional meter detection methods are all based on SVM classification [22], which is insufficient to adapt to complex industrial site environment. In order to improve robustness, a neural network was introduced, which makes the text detection and recognition algorithm more robust and adaptable, and also lays a good foundation for the later meter image rectification and reading.
In this paper, a FOTS [23] network was used to detect and recognize the text of scale values in the meter. FOTS is an end-to-end text recognition model, of which the network structure is shown in Figure 3. Firstly, a shared convolution network is used to extract the shared features of the image, and these features are used to determine the position of text region in the detection part. RoIRotate  In FOTS, the shared convolution neural network is a U-shaped structure, as shown in Figure 4. It uses ResNet50 [24] for encoding, and then obtains the shared features by decoding through repeated up-sampling, connection operation and two-layer convolution. For points on the feature map, in the detection part, whether these points belong to the text region is firstly predicted, followed by the distance from the points to the four boundaries of the text region and the rotation angle of the text box. After that, a threshold is set to filter the points, and non-maximum suppression is applied to the generated prediction box, finally obtaining multiple region areas. RoIRotate module uses bilinear interpolation sampling to convert the text feature map with indefinite length and certain angle into a feature map without angle but with definite length. In the recognition part, the convolution neural network, which only contracts in height, is firstly used to further encode the feature map inputted, and then the bidirectional LSTM is used to decode the features to generate the final prediction string. The structure of the recognition network is shown in Table 1.   In FOTS, the shared convolution neural network is a U-shaped structure, as shown in Figure 4. It uses ResNet50 [24] for encoding, and then obtains the shared features by decoding through repeated up-sampling, connection operation and two-layer convolution. For points on the feature map, in the detection part, whether these points belong to the text region is firstly predicted, followed by the distance from the points to the four boundaries of the text region and the rotation angle of the text box. After that, a threshold is set to filter the points, and non-maximum suppression is applied to the generated prediction box, finally obtaining multiple region areas. RoIRotate module uses bilinear interpolation sampling to convert the text feature map with indefinite length and certain angle into a feature map without angle but with definite length. In the recognition part, the convolution neural network, which only contracts in height, is firstly used to further encode the feature map inputted, and then the bidirectional LSTM is used to decode the features to generate the final prediction string. The structure of the recognition network is shown in Table 1.  In FOTS, the shared convolution neural network is a U-shaped structure, as shown in Figure 4. It uses ResNet50 [24] for encoding, and then obtains the shared features by decoding through repeated up-sampling, connection operation and two-layer convolution. For points on the feature map, in the detection part, whether these points belong to the text region is firstly predicted, followed by the distance from the points to the four boundaries of the text region and the rotation angle of the text box. After that, a threshold is set to filter the points, and non-maximum suppression is applied to the generated prediction box, finally obtaining multiple region areas. RoIRotate module uses bilinear interpolation sampling to convert the text feature map with indefinite length and certain angle into a feature map without angle but with definite length. In the recognition part, the convolution neural network, which only contracts in height, is firstly used to further encode the feature map inputted, and then the bidirectional LSTM is used to decode the features to generate the final prediction string. The structure of the recognition network is shown in Table 1

Image Rectification
The image taken by the camera perpendicular to the meter dial is the ideal image. However, in the collection process, it cannot be guaranteed that the camera is always perpendicular to the meter dial, so the image is rectified by projection transformation to reduce subsequent reading errors. Here, we assume that the distance between the meter pointer and the meter plane is much smaller than the distance between the camera imaging surface and the target object. Based on this assumption, we use the same parameters to perform projection transformation on the pointer and meter plane images.
The rules for projection transformation are shown in Equations (1)- (3).
wherein, (U, V) is the coordinates of a point in the original image; (X, Y) is the coordinates of the point in the transformed visual plane; (u, v, w) and (x, y, w ) are the expressions of homogeneous coordinate system for (U, V) and (X, Y), respectively; w and w take 1. T is the transformation matrix from the original visual plane to the new visual plane, and it can be uniquely determined according to the coordinates of four pairs of points in these two visual planes. It is easy to obtain the position of scale value text in the ideal image, that is, (X, Y) in Equation (2) is known, and the center coordinates of the text bounding box in the image to be read is taken as (U, V), so as to rectify the meter image.

Pointer and Scale Extraction
After scale value text positioning and image rectification, the meter is read in the polar coordinate space in this algorithm, which include three steps: (1) Determine the polar coordinates transformation center and expand the polar coordinates; (2) Extract the pointer and scale line; (3) Obtain the reading according to the spatial relation between pointer and scale line. In this part, an adaptive method for determining polar transform center and a method for extracting pointer and scale line based on secondary region search are introduced.

Polar Transform
The pointer meters with higher precision always have a denser scale distribution, which cause greater difficulty in separating the single scale in the curved region, so does the application of angle method in reading. In consequence, it is necessary to carry out polar transform of the meter image, and convert the curved scale into a linear scale whose relative position is easy to calculate. The essence of polar transform is to transform an image from Cartesian coordinate system to polar coordinate system with a certain point in the image as the center, and the correctness of such transformation largely depends on the accuracy of extracting the center. Hence, a center extraction method with great robustness is proposed, i.e., using text bounding box to extract the center.
The scale values of the meter are distributed on an arc, of which the center is the rotation center of the meter pointer. Therefore, the center of polar transform can be determined by fitting the arc with the coordinates of scale value texts as the data points. After the accurate coordinates of the scale value text box are obtained, the center coordinates are obtained by least square fitting method [25]. After that, the instrument image is converted into a polar coordinate system according to the Equations (4) and (5).
wherein, x o , y o is the abscissa and ordinate in the original coordinate system; ρ and θ are the polar radius and polar angle in the polar coordinate system; C x and C y are the poles in the polar coordinate system. After the polar radius and polar angle of each pixel in the polar coordinate system are worked out, the polar radius and polar angle are taken as the abscissa and ordinate and expanded in the rectangular coordinate system. Figure 5 shows the process of obtaining the center and expanding the polar coordinates based on the coordinates of scale value texts.
wherein, o x , o y is the abscissa and ordinate in the original coordinate system; ρ and θ are the polar radius and polar angle in the polar coordinate system; x C and y C are the poles in the polar coordinate system. After the polar radius and polar angle of each pixel in the polar coordinate system are worked out, the polar radius and polar angle are taken as the abscissa and ordinate and expanded in the rectangular coordinate system. Figure 5 shows the process of obtaining the center and expanding the polar coordinates based on the coordinates of scale value texts.

Pointer and Scale Extraction
After polar transform, the coordinates of point

Pointer and Scale Extraction
After polar transform, the coordinates of point (x o , y o ) in the original coordinate system is changed to ( W 360 × θ, H − ρ), where W and H are the width and height of the image after polar transform, and θ and ρ represent the polar angle and polar radius, respectively, which can be obtained by Equations (4) and (5). By calculating the coordinates of all vertices of the scale value text box in the original image after polar transform, the point set A{P 1 , P 2 , . . . , P m } of all vertices of the scale value text box in the new image can be obtained, as shown in Figure 6a. From Equations (6)-(9), the region R1 of whole scale value texts is obtained, where X R1 , Y R1 , W R1 , H R1 represent the abscissa and ordinate of vertex in the left upper corner of this region, and its width and height, respectively; x 1 , x 2 , . . . , x m is the abscissa of points in the point set A, and y 1 , y 2 , . . . , y m is the ordinate of the same. Figure 6b shows the R1 region.
In the primary search region, the image is firstly subject to threshold segmentation by the Otsu method and transformed into binary images, and then the white pixels in each column of the image are projected on the X-axis, and the horizontal position with the least accumulated number of pixels is found, as shown in Figure 7. The lower image is the one after projection, and X pointer is the horizontal position with the least number of pixels, that is, the horizontal position where the pointer is located. Although the pointer shapes vary with the meter types, it is always the same that the number of pixels projected must be the least at the horizontal position of the pointer. Consequently, the horizontal coordinate of the pointer obtained by projection has high accuracy and robustness.
R1 region is expanded so that h' = 2h is true. Wherein, h is the height of R1; h' is the height of R1 after expansion. The region after expansion is the primary search region ROI1, as shown in Figure 6c.
In the primary search region, the image is firstly subject to threshold segmentation by the Otsu method and transformed into binary images, and then the white pixels in each column of the image are projected on the X-axis, and the horizontal position with the least accumulated number of pixels is found, as shown in Figure 7. The lower image is the one after projection, and pointer X is the horizontal position with the least number of pixels, that is, the horizontal position where the pointer is located. Although the pointer shapes vary with the meter types, it is always the same that the number of pixels projected must be the least at the horizontal position of the pointer. Consequently, the horizontal coordinate of the pointer obtained by projection has high accuracy and robustness.
R1 region is expanded so that h' = 2h is true. Wherein, h is the height of R1; h' is the height of R1 after expansion. The region after expansion is the primary search region ROI1, as shown in Figure 6c.
In the primary search region, the image is firstly subject to threshold segmentation by the Otsu method and transformed into binary images, and then the white pixels in each column of the image are projected on the X-axis, and the horizontal position with the least accumulated number of pixels is found, as shown in Figure 7. The lower image is the one after projection, and pointer X is the horizontal position with the least number of pixels, that is, the horizontal position where the pointer is located. Although the pointer shapes vary with the meter types, it is always the same that the number of pixels projected must be the least at the horizontal position of the pointer. Consequently, the horizontal coordinate of the pointer obtained by projection has high accuracy and robustness.  The next step is the extraction of scale line. Compared with the pointer, the scale line has less obvious features and is more prone to the influence of other objects in the meter dial, so the search range is further narrowed on the basis of primary search region to extract the scale line. According to the position of scale value texts, the secondary search region can be obtained in the following steps: The region containing the scale value text bounding box can be obtained based on the vertexes of such text box, and then a same region as this one is formed on its above, as shown in Figure 8a, and the secondary search region ROI 2 is therefore obtained. Another vertical projection is conducted in the secondary search region. The white pixels in each line of the image are projected onto the X-axis, and the horizontal position with the least accumulated number of pixels is found, as shown in Figure 8b. X scale is the horizontal position with the least number of pixels, i.e., the horizontal position of the main scale line. The next step is the extraction of scale line. Compared with the pointer, the scale line has less obvious features and is more prone to the influence of other objects in the meter dial, so the search range is further narrowed on the basis of primary search region to extract the scale line. According to the position of scale value texts, the secondary search region can be obtained in the following steps: The region containing the scale value text bounding box can be obtained based on the vertexes of such text box, and then a same region as this one is formed on its above, as shown in Figure 8a, and the secondary search region ROI2 is therefore obtained. Another vertical projection is conducted in the secondary search region. The white pixels in each line of the image are projected onto the X-axis, and the horizontal position with the least accumulated number of pixels is found, as shown in Figure 8b wherein, V is the final reading; r V is the scale value corresponding to the scale line on the right side of the pointer, and l V is the scale value corresponding to the left side. The readings are calculated in the primary search region. The distance method in the horizontal direction is used for reading. In the pixel coordinate system of the primary search region, the horizontal coordinate X pointer of the pointer and the horizontal coordinates X l-scale and X r-scale of the scale lines corresponding to the scale value text on both sides of the pointer are worked out, as shown in Figure 9. And reading is performed according to Equation (10).
wherein, V is the final reading; V r is the scale value corresponding to the scale line on the right side of the pointer, and V l is the scale value corresponding to the left side.
Sensors 2020, 20, x FOR PEER REVIEW 9 of 17 The next step is the extraction of scale line. Compared with the pointer, the scale line has less obvious features and is more prone to the influence of other objects in the meter dial, so the search range is further narrowed on the basis of primary search region to extract the scale line. According to the position of scale value texts, the secondary search region can be obtained in the following steps: The region containing the scale value text bounding box can be obtained based on the vertexes of such text box, and then a same region as this one is formed on its above, as shown in Figure 8a, and the secondary search region ROI2 is therefore obtained. Another vertical projection is conducted in the secondary search region. The white pixels in each line of the image are projected onto the X-axis, and the horizontal position with the least accumulated number of pixels is found, as shown in Figure 8b

Experiments
The proposed algorithm is evaluated and compared with previous algorithms in this part. The results of scale value text detection, image rectification and extraction of pointer and scale are described in turn. The proposed algorithm is established by using tensorflow platform and opencv library, and is tested on a host with 3.6 GHz Intel Corei 7 processor and 32 GB memory. A test platform with real instruments is established, which includes power supply, meters, multimeter, light source and camera, as shown in Figure 10. Based on the platform, this article establishes two data sets, one of which is used to evaluate the performance of the algorithm, and to compare the algorithm in this article with other algorithms. The images in this dataset include three meter types. And it combines instrument images under different conditions (such as uniform lighting, strong light exposure, shadows and different shooting angles). Each meter has 50 images in each case. Table 2 shows this data set. The other data set is mainly to verify the effect of shooting angle on readings. The data set contains images taken when the angles of each meter are −60, −45, −30, −15, 0, 15, 30, 45 and 60, as shown in Figure 11. All the instrument images in the data set recorded the real values measured by the multimeter when they were collected. The data set for this article is currently not public.

Experiments
The proposed algorithm is evaluated and compared with previous algorithms in this part. The results of scale value text detection, image rectification and extraction of pointer and scale are described in turn. The proposed algorithm is established by using tensorflow platform and opencv library, and is tested on a host with 3.6 GHz Intel Corei 7 processor and 32 GB memory. A test platform with real instruments is established, which includes power supply, meters, multimeter, light source and camera, as shown in Figure 10. Based on the platform, this article establishes two data sets, one of which is used to evaluate the performance of the algorithm, and to compare the algorithm in this article with other algorithms. The images in this dataset include three meter types. And it combines instrument images under different conditions (such as uniform lighting, strong light exposure, shadows and different shooting angles). Each meter has 50 images in each case. Table 2 shows this data set. The other data set is mainly to verify the effect of shooting angle on readings. The data set contains images taken when the angles of each meter are −60, −45, −30, −15, 0, 15, 30, 45 and 60, as shown in Figure 11. All the instrument images in the data set recorded the real values measured by the multimeter when they were collected. The data set for this article is currently not public.     The proposed algorithm is evaluated and compared with previous algorithms in this part. The results of scale value text detection, image rectification and extraction of pointer and scale are described in turn. The proposed algorithm is established by using tensorflow platform and opencv library, and is tested on a host with 3.6 GHz Intel Corei 7 processor and 32 GB memory. A test platform with real instruments is established, which includes power supply, meters, multimeter, light source and camera, as shown in Figure 10. Based on the platform, this article establishes two data sets, one of which is used to evaluate the performance of the algorithm, and to compare the algorithm in this article with other algorithms. The images in this dataset include three meter types. And it combines instrument images under different conditions (such as uniform lighting, strong light exposure, shadows and different shooting angles). Each meter has 50 images in each case. Table 2 shows this data set. The other data set is mainly to verify the effect of shooting angle on readings. The data set contains images taken when the angles of each meter are −60, −45, −30, −15, 0, 15, 30, 45 and 60, as shown in Figure 11. All the instrument images in the data set recorded the real values measured by the multimeter when they were collected. The data set for this article is currently not public.   Figure 11. Setting of shooting angle. Figure 11. Setting of shooting angle.

Scale Value Text Detection and Image Rectification
The public dataset SynthText [26] is used to pre-train the end-to-end text detection and recognition network, and then the tagged meter dataset is used to fine-tune the model. The training method is gradient descent in small batch, and data size of each batch is 64, with interaction of 100 cycles. The initial learning rate is 0.001, and is in an exponential attenuation with the attenuation rate of 0.95. Data enhancement technique is adopted in the training, including cropping, rotating, hue changing and Gaussian noise.
To verify the feasibility and robustness of the image rectification algorithm, this algorithm is applied to meter images under different shooting angles. Figure 12e-h shows the transformation results of Figure 12a-d. The test results demonstrated that as long as the meter has the features of scale value text on its dial, the parameters of projection transformation matrix can be obtained based on the text, and the image after rectification can be obtained.

Scale Value Text Detection and Image Rectification
The public dataset SynthText [26] is used to pre-train the end-to-end text detection and recognition network, and then the tagged meter dataset is used to fine-tune the model. The training method is gradient descent in small batch, and data size of each batch is 64, with interaction of 100 cycles. The initial learning rate is 0.001, and is in an exponential attenuation with the attenuation rate of 0.95. Data enhancement technique is adopted in the training, including cropping, rotating, hue changing and Gaussian noise.
To verify the feasibility and robustness of the image rectification algorithm, this algorithm is applied to meter images under different shooting angles. Figure 12e-h shows the transformation results of Figure 12a-d. The test results demonstrated that as long as the meter has the features of scale value text on its dial, the parameters of projection transformation matrix can be obtained based on the text, and the image after rectification can be obtained.

Extraction of Pointer and Scale Line
This section first describes the results of image polar transform. The accuracy of polar transform directly affects the accuracy of extracting pointer and scale line, and the extraction of meter center is the key of polar transform. A circularity-based center extraction method was proposed by Ma [15], but this method is only applicable to certain centers with circular features. A center determination method based on double Hough transform was presented by Sheng [17]. This algorithm is tested in the test set in this paper, and is compared with proposed algorithm method based on scale value text coordinates. Table 3 shows the test results of recognizing the meter center with double Hough transform voting algorithm and proposed algorithm, respectively. Manual measured values and Recognition values represent the center coordinates, and error represents the distance between the recognition values and manual measured values. The pixels of the image are 2000 × 2000. From the table, the recognition results of the 6th, 7th and 9th meter images with double Hough transform voting algorithm have a larger error, and it is analyzed and found that this algorithm cannot accurately fit the center under strong light and shadow. This is because the features of scale line in the meter image are not obvious under strong light and shadow, so that the center cannot be fitted with enough fitting points. In this paper, text is used to fit the center instead of scale line. Since the feature quantity of text is far greater than that of scale line, the center can be accurately located either under strong light or shadow. Figure 13 shows such differences.

Extraction of Pointer and Scale Line
This section first describes the results of image polar transform. The accuracy of polar transform directly affects the accuracy of extracting pointer and scale line, and the extraction of meter center is the key of polar transform. A circularity-based center extraction method was proposed by Ma [15], but this method is only applicable to certain centers with circular features. A center determination method based on double Hough transform was presented by Sheng [17]. This algorithm is tested in the test set in this paper, and is compared with proposed algorithm method based on scale value text coordinates. Table 3 shows the test results of recognizing the meter center with double Hough transform voting algorithm and proposed algorithm, respectively. Manual measured values and Recognition values represent the center coordinates, and error represents the distance between the recognition values and manual measured values. The pixels of the image are 2000 × 2000. From the table, the recognition results of the 6th, 7th and 9th meter images with double Hough transform voting algorithm have a larger error, and it is analyzed and found that this algorithm cannot accurately fit the center under strong light and shadow. This is because the features of scale line in the meter image are not obvious under strong light and shadow, so that the center cannot be fitted with enough fitting points. In this paper, text is used to fit the center instead of scale line. Since the feature quantity of text is far greater than that of scale line, the center can be accurately located either under strong light or shadow. Figure 13 shows such differences.  Detection center of proposed algorithm Figure 13. Detection results of meter image center Figure 13. Detection results of meter image center Polar transform is performed after the center is obtained. The curved scale region is transformed into a linear one, and then the pointer and scale are extracted from the image after transformation. Instead of using the method of fitting scale line and pointer line to extract them, projection is carried out in the two narrowed ROI regions, and the horizontal position of pointer and scale line is determined to get the reading, as shown in Figure 14.
Sensors 2020, 20, x FOR PEER REVIEW 13 of 17 Polar transform is performed after the center is obtained. The curved scale region is transformed into a linear one, and then the pointer and scale are extracted from the image after transformation. Instead of using the method of fitting scale line and pointer line to extract them, projection is carried out in the two narrowed ROI regions, and the horizontal position of pointer and scale line is determined to get the reading, as shown in Figure 14. The result of automatic reading is compared with the real value to verify the accuracy of meter detection algorithm. Take the reading measured by the multimeter as the real value. The result of automatic reading algorithm is taken as the test value. The reference error is calculated according to the Equation (11).
wherein, x is the interpretation value of the algorithm; 0 x is the real value; m x is the meter full-scale value. Proposed algorithm and other algorithms [14,15] are used for the reading of different meters. Table 4 shows part of the reading results of the meter image reading using the algorithm proposed in this paper and other algorithms. It can be seen that the algorithm proposed in this article has higher accuracy than the other two algorithms, and it can also be seen that the algorithm proposed in this paper is more robust to the environment and shooting angle. The reading error of the different shooting angles image is larger than that of other environmental reading errors. The reason is analyzed because even if the image is rectified, the position of the pointer is still different from that in the front view. However, this article has realized the rectification of the instrument, and its accuracy has been greatly improved compared with other reading algorithms. Table 5 shows the average relative error of the proposed algorithm and the other two algorithms [14,15] for three different types of meter readings. It can be seen that the algorithm proposed in this paper can achieve better accuracy for different meters and has good adaptive capabilities.  The result of automatic reading is compared with the real value to verify the accuracy of meter detection algorithm. Take the reading measured by the multimeter as the real value. The result of automatic reading algorithm is taken as the test value. The reference error is calculated according to the Equation (11).
wherein, x is the interpretation value of the algorithm; x 0 is the real value; x m is the meter full-scale value. Proposed algorithm and other algorithms [14,15] are used for the reading of different meters. Table 4 shows part of the reading results of the meter image reading using the algorithm proposed in this paper and other algorithms. It can be seen that the algorithm proposed in this article has higher accuracy than the other two algorithms, and it can also be seen that the algorithm proposed in this paper is more robust to the environment and shooting angle. The reading error of the different shooting angles image is larger than that of other environmental reading errors. The reason is analyzed because even if the image is rectified, the position of the pointer is still different from that in the front view. However, this article has realized the rectification of the instrument, and its accuracy has been greatly improved compared with other reading algorithms. Table 5 shows the average relative error of the proposed algorithm and the other two algorithms [14,15] for three different types of meter readings. It can be seen that the algorithm proposed in this paper can achieve better accuracy for different meters and has good adaptive capabilities.

Analysis of the Error
This section analyzes the error caused by the circle center fitting, and the result is shown in Figure 15. Figure 15a is the accurate rotation center and the center coordinates obtained by the algorithm; the blue point is the accurate rotation center obtained by the intersection of pointers with different rotation angles. The pixel distance between the green coordinate and the blue coordinate is 15 pixels, and the pixel distance between the red coordinate and the blue coordinate is 95 pixels. Figure 15b is the result of polar coordinate transformation centered on the correct rotation center; Figure 15c is the result of polar coordinate transformation with the center of the 15-pixel distance error as the center; Figure 15d is the result of polar coordinate transformation with the center of the 95-pixel distance error as the center; Figure 15f-h show the primary search region ROI 1 obtained from Figure 15b-d. The figure also shows the horizontal coordinates of the pointer and the main tick mark obtained by secondary search algorithm. The readings can be calculated according to Equation (10) to be 196.29 V, 196.82 V, and 0 V respectively. It can be seen from the results that the 15-pixel circle center fitting error will not have a large impact on the image after the polar coordinate transformation, nor will it introduce a large reading error. 95-pixel distance error as the center; Figure 15f-h show the primary search region ROI1 obtained from Figure 15b-d. The figure also shows the horizontal coordinates of the pointer and the main tick mark obtained by secondary search algorithm. The readings can be calculated according to Equation (10) to be 196.29 V, 196.82 V, and 0 V respectively. It can be seen from the results that the 15-pixel circle center fitting error will not have a large impact on the image after the polar coordinate transformation, nor will it introduce a large reading error.  We tested the images in the second data set and calculated the error. The relationship between shooting angle and error is shown in Figure 16. It can be seen from the chart that the greater the shooting angle, the greater the reading error caused, which is caused by the fact that the pointer and the meter are not on the same plane. When the shooting angle is within 30 degrees, the reading error is within 1%, and when the angle is greater than 30 degrees, the error is greater than 1%.

Conclusions
According to the spatial distribution pattern of scale value text and scale region of pointer meters, an automatic reading algorithm of pointer meters based on text detection is proposed, which has high robustness and adaptability. First of all, deep learning is applied to detect and recognize scale value text in the meter dial. Then, the image is rectified and meter center is determined based on the coordinates of scale value text. Next, the curved scale region is transformed into a linear scale region by polar transform, and secondary region search is realized based on the position of scale text, thus obtaining the horizontal positions of pointer and scale line. Finally, the distance method was used to read the scale region where the pointer is located. Just input an ideal image of this type of We tested the images in the second data set and calculated the error. The relationship between shooting angle and error is shown in Figure 16. It can be seen from the chart that the greater the shooting angle, the greater the reading error caused, which is caused by the fact that the pointer and the meter are not on the same plane. When the shooting angle is within 30 degrees, the reading error is within 1%, and when the angle is greater than 30 degrees, the error is greater than 1%. We tested the images in the second data set and calculated the error. The relationship between shooting angle and error is shown in Figure 16. It can be seen from the chart that the greater the shooting angle, the greater the reading error caused, which is caused by the fact that the pointer and the meter are not on the same plane. When the shooting angle is within 30 degrees, the reading error is within 1%, and when the angle is greater than 30 degrees, the error is greater than 1%.

Conclusions
According to the spatial distribution pattern of scale value text and scale region of pointer meters, an automatic reading algorithm of pointer meters based on text detection is proposed, which has high robustness and adaptability. First of all, deep learning is applied to detect and recognize scale value text in the meter dial. Then, the image is rectified and meter center is determined based on the coordinates of scale value text. Next, the curved scale region is transformed into a linear scale region by polar transform, and secondary region search is realized based on the position of scale text, thus obtaining the horizontal positions of pointer and scale line. Finally, the distance method was used to read the scale region where the pointer is located. Just input an ideal image of this type of meter for the algorithm to perform image rectification, and the algorithm can realize

Conclusions
According to the spatial distribution pattern of scale value text and scale region of pointer meters, an automatic reading algorithm of pointer meters based on text detection is proposed, which has high robustness and adaptability. First of all, deep learning is applied to detect and recognize scale value text in the meter dial. Then, the image is rectified and meter center is determined based on the coordinates of scale value text. Next, the curved scale region is transformed into a linear scale region