Fast and Flexible Movable Vision Measurement for the Surface of a Large-Sized Object

The presented movable vision measurement for the three-dimensional (3D) surface of a large-sized object has the advantages of system simplicity, low cost, and high accuracy. Aiming at addressing the problems of existing movable vision measurement methods, a more suitable method for large-sized products on industrial sites is introduced in this paper. A raster binocular vision sensor and a wide-field camera are combined to form a 3D scanning sensor. During measurement, several planar targets are placed around the object to be measured. With the planar target as an intermediary, the local 3D data measured by the scanning sensor are integrated into the global coordinate system. The effectiveness of the proposed method is verified through physical experiments.


Introduction
In the process of manufacture and assembly of large-sized objects the use of sensor feedback to guide the processing and assembly by the rapid online measurement of large-sized surface morphologies can significantly improve the machine efficiency and quality of the resulting parts. With advances in computer technology, image processing, and pattern-recognition technologies, vision measurement technology has rapidly developed. Vision measurement systems have gradually become OPEN ACCESS the most important means of three-dimensional (3D) surface topography measurement for large-sized objects [1][2][3][4][5]. Currently, the main characteristic of 3D surface topography measurements for large-sized objects is that the measuring position is essentially fixed because in the industrial field each batch of large-sized product does not change significantly in shape. However, the fast-paced production and the small measuring space require that the measurement system be highly accurate, its speed rapid, and the structures be simple and flexible. However, most existing vision measurement systems cannot meet the needs of fast-paced on-site production. Thus, research on fast and high-precision 3D shape measurement of large-sized objects is important in the industrial field.
At present, 3D shape measurement of objects basically uses three methods: structured light vision measurement, Fourier profilometry, and phase profilometry. Structured light vision measurement includes the multi-line structured light method and the encoding structured light method. Light strip matching is more difficult, so the multi-line structured light method [6,7] is usually applied in object geometric measurement. Meanwhile, the coded structured light method [8][9][10] is an effective means of obtaining dense 3D point clouds of object 3D surface morphology; the method operates on a simple principle and has a high degree of automation, so it is the most commonly used among the 3D shape measurement methods. The biggest advantage of Fourier transform profilometry [11][12][13] is that by using only one image, it can achieve 3D object surface topography measurement. Therefore, it is suitable for dynamic 3D measurements. Its disadvantages are its long operation time and low automated performance, so it is not suitable for industrial measurements. Phase measurement profilometry [14][15][16] has high accuracy and is currently the most frequently used 3D shape measurement method, however, the algorithm is more complex and the phase unwrapping problem exists.
Single-vision sensors cannot achieve overall large-sized object 3D surface topography measurement because of the occlusions. The usual method is to divide the area to be measured into a plurality of sub-regions, and all the sub-region 3D data are integrated into the global coordinate system to obtain a 3D morphology of the entire surface of the object. Depending on the different overall unified approaches, the surface 3D morphology vision measurement of large-sized objects can be divided into two categories: movable single-vision sensor measurement and fixed multiple vision sensor measurement.
Movable single-vision sensor measurement method measures the 3D morphology of the entire surface of a large-sized object using a movable single-vision sensor. It uses simple equipment and is low cost. This method affixes labels on the object or uses a planar target to integrate the subregion 3D data into the global coordinate system. A typical method using adhesive markers is the ATOS movable 3D optical measurement system developed by the GOM Company. However, many of the objects to be measured (e.g., soft objects, liquids, or high-precision mechanical components) cannot be labeled. Meanwhile, this method has a long adhesive marking time and the marker is easily deformed. By contrast, using the planar target method [17], errors caused in the single movable measurement can easily accumulate. In addition, the planar target needs to be placed before the measured object in this method, the measuring time is long, and the operation is complex.
Fixed multiple-vision sensor measurement [18][19][20] needs to employ more on-site vision sensors and complete the global calibration of multiple vision sensors. Based on the global calibration results, data obtained by each vision sensor are united into the global coordinate system. Currently, typical measurement systems using fixed multiple-vision sensors include those of the American company Perceptron, with its auto-body geometry detection system, and Italian company MERMEC, with its online trial of full profile measurement systems. The principle of the method is simple, but the systems are complex and on-site calibration is difficult. After the measurement system is moved or the measured objects are changed, the measurement system needs to be reconfigured and recalibrated. At present, the method is often used in the geometry size measurement of the mass-produced large-sized products in the industrial field, but it is not suitable for large-sized and complex 3D surface reconstruction.
Based on the above analysis, compared with the fixed multiple-vision sensors measurement, the movable single-vision sensor measurement method seems more suitable for 3D surface topography measurement of the large-sized object. To achieve rapid measurement of the 3D surface topography of large-sized object, particularly mass-produced large-sized objects in the industrial field, the method proposed herein combines a raster binocular vision sensor with a wide-field camera to form a 3D scanning sensor. Multiple plane targets arranged in the surroundings of the measured object are used as intermediaries and the local 3D data obtained from the 3D scanning sensor are integrated into the global coordinate system. The remainder of the paper is organized as follows: Section 2 describes the structure and the mathematical model of the 3D scanning sensor, Section 3 provides a detailed description of the basic principles of the algorithm, and Section 4 verifies the effectiveness of the proposed algorithm through experiments.

System Measurement Principle
The structural schematic of the measurement system is shown in Figure 1. The coordinate systems of planar targets 1 and 2 are t1 t1 t1 t1 O x y z and t2 t2 t2 t2 O x y z , respectively. o o o o O x y z is the 3D scanning sensor coordinate system. The coordinate system of planar target 1 t1 t1 t1 t1 O x y z is selected as the global coordinate system G G G G O x y z . The 3D scanning sensor is placed in front of the measured object to ensure that the wide-field camera can "see" the target plane. C,t1 T and C,t2 T are the transformation matrices from the 3D scanning sensor coordinate system o o o o O x y z to the coordinate systems of plane targets 1 and 2, respectively. t2,t1 T is the transformation matrix from the plane target 2 to plane target 1.

Figure 1.
The structural schematic of the measurement system.
The proposed system includes a 3D scanning sensor, multiple planar targets, a high-speed image acquisition system, a computer, measurement software, and the corresponding mechanical structure. The basic principle of the measurement system is as follows: first, multiple planar targets are arranged around the measured object. Second, the raster binocular stereo vision sensor of the 3D scanning sensor measures the local 3D surface of the object, and the wide-field camera of the 3D scanning sensor measures the planar target. Finally, these planar targets function as the mediators to integrate all local 3D data measured by the 3D scanning sensor into the global coordinate system G G G G O x y z .

3D Scanning Sensor
As shown in Figure 2, the 3D scanning sensor includes a raster binocular stereo vision sensor and a wide-field camera. The raster binocular stereo vision sensor consists of two cameras and a projector. The wide-field camera is a combination of a high-resolution camera and a four-sided mirror. The wide-field camera also can be considered as a four-mirror camera to achieve multi-angle measurement. The 3D scanning  Compared with the curved mirror in current panoramic cameras, the model of the wide-field camera with a four-surface mirror is simpler and its measurement accuracy is higher. However, the field of views of the wide-field camera has a blind area. The size and measuring location of mass-produced large-sized products are basically fixed in industrial production sites, so each movement position of the 3D scanning sensor can be determined in advance, and the planar targets used in the proposed method can optimize those positions based on each moving point of the 3D scanning sensor. Therefore, compared with using a panoramic camera with curved mirrors, the wide-field camera with a four-surface mirror is suitable for industrial production. It is also the main reason why flat mirrors are used in the proposed method instead of curved mirrors.
where s P is the 3D coordinates of P under the coordinate system s s s s O x y z . os T can be obtained by calibration before measuring [21].

Light Strip Coding
The proposed method uses the existing binary-coded method [22] to achieve the match with the left and right cameras' light strips. The projector is first arranged according to Figure 4a-f to cast six black and white images. Supposing that the black is defined as 0 and the white is 1. The coding index of the first black light bar region in the left side of the 64 black and white light bar region is 000000 in Figure 4f. From left to right, the successive light bar coding is 000001, 000010, 000011, and so on.
As shown in Figure 4a   In Figure 4a-f, the black and white light bar region is only used for constructing the coding region to identify 64 light strips of each light strip image in Figure 4g-j. In the four light strip images shown in Figure 4g-j, the Steger [23] algorithm is used to extract the light strip center point of the light strip image in this paper. First, the Hessian matrix is used to determine the pixel-level coordinate and the normal direction of the light strip center Then, the sub-pixel level coordinate of the light strip center is obtained by solving the extreme points in the normal direction, as shown in Figure 6a. Finally, a link constraint method is used to remove the wrong center of the light bar and to link the correct light strip centers together to form a plurality of segments, as shown in Figure 6b.

Partial 3D Reconstruction
Section 2.2 shows that each light strip in the projected image corresponds to a unique code index. The light strips captured by the left and right cameras of the raster binocular stereo vision sensor can be matched according to the code index. The corresponding points of the light strip captured by two cameras can be obtained according to the epipolar constraints. Finally, the corresponding points are substituted into the raster binocular stereo vision model to calculate the 3D coordinates of the corresponding points.
The schematic of the grating binocular vision sensor model is shown in Figure 7. The left camera's coordinate system is c1 c1 c1 c1 O x y z and the right camera' where 1 K and 2 K are intrinsic parameters of the left and right cameras, respectively. The light strip matching of the left and right cameras can be achieved by light strip coding. The epipolar constraints are added to achieve the corresponding light strip center points of the left and right cameras. The corresponding points are substituted into Equation (4) to calculate their 3D coordinates.

Global Unity of Partial 3D Data
The partial 3D reconstruction process of the 3D scanning sensor is introduced in Section 2.3. However, limited by the vision sensor's field of view, the partial 3D reconstruction of the 3D scanning sensor can only measure the local 3D data of large-sized objects. To achieve the overall 3D reconstruction of large-sized objects, all local 3D data need to be integrated into the global coordinate system.
The planar target 1 coordinate t1 t1 t1 t1 O x y z is selected as the global coordinate system G G G G O x y z . O P is the 3D coordinate of the light strip center point P measured by the raster binocular vision sensor in the 3D scanning sensor coordinate system O O O O O x y z and G P is the 3D coordinate of P in the global coordinate system G G G G O x y z . As shown in Figure 1, the wide-field camera of the 3D scanning sensor can "see" two plane targets placed around the large-sized object to calculate C,t1 T and C,t2 T , then calculate t2,t1 T . The local 3D coordinates measured by 3D scanning sensor can be integrated into the global coordinate system using Equation (3): The wide-field camera may not "see" planar target 1, but it can "see" plane target 2. Then, the local 3D data can be integrated into the global coordinate system using Equation (4): To improve system efficiency and flexibility, multiple plane targets can be arranged around the large-sized object on the measurement site.

Physical Experiments
The setup of the physical experiment is shown in Figure 8. The raster binocular stereo vision sensor of the 3D scanning sensor consists of two cameras (AVTGC1380H with 17 mm lenses and a resolution of 1360 × 1024 and a field of view of 500 mm × 380 mm × 400 mm, Allied Vision Technologies, Stadtroda, Germany) and one projector (Dell M110 with a resolution of 1360 × 768, Dell, Round Rock, TX, USA). The wide-field camera of the 3D scanning sensor consists of one camera (Pointgray with 12 mm lenses and a resolution of 2448 × 2048, Point Grey Research, Richmond, Canada) and one mirror with four surfaces. The characteristic point of the planar target is unified as 10 × 10, with a machining accuracy of 5 μm.

System Calibration Results
First, the raster binocular stereo vision and the intrinsic parameters of the wide-field camera are calibrated based on [24,25]. Then, the transformation matrix m21 T , m31 T , m31 T and os T are calibrated by [21]. The planar target used for calibration is the same as the planar target for the measurement system.
All calibration results of 3D scanning sensor are shown as follows: A planar target is placed before the binocular stereo vision sensor at two positions, which measures the distance of character points of the target. Compared the real distance and the measurement distance, the RMS error is 0.09 mm. The binocular stereo vision sensor measure the character points of planar target 100 times, the deviation error is 0.03 mm. To verify the effectiveness of the proposed method, the following experiments were conducted to evaluate the measurement accuracy. The following subsection is a detailed description of the procedures and results.
A self-designed method is used to evaluate the global measurement precision. The specific experimental procedure is as follows: a one-dimensional (1D) target with two characteristic points (the distance between the two points is 1234.15 mm, with a precision of 0.01 mm) is placed before the 3D scanning sensor, which measures the characteristic point of the 1D target, as shown in Figure 9. Firstly, the 3D scanning sensor measures the left characteristic point of the 1D target at the first position, then it measures the right characteristic point of the 1D target at the second position. Finally, all characteristic points of 1D target measured by the scanning sensor at two positions are integrated into the global coordinate system by the planar target. The above progress is repeated eight times. The distance between the two points is calculated as the measurement distance (dm). The real distance between the two points of 1D target is the ideal distance (dt = 1234.15 mm). The deviation ( d Δ ) between dm and dt and the RMS error are calculated to evaluate the global accuracy of the proposed method.  The images captured by the wide-field camera and the two cameras of the raster binocular stereo vision sensor at the first position are shown in Figure 10a. The images captured by the wide-field camera and the two cameras of the raster binocular stereo vision sensor at the second position are shown in Figure 10b. The distances between the two points of the 1D target by the 3D scanning sensor at eight positions and the RMS error are listed in Table 1. The result shows the global measurement accuracy of the proposed method can reach 0.14 mm.

Real Data Measurement Experiment
To verify the effectiveness of the proposed method, the following real data measurement experiment was designed. In the experiment, the 3D scanning sensor is moved twice to measure the 3D morphology of the two parts of the train wheels.  The wide-field camera of the 3D scanning sensor is used to measure the arranged planar target. With the plane target as an intermediary, the local 3D data from two measurements is united to the global coordinate system. The coded images and the light strip images are shown in Figures 11  and 12, respectively.
The 3D morphology of the object measured by the 3D scanning sensor at the first position is shown in Figure 13a. The 3D morphology of the object measured by the 3D scanning sensor at the second position is shown in Figure 13b. The united 3D morphology of the object measured by the 3D scanning sensor at two positions is shown in Figure 13c. In order to further validate the effectiveness of the proposed algorithm, we measured the missile model, and their 3D morphology are shown in Figure 14.

Conclusions
Given that existing movable vision measurement methods for the 3D surface of large-sized objects have some problems, such as long operation times, low efficiency, and unsuitability for soft surfaces, a fast and high-precision movable vision measurement method for 3D surface of large-sized objects is introduced in this paper.
Compared with the existing measurement methods, the proposed method combines a raster binocular vision sensor with a wide-field camera to form a scanning sensor, and it is not necessary to past marks on the object's surface in front of the object. Meanwhile, the proposed method realizes the synchronous measurement of partial 3D measurement and local 3D data integrating, which is not needed to move target repeatedly in front of the object and the 3D scanning sensor, so it greatly improving the measurement efficiency. Physical experiment confirms that when the size of 1D target is about 1.2 m, the accuracy of the proposed method could reach 0.14 mm. Meanwhile, the proposed method is s of high flexibility and efficiency.
Since the size and measuring location of mass-produced large-sized products are basically fixed, so each moved position of the 3D scanning sensor can be determined in advance, and those positions of the planar targets used in the proposed method can be optimized based on each moving point of the 3D scanning sensor. Thus, the proposed method is especially suitable for the measurement for the mass-produced large-sized products in the industrial site.