Next Article in Journal
Comparative Analysis of Micro-Computed Tomography and 3D Micro-Ultrasound for Measurement of the Mouse Aorta
Previous Article in Journal
U-Net Convolutional Neural Network for Mapping Natural Vegetation and Forest Types from Landsat Imagery in Southeastern Australia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Binocular Color Line-Scanning Stereo Vision System for Heavy Rail Surface Detection and Correction Method of Motion Distortion

1
School of Transportation, Ludong University, Yantai 264025, China
2
School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China
3
Beijing Institute of Control and Electronics Technology, Beijing 100045, China
*
Authors to whom correspondence should be addressed.
J. Imaging 2024, 10(6), 144; https://doi.org/10.3390/jimaging10060144
Submission received: 8 May 2024 / Revised: 6 June 2024 / Accepted: 10 June 2024 / Published: 13 June 2024

Abstract

:
Thanks to the line-scanning camera, the measurement method based on line-scanning stereo vision has high optical accuracy, data transmission efficiency, and a wide field of vision. It is more suitable for continuous operation and high-speed transmission of industrial product detection sites. However, the one-dimensional imaging characteristics of the line-scanning camera cause motion distortion during image data acquisition, which directly affects the accuracy of detection. Effectively reducing the influence of motion distortion is the primary problem to ensure detection accuracy. To obtain the two-dimensional color image and three-dimensional contour data of the heavy rail surface at the same time, a binocular color line-scanning stereo vision system is designed to collect the heavy rail surface data combined with the bright field illumination of the symmetrical linear light source. Aiming at the image motion distortion caused by system installation error and collaborative acquisition frame rate mismatch, this paper uses the checkerboard target and two-step cubature Kalman filter algorithm to solve the nonlinear parameters in the motion distortion model, estimate the real motion, and correct the image information. The experiments show that the accuracy of the data contained in the image is improved by 57.3% after correction.

1. Introduction

For high-speed rail transportation, the quality of heavy rail is a key factor to ensure the safety of train operations. As an important means of controlling the quality of heavy rail products, the surface quality detection of heavy rail directly affects the economic benefits of enterprises and the safety of railway transportation. Different from common industrial product detection and in-service track detection [1,2], the surface quality detection of heavy rail in production lines has the following problems and challenges: harsh working conditions, the complex surface background of rolling heavy rail, scattered and random defects, high production and transmission speed, and continuous work. The existing surface defect detection methods for heavy rail production lines mainly include manual detection methods, traditional non-destructive detection methods, and detection methods based on machine vision. Among them, machine vision technology, with its advantages of non-contact and strong adaptability, can effectively meet the requirements of on-site high-speed production quality inspection. Compared to other detection methods, machine vision has higher flexibility and automatic deployment characteristics and is more adaptable to the usage environment.
As early as the 1990s, machine vision technology had been applied to the surface defect detection process in industrial production. In 1990, Piironen et al. introduced a prototype of an automatic visual online metal strip detection system [3]. In 2013, Song et al. [4] presented an automatic recognition method for hot-rolled steel strip surface defects affected by the feature variations of intra-class, illumination, and grayscale. Yang et al. investigated the online fault detection technique based on machine vision for conveyor belts [5]. Wang et al. [6] proposed an automatic detection method for rail fastener defects based on machine vision. In 2018, Dong et al. proposed an automatic defect detection network for surface defect detection, which achieved high-precision defect detection on data sets such as steel strips, ceramic tiles, and road surface defects [7]. To optimize the efficiency of rail defect detection, Cheng et al. [8] proposed a detection algorithm for a Faster Region-based Convolutional Neural Network (Faster R-CNN) based on machine vision in 2020. Song et al. [9] proposed a significance propagation algorithm (MCITF) based on multiple constraints and improved texture features to solve the complex variation of strip surface defects and the similarity of defects between classes. Zhang et al. [10] proposed a unified method to detect both the common and rare defects on the surface of aluminum profiles, which develops an attention module to promote the accuracy of the common and rare defects by providing PMs. In 2021, Zhou [11] studied a rail defect detection system based on machine vision to identify the rail track areas that require polishing. Guo et al. [12] used the Mask R-CNN network to detect defects on the rail surface and conduct semantic segmentation of defect areas. In 2022, Ma et al. [13] proposed a novel one-shot unsupervised domain adaptation framework for rail surface defect segmentation under different service times and natural conditions, which effectively improved the robustness of the model to distribution differences. Sun et al. [14] proposed an unsupervised defect detection system for aluminum plate inspection using a combination of bright-field and dark-field illumination in 2023.
At present, the 3D measurement methods in heavy rail detection research mainly include line-structured light triangulation measurement and stereo vision measurement. Among them, the stereo vision system based on linear array cameras benefits from the advantages of a large field of view, ultra-high resolution, and high acquisition speed [15]. It has higher optical accuracy, field of view range, and data transmission efficiency and is more suitable for continuous operations and high-speed transmission of industrial product inspection sites. It can also collect corresponding color and texture information while obtaining three-dimensional data [16,17,18]. Therefore, the scheme based on linear array stereo vision is also more conducive to the acquisition of 3D data of the heavy rail surface in the production line.
In 2009, Huke et al. [19] first proposed the three-dimensional measurement method using structured light based on a single line-scanning camera. In 2015, Lilienblum [20] proposed a linear scanning three-dimensional measurement method based on the intersection measurement of two line-scanning cameras, gradually improving the principal basis and equipment model of linear array stereo vision. On this basis, Niu et al. proposed an unsupervised stereo saliency detection method based on a binocular line scanning system, which provides an effective method for locating rail surface defects [21]. In 2021, Wu et al. [22] proposed a linear laser scanning measurement method to complete the 3D scanning of a freeform structure. However, in the application of stereo vision, it is possible to obtain incorrect detection results by directly using the acquired image data. This is because of the one-dimensional imaging feature of the line-scanning camera. When the scanning plane of the visual sensor is not perpendicular to the motion transmission direction, there will be oblique deformation in the scanned image, which will affect the authenticity of the description of the surface texture of the heavy rail.
The binocular color linear array stereo vision system studied in this paper adopts the measurement principle of coplanar intersection. Different from the known measurement methods of different plane intersections, coplanar intersection measurement has greater flexibility and higher accuracy in obtaining the surface profile information of the measured object [23]. The visual field planes of the two cameras coincide, which can ensure the synchronous imaging of measurement points in the space between the two cameras. The depth calculation can be carried out only by searching the corresponding points in the same frame, which is suitable for obtaining three-dimensional point clouds of the surface with weak texture in various poses. At the same time, the coplanar intersection measurement avoids introducing additional motion information and the limit constraints in the matching process, which greatly improves the speed and accuracy of registration, and reduces the computational cost [22,24,25]. The motion distortion in this method is mainly due to the installation error between the camera and the transmission motion device, the vibration in the transmission process, and the image stretching or compression caused by frame rate mismatch between image acquisition and rotary encoder. The huge motion distortion not only generates erroneous contour information, increases the false alarm rate, and leads to abnormal detection but also affects the screening, classification, and localization of suspected defect areas in the actual detection process due to inaccurate information transmission [26]. To reduce the influence of the above factors, a checkerboard target with coordinate information is used to estimate the actual motion situation and correct the final image information. It can not only guide the device installation but also compensate for the impact of motion distortion. The double-step cubature Kalman filter algorithm is used to iteratively solve the nonlinear parameters of the motion distortion model. The effectiveness of the correction algorithm is verified through experiments.

2. Hardware Acquisition System

The binocular color line-scanning stereo vision system consists of a binocular color line-scanning camera, an illumination system, and an experimental transmission motion platform.

2.1. Binocular Color Line-Scanning Camera

The hardware of the binocular color line-scanning system mainly includes a line-scanning camera unit, a binocular integration unit, and a data acquisition and transmission cooperative control unit. Compared with the area-scanning camera, the sensor units of the line-scanning camera are mainly concentrated in the length direction, and the number can easily reach several thousand, while the width direction is only a few rows. This allows line-scanning CCD cameras to have a larger field of view and higher resolution in the lengthwise direction than aera-scanning cameras [1]. In addition, the reduction in the overall number of light-sensitive units allows line-scanning cameras to achieve higher scanning frequencies. In practical applications, line-scanning cameras can be used to continuously image products to be detected using scanning stitching. Therefore, line-scanning cameras are widely used in the detection of industrial products with continuous and uniform motion, such as metals, plastics, paper, and cloth fabrics.
Binocular color line-scanning stereo vision system is built based on binocular vision measurement, and the triangulation principle is used to obtain RGB color images and calculate their corresponding 3D depth information, as shown in Figure 1. The color line-scanning camera system used in this paper is mainly based on the 3DPIXA camera of Chromasens, which is mainly oriented to high-speed 3D measurement applications, such as food production detection, industrial parts, and natural target image reconstruction. It uses a trilinear CCD line sensor (RGB) that interacts with the PC through CammeraLink. In addition to its built-in 3D_API, it also supports industrial image processing software such as HALCON (MVTec) for various related subsequent vision application development, as shown in Figure 1. The optical resolution of the camera is up to “70 µm/pixels”, and its maximum acquisition speed is up to “1.4 m/s” with a maximum frame rate of “21 kHz”. In addition, the line-scanning camera has 7142 ultra-high resolution pixels in each line to meet the RGB three-channel color information capture. The specific parameters of the camera are shown in Table 1.

2.2. Lighting System Selection and Layout

Inappropriate lighting can cause a lot of problems. For example, blotches and overexposure can hide a lot of important information, and shadows can cause false detection of edges. The reduced signal-to-noise ratio, as well as non-uniform illumination, can lead to difficulties in selecting the threshold for image processing. The factors to consider in choosing the optimal lighting scheme include light intensity, polarization, uniformity, direction, size and shape of the light source, diffuse or linear, background, etc., and the optical characteristics of the test objects (color, smoothness, etc.), working distance, object size, luminescence, etc., as shown in Figure 2.
Table 2 shows the main properties of halogen lamps, fluorescent lamps, and LED light sources. As can be seen from the table, LED light source has high efficiency, small size, less heat, low power consumption, stable luminescence, and long life (red LED life can reach 100,000 h, while other colors can also reach 30,000 h), and can be designed into light sources of different shapes and lighting modes through different combinations, such as ring lamp, dome lamp, coaxial light source, strip lamp, etc.
If a black-and-white camera is used, there is no special requirement for the color selection of the measured object, and a red LED is the most appropriate choice. Generally, CCD is not sensitive to purple and blue light, and CCD without coating is the most sensitive in the near-infrared region. If color imaging is performed, a white light source must be used.
Whether the final image acquisition effect can meet the requirements mainly depends on the layout relationship between the lighting system and the CCD camera. Different layout methods, such as linear light source, planar scattering light source, line-scanning CCD camera, area-scanning CCD camera, and mixed layout, have different effects on the resolution and contrast of the collected image. Several common layouts are shown in Figure 2. For binocular line-scanning cameras with ultra-high resolution, the small aperture setting is usually used to reduce lens distortion, minimize assembly errors, and expand the depth of field range.
The binocular color line-scanning stereo vision system covered in this paper uses a bright-field illumination scheme. As shown in Figure 3, the depth of field when the camera is shooting varies with the focal length, aperture value, and distance. When the aperture becomes smaller, the front and back of the subject become clearer, but the overall picture becomes darker due to the amount of light intake. To capture clearer imaging results, it is necessary to use a small aperture for shooting while increasing the amount of light input. Therefore, a bright field lighting scheme is necessary. Because a single light source illumination cannot meet the demand, the symmetrical linear LED light source layout, as shown in Figure 4, is adopted.

2.3. Experimental Transmission Motion Platform

In order to meet the experimental requirements of the line-scanning stereo vision system, the target object to be detected needs to have relative motion with the camera and a relatively stable proportional coordination relationship between its motion speed and the camera acquisition frequency under general conditions. In this paper, the motion experimental platform device shown in Figure 5 is used to carry out the relevant research on the line-scanning stereo vision system, and the device composition is shown in Table 3. The scanning system adopts the 3DPIXA binocular color line-scanning camera made by Germany Chromasens, model 3DPIXA-Dual70μm. The supporting image acquisition card is the microEnable series acquisition card of the German Silicon Software company, and the model is microEnable ⅣAD4-CL. The model of the linear LED light source is Corona Ⅱ. The XLC4 light source controller is used to adjust the brightness of the light source by controlling the current, and its adjustable range is 200–1800 mA. The power converter device is shown in Figure 6. The motion platform adopts an incremental encoder, model Sendix Base KIS40. The overall acquisition and control system relies on a desktop PC with an Intel(R) Core(TM) i7-7700 CPU.
The software mainly includes image acquisition and control software, light source adjustment and control software, camera parameter configuration software, and CS-3D-Viewer V3.2.0 development software based on 3DPIXA, as shown in Table 4.

3. Principles of Triangulation and Stereo Matching

3.1. Triangulation Principle of Binocular Line-Scanning Camera

Triangulation is an effective measurement method in the field of machine vision. The improved triangulation methods achieve better point and stereo triangulation results [27,28], which are then used for on-machine measurement technology for complex surfaces [29]. The binocular linear-scanning camera in this paper adopts the principle of coplanar intersection measurement. When the field-of-view planes of the two line-scanning cameras coincide exactly, and after correction process, it can be simplified to a coplanar geometric model as shown in Figure 7. Where the optical centers of the two cameras after correction are O R and O L , respectively, which are rotated to the back of the imaging plane to simplify the description of the principle of triangulation. P is the point to be measured in space, P L and P R are the imaging points at the spatial point P on the left and right camera planes. B is the baseline distance between the optical centers of the two cameras, and the focal length of the two cameras is f . Z P is the depth distance from the spatial point to the camera coordinate system. Let P R L = B ( y R y L ) , then according to the principle of triangle similarity, it can be concluded that
P R L B = Z p f Z p
Let d = ( y R y L ) , then the depth distance from the spatial point P to the camera baseline B is
Z p = f B d
where d is the position deviation of pixels imaged under two cameras of the same scene, which is the disparity in binocular matching. However, in general, the intrinsic parameters of the cameras are not the same, especially when the camera models are not consistent or there is a gap in the focal length adjustment, the imaging model is not applicable. The real simplified model is shown in Figure 8.
Where the optical centers of the two cameras are O R and O L , respectively. P is the position point in space, and P L and P R are their corresponding imaging points. B is the baseline distance between the two cameras, the focal lengths of the two cameras are f 1 and f 2 , respectively, and D P 1 is the depth distance from the spatial point to the imaging plane of the right camera coordinate system. In addition, α is the angle between the optical axes of the two cameras, β is the angle between the optical axis of the left camera and the baseline. θ 1 is the angle between the projection line of the spatial point on the right camera and the optical axis of the right camera, and θ 2 is the angle between the projection line of the spatial point on the left camera and the optical axis of the left camera. l x r is the distance between the projection point of the right camera and the principal point of the right camera, and l x l is the distance between the projection point of the left camera and the principal point of the left camera. According to the geometric properties of the triangle and the formula for the sine area of the triangle, there is
S Δ P O R O L = 1 2 l P O R l P O L sin ( ϕ 1 ) = 1 2 B l P O L sin ( ϕ 2 )
where, ϕ 1 is the angle between the projection line of the spatial point in the left camera and the baseline, and ϕ 2 is the angle between the projection line of the spatial point in the left and right cameras. l P O R and l P O L are the distances from the spatial point P to the optical centers of the left and right cameras, respectively. The above equations can be simplified as follows:
l P O R = B sin ( ϕ 2 ) sin ( ϕ 1 )
It can be obtained that the distance from the spatial point to the imaging plane of the right camera is
D P 1 = ( l P O R f 1 sin ( θ 1 ) ) cos ( θ 1 )
In summary, it can be obtained that
D P 1 = ( B sin ( β + θ 2 ) sin ( α + θ 1 θ 2 ) f 1 sin ( θ 1 ) ) cos ( θ 1 )
where
{ θ 1 = arctan ( l x r f 1 ) θ 2 = arctan ( l x l f 2 )

3.2. Binocular Vision Stereo Matching

Stereo matching is the key part of the stereo vision reconstruction, which mainly restores the spatial information of the 3D world through multiple images [30,31,32]. The purpose is to find the same point in two or more view images and then obtain the disparity result for depth estimation. As shown in Figure 9, after the epipolar rectification, the reference point in the left view is P r e f e r e n c e , and the optimal target point position P t a r g e t is found by searching the homonymy points on the epipolar line of the right view and within the disparity range D m a x .
The matching algorithm in stereo reconstruction involved in this paper is mainly based on the semi-supervised global matching algorithm SGBM (Semi-Global Block Matching) [33]. The initial disparity map is constructed by calculating and selecting the disparity corresponding to each pixel, and the related global energy function is established. By solving this minimization energy function, the disparity result corresponding to each pixel is optimized. The minimization energy function equation is as follows:
E ( D ) = p ( Cos t ( p , D p ) + q N p λ 1 I 1 [ | D p D q | = 1 ] + q N p λ 2 I 2 [ | D p D q | > 1 ] )
where C o s t ( p , D p ) is the matching cost. λ 1 I 1 ( ) and λ 2 I 2 ( ) are the neighborhood disparity cost penalty terms. I 1 ( ) and I 2 ( ) are the indicator functions (returns “1” if true, in parentheses, otherwise it will return “0”). D is the current corresponding overall disparity map, D p is the disparity result corresponding to point p , and D q is the disparity result corresponding to point q . q N p is the neighborhood of point p , λ 1 is the coefficient of the penalty term of I 1 ( ) with a disparity difference of 1 in the neighborhood, and λ 2 is the coefficient of the penalty term of I 2 ( ) with disparity difference greater than 1 in the neighborhood.
In the process of calculating the initial disparity cost, the sliding window is usually used. The smaller the matching cost calculation window is, the more noisy the disparity map is. On the contrary, the larger the window setting, the smoother the disparity map. However, a window too large can easily lead to over-smoothing and increase the probability of mismatching, as well as more void regions with no values in the disparity map. The smoothness of the final disparity map result is controlled by using the penalty coefficient and penalty term. The larger the λ 2 , the smoother the disparity map.
In summary, according to the principle of triangulation and epipolar rectification, and combined with the SGBM stereo matching algorithm, the disparity map of the left and right cameras is obtained. Finally, the intrinsic and extrinsic parameters of each camera and between cameras of the binocular line-scanning system were obtained by calibration and other methods, and the final depth map was obtained through the baseline distance and disparity. In particular, the left-right consistency detection error of the SGBM algorithm is “ 10 ”, and the matching disparity range is “ 165 , + 165 ”. Based on the above conditions, the binocular line-scanning system can provide high-precision depth information with a resolution of “14 µm” in the range of “52 mm”.

4. Motion Distortion Correction

4.1. Motion Distortion

The main reasons for motion distortion are the installation error between the camera and the transmission motion device and the vibration during transmission, as shown in Figure 10. Firstly, when the angle between the camera’s view plane and the motion direction exists, due to the influence of the scanning imaging factors of the linear-scanning camera, the image not only exists stretching or compression deformation in the direction perpendicular to the field of view plane but also produces a displacement along the direction of the view plane. Secondly, although the real-time reconstructed contour of the object to be measured is robust to vibration, in the process of motion stitching, the motion relationship between the object to be measured and the camera is distorted due to vibration influence, resulting in errors in the final stitched overall contour along the motion direction. To reduce the influence of the above factors, the checkerboard target with coordinate information is used to estimate the actual motion situation and correct the final image information. As shown in Figure 11, the X Z plane is the scanning plane of the camera’s field of view, the point p ( x , y , z ) in space is a point on the target object, n = ( n 1 , n 2 , n 3 ) is the actual direction of motion transmission. p t ( x t , y t , z t ) is the vertical projection point of point p ( x , y , z ) on the camera and p ( x , y , z ) is the real projection point of point p ( x , y , z ) on the camera image along the motion direction, p ( x , y , z ) is the position of the corresponding spatial point displayed in the camera imaging result.
When the relationship between the acquisition frequency of the camera and the true motion velocity is not changed, it is known that the distance L t from point p to point p t is equal to the distance L t from point p t to point p . Therefore, the following relationship can be obtained:
{ x = x y n 2 n 1 y = y n 2 z = z y n 2 n 3
The relationship between the camera and the transmission direction n = ( n 1 , n 2 , n 3 ) can be obtained by using the known relative position coordinate information of the spatial point on the calibration board. In the image coordinate system O C a m ( X , Y , Z ) shown in the figure above, the position of the spatial point is p ( x , y , z ) . In the calibration board coordinate system O C a b ( X , Y , Z ) , the coordinate position of the corresponding space point is p c ( x c , y c , z c ) . The relationship between the calibration board coordinate system and the image scanning coordinate system can be expressed as follows:
p ( x , y , z ) = R p c ( x c , y c , z c ) + T
where R and T are the rotation and translation relations between the two coordinate systems, respectively. The above equation is expressed in the coordinate form as
[ x y z ] = [ r 11 r 12 r 13 r 21 r 22 r 23 r 31 r 32 r 33 ] [ x c y c z c ] + [ t 1 t 2 t 3 ]
To sum up, it can be concluded that
{ n 2 y = r 21 x c + r 22 y c + r 23 z c + t 2 r 21 2 + r 22 2 + r 23 2 = 1
Solving the above equations yields n 2 , [ r 21 , r 22 , r 23 ] and t 2 . Similarly, the other coordinate correspondences between p c ( x c , y c , z c ) and p ( x , y , z ) can be calculated by Equations (13) and (14).
{ x = r 11 x c + r 12 y c + r 13 z c + t 1 y n 1 r 11 2 + r 12 2 + r 13 2 = 1
and
{ z = r 31 x c + r 32 y c + r 33 z c + t 3 y n 3 r 31 2 + r 32 2 + r 33 2 = 1

4.2. Cubature Kalman Filter for Solving the Relevant Parameters

The above equations can express the relationship between the image acquisition information by the line scanning camera imaging system and the results in real motion. The above equations are integrated as follows:
[ x 0 z ] = [ r 11 r 12 r 13 r 21 r 22 r 23 r 31 r 32 r 33 ] [ x c y c z c ] + [ t 1 t 2 t 3 ] y [ n 1 n 2 n 3 ] S t : R = [ r 11 r 12 r 13 r 21 r 22 r 23 r 31 r 32 r 33 ] , n = [ n 1 n 2 n 3 ] , R R T = 1 , n T n = 1 ,
In the process of solving, in order to reduce the interference of vibration and other additional factors on image acquisition, this paper uses the cubature Kalman filter to solve the relevant parameters. The specific solution steps are as follows.
The nonlinear system equation and observation equation of motion parameters can be expressed as
x k + 1 c = F ( x k c , ν k ) = x k c + ν ( k ) y k = H ( x k + 1 c , u k 1 , u k 2 , e k ) = R ( x k θ ) u k 1 + x k t + u k 2 x k n + e ( k )
where u 1 = ( x c , y c , z c ) , u 2 = y , and θ is the Euler angle corresponding to the rotation matrix. The parameter vector of parameters to be sought is
x c = [ θ α , θ β , θ γ , t 1 , t 2 , t 3 , n 1 , n 2 , n 3 ] T = [ x θ , x t , x n ]
The flow chart of motion distortion correction based on the cubature Kalman filter is shown in Figure 12 and Figure 13, and its operation steps are shown in Algorithm 1.
Algorithm 1: Motion distortion correction based on cubature Kalman filter
Input: Original calibration target scan map, corresponding depth map, camera intrinsic parameters of binocular line matrix scanning system, parameter information of real calibration target, initialization parameter x 0 c , iteration error threshold T h r e s h o l d , iteration number upper limit N u m b e r .
Step I: Image information acquisition phase:
1. Corner detection and image coordinate extraction for the original image of the calibration target.
2. Calculate the 3D coordinate information of the image corner points in the camera coordinate system in the previous step based on the binocular camera intrinsic parameters, the corresponding height map and the initial motion transmission correspondence.
3. Based on the information of the real calibration target corner points, set them in the X _Y plane of the calibration target coordinate system with a point spacing of 30mm and a Z coordinate of 0.
Step II: Motion parameter solution stage:
4. Initialize the parameters x 0 c in parameters x 0 n = [ n 1 , n 2 , n 3 ] = [ 0 , 1 , 0 ] .
5. Match the point correspondence between the 3D coordinates of the corner points of the image and the real coordinates of the spatial corner points of the calibration target according to the characteristics of the checkerboard graph.
6. Calculate the rotation and translation relationship between the above corresponding points by using the cubature Kalman filter (Step-I) to obtain the initial value of [ x k θ , x k t ] = [ θ α , θ β , θ γ , t 1 , t 2 , t 3 ] in x k c .
7. Re-estimate the root mean square error E r r o r k between the 3D coordinates of the image corner points and the updated coordinates of the spatial corner points in the calibration target after rotating and translating them according to [ x k θ , x k t ] .
8. If E r r o r k < = T h r e s h o l d and the number of iterations I t e r _ t i m e s : k < = N u m b e r , use the cubature Kalman filter (Step-II) to calculate x k n = [ n 1 , n 2 , n 3 ] to reduce the E r r o r . Based on the updated x k n = [ n 1 , n 2 , n 3 ] , the 3D coordinate information of the image corners is updated to compensate and go to step 5.
9. If E r r o r k < = T h r e s h o l d or k > N u m b e r , output the final of k > N u m b e r and x k 1 n = [ n 1 , n 2 , n 3 ] .
Output: Output the final [ x k θ , x k t ] and x k 1 n = [ n 1 , n 2 , n 3 ] .
The process of solving the motion state based on volumetric Kalman filtering is as follows:
  • Initialize the state equation and its corresponding covariance matrix x c = x 0 c , P = P 0 c , then initialize the variance in the process model Q k c = Q 0 c .
  • Estimate the predicted states and the predicted state covariance as
x k + 1 | k c = x k | k c P k + 1 | k = P k | k + Q k c
3.
Then estimate the correspondence points
y k + 1 | k = R ( x k θ ) u k 1 + x k t + u k 2 x k n
4.
Finally, estimate the state and the corresponding covariance matrix
x k + 1 | k + 1 c = x k + 1 | k c + K k + 1 ( y k + 1 y k + 1 | k ) P k + 1 | k + 1 = P k + 1 | k K k + 1 P y y , k + 1 | k K k + 1 T
where
K k + 1 = P x c y , k + 1 | k P y y , k + 1 | k 1
P x c y , k + 1 | k = Ε [ ( x k + 1 c x k + 1 | k c ) ( y k + 1 y k + 1 | k ) T ]
P y y , k + 1 | k = Ε [ ( y k + 1 y k + 1 | k ) ( y k + 1 y k + 1 | k ) T ]
where K k + 1 is the Kalman gain and P x c y , k + 1 | k and P y y , k + 1 | k are the covariance matrices. y k + 1 | k , P x c y , k + 1 | k and P y y , k + 1 | k are obtained by computing the volume points.

5. Experimental Results and Analysis

The checkerboard model used in the experiment is GP400-12-9, with an external dimension of 400 mm × 300 mm , a square edge length of 30 mm , a pattern array of 12 × 9 , a pattern size of 360 × 270 , an accuracy of ± 0.01 mm , and a material of float glass. The camera uses a rising edge outer trigger setting, and the cooperative acquisition relationship between the image acquisition frame rate and the rotating encoder pulse frequency is 1 : 14 , and the image gain is 3. The external dual light source is set to 800 mah. The angle installation error between the actual camera view plane and the transmission direction is ≤3°.
Figure 14 shows the root mean square error (RMSE) of the mapping between the 3D information of the moving image corner points compensated and the coordinates of the real calibration target corner points in each iteration process according to the current motion parameters. The blue curve represents the variation of RMSE with the number of iterations. The figure shows that with the increase of the number of iterations, the results tend to be stable, the RMSE of corner point reprojection is reduced from 2.059 mm to 0.8794 mm, and the data accuracy of the corresponding real coordinate information in the image is improved by 57.3%. Convergence can be achieved after five iterations. The error may be affected by the vibration and noise in motion and the accuracy of depth map information after binocular reconstruction. The variation of each component of x k n = [ n 1 , n 2 , n 3 ] during the iteration is shown in Figure 15. The blue curves represent the variations of the offset components in the three directions of x, y, and z with the number of iterations.
In addition to the offset of the information in the image caused by the installation error, the image stretch or compression caused by the mismatch between the image acquisition frame rate and the frame rate of the rotary encoder is also an important source of image information deviation. Figure 16 shows that in this experiment, there is a stretching distortion effect between the image acquisition frame rate and the rotary encoder frame rate with a distortion ratio of approximately 1.0246:1. The blue curves represent the variation of the tensile distortion coefficient with the number of iterations. The distortion effect is mainly reflected in the normalization process of the motion vector n. When normalizing n, the normalization coefficient is the stretching distortion coefficient. Figure 17 shows the calibration target image before correction and the calibration target image after correction, as well as the 3D information display of their corresponding image corner points in the camera coordinate system. Figure 18 shows the comparison between the real corner point coordinates and the corner point coordinates of the image before and after correction, respectively. The error shows that the corrected image can reflect the spatial position relationship of the target to be detected more realistically and reduces the influence brought by the system installation error and the cooperative acquisition error. Figure 19 shows the before-and-after comparison of the corrected image of the advertising brochure, where the red and green dots indicate the strong correspondence between the original image and the corrected image, respectively.

6. Conclusions

In this paper, a binocular color line-scanning stereo vision system is designed, which utilizes the triangulation principle and stereo matching technology to capture high-precision and high-resolution 2D images and 3D contour information of the heavy rail surface. A model is established to address the motion distortion of captured images caused by camera installation errors and collaborative acquisition mismatch. The parameters in the nonlinear model are iteratively solved by a checkerboard target and the double-step cubature Kalman filter algorithm. The experiments prove that the RMSE decreases from the original 2.059 mm to 0.8794 mm, and the data accuracy of the corrected coordinate information in the image is improved by 57.3%.

Author Contributions

Conceptualization, C.W. and K.S.; methodology, C.W. and K.S.; software, M.N.; validation, C.W. and M.N.; investigation, W.L. and M.N.; data curation, W.L. and M.N.; writing—original draft preparation, W.L. and M.N.; writing—review and editing, C.W. and J.L.; project administration, K.S.; funding acquisition, K.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 51805078.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article are not publicly available at this time but will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, J.Z.; Li, Q.Y.; Gan, J.R.; Yu, H.M.; Yang, X. Surface defect detection via entity sparsity pursuit with intrinsic priors. IEEE Trans. Ind. Inform. 2020, 16, 141–150. [Google Scholar] [CrossRef]
  2. Yu, H.M.; Li, Q.Y.; Tan, Y.Q.; Gan, J.R.; Wang, J.Z. A coarse-to-fine model for rail surface defect detection. IEEE Trans. Instrum. Meas. 2019, 68, 656–666. [Google Scholar] [CrossRef]
  3. Piironen, T.; Silven, O.; Pietikäinen, M.; Laitinen, T.; Strömmer, E. Automated visual inspection of rolled metal surfaces. Mach. Vis. Appl. 1990, 3, 247–254. [Google Scholar] [CrossRef]
  4. Song, K.C.; Yan, Y.H. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 2013, 285, 858–864. [Google Scholar] [CrossRef]
  5. Yang, Y.L.; Miao, C.Y.; Li, X.G.; Mei, X.Z. On-line conveyor belts inspection based on machine vision. Optik 2014, 125, 5803–5807. [Google Scholar] [CrossRef]
  6. Wang, Z.Z.; Wang, S.M. Research of method for detection of rail fastener defects based on machine vision. In Proceedings of the 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering, Xi’an, China, 12–13 December 2015. [Google Scholar]
  7. Dong, H.W.; Song, K.C.; He, Y.; Xu, J.; Yan, Y.H.; Meng, Q.G. PGA-Net: Pyramid feature fusion and global context attention network for automated surface defect detection. IEEE Trans. Ind. Inform. 2019, 16, 7448–7458. [Google Scholar] [CrossRef]
  8. Cheng, Y.; Deng, H.G.; Feng, Y.X. Effects of faster region-based convolutional neural network on the detection efficiency of rail defects under machine vision. In Proceedings of the IEEE 5th Information Technology and Mechatronics Engineering Conference, Chongqing, China, 12–14 June 2020. [Google Scholar]
  9. Song, G.R.; Song, K.C.; Yan, Y.H. Saliency detection for strip steel surface defects using multiple constraints and improved texture features. Opt. Lasers Eng. 2020, 128, 106000. [Google Scholar] [CrossRef]
  10. Zhang, D.F.; Song, K.C.; Xu, J.; Yu, H. Unified detection method of aluminium profile surface defects: Common and rare defect categories. Opt. Lasers Eng. 2020, 126, 105936. [Google Scholar] [CrossRef]
  11. Zhou, Q.X. A detection system for rail defects based on machine vision. J. Phys. Conf. Ser. 2021, 1748, 022012. [Google Scholar] [CrossRef]
  12. Guo, F.; Qian, Y.; Rizos, D.; Suo, Z.; Chen, X.B. Automatic rail surface defects inspection based on Mask R-CNN. Transp. Res. Rec. 2021, 2675, 655–668. [Google Scholar] [CrossRef]
  13. Ma, S.; Song, K.C.; Niu, M.H.; Tian, H.K.; Wang, Y.Y.; Yan, Y.H. Shape consistent one-shot unsupervised domain adaptation for rail surface defect segmentation. IEEE Trans. Ind. Inform. 2023, 19, 9667–9679. [Google Scholar] [CrossRef]
  14. Sun, Q.; Xu, K.; Liu, H.J.; Wang, J.E. Unsupervised surface defect detection of aluminum sheets with combined bright-field and dark-field illumination. Opt. Lasers Eng. 2023, 168, 107674. [Google Scholar] [CrossRef]
  15. Xie, Q.W.; Liu, R.R.; Sun, Z.; Pei, S.S.; Feng, C. A flexible free-space detection system based on stereo vision. Neurocomputing 2022, 485, 252–262. [Google Scholar] [CrossRef]
  16. Xiao, Y.L.; Wen, Y.F.; Li, S.K.; Zhong, Q.C.; Zhong, J.X. Large-scale structured light 3D shape measurement with reverse photography. Opt. Lasers Eng. 2020, 130, 106086. [Google Scholar] [CrossRef]
  17. Sun, B.; Zhu, J.G.; Yang, L.H.; Yang, S.R.; Guo, Y. Sensor for in-motion continuous 3D shape measurement based on dual line-scan cameras. Sensors 2016, 16, 1949. [Google Scholar] [CrossRef] [PubMed]
  18. Ma, Y.P.; Li, Q.W.; Chu, L.L.; Zhou, Y.Q.; Xu, C. Real-time detection and spatial localization of insulators for UAV inspection based on binocular stereo vision. Remote Sens. 2021, 13, 230. [Google Scholar] [CrossRef]
  19. Denkena, B.; Huke, P. Development of a high resolution pattern projection system using linescan cameras. Proc. SPIE 2009, 7389, 73890F. [Google Scholar]
  20. Lilienblum, E. A structured light approach for 3-D surface reconstruction with a stereo line-scan system. IEEE Trans. Instrum. Meas. 2015, 64, 1258–1266. [Google Scholar] [CrossRef]
  21. Niu, M.H.; Song, K.C.; Huang, L.M. Unsupervised saliency detection of rail surface defects using stereoscopic images. IEEE Trans. Ind. Inform. 2021, 17, 2271–2281. [Google Scholar] [CrossRef]
  22. Wu, C.Y.; Yang, L.; Luo, Z.; Jiang, W.S. Linear laser scanning measurement method tracking by a binocular vision. Sensors 2022, 22, 3572. [Google Scholar] [CrossRef]
  23. Yang, S.B.; Gao, Y.; Liu, Z.; Zhang, G.J. A calibration method for binocular stereo vision sensor with short-baseline based on 3D flexible control field. Opt. Lasers Eng. 2020, 124, 105817. [Google Scholar] [CrossRef]
  24. Kim, H.; Lee, S. Simultaneous line matching and epipolar geometry estimation based on the intersection context of coplanar line pairs. Pattern Recognit. Lett. 2012, 33, 1349–1363. [Google Scholar] [CrossRef]
  25. Guo, N.; Li, L.; Yan, F.; Li, T.T. Binocular stereo vision calibration based on constrained sparse beam adjustment algorithm. Optik 2020, 208, 163917. [Google Scholar]
  26. Jia, Z.Y.; Yang, J.H.; Liu, W.; Wang, F.J.; Liu, Y.; Wang, L.L.; Fan, C.N.; Zhao, K. Improved camera calibration method based on perpendicularity compensation for binocular stereo vision measurement system. Opt. Express 2015, 23, 15205–15223. [Google Scholar] [CrossRef]
  27. Park, J.H.; Park, H.W. Fast view interpolation of stereo images using image gradient and disparity triangulation. Signal Process.-Image Commun. 2003, 18, 401–416. [Google Scholar] [CrossRef]
  28. Otero, J.; Sanchez, L. Local iterative DLT soft-computing vs. interval-valued stereo calibration and triangulation with uncertainty bounding in 3D reconstruction. Neurocomputing 2015, 167, 44–51. [Google Scholar] [CrossRef]
  29. Ding, D.W.; Ding, W.F.; Huang, R.; Fu, Y.C.; Xu, F.Y. Research progress of laser triangulation on-machine measurement technology for complex surface: A review. Measurement 2023, 216, 113001. [Google Scholar] [CrossRef]
  30. Hamzah, R.A.; Ibrahim, H. Literature survey on stereo vision disparity map algorithms. J. Sens. 2015, 2016, 8742920. [Google Scholar] [CrossRef]
  31. Scharstein, D.; Szeliski, R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithm. Int. J. Comput. Vis. 2002, 47, 7–42. [Google Scholar] [CrossRef]
  32. He, K.J.; Sui, C.Y.; Huang, T.Y.; Dai, R.; Lyu, C.Y.; Liu, Y. H3D Surface reconstruction of transparent objects using laser scanning with LTFtF method. Opt. Lasers Eng. 2022, 148, 106774. [Google Scholar] [CrossRef]
  33. Hirschmuller, H. Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 328–341. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Binocular color line-scanning camera system: (a) Structure schematic diagram; (b)Picture of the actual system.
Figure 1. Binocular color line-scanning camera system: (a) Structure schematic diagram; (b)Picture of the actual system.
Jimaging 10 00144 g001
Figure 2. Various lighting layout methods and applicability camera types: (a) Bright field line scan; (b) Compact bright field surface scan; (c) Bright dark field surface scan; (d) Bright field/dark field surface scan.
Figure 2. Various lighting layout methods and applicability camera types: (a) Bright field line scan; (b) Compact bright field surface scan; (c) Bright dark field surface scan; (d) Bright field/dark field surface scan.
Jimaging 10 00144 g002
Figure 3. Camera imaging internal related parameters.
Figure 3. Camera imaging internal related parameters.
Jimaging 10 00144 g003
Figure 4. Symmetrical linear LED light source layout.
Figure 4. Symmetrical linear LED light source layout.
Jimaging 10 00144 g004
Figure 5. Experimental platform device.
Figure 5. Experimental platform device.
Jimaging 10 00144 g005
Figure 6. Corona II light source control system.
Figure 6. Corona II light source control system.
Jimaging 10 00144 g006
Figure 7. Coplanar intersection measurement with binocular line scanning camera.
Figure 7. Coplanar intersection measurement with binocular line scanning camera.
Jimaging 10 00144 g007
Figure 8. Simplified model of binocular triangulation.
Figure 8. Simplified model of binocular triangulation.
Jimaging 10 00144 g008
Figure 9. Schematic diagram of binocular stereo matching: (a) Left camera correction image; (b) Right camera correction image.
Figure 9. Schematic diagram of binocular stereo matching: (a) Left camera correction image; (b) Right camera correction image.
Jimaging 10 00144 g009
Figure 10. Schematic diagram of motion distortion caused by installation Error.
Figure 10. Schematic diagram of motion distortion caused by installation Error.
Jimaging 10 00144 g010
Figure 11. Correspondence between real points p t ( x t , y t , z t ) and image mapping points p ( x , y , z ) caused by installation error.
Figure 11. Correspondence between real points p t ( x t , y t , z t ) and image mapping points p ( x , y , z ) caused by installation error.
Jimaging 10 00144 g011
Figure 12. Image information acquisition stage of motion distortion correction process based on cubature Kalman filter.
Figure 12. Image information acquisition stage of motion distortion correction process based on cubature Kalman filter.
Jimaging 10 00144 g012
Figure 13. Parameter solving stage of the motion distortion correction process based on cubature Kalman filter.
Figure 13. Parameter solving stage of the motion distortion correction process based on cubature Kalman filter.
Jimaging 10 00144 g013
Figure 14. The root mean square error between the corner coordinates of image mapping and the real coordinates is corrected in the iterative process.
Figure 14. The root mean square error between the corner coordinates of image mapping and the real coordinates is corrected in the iterative process.
Jimaging 10 00144 g014
Figure 15. Changes of components of x k n = [ n 1 , n 2 , n 3 ] during iteration.
Figure 15. Changes of components of x k n = [ n 1 , n 2 , n 3 ] during iteration.
Jimaging 10 00144 g015
Figure 16. Tensile distortion coefficient during iteration.
Figure 16. Tensile distortion coefficient during iteration.
Jimaging 10 00144 g016
Figure 17. Comparison of image and corner coordinates before and after motion distortion: (a) Before correction; (b) After correction; (c) Coordinates of corner points before and after correction.
Figure 17. Comparison of image and corner coordinates before and after motion distortion: (a) Before correction; (b) After correction; (c) Coordinates of corner points before and after correction.
Jimaging 10 00144 g017
Figure 18. The real corner coordinates are the same as the corner coordinates of the image before correction and the corner coordinates of the image after correction: (a) Before correction; (b) After correction.
Figure 18. The real corner coordinates are the same as the corner coordinates of the image before correction and the corner coordinates of the image after correction: (a) Before correction; (b) After correction.
Jimaging 10 00144 g018
Figure 19. Before and after comparison of corrected images of advertising brochures.
Figure 19. Before and after comparison of corrected images of advertising brochures.
Jimaging 10 00144 g019
Table 1. Camera parameters of 3DPIXA.
Table 1. Camera parameters of 3DPIXA.
PerformanceParameter
Optical resolution70 μm/pixel
Field of view500 mm
Pixels7142
Pixel unit10 × 10 μm
Height resolution14 μm
Depth of field52 mm
Distance of working surface796.9 mm
Detection speed1480 mm/s
Table 2. Performance comparison of common light sources.
Table 2. Performance comparison of common light sources.
PerformanceHalogenFluorescentLED Light Source
Lifespan (hours)5000–70005000–700060,000–100,000
Brightness levelbrightbrighterHigh brightness (multiple LEDs)
Response speedslowslowfast
CharacteristicHigh heat generation,
almost no change in brightness and color temperature, cheap price.
Less heat generation, good diffusivity, suitable for large area uniform irradiation, and cheap.Less heat generation, the wavelength can be selected according to the application, the shape is convenient to make, the operation cost is low, and the power consumption is low.
Table 3. Hardware composition and function of the experimental sports platform.
Table 3. Hardware composition and function of the experimental sports platform.
HardwareQuantityFunction
3DPIXA camera1Image acquisition
Corona linear LED light source2Provide light and increase the amount of light intake
XLC4 controller2Regulate the brightness of the light source
220 V to 24 V, AD/DA converter2Power the light source
220 V to 12 V, AD/DA converter1Power the camera
MicroEnable
Image Capture Card
2Capture digitized video image information and store it
KIS40 encoder1Collaborative control the relationship between transmission speed and camera acquisition frequency
Transmission motion platform1Realize the relative movement of the product to be detected and the camera
Table 4. Software and function of experimental sports platform.
Table 4. Software and function of experimental sports platform.
SoftwareFeatures
XLC4CommanderConnect XLC4 to PC to control the brightness of the linear LED light source
Camera Setup Tool (CST)Configure camera-related preset parameters
MicroDisplayReal-time display of image acquisition
CS-3D-ViewerProvide intrinsic parameter conversion of the camera and subsequent development SDK
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, C.; Luo, W.; Niu, M.; Li, J.; Song, K. A Binocular Color Line-Scanning Stereo Vision System for Heavy Rail Surface Detection and Correction Method of Motion Distortion. J. Imaging 2024, 10, 144. https://doi.org/10.3390/jimaging10060144

AMA Style

Wang C, Luo W, Niu M, Li J, Song K. A Binocular Color Line-Scanning Stereo Vision System for Heavy Rail Surface Detection and Correction Method of Motion Distortion. Journal of Imaging. 2024; 10(6):144. https://doi.org/10.3390/jimaging10060144

Chicago/Turabian Style

Wang, Chao, Weixi Luo, Menghui Niu, Jiqiang Li, and Kechen Song. 2024. "A Binocular Color Line-Scanning Stereo Vision System for Heavy Rail Surface Detection and Correction Method of Motion Distortion" Journal of Imaging 10, no. 6: 144. https://doi.org/10.3390/jimaging10060144

APA Style

Wang, C., Luo, W., Niu, M., Li, J., & Song, K. (2024). A Binocular Color Line-Scanning Stereo Vision System for Heavy Rail Surface Detection and Correction Method of Motion Distortion. Journal of Imaging, 10(6), 144. https://doi.org/10.3390/jimaging10060144

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop