A Knowledge-Driven Approach for 3D High Temporal-Spatial Measurement of an Arbitrary Contouring Error of CNC Machine Tools Using Monocular Vision

Periodic health checks of contouring errors under unloaded conditions are critical for machine performance evaluation and value-added manufacturing. Aiming at breaking the dimension, range and speed measurement limitations of the existing devices, a cost-effective knowledge-driven approach for detecting error motions of arbitrary paths using a single camera is proposed. In combination with the PNP algorithm, the three-dimensional (3D) evaluation of large-scale contouring error in relatively high feed rate conditions can be deduced from a priori geometrical knowledge. The innovations of this paper focus on improving the accuracy, efficiency and ability of the vision measurement. Firstly, a camera calibration method considering distortion partition of the depth-of-field (DOF) is presented to give an accurate description of the distortion behavior in the entire photography domain. Then, to maximize the utilization of the decimal involved in the feature encoding, new high-efficient encoding markers are designed on a cooperative target to characterize motion information of the machine. Accordingly, in the image processing, markers are automatically identified and located by the proposed decoding method based on finding the optimal start bit. Finally, with the selected imaging parameters and the precalibrated position of each marker, the 3D measurement of large-scale contouring error under relatively high dynamic conditions can be realized by comparing the curve that is measured by PNP algorithm with the nominal one. Both detection and verification experiments are conducted for two types of paths (i.e., planar and spatial trajectory), and experimental results validate the measurement accuracy and advantages of the proposed method.


Introduction
Nowadays, as high-end equipment places ever-increasing demands on the accuracy of parts, the manufacturing quality of components is directly related to the performance (i.e., static and dynamic performance) of the machines utilized on the production line. To meet these requirements, measurement is inseparable from any basic or complex CNC machine tool machining process. The conventional post-inspection method is too late for the needs of value-added manufacturing, which drives us to evaluate machine performance before machining to determine or improve the processing capacity. Given the fact that workpieces are dynamically machined, the contouring performance, the capacity to accurately run a given trajectory, is a foundation for the evaluation of the In this paper, the error motions of a path can be calculated by processing and analyzing the position of features in image sequence. The objective of this paper is to achieve the 3D measurement of an arbitrary large-scale contouring error in relatively high-dynamic conditions using a single camera. Therefore, aiming at improving the accuracy, efficiency, and measurement capability of the proposed method, we perform the following tasks: firstly, combined with the line control field, a camera calibration method considering the distortion partition of the DOF is presented to improve the pixel positioning accuracy of the features (Section 2.2); thereafter, the CNC machine is triggered manually after the workstation allocates memory for image acquisition, and the image sequence of movement cooperative target is acquired by the camera (Section 3.1). After that, in image processing, with the decoding method based on finding the best starting bit proposed in this paper, features in the image sequence can be efficiently distinguished and located (Section 3.2). Subsequently, after selecting the optimal PNP algorithm (Section 4.1) by accuracy comparison analysis, the time-varying point pairs in the object and the image spaces with known position are used to solve the spatial trajectory. Finally, the contouring error can be obtained by comparing the difference between the detected path and the nominal one (Section 4.2). In this paper, the error motions of a path can be calculated by processing and analyzing the position of features in image sequence. The objective of this paper is to achieve the 3D measurement of an arbitrary large-scale contouring error in relatively high-dynamic conditions using a single camera. Therefore, aiming at improving the accuracy, efficiency, and measurement capability of the proposed method, we perform the following tasks: firstly, combined with the line control field, a camera calibration method considering the distortion partition of the DOF is presented to improve the pixel positioning accuracy of the features (Section 2.2); thereafter, the CNC machine is triggered manually after the workstation allocates memory for image acquisition, and the image sequence of movement cooperative target is acquired by the camera (Section 3.1). After that, in image processing, with the decoding method based on finding the best starting bit proposed in this paper, features in the image sequence can be efficiently distinguished and located (Section 3.2). Subsequently, after selecting the optimal PNP algorithm (Section 4.1) by accuracy comparison analysis, the time-varying point pairs in the object and the image spaces with known position are used to solve the spatial trajectory. Finally, the contouring error can be obtained by comparing the difference between the detected path and the nominal one (Section 4.2).

Camera Calibration Method Considering the Distortion Partition of DOF
As shown in Figure 2, C C C C O X Y Z is the camera coordinate system (CCS) that use the optic axis as the Z axis. The 2D pixel coordinate system (PCS) ouv whose u-and v-axis are parallel to that of the C C C C O X Y Z is located in the upper left element of the image. A space point ( ) w w w P X Y Z in world coordinate system (WCS) is projected on an image point ( ) n n p u v through the optical center (camera aperture) C O , the camera model can be given by [40]: where ( ) O X Y Z . In this paper, the 3D measurement capability of a single camera is guaranteed by the Perspective-n-Point (PNP) algorithm. This feature points-based pose estimation problem was proposed by Fishchler [41]. As shown in Figure 2, the PnP problem aims to calculate the six pose parameters (i.e., rotation C R and translation C T matrixes) of a calibrated camera in the WCS from n( ＞3 n ) known 3D points i P and their 2D projections  i p in the image. However, because of the complexity of the optical system, the imperfect imaging of the lens may lead to mapping errors   ( ) x y . Thus, the first, and also the most important step, is to calibrate the parameters in the imaging model, since for close-range photogrammetry, the nonlinear imaging lens distortion, i.e., radial distortion and tangential distortion, is the main contributor to the measurement uncertainty. Therefore, to meet the high precision measurement requirement, the precise correction of these two manufacturing and assembly induced errors is mandatory. According to Brown's model [42], imaging distortion for traditional lenses (i.e., ordinary zoom or prime lens) satisfies: However, because of the complexity of the optical system, the imperfect imaging of the lens may lead to mapping errors ( δ x δ y ). Thus, the first, and also the most important step, is to calibrate the parameters in the imaging model, since for close-range photogrammetry, the nonlinear imaging lens distortion, i.e., radial distortion and tangential distortion, is the main contributor to the measurement uncertainty. Therefore, to meet the high precision measurement requirement, the precise correction of these two manufacturing and assembly induced errors is mandatory. According to Brown's model [42], imaging distortion for traditional lenses (i.e., ordinary zoom or prime lens) satisfies: + δ y δ x = u n ·(k 1 + k 2 ·r 2 + k 3 ·r 4 + · · · ) + p 1 ·(r 2 + 2·u 2 n ) + 2p 2 ·u n ·v n · · · δ y = v n ·(k 1 + k 2 ·r 2 + k 3 ·r 4 + · · · ) + p 2 ·(r 2 + 2·u 2 n ) + 2p 1 ·u n ·v n · · · (2) where p( u n v n ) depict the pixel coordinates without distortion. k 1 , k 2 and k 3 describe the radial distortion coefficients, respectively; the p 1 and p 2 represent the first-and second-order tangential distortion coefficients. However, for traditional camera calibration methods, distortion coefficients are calculated with K and M by minimizing the reprojection error, but these parameters will be coupled with each other. Consequently, a small reprojection error may occur when the distortion parameters are not well estimated. Therefore, to improve the camera calibration accuracy, distortion coefficients should be calculated separately. Furthermore, the existing calibration methods mainly focus on the use of a set of distortion coefficients to represent the distortion behavior of the imaging domain [40], or the solution of the distortion of different depth object planes [42,43], when in fact, distortion in the DOF is position-dependent. Hence, for high accuracy distortion correction, the distortion behavior in the DOF needs to be processed more finely. For this intention, a camera calibration method considering the distortion partition of the DOF is proposed, of which the basic idea is to extend the equal-radius partitioned 2D distortion to the 3D imaging domain, and then to separately calibrate the distortion using plumb-line method [42], after locking the obtained optimal distortion parameters in each partition, extrinsic parameters can be optimized.
To remove the coupling effect of multiple parameters, the line control field is designed and then applied to calibrate the image distortion separately using the plumb-line method. As shown in Figure 3, the corner of the control field is used to calibrate the camera intrinsic parameters [40] to calculate and adjust the pose of the target using the PNP algorithm, while straight lines are utilized to separately calibrate the image distortion of the image. Good pattern manufacturing quality is the premise for high-accuracy camera calibration. To this end, the lithography process is used to make patterns, the linewidth resolution accuracy can be guaranteed to be less than 1.0 µm. Besides, the distance error between two corners is less than 1.5 µm. In terms of partition, firstly, for a single 2D object plane perpendicular to the optical axis, the equal-radius partition method is presented to calculate the distortion in a more accurate way. Considering that the circularly and symmetrically distributed distortion varies with the increase of the distortion radius, the minimum distortion tolerance for image central area is taken as the threshold to calculate the partition radius r p , and the number of subregions n can be solved by: n·r p ≤ r max ≤ (n + 1)·r p , where r max = (I w − u 0 ) 2 + (I l − v 0 ) 2 is the maximum distortion radius; I w and I l depict the image width and image length, respectively.
According to [44], when the lens is focused at an unknown depth of s, the radial k s,s n i , and tangential p s,s n j distortion parameters at an arbitrary depth s n can be computed by the calibrated distortion parameters of two known different depths s m and s k , which can be given by: where α s n = s k −s n s k −s m · s m −c s n −c , s n , s m and s k denote the depth of three object planes perpendicular to the optical axis; c, c s n , c s m and c s 2 are the principal distances when the focus plane is at infinity, depth s n , depth s m and depth s k . k s,s m i and k s,s k i describe ith radial distortion parameters of two known object planes. p s,s m j and p s,s k j describe jth tangential distortion parameters of two known object planes. According to Equation (1), spatial points satisfy: where ( X m Y m Z m ) and ( X k Y k Z k ) are the coordinates of the two spatial points in the CCS, respectively; accordingly, ( x y ) denote the 2D projection on the image plane expressed in mm. Then, for a partition radius x 2 + y 2 = ρ 2 , we have: Solving the equation and re-arranging, we get f ·R m = ρ·Z m and Z m ·R k = Z k ·R m . where, Z m and Z k are the depths of the mth (Π m ) and kth (Π k ) object planes in the CCS; R m and R k describe the corresponding partition radius of the two planes; f represents the lens focal length. Let s m = Z m and s k = Z k , then the aforementioned in-plane distortion partition is extended to the 3D DOF, as can be shown in Figure 4, based on the gth partition on object plane Π m with partition range of (g − 1)·R m m·R m , the corresponding partition range on object plane Π k and Π n at the object distance s k and s n are (g − 1)·(s k ·R m /s m ) g·(s k ·R m /s m ) and (g − 1)·(s n ·R m /s m ) g·(s n ·R m /s m ) , respectively. Consequently, the radial g k s,s n i and decentering g p s,s n i distortion coefficient under the gth partition of arbitrary object distance s n can be expressed as: The radial ( g k s,s m i and g k s,s k i ) and tangential ( g p s,s m j and g p s,s k j ) distortion in each subregion of the two known planes can be estimated by minimizing the straightness error of the corresponding straight lines. Then, based on Equation (6), the distortion partition and corresponding distortion coefficients of arbitrary object distance s n can be determined, which can give an optimal description of the distortion behavior in the DOF. To further improve the calibration accuracy, the determined distortion and the intrinsic parameters of the imaging model are locked, and calibration target ( Figure 3) with high-precision grid distance is used to optimize extrinsic parameters: where, E q depth_dependent is the cost function of the target at the qth pose, R q and T q are the rotation and translation matrix under the qth pose to be optimized, g k i and g p j represent the radial and tangential distortion parameters of the gth distortion partition under the qth pose. In this paper, we used Levenberg-Marquardt (LM) algorithm to optimize the objective function. It needs to be emphasized that the constructed distortion calibration model with respect to the entire photography domain is related to the 3D position of a space point. Based on the established camera calibration method, the distortion of an image point can be corrected by selecting the optimal position-dependent distortion model. tangential distortion parameters of the g th distortion partition under the q th pose. In this paper, we used Levenberg-Marquardt (LM) algorithm to optimize the objective function. It needs to be emphasized that the constructed distortion calibration model with respect to the entire photography domain is related to the 3D position of a space point. Based on the established camera calibration method, the distortion of an image point can be corrected by selecting the optimal position-dependent distortion model.

High Precision 3D Positioning of Machine Tool Movement
In this paper, a 3D metrology is proposed to detect contouring errors using a single camera. To enable the working range measurement using a priori geometric information, as well as maintain a high vision measurement accuracy to evaluate the dynamic performance, high-quality image emphasized that the constructed distortion calibration model with respect to the entire photography domain is related to the 3D position of a space point. Based on the established camera calibration method, the distortion of an image point can be corrected by selecting the optimal position-dependent distortion model.

High Precision 3D Positioning of Machine Tool Movement
In this paper, a 3D metrology is proposed to detect contouring errors using a single camera. To enable the working range measurement using a priori geometric information, as well as maintain a high vision measurement accuracy to evaluate the dynamic performance, high-quality image

High Precision 3D Positioning of Machine Tool Movement
In this paper, a 3D metrology is proposed to detect contouring errors using a single camera. To enable the working range measurement using a priori geometric information, as well as maintain a high vision measurement accuracy to evaluate the dynamic performance, high-quality image acquisition and high-accuracy image processing should be guaranteed. This section describes these two aspects in detail.

High-Quality Image Acquisition for Machine Tool Movement
In practical vision measurement, it is necessary to add features to highlight the information of the object measured to be measured [45]. Thus, to precisely describe the movement information of the machine, markers should be installed at the end of the workbench as an enhanced feature. However, for machine motion capture scheme with bar lights radiating forward light for traditional reflective markers (Figure 5a). The unsatisfactory image quality (e.g., high reflection and image noise) and low marker manufacturing accuracy (shape error larger than 70 µm) limit the high-accuracy contouring error detection. To this end, our previous work [15] proposed a cooperative target based on high-precision lithography and high uniform backward illumination. For such a "what-you-see-what-you-get" measurement scheme (Figure 5b), the wide range measurement capability depends entirely on wide FOV size and large camera resolution. Obviously, this costly measurement scheme is not feasible due to the limitations of the camera hardware. To address the problem, a monocular vision-based 3D high temporal-spatial measurement method for contouring error detection is proposed (Figure 5c). To enhance the marker encoding efficiency, the number of 1024 new fiducial markers (see Section 3.2) is designed and coded on the large-size artifact. As shown in Figure 6, the artifact (Figure 6a) is embedded in a cooperative target (Figure 6b) to describe the motion information. In addition, a flat backlight independent of ambient lighting is employed to enhance the markers to be inspected. A high signal-to-noise-ratio (SNR) value of 38.7 dB can be obtained for the marker image acquired by the proposed method. reflective markers (Figure 5a). The unsatisfactory image quality (e.g., high reflection and image noise) and low marker manufacturing accuracy (shape error larger than 70 μm) limit the high-accuracy contouring error detection. To this end, our previous work [15] proposed a cooperative target based on high-precision lithography and high uniform backward illumination. For such a "what-you-seewhat-you-get" measurement scheme (Figure 5b), the wide range measurement capability depends entirely on wide FOV size and large camera resolution. Obviously, this costly measurement scheme is not feasible due to the limitations of the camera hardware. To address the problem, a monocular vision-based 3D high temporal-spatial measurement method for contouring error detection is proposed (Figure 5c). To enhance the marker encoding efficiency, the number of 1024 new fiducial markers (see Section 3.2) is designed and coded on the large-size artifact. As shown in Figure 6, the artifact ( Figure 6a) is embedded in a cooperative target (Figure 6b) to describe the motion information. In addition, a flat backlight independent of ambient lighting is employed to enhance the markers to be inspected. A high signal-to-noise-ratio (SNR) value of 38.7 dB can be obtained for the marker image acquired by the proposed method. the machine, markers should be installed at the end of the workbench as an enhanced feature. However, for machine motion capture scheme with bar lights radiating forward light for traditional reflective markers (Figure 5a). The unsatisfactory image quality (e.g., high reflection and image noise) and low marker manufacturing accuracy (shape error larger than 70 μm) limit the high-accuracy contouring error detection. To this end, our previous work [15] proposed a cooperative target based on high-precision lithography and high uniform backward illumination. For such a "what-you-seewhat-you-get" measurement scheme (Figure 5b), the wide range measurement capability depends entirely on wide FOV size and large camera resolution. Obviously, this costly measurement scheme is not feasible due to the limitations of the camera hardware. To address the problem, a monocular vision-based 3D high temporal-spatial measurement method for contouring error detection is proposed (Figure 5c). To enhance the marker encoding efficiency, the number of 1024 new fiducial markers (see Section 3.2) is designed and coded on the large-size artifact. As shown in Figure 6, the artifact ( Figure 6a) is embedded in a cooperative target (Figure 6b) to describe the motion information.
In addition, a flat backlight independent of ambient lighting is employed to enhance the markers to be inspected. A high signal-to-noise-ratio (SNR) value of 38.7 dB can be obtained for the marker image acquired by the proposed method.
(a) (b) (c) The photoetching technology (±0.5 µm) ensures the shape error of a single feature less than 1 µm. However, the big warp error of the large-size glass reduces the geometric accuracy among markers [33]. Thus, the 3D position of markers is calibrated by a HEXAGON OPTIV reference instrument (Qingdao, China, measurement error E x , E y : 0.25 µm + L/900, E z = 0.5 µm) as a priori information (Figure 7) to ensure the vision measurement accuracy under a wide range of conditions. The photoetching technology (± 0.5μm) ensures the shape error of a single feature less than 1 μm. However, the big warp error of the large-size glass reduces the geometric accuracy among markers [33]. Thus, the 3D position of markers is calibrated by a HEXAGON OPTIV reference instrument (Qingdao, China, measurement error x E , y E : 0.25 m + L/900, z E = 0.5 m) as a priori information ( Figure 7) to ensure the vision measurement accuracy under a wide range of conditions.

Accurate Encoding and Identification Method for Coded Targets
As mentioned above, we design a large-size artifact. To make the proposed wide range measurement method feasible, a certain number of fiducial markers that are easy to distinguish from each other should be distributed over a large measurement basis. For this intension, considering the invariant attribute of the radially symmetric form, fiducial marker with ring pattern around the center point is designed as the enhanced feature ( Figure 8e). The center point is used to represent the motion information of the worktable; while the ring pattern consists of encoding region (bit sequence "1") and non-encoding region (bit sequence "0") serves to distinguish each fiducial marker. To perform large-scale contouring error detection using a priori information, the time-varying fiducial markers need to be accurately and automatically identified. However, for traditional encoding method, the identification number is defined as the smallest decimal value converted from binary sequences after cyclic shift [33,46]. As a result, the utilization of the decimal involved in the encoding process is less than 1 n due to the inefficient encoding rules, where n is the number of bits. Therefore, an efficient encoding and decoding method based on finding the best start bit is proposed. With the method, for each fiducial marker, an auxiliary circular tag is added and pointed to the lowest bit of the binary code. Since the decimal corresponding to the i th coded marker is i, the effective utilization of the decimal is increased to 100%. Then, the 1024 fiducial markers used in this paper can be encoded by 10 bits, while the traditional method requires at least 15 bits.
After encoding, the coded value can be calculated by the image analysis of the ring pattern defining the code, the main idea of the decoding method is to reorganize the clockwise arranged bit sequence after finding the start bit, and then to identify the coded value by direct decimal conversion. The identification algorithm can be split up into the following steps: (1) Image preprocessing and geometric ellipse fitting As shown in Figure 8b, image preprocessing (e.g., binarization, noise rejection) is conducted for the acquired gray image (Figure 8a) first, then the morphological clustering method [47] is combined with the form factor

Accurate Encoding and Identification Method for Coded Targets
As mentioned above, we design a large-size artifact. To make the proposed wide range measurement method feasible, a certain number of fiducial markers that are easy to distinguish from each other should be distributed over a large measurement basis. For this intension, considering the invariant attribute of the radially symmetric form, fiducial marker with ring pattern around the center point is designed as the enhanced feature ( Figure 8e). The center point is used to represent the motion information of the worktable; while the ring pattern consists of encoding region (bit sequence "1") and non-encoding region (bit sequence "0") serves to distinguish each fiducial marker. To perform large-scale contouring error detection using a priori information, the time-varying fiducial markers need to be accurately and automatically identified. However, for traditional encoding method, the identification number is defined as the smallest decimal value converted from binary sequences after cyclic shift [33,46]. As a result, the utilization of the decimal involved in the encoding process is less than 1/n due to the inefficient encoding rules, where n is the number of bits. Therefore, an efficient encoding and decoding method based on finding the best start bit is proposed. With the method, for each fiducial marker, an auxiliary circular tag is added and pointed to the lowest bit of the binary code. Since the decimal corresponding to the ith coded marker is i, the effective utilization of the decimal is increased to 100%. Then, the 1024 fiducial markers used in this paper can be encoded by 10 bits, while the traditional method requires at least 15 bits.
After encoding, the coded value can be calculated by the image analysis of the ring pattern defining the code, the main idea of the decoding method is to reorganize the clockwise arranged bit sequence after finding the start bit, and then to identify the coded value by direct decimal conversion. The identification algorithm can be split up into the following steps: (1) Image preprocessing and geometric ellipse fitting As shown in Figure 8b, image preprocessing (e.g., binarization, noise rejection) is conducted for the acquired gray image (Figure 8a) first, then the morphological clustering method [47] is combined with the form factor P min ≤ (perimeter 2 /4π × area) ≤ P max to distinguish the center point and encoding regions. For detailed description, please refer to our previous work [15]. Thereafter, the five parameters ( x c y c a b θ ) of each closed region is obtained by the geometric ellipse fitting method. (2) Obtain the complete fiducial marker In practical measurement, imaging parameters of small FOV and low camera resolution are used to improve the measureable traverse speed. Besides, this paper involves the contouring error detection of a real 3D path, hence, the resultant perspective effect increases the difficulty in removing incomplete fiducial marker in the image boundary, as well as in calculating the number of "1" in the encoding region using area criteria. Therefore, the affine transformation is performed to get closed regions perpendicular to the optical axis, which can be given by: where ( x c y c a b θ ) is the five parameters of an ellipse, ( x y ) represents the coordinates of the original image, ( x y ) represents the transformed coordinates corresponding to the original image ( x y ). Then, based on the central positions and radius ratio (r A : r B : r C = 6.5 : 4.5 : 2.5), closed regions attached to the same coded target can be obtained (Figure 8d), and coded targets in the image can thus be distinguished between each other. Thereafter, the complete coded targets are determined by r A ≤ d, where r A and d are determined by the imaging parameters and the physical dimension of the coded target.
(3) Arrange vectors clockwise beginning with the start tag First, the center of the inner circle, denoted by B, is located by the grey centroid method [48].
For each complete fiducial marker, a set of straight line vectors → BA i are formed by joining the center point and the centroid of the surrounding encoding regions. Then, straight line vectors are arranged in clockwise → BA 2 · · · → BA 5 → → BA 1 (Figure 9). Afterwards, the start tag is found (see Step 1) and considered to be the initial position to rearrange vectors, and we get First, the center of the inner circle, denoted by B , is located by the grey centroid method [48].  (Figure 9). Afterwards, the start tag is found (see Step 1) and considered to be the initial position to rearrange vectors, and we get       (4) Read the binary sequence clockwise It is noted that the 'start tag' is not involved in the calculation of "0" and "1". The numbers of "1" with respect to each encoding region is deduced by the area criterion area area C U , where area C is the area of a encoding region; area U denotes the area of the unit encoding zone which can be described by a binary "1". Meanwhile, the calculation for the numbers of "0" in the non-encoding regions can be categorized into two cases: A) Case 1: only one encoding region around the central point As shown in Figure 10a, suppose that this encoding region consists of m "1", then the number of "0" with respect to the non-encoding region is   10 n m, and we get the binary sequence   (4) Read the binary sequence clockwise It is noted that the 'start tag' is not involved in the calculation of "0" and "1". The numbers of "1" with respect to each encoding region is deduced by the area criterion C area /U area , where C area is the area of a encoding region; U area denotes the area of the unit encoding zone which can be described by a binary "1". Meanwhile, the calculation for the numbers of "0" in the non-encoding regions can be categorized into two cases: A) Case 1: only one encoding region around the central point As shown in Figure 10a, suppose that this encoding region consists of m "1", then the number of "0" with respect to the non-encoding region is n = 10 − m, and we get the binary sequence 11 · · · m 00 · · ·  A BA in Figure 11);  denotes half of the number of "1" contained in the first clockwise encoding region. As illustrated in Figure 11, the coded value can be calculated by the following two cases: Through the above calculation, a clockwise binary sequence, denoted by Binary_sequence, starting from the encoding region closest to the start tag is obtained. Then, the decoding method based on finding the start bit is proposed to deduce the identification number. Let where α = 0.5; β describes the angle between the start tag and its nearest encoding region along clockwise direction (∠A 1 BA 2 in Figure 11); θ denotes half of the number of "1" contained in the first clockwise encoding region. As illustrated in Figure 11, the coded value can be calculated by the following two cases: adjacent encoding regions. By traversing the entire ring pattern, we get the binary sequence of. A BA in Figure 11);  denotes half of the number of "1" contained in the first clockwise encoding region. As illustrated in Figure 11, the coded value can be calculated by the following two cases: Figure 11. Pseudo-code description of calculating the coded value.
A) Case 1: if ＜0 or =0 Figure 11. Pseudo-code description of calculating the coded value.
A) Case 1: if ∆ < 0 or ∆ = 0 As shown in Figure 12, Let ψ = roundn(α + β + θ, 0), and assume the first encoding region consists of k number of "1", where k ≥ ψ. First, ψ numbers of "1" from the back to the front of this sequence are read and then connected with the following sequence to get the Segment_1; while the remain k − ψ number of "1" in the first encoding region is denoted by Segment_2. Finally, the new segment is obtained by connecting Segment_1 to Segment_2.
B) Case 2: ∆ > 0 ψ = ∆ binary numbers from the back of the entire binary sequence to the front are read to form Segment_1 ( Figure 13), and the remain binary sequence is denoted by Segment_2. Then, the new segment is obtained by connecting Segment_1 to Segment_2. . First,  numbers of "1" from the back to the front of this sequence are read and then connected with the following sequence to get the _ 1 Segment ; while the remain   k number of "1" in the first encoding region is denoted by _ 2 Segment . Finally, the new segment is obtained by connecting _ 1 Segment to _ 2 Segment . B) Case 2: ＞0   = binary numbers from the back of the entire binary sequence to the front are read to form _ 1 Segment (Figure 13), and the remain binary sequence is denoted by _ 2 Segment . Then, the new segment is obtained by connecting _ 1 Segment to _ 2 Segment .    The detailed pseudo-code description of calculating the coded value is given in Figure 11. Finally, the coded target can be decoded by directly converting the new binary sequence to decimal.

Pose Estimation Algorithms Comparison
Currently, PNP algorithms can be classified as closed solution based algorithms (P3P, P4P, P5P), non-iterative algorithms (DLT, TASI) and iterative algorithms. The former two algorithms are easily affected by noise; while the non-iterative algorithm has obvious advantages in measuring accuracy and stability. Research on PNP algorithm mainly focuses on the measurement accuracy, efficiency and stability. To make it work, at least three (or six) control points are required for vision system with known (or unknown) intrinsic parameters. In this paper, three classical algorithms, i.e., ''DLS' [49], 'LHM' [50] and 'OPNP' [51] have been selected and tested to find the most accurate one for practical application. The detailed pseudo-code description of calculating the coded value is given in Figure 11. Finally, the coded target can be decoded by directly converting the new binary sequence to decimal. The Supplementary Materials give an external video for image sequence processing. And all the markers in the image sequence can be accurately identified.

Pose Estimation Algorithms Comparison
Currently, PNP algorithms can be classified as closed solution based algorithms (P3P, P4P, P5P), non-iterative algorithms (DLT, TASI) and iterative algorithms. The former two algorithms are easily affected by noise; while the non-iterative algorithm has obvious advantages in measuring accuracy and stability. Research on PNP algorithm mainly focuses on the measurement accuracy, efficiency and stability. To make it work, at least three (or six) control points are required for vision system with known (or unknown) intrinsic parameters. In this paper, three classical algorithms, i.e., 'DLS' [49], 'LHM' [50] and 'OPNP' [51] have been selected and tested to find the most accurate one for practical application.
All algorithms were tested by image reprojection errors of the features at a fixed position without considering the proposed distortion partition model. Specifically, the target is driven by the A axis of the machine to rotate to six angular positions ( Figure 14a). And, in each stop position, several features are used by the three codes to constructed the pose ( R T ) of a calibrated camera. Thereafter, the whole 23 features are projected back to the image via the ( R T ), and the measurement accuracy of the three algorithms are compared by the reprojection error, which is the image distance between the projected point and a observed one.

Wide Range Contouring Error Detection
To ensure the vision measurement accuracy high enough to evaluate the contouring error, a high-speed camera with large resolution is needed. However, the FOV, camera resolution and frames per second (FPS) interact with each other [33]: (9) where Spatial resolution is measured in mm/pixel; Bandwidth is defined as the amount of data that has to be transmitted per second. When the camera interface and the capture card are determined, the Bandwidth is a constant. For the "what-you-see-what-you-get" measurement scheme (Figure 5b), the working range entirely depends on the size of the FOV. Hence, to improve the working range while maintaining a satisfactory Spatial resolution (Equation (9)), the increased camera resolution will reduce the FPS (i.e., working speed/ time resolution). Conversely, for constant values of Bandwidth and Spatial resolution , to increase the working speed (i.e., FPS), the camera resolution should be reduced. As a result, the narrow FOV reduces the working range of the vision system. To summarize, the working range and the working speed cannot be simultaneously and greatly improved by simply adjusting the camera imaging parameters. Thus, for further enhancing the measurement ability of the vision system, in combination with a priori geometric constraint (Figure 7), a monocular vision-based 3D high temporal-spatial measurement method is proposed in this paper. The basic idea of the method is to improve the working speed (i.e., FPS) of the vision system by scarifying the FOV and camera resolution, while the wide range measurement capability of the vision device is realized by a priori geometric constraint. Specifically, the FOV and resolution of the camera can be scaled down to only guarantee the proper positioning accuracy of several markers in the image (Figure 5c). As a result, the measurable speed of the vision system can be increased; while, in case of wide range measurement, a measurement fixture with 1024 coded markers on the artifact (Figure 6a) is designed. For the coded markers, one of them is selected as the 'reference feature' to represent the whole motion trajectory of the machine, while the others are defined as auxiliary coded markers used for calculating the position of the 'reference feature'.

Wide Range Contouring Error Detection
To ensure the vision measurement accuracy high enough to evaluate the contouring error, a high-speed camera with large resolution is needed. However, the FOV, camera resolution and frames per second (FPS) interact with each other [33]: where Spatial resolution is measured in mm/pixel; Bandwidth is defined as the amount of data that has to be transmitted per second. When the camera interface and the capture card are determined, the Bandwidth is a constant. For the "what-you-see-what-you-get" measurement scheme (Figure 5b), the working range entirely depends on the size of the FOV. Hence, to improve the working range while maintaining a satisfactory Spatial resolution (Equation (9)), the increased camera resolution will reduce the FPS (i.e., working speed/time resolution). Conversely, for constant values of Bandwidth and Spatial resolution , to increase the working speed (i.e., FPS), the camera resolution should be reduced. As a result, the narrow FOV reduces the working range of the vision system. To summarize, the working range and the working speed cannot be simultaneously and greatly improved by simply adjusting the camera imaging parameters. Thus, for further enhancing the measurement ability of the vision system, in combination with a priori geometric constraint (Figure 7), a monocular vision-based 3D high temporal-spatial measurement method is proposed in this paper. The basic idea of the method is to improve the working speed (i.e., FPS) of the vision system by scarifying the FOV and camera resolution, while the wide range measurement capability of the vision device is realized by a priori geometric constraint. Specifically, the FOV and resolution of the camera can be scaled down to only guarantee the proper positioning accuracy of several markers in the image (Figure 5c). As a result, the measurable speed of the vision system can be increased; while, in case of wide range measurement, a measurement fixture with 1024 coded markers on the artifact (Figure 6a) is designed. For the coded markers, one of them is selected as the 'reference feature' to represent the whole motion trajectory of the machine, while the others are defined as auxiliary coded markers used for calculating the position of the 'reference feature'.
Therefore, according to Equation (9), the camera's frame rate can be increased by scaling down the FOV and camera resolution. Theoretically, image blur can be suppressed by only increasing the frame rate by reducing the camera resolution. But in this paper, to ensure the vision measurement accuracy at low camera resolution, the FOV is also reduced. In practical implementation, a CoaXPress interface camera with full-resolution of 5120 × 5120 pixels is used. The built-in "Region of Interest" (ROI) function enables the camera to send images at faster frame rate by sacrificing the camera resolution: As can be seen from Equation (10), when the full resolution R f ull is reduced to a low resolution R reduce without changing the focal length of the lens, the camera's FOV F f ull will be reduced to F reduce in the same proportion as the camera resolution. Therefore, the spatial resolutions before and after the adjustment of imaging parameters remain unchanged, and the vision measurement accuracy can thus be ensured. Besides, after the adjustment, the FOV can be further reduced by increasing the focal length, and thus the positioning accuracy of the marker in the FOV can be improved due to the enlargement of the spatial resolution of the vision system.
To begin with, we give definitions of several coordinate systems. As illustrated in Figure 15, during the measurement, the involved coordinate frames consist of artifact coordinate system (ACS) O A X A Y A Z A to provide the high accuracy spatial position information of each marker; the camera (CCS) O C X C Y C Z C and PCS ouv (defined in Section 2), and the machine coordinate system (MCS) O M X M Y M Z M to represent the contouring error of a movement trajectory. The origin of the ACS is fixed on the artifact, with its origin O A located in the central point of the upper-right marker. The positive X A -and Y A -axes point left and downward, respectively; while the positive Z A -axis points outside. Before the machine movement, the MCS is established and its origin coincides with that of ACS. Below we take the contouring error detection of a spatial contour (Section 5.1) as an example to describe the 3D high temporal-spatial measurement method in detail:

1) System setup and imaging parameters adjustment
The vision measurement system is installed on the platform outside the machine (Figure 16c) to avoid vibrations. Considering the Z-direction variation (about 55 mm) of the spatial contour, the FOV and camera resolution are set to 60 × 60 mm and 3072 × 3072 pixels, respectively. Then, with the 0.03-0.05 subpixel accuracy of the gray centroid algorithm [48], an ideal measurement accuracy of 2 µm can be obtained with a 50 mm focal length at focusing distance of 450 mm. Furthermore, under the imaging parameters, more than 9 markers can appear in each image frame to guarantee the feasibility of the PNP algorithm.

2) Camera calibration
Since the control field ( Figure 3) and the artifact (Figure 6a) can be exchanged in the designed cooperative target. Therefore, before the measurement, the cooperative target with the control field is first placed on the worktable. Then, combining with the coded markers, the camera pose is adjusted by the PNP algorithm. In the process, the intrinsic matrix K in Equation (1) are repeatedly calibrated using Zhang's method [40], until the optical axis is perpendicular to the control field. Thereafter, the proposed camera calibration method considering the distortion partition of DOF in Section 2.2 is performed. The specific process and accuracy verification is detailed in Section 5.2. Sensors 2018, 18, x FOR PEER REVIEW 18 of 35 Figure 15. Principle for 3D high temporal-spatial measurement using PNP algorithm.
3) 3D contouring error detection of wide range trajectories by using PNP algorithm The coded marker r P , i.e., the origin A O of the ACS, is used as the 'reference feature' to represent the entire movement trajectory. Since the contouring error is expressed in the MCS (ISO 10791-6), hence, before measurement, data transformation matrix between VCS and MCS is determined by the method similar to our previous work [15], which can be given by: where ( ) is that the 3D coordinates calculation of point r P in this paper is constructed by PNP algorithm, while in [15], it is calculated by triangulation.
During measurement, the CCS and MCS are fixed, while the ACS makes interpolation motion with the worktable. Simultaneously, the time-varying markers are continuously imaged on the camera. As described in step 1, the maximum frame rate can reach to 208 FPS by reducing both the camera resolution and the FOV. While for enlarging the measurable range of the detection system in condition of small FOV, we use a priori information among coded markers on the artifact to deduce the 3D position of the 'reference feature' r P . As discussed above, the measurable speed is enlarged by sacrificing FOV, then the small FOV will lead to the invisibility of the 'reference feature' in some images. In this case, together with the pre-calibrated position relationship between the visible marker and the 'reference feature', the visible auxiliary coded markers are used by the OPNP algorithm to deduce the 3D position of the 'reference feature'. Suppose that a total of Q frames are acquired, for the i th 3) 3D contouring error detection of wide range trajectories by using PNP algorithm The coded marker P r , i.e., the origin O A of the ACS, is used as the 'reference feature' to represent the entire movement trajectory. Since the contouring error is expressed in the MCS (ISO 10791-6), hence, before measurement, data transformation matrix between VCS and MCS is determined by the method similar to our previous work [15], which can be given by: where C O M ( C X M C Y M C Z M ) are the 3D coordinates of point P r in CCS when the CNC machine tool is zeroed. In this position, the MCS is established. ( x i y i z i ) and ( x i y i z i ) depicts the 3D coordinates of the point P r in CCS when the worktable (i.e., artifact) is moved to the ith position along Xand Y-axis. Based on position data of P r in CCS, the direction vector of Xand Y-axis of MCS, i.e., ( m x n x p x ) and ( m y n y p y ), can be fitted by the least square method, and that of Z axis can be obtained by the right hand rule. M M C represents the transformation matrix from the CCS to the MCS, including the rotation matrix R CM and the translation matrix T CM . The main difference is that the 3D coordinates calculation of point P r in this paper is constructed by PNP algorithm, while in [15], it is calculated by triangulation.
During measurement, the CCS and MCS are fixed, while the ACS makes interpolation motion with the worktable. Simultaneously, the time-varying markers are continuously imaged on the camera. As described in step 1, the maximum frame rate can reach to 208 FPS by reducing both the camera resolution and the FOV. While for enlarging the measurable range of the detection system in condition of small FOV, we use a priori information among coded markers on the artifact to deduce the 3D position of the 'reference feature' P r . As discussed above, the measurable speed is enlarged by sacrificing FOV, then the small FOV will lead to the invisibility of the 'reference feature' in some images. In this case, together with the pre-calibrated position relationship between the visible marker and the 'reference feature', the visible auxiliary coded markers are used by the OPNP algorithm to deduce the 3D position of the 'reference feature'. Suppose that a total of Q frames are acquired, for the ith frame, to calculate the 3D coordinates of P r i , firstly, the number of j complete coded markers, denoted by p 1 and located using the image processing method proposed in Section 3.2. Let the corresponding points in ACS are A P 1 Then, using the pre-calibrated camera parameters, the 3D point C P r i in CCS can be solved by OPNP algorithm, and the measured point in MCS M P r i can be calculated by datum transformation in Equation (11). Finally, the 3D contour L r can be calculated by traversing all the images, and the contouring error E can be deduced by comparing the difference between measured trajectory and the nominal one E = L r − L m . In this way, the measurable range no longer depends on the size of the FOV, but on the size of the artifact, and the two indicators can thus be simultaneously increased.

Experimental Equipment and Tested Trajectories
As shown in Figure 16, the experimental equipment includes a five-axis CNC machine tool, a camera, a platform, a graphics workstation, a frame grabber and a cooperative target. As hardware platform, an EoSens ® 25CXP CMOS camera with full-resolution of 5120 × 5120 pixels is selected. The camera is connected to a microEnable 5 frame grabber (VQ8-CXP6D) within a 22" Windows XP-based workstation by CoaXPress cable. For the software library, GenICam standard is used to configure and trigger the camera, and images are visualized by the microDisplay software. All algorithms are developed with the aid of the machine vision toolbox of MATLAB. In terms of the synchronous trigger between the camera and the machine tool, it is not so strict for the monocular vision system. We only need to ensure that the whole movement trajectory can be recorded. Thus, the software trigger is first used to allocate memory for image acquisition, thereafter the machine tool is triggered manually to perform the trajectory interpolation. In practical measurement, the FOV and camera resolution are set to 60 × 60 mm and 3072 × 3072 pixels, respectively. Though, the resultant frame rate can be increased up to 208 FPS, the sufficient frame rate of 100 FPS is selected to acquire high-quality marker images. Then, with the 0.03-0.05 subpixel accuracy of the gray centroid algorithm [48], an ideal measurement accuracy of 2 µm can be obtained with a 50 mm focal length at focusing distance of 450 mm. The aperture is set at f/22 to ensure that the relatively large DOF of 71 mm can accommodate the distance change (about 55 mm) of the spatial contour. Besides, large artifact with size of 231 mm × 231 mm is designed to increase the measurable range of the system. Other experimental parameters are shown in Table 1.
p u v , are identified and located using the image processing method proposed in section 3.2. Let the corresponding points in ACS are Then, using the pre-calibrated camera parameters, the 3D point E L L . In this way, the measurable range no longer depends on the size of the FOV, but on the size of the artifact, and the two indicators can thus be simultaneously increased.

Experimental Equipment and Tested Trajectories
As shown in Figure 16, the experimental equipment includes a five-axis CNC machine tool, a camera, a platform, a graphics workstation, a frame grabber and a cooperative target. As hardware platform, an EoSens ® 25CXP CMOS camera with full-resolution of 5120 × 5120 pixels is selected. The camera is connected to a microEnable 5 frame grabber (VQ8-CXP6D) within a 22" Windows XP-based workstation by CoaXPress cable. For the software library, GenICam standard is used to configure and trigger the camera, and images are visualized by the microDisplay software. All algorithms are developed with the aid of the machine vision toolbox of MATLAB. In terms of the synchronous trigger between the camera and the machine tool, it is not so strict for the monocular vision system. We only need to ensure that the whole movement trajectory can be recorded. Thus, the software trigger is first used to allocate memory for image acquisition, thereafter the machine tool is triggered manually to perform the trajectory interpolation. In practical measurement, the FOV and camera resolution are set to 60 × 60 mm and 3072 × 3072 pixels, respectively. Though, the resultant frame rate can be increased up to 208 FPS, the sufficient frame rate of 100 FPS is selected to acquire high-quality marker images. Then, with the 0.03-0.05 subpixel accuracy of the gray centroid algorithm [48], an ideal measurement accuracy of 2 μm can be obtained with a 50 mm focal length at focusing distance of 450 mm. The aperture is set at f/22 to ensure that the relatively large DOF of 71 mm can accommodate the distance change (about 55 mm) of the spatial contour. Besides, large artifact with size of 231 mm × 231 mm is designed to increase the measurable range of the system. Other experimental parameters are shown in Table 1.
(a)   Figure 6a for detail) Geometrical accuracy of single coded targets ＜ 1 μm Calibration accuracy of spatial geometric information 0.5 μm Clearly, the large-scale measurement capability is guaranteed by the pre-calibrated large-size artifact. However, the measurement advantage of speed depends entirely on whether non-fuzzy images can be acquired at the set frame rate. Thus, performance test is conducted to verify the relatively high feed measurement capacity. As shown in Table 2, we reduced the camera resolution from 5120 × 5120 pixels to 3072 × 3072 pixels and 1024 × 1024 pixels by the built-in ROI function. Correspondingly, three experimental frame rates are obtained: 25 FPS, 100 FPS and 150 FPS.   Figure 6a for detail) Geometrical accuracy of single coded targets <1 µm Calibration accuracy of spatial geometric information 0.5 µm Clearly, the large-scale measurement capability is guaranteed by the pre-calibrated large-size artifact. However, the measurement advantage of speed depends entirely on whether non-fuzzy images can be acquired at the set frame rate. Thus, performance test is conducted to verify the relatively high feed measurement capacity. As shown in Table 2, we reduced the camera resolution from 5120 × 5120 pixels to 3072 × 3072 pixels and 1024 × 1024 pixels by the built-in ROI function. Correspondingly, three experimental frame rates are obtained: 25 FPS, 100 FPS and 150 FPS.  Based on the above three types of imaging parameters, images of the markers moving along with the X axis of the machine are acquired (measurement configuration is shown in Figure 16b). In the tests, the feed rates are set to 3 m/min, 5 m/min and 7 m/min, respectively. Subsequently, the grey characteristics of the marker with the code value of 397 are studied. And the 3D grey maps captured at different feed rates using imaging parameters in the three columns of Table 2 are plotted, as well as the cross-section grey curve passing the point center (i.e., Figures 17-19). Taking the captured static image as a reference, the sharpness of the point edge at different feed rates is measured by comparing the image gradient with that of a static image (i.e., Figures 17a, 18a, and 19a). As can be seen from the 3D grey maps in Figures 17b-d, when using the same frame rate (i.e., 25 FPS) as in [15], images of central point are obviously degraded. In addition, at 3 m/min, 5 m/min and 7 m/min, the image gradients of the point edge are 11, 8 and 6.5, far less than the reference value of 53. The larger the feed rate, the smoother the point edge. However, the results obtained by the other two imaging parameters (i.e., Figure 18 and 19) show satisfactory results. The 3D grey maps captured at different feed rates (i.e., Figure 18b or Figure 19b) has a good consistency with the reference (i.e., Figure 18a or Figure 19a). Moreover, the image gradient values of the point edge at different feed rates under the two imaging conditions differs from the reference value by about 0.5 pixel, indicating the sharpness of the point edge. Additionally, as illustrated in Table 2, by reducing the camera resolution, the maximum frame rate is increased from 33 FPS to 208 FPS and 308 FPS, respectively, which verifies the feasibility of the proposed method in measuring relatively high-dynamic contouring error.  Based on the above three types of imaging parameters, images of the markers moving along with the X axis of the machine are acquired (measurement configuration is shown in Figure 16b). In the tests, the feed rates are set to 3 m/min, 5 m/min and 7 m/min, respectively. Subsequently, the grey characteristics of the marker with the code value of 397 are studied. And the 3D grey maps captured at different feed rates using imaging parameters in the three columns of Table 2 are plotted, as well as the cross-section grey curve passing the point center (i.e., Figures 17-19). Taking the captured static image as a reference, the sharpness of the point edge at different feed rates is measured by comparing the image gradient with that of a static image (i.e., Figures 17a, 18a, and 19a). As can be seen from the 3D grey maps in Figures 17b-d, when using the same frame rate (i.e., 25 FPS) as in [15], images of central point are obviously degraded. In addition, at 3 m/min, 5 m/min and 7 m/min, the image gradients of the point edge are 11, 8 and 6.5, far less than the reference value of 53. The larger the feed rate, the smoother the point edge. However, the results obtained by the other two imaging parameters (i.e., Figure 18 and 19) show satisfactory results. The 3D grey maps captured at different feed rates (i.e., Figure 18b or Figure 19b) has a good consistency with the reference (i.e., Figure 18a or Figure 19a). Moreover, the image gradient values of the point edge at different feed rates under the two imaging conditions differs from the reference value by about 0.5 pixel, indicating the sharpness of the point edge. Additionally, as illustrated in Table 2, by reducing the camera resolution, the maximum frame rate is increased from 33 FPS to 208 FPS and 308 FPS, respectively, which verifies the feasibility of the proposed method in measuring relatively high-dynamic contouring error.  Based on the above three types of imaging parameters, images of the markers moving along with the X axis of the machine are acquired (measurement configuration is shown in Figure 16b). In the tests, the feed rates are set to 3 m/min, 5 m/min and 7 m/min, respectively. Subsequently, the grey characteristics of the marker with the code value of 397 are studied. And the 3D grey maps captured at different feed rates using imaging parameters in the three columns of Table 2 are plotted, as well as the cross-section grey curve passing the point center (i.e., Figures 17-19). Taking the captured static image as a reference, the sharpness of the point edge at different feed rates is measured by comparing the image gradient with that of a static image (i.e., Figures 17a, 18a, and 19a). As can be seen from the 3D grey maps in Figures 17b-d, when using the same frame rate (i.e., 25 FPS) as in [15], images of central point are obviously degraded. In addition, at 3 m/min, 5 m/min and 7 m/min, the image gradients of the point edge are 11, 8 and 6.5, far less than the reference value of 53. The larger the feed rate, the smoother the point edge. However, the results obtained by the other two imaging parameters (i.e., Figure 18 and 19) show satisfactory results. The 3D grey maps captured at different feed rates (i.e., Figure 18b or Figure 19b) has a good consistency with the reference (i.e., Figure 18a or Figure 19a). Moreover, the image gradient values of the point edge at different feed rates under the two imaging conditions differs from the reference value by about 0.5 pixel, indicating the sharpness of the point edge. Additionally, as illustrated in Table 2, by reducing the camera resolution, the maximum frame rate is increased from 33 FPS to 208 FPS and 308 FPS, respectively, which verifies the feasibility of the proposed method in measuring relatively high-dynamic contouring error.  Based on the above three types of imaging parameters, images of the markers moving along with the X axis of the machine are acquired (measurement configuration is shown in Figure 16b). In the tests, the feed rates are set to 3 m/min, 5 m/min and 7 m/min, respectively. Subsequently, the grey characteristics of the marker with the code value of 397 are studied. And the 3D grey maps captured at different feed rates using imaging parameters in the three columns of Table 2 are plotted, as well as the cross-section grey curve passing the point center (i.e., Figures 17-19). Taking the captured static image as a reference, the sharpness of the point edge at different feed rates is measured by comparing the image gradient with that of a static image (i.e., Figure 17a, Figure 18a, and Figure 19a). As can be seen from the 3D grey maps in Figure 17b-d, when using the same frame rate (i.e., 25 FPS) as in [15], images of central point are obviously degraded. In addition, at 3 m/min, 5 m/min and 7 m/min, the image gradients of the point edge are 11, 8 and 6.5, far less than the reference value of 53. The larger the feed rate, the smoother the point edge. However, the results obtained by the other two imaging parameters (i.e., Figures 18 and 19) show satisfactory results. The 3D grey maps captured at different feed rates (i.e., Figure 18b or Figure 19b) has a good consistency with the reference (i.e., Figure 18a or Figure 19a). Moreover, the image gradient values of the point edge at different feed rates under the two imaging conditions differs from the reference value by about 0.5 pixel, indicating the sharpness of the point edge. Additionally, as illustrated in Table 2, by reducing the camera resolution, the maximum frame rate is increased from 33 FPS to 208 FPS and 308 FPS, respectively, which verifies the feasibility of the proposed method in measuring relatively high-dynamic contouring error.  Based on the above three types of imaging parameters, images of the markers moving along with the X axis of the machine are acquired (measurement configuration is shown in Figure 16b). In the tests, the feed rates are set to 3 m/min, 5 m/min and 7 m/min, respectively. Subsequently, the grey characteristics of the marker with the code value of 397 are studied. And the 3D grey maps captured at different feed rates using imaging parameters in the three columns of Table 2 are plotted, as well as the cross-section grey curve passing the point center (i.e., Figures 17-19). Taking the captured static image as a reference, the sharpness of the point edge at different feed rates is measured by comparing the image gradient with that of a static image (i.e., Figures 17a, 18a, and 19a). As can be seen from the 3D grey maps in Figures 17b-d, when using the same frame rate (i.e., 25 FPS) as in [15], images of central point are obviously degraded. In addition, at 3 m/min, 5 m/min and 7 m/min, the image gradients of the point edge are 11, 8 and 6.5, far less than the reference value of 53. The larger the feed rate, the smoother the point edge. However, the results obtained by the other two imaging parameters (i.e., Figure 18 and 19) show satisfactory results. The 3D grey maps captured at different feed rates (i.e., Figure 18b or Figure 19b) has a good consistency with the reference (i.e., Figure 18a or Figure 19a). Moreover, the image gradient values of the point edge at different feed rates under the two imaging conditions differs from the reference value by about 0.5 pixel, indicating the sharpness of the point edge. Additionally, as illustrated in Table 2, by reducing the camera resolution, the maximum frame rate is increased from 33 FPS to 208 FPS and 308 FPS, respectively, which verifies the feasibility of the proposed method in measuring relatively high-dynamic contouring error. To verify the advantages of the proposed knowledge-driven contouring error detection approach in multi-dimensional, high speed, wide working range as well as the various forms of trajectories measurement over the existing vision and non-vision methods. Contouring errors of two types of paths, i.e., planar contour and spatial trajectory, are evaluated. The former is the wide range butterfly curve (Figure 20a)  ), the working range in -axis is twice that of our previous work [15], and about 2.8 times of the FOV selected in this paper. Another is the spatial path shown in Figure  20b, in which the whole curve can be derived by the offset or rotation of the two paths:   That is, for setup 1 (Figure 16b), the camera is placed above the XOY plane to measure contouring error of the butterfly curve, while setup 2 (Figure 16c) with camera mounting in front of the machine tool is utilized to detect motion error of the spatial trajectory. To verify the advantages of the proposed knowledge-driven contouring error detection approach in multi-dimensional, high speed, wide working range as well as the various forms of trajectories measurement over the existing vision and non-vision methods. Contouring errors of two types of paths, i.e., planar contour and spatial trajectory, are evaluated. The former is the wide range butterfly curve (Figure 20a)  ), the working range in -axis is twice that of our previous work [15], and about 2.8 times of the FOV selected in this paper. Another is the spatial path shown in Figure  20b, in which the whole curve can be derived by the offset or rotation of the two paths:  That is, for setup 1 (Figure 16b), the camera is placed above the XOY plane to measure contouring error of the butterfly curve, while setup 2 (Figure 16c) with camera mounting in front of the machine tool is utilized to detect motion error of the spatial trajectory. To verify the advantages of the proposed knowledge-driven contouring error detection approach in multi-dimensional, high speed, wide working range as well as the various forms of trajectories measurement over the existing vision and non-vision methods. Contouring errors of two types of paths, i.e., planar contour and spatial trajectory, are evaluated. The former is the wide range butterfly curve (Figure 20a) which is interpolated by the predetermined Xand Y-axis, of which the polar equation can be expressed as r = 6·e (cos(2θ)−2 cos(8θ)+(sin (θ/6) 5 )) (X ∈ 85.601mm 84.132mm and Y ∈ 34.554mm 35.138mm ), the working range in X-axis is twice that of our previous work [15], and about 2.8 times of the FOV selected in this paper. Another is the spatial path shown in Figure 20b, in which the whole curve can be derived by the offset or rotation of the two paths: , where β a = at an(l c · sin β c /h) and l a = l 2 c sin 2 (β c ) + h 2 ; h, l c and β c depend on the position of markers. To achieve contouring error measurement, considering the defocus effect caused by the small DOF under imaging parameters of large focal length and small object distance, two measurement configurations are used. That is, for setup 1 (Figure 16b), the camera is placed above the XOY plane to measure contouring error of the butterfly curve, while setup 2 (Figure 16c) with camera mounting in front of the machine tool is utilized to detect motion error of the spatial trajectory.

Experiment for Verifying the Proposed Calibration Method
Before conducting contouring error detection experiment, we verify the measurement accuracy of the distortion calibration method proposed in Section 2.2. As shown in the Figure 21a, after the alignment, the control field is driven by the machine to move four positions within the DOF perpendicular to the optical axis, two object planes at the front and rear position of the DOF are used as the reference to estimate the distortion behavior of the other two middle planes.

Experiment for Verifying the Proposed Calibration Method
Before conducting contouring error detection experiment, we verify the measurement accuracy of the distortion calibration method proposed in Section 2.2. As shown in the Figure 21a, after the alignment, the control field is driven by the machine to move four positions within the DOF perpendicular to the optical axis, two object planes at the front and rear position of the DOF are used as the reference to estimate the distortion behavior of the other two middle planes.

Experiment for Verifying the Proposed Calibration Method
Before conducting contouring error detection experiment, we verify the measurement accuracy of the distortion calibration method proposed in Section 2.2. As shown in the Figure 21a, after the alignment, the control field is driven by the machine to move four positions within the DOF perpendicular to the optical axis, two object planes at the front and rear position of the DOF are used as the reference to estimate the distortion behavior of the other two middle planes. For a clearer description, the verification process is divided into two steps: 1) Accuracy verification of the 3D distortion partition model As shown in the Figure 21a, after the alignment, the control field is driven by the machine to move four positions within the DOF perpendicular to the optical axis. Two object planes at the front and rear position of the DOF are used as the reference to estimate the distortion behavior of the other two middle planes: (1) Accuracy verification of equal-radius partition model As shown in Figure 21b and 21c, distortion curve of each subregion is different from that of calculated by all the lines in the image. Firstly, the performance of in-plane distortion partition model is judged by the straightness error after distortion correction. As illustrated in Table 3, the maximum and average straightness errors of each subregion are smaller than that are calculated by all the lines in the image, which indicate the accuracy of the proposed partition method. The optimal distortion curve for each subregion can be seen in the enlarged view of Figure 21c. Table 3. Accuracy verification of the in-plane distortion partition model. (2) Accuracy verification of the 3D distortion partition model Then, based on the front and rear object planes with known depths and the calibrated distortion parameters, the distortion coefficients on each partition of the two middle object planes are estimated by the method in Section 2.2. Thereafter, the derived distortions of the two middle planes are compared with that calculated directly by the plumb-line method to verify the accuracy of the proposed DOF distortion partition model. Table 4 illustrates the difference between the distortion calculated with or without DOF distortion partition model and the observed one. The results indicate that the maximum and average differences are 1.75 μm and 0.86 μm, while the distortion differences calculated without the For a clearer description, the verification process is divided into two steps:

Subregion 1 Subregion 2 Subregion 3 Subregion 4 Entire Image
1) Accuracy verification of the 3D distortion partition model As shown in the Figure 21a, after the alignment, the control field is driven by the machine to move four positions within the DOF perpendicular to the optical axis. Two object planes at the front and rear position of the DOF are used as the reference to estimate the distortion behavior of the other two middle planes: (1) Accuracy verification of equal-radius partition model As shown in Figure 21b,c, distortion curve of each subregion is different from that of calculated by all the lines in the image. Firstly, the performance of in-plane distortion partition model is judged by the straightness error after distortion correction. As illustrated in Table 3, the maximum and average straightness errors of each subregion are smaller than that are calculated by all the lines in the image, which indicate the accuracy of the proposed partition method. The optimal distortion curve for each subregion can be seen in the enlarged view of Figure 21c. Table 3. Accuracy verification of the in-plane distortion partition model. (2) Accuracy verification of the 3D distortion partition model Then, based on the front and rear object planes with known depths and the calibrated distortion parameters, the distortion coefficients on each partition of the two middle object planes are estimated by the method in Section 2.2. Thereafter, the derived distortions of the two middle planes are compared with that calculated directly by the plumb-line method to verify the accuracy of the proposed DOF distortion partition model. Table 4 illustrates the difference |C − O| between the distortion calculated with or without DOF distortion partition model and the observed one. The results indicate that the maximum and average differences are 1.75 µm and 0.86 µm, while the distortion differences calculated without the partition model are more than twice the corresponding difference calculated with the partition distortion, which show the high accuracy of the proposed partition method. 2) Accuracy verification of camera calibration To verify the calibration accuracy, the DOF distortion partition model is embedded in the selected OPNP algorithm to correct the pixel position deviation. Similar to the experimental equipment ( Figure 14a) and the verification process in Section 4.1, we first verify the calibration accuracy by using the reprojection error in pixel. As illustrated in Figure 22a, the maximum and average reprojection errors of the six angle positions using the OPNP algorithm with partition model are 0.23 pixel and 0.04 pixel, respectively. Additionally, the calibrated accuracy of the camera on each axis is validated. To perform camera calibration, the image plane should be installed parallel to the control field. Since the CMOS sensor has the same pixel size and imaging resolution in horizontal and vertical directions, we assume that the camera has the same measurement accuracy in the two directions. In practical application, the control field is driven to move 13 positions with an interval of 3 mm in the Xand Z-axis of the camera. Then, considering the high accuracy of machine tool static positioning, the vision calibrated accuracy is evaluated by the distance deviation of two adjacent stop positions. As shown in Figure 22b, the maximum and mean calibrated accuracy of the camera in X/Yand Z-axis are 3.4 µm/1.6 µm and 4.5 µm/1.4 µm, while the standard deviations are 1.0 µm and 1.6 µm, respectively. The results indicate the high calibrated accuracy of the camera in three axes.
Besides, we apply the algorithm to construct the angle between two adjacent angular positions, and validate the measurement accuracy of the system by comparing it with the actual motion angle. The accuracy verification results in Figure 22c illustrate that the maximum and mean angle errors are 0.0157 • and 0.0132 • , which verifies the 3D measurement accuracy of the vision system.

Case Study for Illustrating Advantages in 3D High Temporal-Spatial Measurement
In our previous work [15], due to the blurring effect caused by the fast movement of the target, the maximum measurable traverse speed was limited by 2 m/min. To demonstrate the capacity in synchronously extending the measurable range and speed, as well as of the 3D measurement of the contouring error. The contouring error of a large-scale butterfly curve (Figure 20a) is tested at 3 m/min and 5 m/min, respectively.
In practical application, the experimental process was repeated three groups for each feed rate, and the three repetition results are plotted on one graph (e.g., Figure 23). Since the interpolation motion is triggered manually, the sampling positions of the camera on the machine tool movement among the three groups are different (e.g., Figure 24c). This is equivalent to increasing the sampling points of motion trajectory, so that the estimation of contouring error is more sufficient. Besides, the number of sampling points varies with the feed rate. For each group of experiments at 3 m/min and 5 m/min, a total of about 2000 (e.g., Figures 23 and 24c) and 1350 (e.g., Figures 25 and 26c) image frames are collected, respectively. This is mainly due to the fact that the faster the interpolation speed is, the smaller the total duration. Figures 23 and 25 depict the 3D large-scale trajectories constructed in the MCS at two different traverse speed using the single camera. The results indicate that the method enable the large-scale path measurement in small FOV. Meanwhile, the 3D measured paths with time-varying curvature validate that the vision method breaks the limitations of the measurable dimension and trajectories of some existing equipment (e.g., ball-bar, cross-grid encoder and R test). As illustrated in Figures 23 and 25, the paths interpolated by the predetermined Xand Y-axis float above and below the movement plane. The maximum float range can reach 23.2 μm and 29.1 μm, respectively, which is mainly caused by the defects of the numerical control system and the machine structure (e.g., vibration and straightness error).

Case Study for Illustrating Advantages in 3D High Temporal-Spatial Measurement
In our previous work [15], due to the blurring effect caused by the fast movement of the target, the maximum measurable traverse speed was limited by 2 m/min. To demonstrate the capacity in synchronously extending the measurable range and speed, as well as of the 3D measurement of the contouring error. The contouring error of a large-scale butterfly curve (Figure 20a) is tested at 3 m/min and 5 m/min, respectively.
In practical application, the experimental process was repeated three groups for each feed rate, and the three repetition results are plotted on one graph (e.g., Figure 23). Since the interpolation motion is triggered manually, the sampling positions of the camera on the machine tool movement among the three groups are different (e.g., Figure 24c). This is equivalent to increasing the sampling points of motion trajectory, so that the estimation of contouring error is more sufficient. Besides, the number of sampling points varies with the feed rate. For each group of experiments at 3 m/min and 5 m/min, a total of about 2000 (e.g., Figures 23 and 24c) and 1350 (e.g., Figures 25 and 26c) image frames are collected, respectively. This is mainly due to the fact that the faster the interpolation speed is, the smaller the total duration. Figures 23 and 25 depict the 3D large-scale trajectories constructed in the MCS at two different traverse speed using the single camera. The results indicate that the method enable the large-scale path measurement in small FOV. Meanwhile, the 3D measured paths with time-varying curvature validate that the vision method breaks the limitations of the measurable dimension and trajectories of some existing equipment (e.g., ball-bar, cross-grid encoder and R test). As illustrated in Figures 23 and 25, the paths interpolated by the predetermined Xand Y-axis float above and below the movement plane. The maximum float range can reach 23.2 µm and 29.1 µm, respectively, which is mainly caused by the defects of the numerical control system and the machine structure (e.g., vibration and straightness error).   Additionally, as described in [15], the commercial cross-grid encoder ( Figure 27) with high resolution of 0.5μm is employed to measure the butterfly curve under the same conditions (test configuration in Figure 27 is the same as in Figure 16b). Firstly, the 3D visual measured points are Z position/mm  Additionally, as described in [15], the commercial cross-grid encoder ( Figure 27) with high resolution of 0.5 µm is employed to measure the butterfly curve under the same conditions (test configuration in Figure 27 is the same as in Figure 16b). Firstly, the 3D visual measured points are projected onto the XY plane of machine tool coordinate system to obtain the 2D contour. At the same time, the 2D interpolated butterfly contour is measured by cross-grid encoder under the same conditions. Then, according to ISO 10791-6, 2D contours measured by the two devices are compared with the nominal ones (commanded path) to obtain the contouring error. Thereafter, the 2D contour measured by cross-grid encoder is considered as the standard, and the accuracy of the vision method in detecting contouring error of butterfly curve is verified by comparing the difference between the two trajectories measured by the camera and the cross grid encoder. Figures 24a and 26a indicate that the 2D paths detected by both the vision and cross-grid encoder returns a consistent trend with the nominal ones. Figures 24b and 26b describe the 2D contouring error of the trajectories measured at two feed rates by the two means, while the Figures 24c and 26c depict the verification results of vision measurement accuracy. To give a better assessment of both the contouring error and vision measurement accuracy, a total of 8000 points are sampled by the cross-grid encoder on the whole motion trajectory. For contouring error of the butterfly path detected by cross-grid encoder at 3 m/min and 5 m/min, the maximum and mean contouring error are 64.9 µm/13.2 µm (at 3 m/min, Figure 24b) and 100.7 µm/15.7 µm (at 5 m/min, see Figure 26b), respectively. As illustrated in the two figures, the higher the feed rate, the larger the contouring error, and the maximum contouring error appears in the contour with large curvature. The difference between the maximum and minimum contouring errors at 3 m/min can be up to 56.6 µm, while at 5 m/min, it is up to 89.2 µm. The verification data are presented in Figures 24c and 26c, compared to the paths detected by the cross-grid encoder, the results show that the vision-based maximum and mean solution error at two different feed rates are 11.3 µm/3.4 µm (at 3 m/min, Figure 24c) and 14.1 µm/3.9 µm (at 5 m/min, Figure 26c), both are less than 1/3 of the error to be solved. The results demonstrate that the 3D high temporal-spatial measurement method enables the vision to have both wide range and relatively high-dynamic measurement capabilities. Besides, the standard deviations of the vision measurement accuracy at 3 m/min and 5 m/min are 1.4 µm and 1.7 µm respectively, which shows the good stability of the measurement accuracy. projected onto the XY plane of machine tool coordinate system to obtain the 2D contour. At the same time, the 2D interpolated butterfly contour is measured by cross-grid encoder under the same conditions. Then, according to ISO 10791-6, 2D contours measured by the two devices are compared with the nominal ones (commanded path) to obtain the contouring error. Thereafter, the 2D contour measured by cross-grid encoder is considered as the standard, and the accuracy of the vision method in detecting contouring error of butterfly curve is verified by comparing the difference between the two trajectories measured by the camera and the cross grid encoder. Figures 24a and 26a indicate that the 2D paths detected by both the vision and cross-grid encoder returns a consistent trend with the nominal ones. Figures 24b and 26b describe the 2D contouring error of the trajectories measured at two feed rates by the two means, while the Figures 24c and 26c depict the verification results of vision measurement accuracy. To give a better assessment of both the contouring error and vision measurement accuracy, a total of 8000 points are sampled by the cross-grid encoder on the whole motion trajectory. For contouring error of the butterfly path detected by cross-grid encoder at 3 m/min and 5 m/min, the maximum and mean contouring error are 64.9 μm/13.2 μm (at 3 m/min, Figure 24b) and 100.7 μm/15.7 μm (at 5 m/min, see Figure 26b), respectively. As illustrated in the two figures, the higher the feed rate, the larger the contouring error, and the maximum contouring error appears in the contour with large curvature. The difference between the maximum and minimum contouring errors at 3 m/min can be up to 56.6 μm, while at 5 m/min, it is up to 89.2 μm. The verification data are presented in Figures 24c and 26c, compared to the paths detected by the crossgrid encoder, the results show that the vision-based maximum and mean solution error at two different feed rates are 11.3 μm/3.4 μm (at 3 m/min, Figure 24c) and 14.1 μm/3.9 μm (at 5 m/min, Figure 26c), both are less than 1/3 of the error to be solved. The results demonstrate that the 3D high temporal-spatial measurement method enables the vision to have both wide range and relatively high-dynamic measurement capabilities. Besides, the standard deviations of the vision measurement accuracy at 3 m/min and 5 m/min are 1.4 μm and 1.7 μm respectively, which shows the good stability of the measurement accuracy.

Case Study for Highlighting 3D Detection of Contouring Error of a Space Trajectory
As mentioned in Section 5.2, we verified the accuracy of the proposed DOF distortion partition method, as well as the 3D measurement accuracy. On this basis, to further illustrate the advantages of the proposed monocular vision scheme in spatial contouring error detection over the existing equipment (e.g., telescoping ballbar and cross-grid encoder). The contouring performance of the interpolated spatial curve shown in Figure 20b is assessed at 3 m/min.
To improve the vision positioning accuracy of fiducial markers, in practical measurement, the PNP algorithm is first used to estimate the 3D position of point on the artifact. Then, the four subregions-based distortion partition in the object plane perpendicular to the optical axis is extended to the DOF. Thereafter, the spatial coordinates of point is re-corrected to a precise solution using the proposed subregion based distortion model. The measured 3D contour of the spatial path is

Case Study for Highlighting 3D Detection of Contouring Error of a Space Trajectory
As mentioned in Section 5.2, we verified the accuracy of the proposed DOF distortion partition method, as well as the 3D measurement accuracy. On this basis, to further illustrate the advantages of the proposed monocular vision scheme in spatial contouring error detection over the existing equipment (e.g., telescoping ballbar and cross-grid encoder). The contouring performance of the interpolated spatial curve shown in Figure 20b is assessed at 3 m/min.
To improve the vision positioning accuracy of fiducial markers, in practical measurement, the PNP algorithm is first used to estimate the 3D position of point O G on the artifact. Then, the four subregions-based distortion partition in the object plane perpendicular to the optical axis is extended to the DOF. Thereafter, the spatial coordinates of point O G is re-corrected to a precise solution using the proposed subregion based distortion model. The measured 3D contour of the spatial path is shown in the Figure 28, in which scatter points with red, green and blue marks are the three groups of movement paths measured by monocular vision. While the black ones form the nominal curve. For each experimental group, 2500 points on the trajectory are sampled by the vision method. As can be seen from the enlarged view, the three repetition results have good consistency in reflecting the trend of contouring error. For place where the curvature changes rapidly, large contouring errors induced by the servo mismatch are more likely to occur. The difference between the maximum and minimum contouring errors can be up to 72 µm. We performed statistical analysis on the measured data, and the results reveal that maximum and mean contouring error caused by the imperfect machine tool are 78.9 µm and 11.7 µm. Additionally, the standard deviation of contouring error is 9.3 µm, which indicates that the machine has a stable and small contouring error in most contour positions of the in contrast to where the curvature changes drastically. This case study highlights the measurement capability of the vision metrology in detecting contouring performance of a space trajectory. shown in the Figure 28, in which scatter points with red, green and blue marks are the three groups of movement paths measured by monocular vision. While the black ones form the nominal curve. For each experimental group, 2500 points on the trajectory are sampled by the vision method. As can be seen from the enlarged view, the three repetition results have good consistency in reflecting the trend of contouring error. For place where the curvature changes rapidly, large contouring errors induced by the servo mismatch are more likely to occur. The difference between the maximum and minimum contouring errors can be up to 72 μm. We performed statistical analysis on the measured data, and the results reveal that maximum and mean contouring error caused by the imperfect machine tool are 78.9 μm and 11.7 μm. Additionally, the standard deviation of contouring error is 9.3 μm, which indicates that the machine has a stable and small contouring error in most contour positions of the in contrast to where the curvature changes drastically. This case study highlights the measurement capability of the vision metrology in detecting contouring performance of a space trajectory. Though contouring error of a spatial path can be detected. In fact, the camera focusing system is probably not capable of following target changes which involve large change in distance. This is mainly caused by the relatively small DOF. The key indicators that affect DOF are working distance, focal length and aperture. Although the DOF of a single camera is greater than that of binocular camera, both of the two systems inevitably use small working distance and large focal length in high-accuracy applications. Consequently, the small DOF induced defocus blur degrades the image of coded markers, which significantly decreases the vision measurement accuracy. Since DOF is independent of FOV and camera resolution, thus, the proposed method cannot further enhance the DOF.

Remarks on Major Contributors for Measurement Uncertainties
The issue of measurement has been explored in prior studies [52]. We reanalyze all the links in the vision measurement process, and major contributors affecting the vision measurement uncertainty are presented, which are detailed as follows:

1) Error of a priori information
To achieve vision measurement with the proposed vision method, artifact calibration is necessary. In this paper, we calibrate the 3D position of the coded markers (i.e., a priori information) using high-accuracy commercialized instrument HEXAGON OPTIV reference (Measurement uncertainty x E , y E : (0.25 + L/900) μm and z E : 0.5 μm), where L is the measurement length (in mm).
On the one hand, PNP measurements are valid when more than three markers with known spatial position in FOV. On the other hand, when performing wide range measurement, the 3D position of the invisible 'reference feature' is derived based on a priori information. Therefore, the markers' position calibration error contributes the vision measurement uncertainty throughout the measurement. Though contouring error of a spatial path can be detected. In fact, the camera focusing system is probably not capable of following target changes which involve large change in distance. This is mainly caused by the relatively small DOF. The key indicators that affect DOF are working distance, focal length and aperture. Although the DOF of a single camera is greater than that of binocular camera, both of the two systems inevitably use small working distance and large focal length in high-accuracy applications. Consequently, the small DOF induced defocus blur degrades the image of coded markers, which significantly decreases the vision measurement accuracy. Since DOF is independent of FOV and camera resolution, thus, the proposed method cannot further enhance the DOF.

Remarks on Major Contributors for Measurement Uncertainties
The issue of measurement has been explored in prior studies [52]. We reanalyze all the links in the vision measurement process, and major contributors affecting the vision measurement uncertainty are presented, which are detailed as follows:

1) Error of a priori information
To achieve vision measurement with the proposed vision method, artifact calibration is necessary. In this paper, we calibrate the 3D position of the coded markers (i.e., a priori information) using high-accuracy commercialized instrument HEXAGON OPTIV reference (Measurement uncertainty E x , E y : (0.25 + L/900) µm and E z : 0.5 µm), where L is the measurement length (in mm). On the one hand, PNP measurements are valid when more than three markers with known spatial position in FOV. On the other hand, when performing wide range measurement, the 3D position of the invisible 'reference feature' is derived based on a priori information. Therefore, the markers' position calibration error contributes the vision measurement uncertainty throughout the measurement.
2) Image size of the marker In this paper, the grey centroid method is applied to locate the center (x M y M ) of the marker on the image, which can be described by the classic equation: (ts i ) (12) where n denotes the number of pixels to be processed, s i represents the grey value at pixel position ( x i y i ). According to whether the pixel position participates in the calculation, t is set to 0 or 1. Then, do error propagation to Equation (12) to determine theoretical accuracy: As illustrated in Equation (13), the standard deviation of centroid is clearly positively correlated with the grayscale noise (i.e., σ s ) and the marker image size. For instance, for a marker occupying 6 pixels whose grey value is 220, the theoretical standard deviation of the centroid is about σ x M = σ y M = 0.005pixel when image grey value noise is 0.6. Besides, according to [48], image size of marker also influence the center deviation caused by the perspective effect. The larger the image size occupied, the larger the eccentricity. Through the above analysis, the target size is also the main factor influencing the measurement uncertainty. While this issue was not studied in our experimentations. For high-accuracy applications (smaller than 0.5 µm), the image size of the target should be as small as possible, but at least 5 pixels for anti-noise and algorithm application.

3) Alignment error
In this paper, a camera calibration method considering the distortion partition of the DOF is proposed, which allows the calculation of distortion coefficients of arbitrary object distance by two known object planes (Equation (6)). Therefore, before using the proposed method (Section 2.2) to calibrate camera parameters, the image plane should be installed parallel to the control field. In practice, combining with the Zhang's method [40], it is repeatedly adjusted by the PNP algorithm. And the alignment error can only be controlled within 0.1 • , reducing the subsequent camera calibration accuracy. Additionally, the distortion coefficients describe the goodness-of-fit of the distortion in each subregion by minimizing the straightness error. For each region with equal partition radius R1 = R2 = R3 (Figure 29a), according to Equation (2), the calculated accuracy of distortion coefficients varies with different distortions ∆1 = ∆2 = ∆3. And the larger the distortion variation, the lower the estimation accuracy of distortion (see in Table 3). Thus, we assume that if the distortion in the object plane is partitioned by equal-distortion criterion (Figure 29b), i.e., ∆1 = ∆2 = ∆3. Then, although the partition radiuses are different R1 = R2 = R3, we can achieve high accuracy of distortion coefficient calculation. Our further study will focus on it.

Conclusions
In this paper, a 3D high temporal-spatial measurement method and system based on a single camera are proposed, this knowledge-driven approach realizes the 3D detection of contouring errors of arbitrary paths, especially that of interpolated spatial contours. The innovations of this paper are the work to improve the accuracy, efficiency and ability (i.e., measurable speed and working range) of the vision measurement, which is detailed as follows: a camera calibration method considering the distortion partition of the FOV is proposed, which solves the problem that the DOF-dependent imaging distortion seriously restricts the vision measurement accuracy; both a new encoding method and the decoding method based on finding the optimal start bit are proposed, which improve the marker identification efficiency in image processing. Finally, together with a priori information, the 3D measurement of large-scale contouring error under relatively high dynamic conditions is realized by the PNP algorithm. After performing the performance test of the vision system, contouring errors of both a planar and a spatial trajectory are measured in the laboratory. The statistical analysis results verify the measurement ability of proposed monocular vision-based method in multi-dimensional (versus double ballbar and cross-grid encoder), wide working range (versus R-test) and various forms of trajectories (versus double ballbar, R-test and cross-grid encoder). Finally, other factors affecting the uncertainty of the vision measurement are analyzed. This technique has potential applications in enhancing the dynamic behavior of low-accuracy CNC machines. The main limitation of the research is the relatively low measurement accuracy compared to binocular vision, as well as the inability to measure trajectories with relatively large variations in DOF. Therefore, our next objective focuses on the accuracy improvement of PNP algorithm, as well as the extension of the DOF of the vision system.

Conflicts of Interest:
The authors declare no conflict of interest.

Conclusions
In this paper, a 3D high temporal-spatial measurement method and system based on a single camera are proposed, this knowledge-driven approach realizes the 3D detection of contouring errors of arbitrary paths, especially that of interpolated spatial contours. The innovations of this paper are the work to improve the accuracy, efficiency and ability (i.e., measurable speed and working range) of the vision measurement, which is detailed as follows: a camera calibration method considering the distortion partition of the FOV is proposed, which solves the problem that the DOF-dependent imaging distortion seriously restricts the vision measurement accuracy; both a new encoding method and the decoding method based on finding the optimal start bit are proposed, which improve the marker identification efficiency in image processing. Finally, together with a priori information, the 3D measurement of large-scale contouring error under relatively high dynamic conditions is realized by the PNP algorithm. After performing the performance test of the vision system, contouring errors of both a planar and a spatial trajectory are measured in the laboratory. The statistical analysis results verify the measurement ability of proposed monocular vision-based method in multi-dimensional (versus double ballbar and cross-grid encoder), wide working range (versus R-test) and various forms of trajectories (versus double ballbar, R-test and cross-grid encoder). Finally, other factors affecting the uncertainty of the vision measurement are analyzed. This technique has potential applications in enhancing the dynamic behavior of low-accuracy CNC machines. The main limitation of the research is the relatively low measurement accuracy compared to binocular vision, as well as the inability to measure trajectories with relatively large variations in DOF. Therefore, our next objective focuses on the accuracy improvement of PNP algorithm, as well as the extension of the DOF of the vision system.