Free-Surface Velocity Measurement Using Direct Sensor Orientation-Based STIV

Particle image velocimetry (PIV) is a quantitative flow visualization technique, which greatly improves the ability to characterize various complex flows in laboratory and field environments. However, the deployment of reference objects or ground control points (GCPs) for velocity calibration is still a challenge for in situ free-surface velocity measurements. By combining space-time image velocimetry (STIV) with direct sensor orientation (DSO) photogrammetry, a laser distance meter (LDM)-supported photogrammetric device is designed, to realize the GCPs-free surface velocity measurement under an oblique shooting angle. The velocity calibration with DSO is based on the collinear equation, while the lens distortion, oblique shooting angle, water level variation, and water surface slope are introduced to build an imaging measurement model with explicit physical meaning for parameters. To accurately obtain the in situ position and orientations of the camera utilizing the LDM and its embedded tilt sensor, the camera’s intrinsic parameters and relative position within the LDM are previously calibrated with a planar chessboard. A flume experiment is designed to evaluate the uncertainty of optical flow estimation and velocity calibration. Results show that the proposed DSO-STIV has good transferability and operability for in situ measurements. It is superior to propeller current meters and surface velocity radars in characterizing shallow free-surface flows; this is attributed to its non-intrusive, whole-field, and high-resolution features. In addition, the combined uncertainty of free-surface velocity measurement is analyzed, which provides an alternative solution for error assessment when comparing measurement failures.


Introduction
Particle image velocimetry (PIV) is a quantitative flow visualization technique [1]. It seeds particles with good light scattering to trace their flow and estimates their motion as displacement or velocity vectors between two successive frames via image processing to present the flow field. Compared with traditional velocity measurement methods, it greatly improves the ability to characterize various complex flows in laboratory and field environments; this is attributed to its non-intrusive and whole-field measurement features [2]. Velocity calibration, which transforms the motion vectors in the image coordinate system to the velocity vectors in the world coordinate system, is one of the key issues in PIV [3].
In laboratory flume and river model experiments, cameras generally shoot perpendicular to the test plane. Under these conditions, the scaling relation for orthographic projection could be simply calculated with known-size reference objects set on the test plane [4,5]. A more rigorous approach is to build the invertible coordinate transformation relation between the image plane and the object plane by solving the homography matrix. The direct linear transformation (DLT) is a commonly used method in close-range photogrammetry, which needs at least four non-aligned ground control points (GCPs) on the free surface [6]. The planar homography model is simple and GCPs must be strictly coplanar with the test plane. Otherwise, their projections on the image cannot represent the true elevation of the free surface to be measured. However, these conditions are often difficult to control in practice, resulting in calibration error [7].
In contrast, the conditions of large-scale river surfaces are more complicated. Firstly, in order to survey the river cross-section as completely as possible, the camera needs to be installed on a riverbank and shoot with a small pitch angle to obtain a large field of view, ranging from hundreds to thousands of square meters. This can lead to serious image perspective distortion and the loss of spatial resolution in the far-field image. Image orthorectification can correct the perspective distortion [8][9][10][11]; however, it not only increases the consumption of computation and memory, but also brings in errors induced by gray level interpolation [12]. Secondly, the water level of natural rivers varies quickly and significantly, especially in mountain streams where the variation can be several meters in minutes. This can change the projective relation of the free surface, thereby affecting the scale and position of the measuring area [13]. Moreover, it is quite difficult to deploy GCPs on the river surface [14]. Some studies have tried to establish the projective relation with a floating calibration board [15] or laser points tracing the water surface [16]. This can meet the measurement requirement of flow fields ranging from tens to hundreds of square meters under a large shooting angle. However, these reference objects often have poor visibility under complex illumination conditions such as shadows and glares on the river surface, which makes it difficult to automatically detect and accurately locate objects within the image. The detection error will be significantly amplified when applied to a large-scale area [17]. Currently, the commonly used velocity calibration scheme is based on variable height homography (VHH) [7,[18][19][20]. It improves the measuring precision of bank-based large-scale PIV (LSPIV) systems with small shooting angles by considering the distance from GCPs to the free surface, as well as modeling the elevation of the camera with the water level and the river slope. A limitation of VHH is that at least six non-coplanar control points should be distributed evenly on both sides of the river and surveyed by a total station or differential global positioning system (DGPS). It is a labor-consuming and highrisk task, especially in flood emergency monitoring. In addition, it is found that solving VHH with DLT is sensitive to the number, distribution, and accuracy of GCPs, as well as lens distortion. The solving process trends to be non-convergence under unfavorable conditions and fails to build the coordinate transformation relation [21]. Therefore, the velocity calibration problem has become one of the main bottlenecks of LSPIV.
The two-step method [22] for camera calibration in the computer vision society provides a solution for reducing the dependence of PIV on GCPs. It takes into account the nonlinear distortion of the lens and decomposes the homography matrix into intrinsic and extrinsic camera matrices. The photogrammetric task is divided into two steps: (1) indoor calibration of intrinsic parameters (focal length and principal point) and distorted aberration (radial and tangential distortion); and (2) in situ calibration of extrinsic parameters (translation and rotation). This not only reduces the number of GCPs required by the in situ calibration to four [23], but also improves the flexibility of system deployment. For example, a rotational calibration technique based on precise pan-tilt heads is devised to ensure accurate camera geometry in the wide river setting [24,25]. It can utilize nearfield GCPs with better visibility to calibrate the initial value of the extrinsic parameters. The camera can then be rotated to the measuring position and can compensate for its orientations (azimuth and pitch angles) via the readings of two dials on the pan-tilt heads. Since the above calibration process requires manual adjustment and reading, automatic measurement has not been supported so far. When the camera orientation changes due to external disturbance (such as a storm) the present calibration results will no longer be applicable. Re-surveying GCPs and re-calibrating may cause inconsistencies in the world reference system, which creates problems for data use. Obviously, the development of GCPs-free imaging velocimetry has important significance for online river flow monitoring.
With the advances in sensor and information fusion technology, the positioning and orientation system (POS) is explored for extrinsic parameter measurement and coordinate transformation in both land-based [26] and aerial photogrammetry fields [27]. This technique is known as direct sensor orientation (DSO) or direct georeferencing (DG). At present, it can realize the GCPs-free measurement with decimeter-level precision within several square kilometers. The measurement quality of these systems is highly dependent on the precision of POS. This limitation is a major drawback due to the elevated cost associated with high-end POS units, particularly the inertial system. Generally, it is required that the roll and pitch angle errors of inertial systems (INS) should not be greater than 0.01 • , the azimuth angle error should not be greater than 0.02 • , and the recording frequency should be higher than 50 Hz. The positioning accuracy of GPS should reach the centimeter level, and the minimum sampling interval is generally less than one second. In addition, the potential accuracy of the DSO also depends on the architecture and quality of the GPS/INS integration process, as well as the validity of the system calibration. However, the calibration of multiple sensors, as well as the system mounting parameters (i.e., six elements of exterior orientation), is a delicate and complicated task. For the aerial photogrammetric systems, a special ground control field generally needs to be deployed for regular in-flight calibration. Fortunately, for PIV systems with fixed cameras, the position can be measured after installation and the orientation can be synchronously measured by a tilt sensor when capturing images. The DSO-based PIV has the potential to utilize low-cost sensors and improve the efficiency and safety of in situ measurements.
In this paper, a method which combines DSO and space-time image velocimetry (STIV) is proposed to realize free-surface velocity measurement without GCPs and image orthorectification. Both the safety and efficiency of field operations can be effectively improved using this low-cost measurement method. Section 2 firstly introduces the principle of an FFT-STIV based on the fast Fourier transform (FFT), then mainly focuses on the velocity calibration using DSO. Section 3 introduces the design of a laser distance meter (LDM)supported measurement device and mainly focuses on its calibration. Section 4 presents a flume experiment to validate the method when applied to a small-scale free surface and evaluates its uncertainty. Section 5 concludes the paper and proposes research needs for future work.

Motion Estimation with FFT-STIV
STIV is a motion estimation method for time-averaged optical flow [28]. It assumes that the tracers that follow the surface flow satisfy the continuity of motion within a short time. It makes their position in the two-dimensional (2-D) space-time image (STI) present as a significant directional texture, with a main orientation that is a function of the one-dimensional (1-D) time-averaged velocity on a testing line. Compared to LSPIV, its spatial resolution can achieve single-pixel level, and its computational efficiency is above 10 times than that of the correlation-based method [29]. This makes it particularly suitable for real-time velocity-field observation under a small shooting angle. In this study, the central perspective images (rather than the ortho-rectified ones) are directly used for motion estimation to avoid additional errors. Figure 1 illustrates the outline of the FFT-STIV.
Frames of M are captured from a camera or a video with time intervals of ∆t seconds, in which a testing line with the length of L pixels is set along the flow direction. The testing line can be regarded as the interrogation area (IA) with the width of single pixel. An STI the size of L × M (pixels) can be generated in x-t coordinates. It has a directional texture resembling bright and dark bands formed by the motion of optical flow on the free surface. Assuming that the motion of optical flow is equivalent to the motion of flow tracers, the main orientation of texture (MOT), which is defined as δ, can reflect the magnitude and direction of the time-averaged velocity V (m/s) during time interval of M × ∆t(s) on the testing line: where D (m)represents the distance that tracers traveled in physical coordinates within T (s), d (pixel)represents the distance that tracers traveled in image coordinates within τ (frame), v (pixel/s) represents the optical flow velocity and its sign denotes the motion direction, s (m/pixel)represents the scaling factor (i.e., the spatial resolution) on the testing line, and δ ( • ) represents the main orientation of texture.
where D (m)represents the distance that tracers traveled in physical coordinates within T (s), d (pixel)represents the distance that tracers traveled in image coordinates within τ (frame), v (pixel/s)represents the optical flow velocity and its sign denotes the motion direction, s (m/pixel)represents the scaling factor (i.e., the spatial resolution) on the testing line, and δ (°)represents the main orientation of texture. To overcome the deficiencies under deteriorated light (shadows and glares) and natural seeding conditions (sparse and uneven spatiotemporal distribution), the study attempts to address the issue of MOT detection in the frequency domain according to the auto-registration property of the Fourier transform magnitude spectra F (u,v). This autoregistration depends on the orientation and variation frequency of the image texture, rather than their spatial locations. It makes the FFT-STIV robust to local noises and randomly occurring tracers. The 2-D FFT of STI is noted below: where FFT2 represents the two-dimensional fast Fourier transform, STI(x, y) represents the space-time image, and N represents the size of the space-time image. To avoid "spectrum compression" caused by unequal width and height, the STI is enlarged to the size of N × N pixels via zero-padding. The texture patterns of STI are redistributed in lines passing through the center of its centrosymmetric FTMS, and the main orientation of spectra (MOS) θ is orthogonal to the MOT: where δ(°) represents the main orientation of texture.
To detect the MOS, the F(u,v) is projected to the polar coordinate system where the polar radius r = N/2 pixels and the polar angle ranges from 0° to 180°: To overcome the deficiencies under deteriorated light (shadows and glares) and natural seeding conditions (sparse and uneven spatiotemporal distribution), the study attempts to address the issue of MOT detection in the frequency domain according to the auto-registration property of the Fourier transform magnitude spectra F(u, v). This autoregistration depends on the orientation and variation frequency of the image texture, rather than their spatial locations. It makes the FFT-STIV robust to local noises and randomly occurring tracers. The 2-D FFT of STI is noted below: where FFT2 represents the two-dimensional fast Fourier transform, STI(x, y) represents the space-time image, and N represents the size of the space-time image. To avoid "spectrum compression" caused by unequal width and height, the STI is enlarged to the size of N × N pixels via zero-padding. The texture patterns of STI are re-distributed in lines passing through the center of its centrosymmetric FTMS, and the main orientation of spectra (MOS) θ is orthogonal to the MOT: where δ( • ) represents the main orientation of texture.
To detect the MOS, the F(u, v) is projected to the polar coordinate system where the polar radius r = N/2 pixels and the polar angle ranges from 0 • to 180 • : Micromachines 2022, 13, 1167

of 18
The θ m is determined further by searching and Gaussian fitting the peak value of P(θ): The low frequency background noises are mainly distributed within ±5 • around 0 • and 90 • in the polar projection. Alternative high-pass filtering, or the edge detection procedure, is recommended for background suppression [30] when peak detection of P(θ) is interfered with by a low signal-to-noise ratio.

Velocity Calibration with DSO
Velocity calibration aims to convert the optical flow velocity into the physical velocity and determine the position (starting distance) of the testing line. The essential issue is how to establish the coordinate transformation relation between the object plane and the image plane. The central perspective projection model in photogrammetry provides a theoretical basis for solving such problems. In Figure 2, the object point P(X, Y, Z), projection center C(X C , Y C , Z C ), and image point P(X, Y) are assumed collinear; o and O are the projection points of C(X C , Y C , Z C ) on the image plane and the object plane, respectively. For a perfect pinhole imaging system, the principal point o is at the center of the image, with image plane coordinates (in mm) of: where s (mm) is the pixel size of the image sensor and m × n (pixels) is the image size. The image plane coordinates (x, y) are expressed by their image coordinates (i, j) (pixel) as: 13, x FOR PEER REVIEW 6 of 18 where the pitch (ω), roll (φ), and azimuth (κ) angles are defined as the angles used in order to rotate a (X,Y,Z) geodetic coordinate system and align it with the image coordinate system. Pitch is the rotation around the X axis. Roll is the rotation around the Y axis. Azimuth is the rotation around the Z axis.
There are six unknowns (X C ,Y C ,Z C ,φ,ω,κ) to be solved, which are known as the extrinsic parameters of the camera. Existing methods usually utilize at least three non-collinear GCPs to solve the unknowns, and use (8) to transform coordinates and generate ortho-rectified images for subsequent motion estimation. In this study, the model is reasonably simplified considering the fixed camera position. The object coordinate system is established by taking the reference point R on a fixed ground or sea level as the origin. Point F is the pedal point of point C on the measuring plane. The camera's optical axis is adjusted parallel to the cross-section; thus, it can be assumed that (X C ,Y C ) = (0,0) and κ = 0. The remaining three extrinsic parameters (Z C ,φ,ω) are obtained by distance and tilt sensors to realize the DSO-based photogrammetry. To avoid additional errors and computations induced by image ortho-rectification, the velocity calibration is carried out based on (9) instead of (7). For STIV, the velocity on the testing line is denoted as: Considering the object distance ( OC) is much larger than the image distance (oc), the focal length f can be assumed equal to the image distance. The linear geometric relation between image points and object points is established by similar triangles. If the image plane and the object plane coordinate systems are taken as the reference, respectively, the collinear equation can be expressed in the following direct and indirect forms: where f represents the focal length and Z represents the elevation of the object plane.
The above two equations describe the reciprocal transformation relation of two plane coordinates. The direct form projects object point (X, Y, Z) to image point (x, y). The indirect form projects image point (x, y) to object point (X, Y), but the Z point coordinate is required. The rotation matrix formed by nine coefficients in the equations can be expressed by the camera's orientation angles relative to the object's 3D coordinate system: where the pitch (ω), roll (ϕ), and azimuth (κ) angles are defined as the angles used in order to rotate a (X, Y, Z) geodetic coordinate system and align it with the image coordinate system. Pitch is the rotation around the X axis. Roll is the rotation around the Y axis. Azimuth is the rotation around the Z axis. There are six unknowns (X C , Y C , Z C , ϕ, ω, κ) to be solved, which are known as the extrinsic parameters of the camera. Existing methods usually utilize at least three noncollinear GCPs to solve the unknowns, and use (8) to transform coordinates and generate ortho-rectified images for subsequent motion estimation. In this study, the model is reasonably simplified considering the fixed camera position. The object coordinate system is established by taking the reference point R on a fixed ground or sea level as the origin. Point F is the pedal point of point C on the measuring plane. The camera's optical axis is adjusted parallel to the cross-section; thus, it can be assumed that (X C , Y C ) = (0, 0) and κ = 0. The remaining three extrinsic parameters (Z C , ϕ, ω) are obtained by distance and tilt sensors to realize the DSO-based photogrammetry.
To avoid additional errors and computations induced by image ortho-rectification, the velocity calibration is carried out based on (9) instead of (7). For STIV, the velocity on the testing line is denoted as: where X i+v x , j and X i ,j represent the coordinates of the image point in the object plane.
The location (i.e., starting distance of the cross-section) of the testing line is denoted as: where D C represents the Y-axis projection distance of the point F relative to the reference point. Note that the scaling factor s in (1) is not directly solved by the above velocity calibration process. On the contrary, according to its definition, it can be derived using the physical distances between two adjacent pixels.

Measuring Device
An LDM-supported measuring device ( Figure 3) was designed for the DSO-based FFT-STIV. A USB 3.0 industrial camera (HIKVISION MV-CA013-20UM, made by Hikvision, in Hangzhou, China) was used for image capture. It had a monochrome CMOS sensor with an image size of 1280 × 1024 pixels and a pixel size of 4.8 µm. The frame rate at full resolution was up to 170 fps. In this study, a C-mount lens with a focal length of 8 mm was installed on the camera. The LDM (SNDWAY SW-Q120, made by SNDWAY, in Dongguan, China) was equipped with an embedded dual-axis (pitch and roll) tilt sensor. The measuring ranges of pitch and roll angles were both ±90 • with precision up to 0.1 • . The measuring range and precision of distance can reach 120 m and 3 mm, respectively. The camera and LDM were rigidly mounted on a platform (a quick release shoe of panhead) with parallel optical axes. Their geometric relation was assumed to be invariant. The major error sources of the measuring device included the nonlinear distortion aberration of the lens, as well as the eccentric distances and eccentric angles between the camera and the LDM. To reduce their effect, the corresponding calibration methods are discussed below. calibration process. On the contrary, according to its definition, it can be derived using the physical distances between two adjacent pixels.

Measuring Device
An LDM-supported measuring device (Figure 3) was designed for the DSO-based FFT-STIV. A USB 3.0 industrial camera (HIKVISION MV-CA013-20UM, made by Hikvision, in Hangzhou, China) was used for image capture. It had a monochrome CMOS sensor with an image size of 1280 × 1024 pixels and a pixel size of 4.8 μm. The frame rate at full resolution was up to 170 fps. In this study, a C-mount lens with a focal length of 8 mm was installed on the camera. The LDM (SNDWAY SW-Q120, made by SNDWAY, in Dongguan, China) was equipped with an embedded dual-axis (pitch and roll) tilt sensor. The measuring ranges of pitch and roll angles were both ±90° with precision up to 0.1°. The measuring range and precision of distance can reach 120 m and 3 mm, respectively. The camera and LDM were rigidly mounted on a platform (a quick release shoe of panhead) with parallel optical axes. Their geometric relation was assumed to be invariant. The major error sources of the measuring device included the nonlinear distortion aberration of the lens, as well as the eccentric distances and eccentric angles between the camera and the LDM. To reduce their effect, the corresponding calibration methods are discussed below.

Calibration of Distortion Aberration
Due to the complexity of lens design and manufacture, the real imaging system cannot strictly satisfy the central perspective projection model. This lens distortion deviates the actual image position from that given by the central perspective projection model, especially in the use of wide-angle lenses. In high-precision photogrammetry, a nonlinear imaging model considering distortion aberration should be applied for image correction. Furthermore, the focal length should use the calibrated value rather than the nominal one. In this study, both intrinsic parameters and distortion aberration are calibrated in the laboratory with the planar chessboard method provided by the camera calibration toolbox from OpenCV. A chessboard with 18 × 12 square grids was designed and the side lengths of each grid were 60 mm. A total of nine images taken under different shooting angles

Calibration of Distortion Aberration
Due to the complexity of lens design and manufacture, the real imaging system cannot strictly satisfy the central perspective projection model. This lens distortion deviates the actual image position from that given by the central perspective projection model, especially in the use of wide-angle lenses. In high-precision photogrammetry, a nonlinear imaging model considering distortion aberration should be applied for image correction. Furthermore, the focal length should use the calibrated value rather than the nominal one. In this study, both intrinsic parameters and distortion aberration are calibrated in the laboratory with the planar chessboard method provided by the camera calibration toolbox from OpenCV. A chessboard with 18 × 12 square grids was designed and the side lengths of each grid were 60 mm. A total of nine images taken under different shooting angles were captured for calibration ( Figure 4). The intrinsic parameter matrix K and the distortion parameter matrix D were calculated as follows: where (C x , C y represents the principal point of the distorted image and f x and f y represent the focal lengths in pixels on the x and y directions, with the mean value converted to the actual focal length f in mm via the pixel size: Micromachines 2022, 13, x FOR PEER REVIEW 8 of 18 were captured for calibration ( Figure 4). The intrinsic parameter matrix K and the distortion parameter matrix D were calculated as follows: where (C x ,C y ) represents the principal point of the distorted image and f x and f y represent the focal lengths in pixels on the x and y directions, with the mean value converted to the actual focal length f in mm via the pixel size: k 1 and k 2 represent the radial lens distortion parameters and p 1 and p 2 represent the tangential lens distortion parameters. Figure 5 indicates that all the mean reprojection errors of the nine images are below 0.5 pixels, and the overall mean error is 0.38 pixels.   k 1 and k 2 represent the radial lens distortion parameters and p 1 and p 2 represent the tangential lens distortion parameters. Figure 5 indicates that all the mean reprojection errors of the nine images are below 0.5 pixels, and the overall mean error is 0.38 pixels.
According to the above results, image correction is performed using the nonlinear imaging model below: where (x , y ) and (x, y) are the distorted and undistorted points in the camera coordinate system, which satisfy the following relationship with their corresponding points (u , v ) and (u, v) in the image coordinate system: The above three equations establish the coordinate transformation relation between distorted and undistorted images.

Calibration of Eccentric Distances
The eccentric distances below are defined as the 3-D distances between the optical centers of the camera (C) and the LDM (L) shown in Figure 3b, which are manually measured with a Vernier caliper according to their marked positions after system integration. The key point that should be taken into account is their effect on the elevation of the camera on the object plane: where d and ω L represent the slope distance and pitch angle of LDM relative to the object plane, respectively. In practice, the d should be measured with LDM by aiming at the water boundary rather than the water surface. The Z S is defined as the height of the measured water surface relative to the reference point R, which is supposed to be zero in this case (i.e., point R coincides with point F). In practice, it can be simultaneously measured with the same camera using image-based methods [31].

Calibration of Eccentric Angles
The eccentric angles (∆κ, ∆ω and ∆ϕ) are defined as the differences in three rotation angles between the camera and the LDM, where ∆κ → 0 , ∆ω, and ∆ϕ are calibrated according to the flow chart shown in Figure 6. The basic principle is to determine the least root-mean-square error (RMSE) of the planar grid size: where E X and E Y represent the total RMSEs of measured grid sizes in X and Y directions, respectively: M and N are the number of interior corners on the chessboard, X i,j ,Y i,j represent their object coordinates, and D X , D Y are the actual sizes of the grid. Considering the system integration precision of the device, the minimum value of E(∆ω, ∆φ) was firstly searched within the ±2° neighborhood of ω L and φ L with a step of 0.1°, then refined with the three-point Gaussian fitting method to make the error of eccentric angles less than 0.1°.
To analyze the sensitivity of the calibration method to the pitch angle (ω), three sets of chessboard images were captured for testing (Figure 7). The chessboard was the same as the one used in Figure 4, which was placed horizontally on the ground. The measuring device was installed on a tripod and adjusted to be level according to the bubble. Table 1 indicates that the elevation bias caused by the eccentric distances ∆Z and ∆Y was within 4 mm in this case. It is negligible when it is much less than d. Table 2 compares the measurement errors with and without the calibration of eccentric angles. In the set which directly used ω L and φ L for measurements, the RMSE decreased significantly with the increase in pitch angle. This was caused by a reduction in image resolution due to image perspective distortion, especially in the far-field. In view of this, the chessboard should be placed in the near-field of the image when the calibration is performed at a small pitch angle such as in Figure 7a. The RMSEs calculated with eccentric angle calibration decrease to a similar level with a difference of less than 0.05 mm. This is attributed to the global adjustment effect of the least square method. Note that the RMSEs include the errors of corner detection and angle measurement. According to the grid size in images, the mean relative error is approximately 2%, which is similar to the mean reprojection error in Figure 5. Additionally, it was found that the differences in eccentric angles preliminarily calibrated in the three sets were all within 0.1°, which agrees with the searching step. In com- M and N are the number of interior corners on the chessboard, X i,j , Y i,j represent their object coordinates, and D X , D Y are the actual sizes of the grid. Considering the system integration precision of the device, the minimum value of E(∆ω, ∆ϕ) was firstly searched within the ±2 • neighborhood of ω L and ϕ L with a step of 0.1 • , then refined with the three-point Gaussian fitting method to make the error of eccentric angles less than 0.1 • .
To analyze the sensitivity of the calibration method to the pitch angle (ω), three sets of chessboard images were captured for testing (Figure 7). The chessboard was the same as the one used in Figure 4, which was placed horizontally on the ground. The measuring device was installed on a tripod and adjusted to be level according to the bubble. Table 1 indicates that the elevation bias caused by the eccentric distances ∆Z and ∆Y was within 4 mm in this case. It is negligible when it is much less than d. Table 2 compares the measurement errors with and without the calibration of eccentric angles. In the set which directly used ω L and ϕ L for measurements, the RMSE decreased significantly with the increase in pitch angle. This was caused by a reduction in image resolution due to image perspective distortion, especially in the far-field. In view of this, the chessboard should be placed in the near-field of the image when the calibration is performed at a small pitch angle such as in Figure 7a. The RMSEs calculated with eccentric angle calibration decrease to a similar level with a difference of less than 0.05 mm. This is attributed to the global adjustment effect of the least square method. Note that the RMSEs include the errors of corner detection and angle measurement. According to the grid size in images, the mean relative error is approximately 2%, which is similar to the mean reprojection error in Figure 5. Additionally, it was found that the differences in eccentric angles preliminarily calibrated in the three sets were all within 0.1 • , which agrees with the searching step. In comparison, the differences decrease to 0.05 • after Gaussian fitting, and the measuring precision is further improved for set one with a small pitch angle. Here, the calibration results of set three with the smallest image perspective distortion are chosen for the following measurement, and the corresponding relative error of 2.03% is taken as the uncertainty of the scaling factor u(S).   Table 2. Calibration results of eccentric angles and camera orientations.

Experiment Settings
An experiment was conducted in a recirculating flume to evaluate the proposed method in small-scale surface velocity field measurement. The 4 m-long flume was simply built using PVC rain gutters, as shown in Figure 8. Its section was trapezoid with the upper width of 160 mm, the lower width of 100 mm and the height of 80 mm. The upstream inlet boundary condition was uniform inflow velocity, the downstream outlet boundary condition was free outflow, and the wall boundary condition was non-slip solid wall. The measuring device was installed 0.2 m upstream of the outlet. Since the parameters of the lens cannot be changed after calibration, the height of device was adjusted to make imaging clear. Note that the imaging model in Figure 2 takes the measuring plane as the reference system, while the angle measurement of LDM is based on the geodetic reference system. This difference should be taken into account when calculating the camera orientations. In the experiment, the angle measurements of LDM were ω L = 59.8° and φ L = −1.2°. The vertical and horizontal slopes of the flume were 1.23° and 0°, respectively. It resulted in ω C = 59.272° and φ C = − 1.927° with eccentric angle and slope compensation. The elevation from the camera to the flume bottom is calculated in Table 3. As the water surface was approximately 8 mm from the flume bottom (i.e., ∆l = 8 mm), the actual camera elevation was 0.541 m according to (20). The size of the effective measurement area shown in Figure 8 was approximately 640 × 480 pixels.    Table 2. Calibration results of eccentric angles and camera orientations.

Experiment Settings
An experiment was conducted in a recirculating flume to evaluate the proposed method in small-scale surface velocity field measurement. The 4 m-long flume was simply built using PVC rain gutters, as shown in Figure 8. Its section was trapezoid with the upper width of 160 mm, the lower width of 100 mm and the height of 80 mm. The upstream inlet boundary condition was uniform inflow velocity, the downstream outlet boundary condition was free outflow, and the wall boundary condition was non-slip solid wall. The measuring device was installed 0.2 m upstream of the outlet. Since the parameters of the lens cannot be changed after calibration, the height of device was adjusted to make imaging clear. Note that the imaging model in Figure 2 takes the measuring plane as the reference system, while the angle measurement of LDM is based on the geodetic reference system. This difference should be taken into account when calculating the camera orientations. In the experiment, the angle measurements of LDM were ω L = 59.8 • and ϕ L = −1.2 • .
The vertical and horizontal slopes of the flume were 1.23 • and 0 • , respectively. It resulted in ω C = 59.272 • and ϕ C = −1.927 • with eccentric angle and slope compensation. The elevation from the camera to the flume bottom is calculated in Table 3. As the water surface was approximately 8 mm from the flume bottom (i.e., ∆l = 8 mm), the actual camera elevation was 0.541 m according to (20). The size of the effective measurement area shown in Figure 8 was approximately 640 × 480 pixels.

Comparison of In Situ and Non-In Situ Calibration
In order to verify whether the calibration results can be transferred into the actual measuring environment, a high-precision alumina calibration board was used for in situ calibration. There were 12 × 9 grids with a side length of 15 mm and an accuracy of 0.01 mm on the board. As its size was 200 × 200 mm, which was larger than the water surface to be measured, it was set on the top of the flume. Considering the thickness of the calibration board (i.e., ∆l = 3 mm), the actual elevation of the camera relative to the board was 0.464 m (Table 4). In comparison with the calibration results of set 3 in Table 2, the differences of ∆ω and ∆φ shown in Table 5 were both within the reasonable error range of 0.1°. Therefore, this proves that the calibration method of this integrated measuring device is practically operable. In addition, the Root Mean Square Error (RMSE) relative to the grid size was 0.73%. Note that this value is approximately 1/3 of the relative error given in Section 3.4. It was found that the mean spatial resolutions of the two cases (0.286 × 0.364 mm/pixel vis-à-vis 0.849 × 1.224 mm/pixel) also approximately met this ratio. As a result, it is considered to be a major error source of measurement alongside the shooting angle.   Table 5. Calibration results of eccentric angles and camera orientation to the calibration board.

Velocity Measurement
The flow rate remained stable during the experiment. For the convenience of flow field observation with human eyes, a small number of sediments and surfactants were seeded to the water to produce distinguishable foam patterns on the free surface. The camera continuously captured images for 100 s with a frame rate of 100 fps. The total 10,000 frames were evenly divided into 40 groups for analysis.
Firstly, motion estimation with FFT-STIV was tested. A total of 50 testing lines with a length of 255 pixels and a space interval of 50 pixels were set across the section of flume in each frame. Figure 9 visualizes the time-averaged optical flow field obtained by group

Comparison of In Situ and Non-In Situ Calibration
In order to verify whether the calibration results can be transferred into the actual measuring environment, a high-precision alumina calibration board was used for in situ calibration. There were 12 × 9 grids with a side length of 15 mm and an accuracy of 0.01 mm on the board. As its size was 200 × 200 mm, which was larger than the water surface to be measured, it was set on the top of the flume. Considering the thickness of the calibration board (i.e., ∆l = 3 mm), the actual elevation of the camera relative to the board was 0.464 m (Table 4). In comparison with the calibration results of set 3 in Table 2, the differences of ∆ω and ∆ϕ shown in Table 5 were both within the reasonable error range of 0.1 • . Therefore, this proves that the calibration method of this integrated measuring device is practically operable. In addition, the Root Mean Square Error (RMSE) relative to the grid size was 0.73%. Note that this value is approximately 1/3 of the relative error given in Section 3.4. It was found that the mean spatial resolutions of the two cases (0.286 × 0.364 mm/pixel vis-à-vis 0.849 × 1.224 mm/pixel) also approximately met this ratio. As a result, it is considered to be a major error source of measurement alongside the shooting angle.   Table 5. Calibration results of eccentric angles and camera orientation to the calibration board.

Without Calibration Without Gaussian Fitting With Gaussian Fitting
Item

Velocity Measurement
The flow rate remained stable during the experiment. For the convenience of flow field observation with human eyes, a small number of sediments and surfactants were seeded to the water to produce distinguishable foam patterns on the free surface. The camera continuously captured images for 100 s with a frame rate of 100 fps. The total 10,000 frames were evenly divided into 40 groups for analysis.
Firstly, motion estimation with FFT-STIV was tested. A total of 50 testing lines with a length of 255 pixels and a space interval of 50 pixels were set across the section of flume in each frame. Figure 9 visualizes the time-averaged optical flow field obtained by group 1, where the black arrow line represents the optical flow velocity v x (pixel/frame) corresponding to each testing line. For easy observation, the arrow lines were enlarged by 10 times, and two frames (18 th and 28 th ) with uniform and clear foam patterns were selected as background images. The central coordinates (x 1 , y 1 ) and (x 2 , y 2 ) of eight foams in the two frames were manually labeled as the corresponding points ( Figure 10). Table 6 provides the displacements d x , d y of the corresponding points and compares their optical flow velocities in the x direction (denoted as u x ) with the FFT-STIV measurements. It shows that there is a good consistency between them. Here, the maximum relative error of 3.24% is taken as the uncertainty of motion estimation u(v), which includes errors in manual labeling. , x FOR PEER REVIEW 13 of 18 times, and two frames (18th and 28th) with uniform and clear foam patterns were selected as background images. The central coordinates x 1 , y 1 and x 2 , y 2 of eight foams in the two frames were manually labeled as the corresponding points ( Figure 10). Table 6 provides the displacements d x , d y of the corresponding points and compares their optical flow velocities in the x direction (denoted as u x ) with the FFT-STIV measurements. It shows that there is a good consistency between them. Here, the maximum relative error of 3.24% is taken as the uncertainty of motion estimation u(v), which includes errors in manual labeling.  Secondly, velocity calibration with DSO was tested. Figure 11 presents the cross-section velocity distribution calculated with a total of 40 groups of calibrated velocity fields. It agreed with the pattern of velocity distribution in trapezoidal open channels [32]; the velocities varied dramatically near two side walls due to the frictional effect, but remained stable after a certain distance from them. In theory, the average velocity at the midstream times, and two frames (18th and 28th) with uniform and clear foam patterns were selected as background images. The central coordinates x 1 , y 1 and x 2 , y 2 of eight foams in the two frames were manually labeled as the corresponding points ( Figure 10). Table 6 provides the displacements d x , d y of the corresponding points and compares their optical flow velocities in the x direction (denoted as u x ) with the FFT-STIV measurements. It shows that there is a good consistency between them. Here, the maximum relative error of 3.24% is taken as the uncertainty of motion estimation u(v), which includes errors in manual labeling.  Secondly, velocity calibration with DSO was tested. Figure 11 presents the cross-section velocity distribution calculated with a total of 40 groups of calibrated velocity fields. It agreed with the pattern of velocity distribution in trapezoidal open channels [32]; the velocities varied dramatically near two side walls due to the frictional effect, but remained stable after a certain distance from them. In theory, the average velocity at the midstream  Secondly, velocity calibration with DSO was tested. Figure 11 presents the crosssection velocity distribution calculated with a total of 40 groups of calibrated velocity fields. It agreed with the pattern of velocity distribution in trapezoidal open channels [32]; the velocities varied dramatically near two side walls due to the frictional effect, but remained stable after a certain distance from them. In theory, the average velocity at the midstream (0.3 m) should be maximum; in this case, the measured velocity was approximately 0.02 m/s lower than those near 0.32 m. This is because the large foams were mainly concentrated in the midstream, which did not follow the surface flow as well as the small ones did. The starting distances of each testing line measured with the DSO method ranged from 0.260 m to 0.361 m. The field of view within 0.26 m was occluded by the right-side wall due to the limited observation angle, resulting an unmeasurable area of 0.012 m (shaded area in Figure 11). The measured distance between the two lower boundaries of the trapezoid section was 0.1 m, which was consistent with the true width of the bottom surface of the flume. The conclusion above verifies that the DSO method is effective for free-surface measurement and velocity calibration.  Figure 11). The measured distance between the two l the trapezoid section was 0.1 m, which was consistent with the true w surface of the flume. The conclusion above verifies that the DSO me free-surface measurement and velocity calibration.

Uncertainty Evaluation
The flow rate remained stable during the experiment. For the c field observation with human eyes, the uncertainty evaluation of real ment has always been a difficulty in PIV related studies. In this expe current meter (PCM) and a surface velocity radar (SVR) were set up parison measurement. The PCM was fixed at the midstream (0.3 m) to

Uncertainty Evaluation
The flow rate remained stable during the experiment. For the convenience of flow field observation with human eyes, the uncertainty evaluation of real flow field measurement has always been a difficulty in PIV related studies. In this experiment, a propeller current meter (PCM) and a surface velocity radar (SVR) were set up to carry out a comparison measurement. The PCM was fixed at the midstream (0.3 m) to measure its velocity on the water surface. The average velocity measured over a period of 100 s was 0.496 m/s. The SVR was set on one side of the flume, approximately 0.65 m above the water surface. Its pitch angle was 52 • and the beam heading angle relative to the flow direction was 5 • . The average velocity measured over a period of 100 s was 0.44 m/s, with a maximum of 0.63 m/s and a minimum of 0.38 m/s. Obviously, the measurements of both instruments were much lower than those of the DSO-based FFT-STIV method. The reasons for this are: (a) the water depth was too shallow for the propeller to completely submerge under the water, which results in low spin number; and (b), the water was clear and shallow enough for the electromagnetic wave to pass through the free surface, enough to reach the bottom and reflect back. The actual measured velocity amounts to the depth-averaged one, rather than the superficial one. Therefore, both instruments have limitations under the experimental conditions and cannot be used as effective references for an uncertainty evaluation.
In view of this, the repeatability precision of the DSO-based FFT-STIV method was assessed by analyzing the 40 groups of velocity measurements in this study, as shown in Table 7. For the testing lines in the bottom inner zone (0.265~0.341 m), the deviation between their mean and median values was within 0.004 m/s. The standard deviation (SD) was less than 0.03 m/s, and the relative standard deviation (RSD) was less than 5%, among which 85% were less than 2%. For each testing line, its 40 measurements were firstly sorted into ascending order according to the absolute value of the RSD. The 38th value was then taken as the relative error with a cumulative frequency of 95%, which ranges from 1.66% to 7.53%. Considering factors such as velocity pulsation, this repeatability precision is reasonable for free-surface velocity measurement. In addition, the cumulative frequency errors (CFEs) are approximately twice the relative standard deviations. It can be assumed that the measurement error obeys the normal distribution law. In contrast, the random coarse error caused by strong flow turbulence and poor tracer conditions in the bottom outer zone can be beyond ±10%. To indicate the relationship between spatial resolution and measurement error, Table 7 also gives the scaling factors of each testing line. Although the spatial resolution increases with the distance of the testing line from the camera, the measurement error does not show an increasing trend. This is because of the global adjustment effect of the least square method which is discussed in Section 3.4. Moreover, S x changes less significantly than S v , indicating that the x direction of the image is less affected by perspective distortion. It can be inferred that the accuracy of velocity calibration is higher than that of the starting distance when the optical axis of the camera is perpendicular to the direction of water flow.
According to (1), the physical velocity V only depends on the optical flow velocity v and scaling factor S. Considering the independence of their errors, here we bring the previously mentioned u(S) and u(v) into the following equation to estimate the combined standard uncertainty of velocity: It can be found that this value is close to the measured maximum RSD of 3.81% (testing line 41) in the bottom inner zone. According to the normal distribution law of measurement error, the extended uncertainty of 95% confidence level is estimated by using two times the combined standard uncertainty: This value is also close to the CFE (7.53%) of testing line 41. This observation implies that the uncertainty synthesis method is feasible for the evaluation of DSO-based FFT-STIV.

Conclusions
The DSO-based STIV is proposed to realize 1-D free-surface velocity measurement under an oblique shooting angle. It also provides a GCPs-free solution to the problem of velocity calibration in planar PIV applications where reference objects cannot be deployed in situ. On the basis of the traditional collinear equation, lens distortion, oblique shooting angle, water level variation, and water surface slope are introduced to build an accurate and reasonably simplified imaging model to describe the coordinate transformation relation between image and object planes. The LDM-supported measuring device is low-cost and easy to construct. It only needs to be calibrated once if the camera's intrinsic parameters and relative position within the LDM remain constant. The comparison of non-in situ and in situ calibration results verify good transferability of the calibration method. The effectiveness of motion estimation with FFT-STIV and velocity calibration with DSO were verified step by step based on the designed flume experiment. Finally, the combined uncertainty of velocity measurement was evaluated, which provided an effective means for error analysis when comparing measurement failures. Overall, the DSO-STIV proved superior to propeller current meters and surface velocity radars in surface velocity measurements in extremely shallow water conditions such as those in this study. It not only has high spatial resolution up to a single pixel, but also has good measuring precision with the standard uncertainty controlled within 5% in small-scale applications. In future work, the DSO-STIV will be applied to natural rivers to evaluate the sensitivity and uncertainty of large-scale river surfaces.