3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey

Cao, Lixia; Gu, Wei; Tian, Xing

doi:10.3390/pr14132101

Open AccessReview

3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey

by

Lixia Cao

^*

,

Wei Gu

^* and

Xing Tian

College of Metrology Measurement and Instrument, China Jiliang University, Hangzhou 310018, China

^*

Authors to whom correspondence should be addressed.

Processes 2026, 14(13), 2101; https://doi.org/10.3390/pr14132101 (registering DOI)

Submission received: 20 April 2026 / Revised: 24 June 2026 / Accepted: 26 June 2026 / Published: 28 June 2026

(This article belongs to the Section Particle Processes)

Download

Browse Figures

Versions Notes

Abstract

Three-dimensional (3D) particle field reconstruction is a core procedure of tomographic particle image velocimetry (Tomo-PIV). Its reconstruction accuracy and efficiency directly determine the ability of the PIV system to characterize various complex flow fields. Compared with traditional multicamera Tomo-PIV, a single light-field camera offers a compact layout, simple calibration, and strong adaptability, making it widely applicable for 3D flow measurement in confined space. This paper systematically reviews recent advances in 3D particle field reconstruction algorithms that use a single light-field camera, including both traditional iterative reconstruction methods and deep learning techniques. First, the imaging mechanism of different light-field cameras, the fundamental theory of light-field Tomo-PIV, and the mathematical foundation of tomographic reconstruction are elaborated to establish a theoretical framework for subsequent algorithm analysis. Next, the advantages, disadvantages, and limitations of traditional iterative reconstruction methods and deep learning techniques are comprehensively analyzed from key dimensions, including reconstruction quality, computational efficiency, inherent defects such as particle elongation and ghost particles, and applicable scenarios. On this basis, the current technical bottlenecks are concluded, including low computational efficiency under high particle concentration, insufficient research on velocity uncertainty quantification, domain mismatch between simulated and experimental datasets, and poor interpretability of deep learning models. Finally, several promising future research directions are discussed, such as the optimization of multiframe correlation-based high-precision reconstruction algorithms, the development of standardized open-source datasets, the interpretability of deep neural networks, and time-resolved flow measurement. This study aims to provide a comprehensive algorithmic reference for researchers in the field and facilitate the practical application of light-field Tomo-PIV in engineering fluid mechanics and related disciplines.

Keywords:

particle image velocimetry; iterative reconstruction; deep learning; light field

1. Introduction

Three-dimensional (3D) flow measurement in confined spaces plays a key role in aerospace propulsion, internal combustion engines, microfluidics, and other industrial fields. However, owing to the limited space size, insufficient optical windows, complex internal obstacles, and the difficulty of synchronization and calibration among multiple cameras, 3D particle image velocimetry (PIV) systems based on traditional multicamera setups are difficult to arrange in practical applications. In comparison, a single camera configuration is more flexible and less intrusive to the flow field, which can effectively reduce the complexity of system installation, synchronous triggering, and parameter matching [1]. Unlike a traditional camera that only records the light intensity and position of the light field, a light-field camera integrates a microlens array (MLA) in front of a charge-coupled device (CCD) sensor, which splits the scattered light from tracer particles into different directions and forms corresponding images on the sensor [2]. Therefore, a single light-field camera can simultaneously capture the intensity, position, and direction information of scattered light in a single exposure without requiring multiangle acquisition or multicamera stitching [3]. This unique advantage enables it to replace a multicamera system for efficient 3D PIV [1]. In the past decade, with the rapid development of microlens fabrication, sensor technology, and precision manufacturing, tomographic PIV (Tomo-PIV) based on a single light-field camera (light-field Tomo-PIV) has become increasingly mature and has been widely adopted in various flow measurement applications [4,5,6,7,8,9,10,11,12].

The basic principles of light-field Tomo-PIV include the calibration of the light-field camera, calculation of the weight matrix, tomographic reconstruction, implementation of a 3D cross-correlation algorithm, and subsequent processing of the flow field [5]. The calibration of light-field camera includes the calibration of a focused light-field camera and the calibration of a standard plenoptic camera. Chen L et al. proposed a calibration method for focused light-field cameras, such as the Raytrix series light-field cameras [13]. First, a circle detection method based on the circular Hough transform is used to accurately obtain the center point of each subimage in the white raw image. On this basis, the subimages in the checkerboard image are classified, and the image corners are detected to determine the image corner clusters corresponding to the checkerboard corners. Then, the center point of the full disc features is taken as the projection point of the checkerboard corner points on the light-field image, and the camera pose is calculated. Finally, the Levenberg Marquardt method is used to solve the calibration model. This calibration method can obtain high-precision optical parameters of Raytrix series light-field cameras. However, when there are two or more different media in the object space, this calibration technique will cause errors in the calculation of the weight matrix. A volumetric calibration method of the standard plenoptic camera is proposed by Hall E M in 2018 [14]. In their research, the spatial mapping relationship among the MLA coordinates (s, t), the known 3D calibration point coordinates (X, Y, Z), and the corresponding calibration point coordinates in different perspective views (u, v) are established using a third-order polynomial mapping function. Thus, the coordinates of any point in the object space can be related to the coordinates of the MLA and the pixel coordinates of the CCD sensor. This calibration method corrects the inaccuracies arising from inconsistent media in the object space, real-world distortions of the main lens and MLA, and thin-lens assumptions. Moreover, this calibration technique does not require calibration of the optical parameters of the standard plenoptic camera. Shi et al. proposed a calibration mode based on the Gaussian optics in 2019 [15]. Similarly, their calibration method determines the relationship between a voxel and its corresponding microlenses and pixels, as well as the center and diameter of the circle of confusion formed on the MLA. The distortions introduced by the main lens and the misalignments between the MLA and the CCD sensor were considered. In 2020, Shi et al. further proposed a flexible calibration method for the standard plenoptic camera using the plenoptic type features [16]. A ‘plenoptic disk features’ was used to operate the raw light-field image. A centroid algorithm was applied to identify the point-like features linked to a point in 3D space. This calibration method does not require the intermediate processing steps of generating sub-aperture images or detecting features on those images. Zhu et al. proposed an adjustment strategy of the iterative polynomial model [17]. This method first initializes from a target-based polynomial mapping and then iteratively optimizes it using the residuals of particle sub-aperture images collected from experiments. Compared to relying on a one-shot fit, this calibration method updates the mapping of the entire volume using the relative displacement measured in real particle images. During the model adjustment process, the angular deviation related to different sub-apertures of the main lens is explicitly compensated. To ensure stability, spatial angular coupling and lightweight regularization were applied to suppress overfitting while preserving the inherent nonlinear mapping characteristics of low-frequency imaging. This calibration is currently the most advanced calibration technique. There are five ways to calculate the weight matrix in the light-field Tomo-PIV. In Fahringer’s method [18], the model assumes that each light ray originating from the voxel center possesses a finite cross-section equivalent to one microlens pitch. For an orthogonal MLA, any ray from the voxel center through the main lens is invariably captured by the four adjacent microlenses and the 16 underlying pixels. Consequently, each light ray’s projection is divided into 4 rectangles on the microlens plane and then into 16 rectangles on the pixel plane by the corresponding 16 pixels. The rectangle’s area determines its contribution coefficient. Owing to the underlying geometric relationship, a linear interpolation scheme can be readily applied to compute the areas of these rectangles. However, the applicability of the linear method is restricted to orthogonal MLA configurations. Additionally, the discretized voxel pitch in the measurement volume should match the microlens pitch to preserve this special geometric relationship. The calculation of the weight matrix requires tens of hours, even with parallel computation [19]. In Shi’s calculation method of the weight matrix [20], for a given voxel, its dense light rays are first traced onto the MLA plane and then onto the CCD sensor. The weight coefficient is given by multiplying the overlap between the dense light rays and the MLA (w₁) by the overlap between the light rays and the pixels (w₂). This method is independent of MLA geometry but suffers from high computational cost. In Cao’s calculation method of the weight matrix [7], the backward ray-tracing technique traces light rays from the pixel center to the MLA, the main lens, and the measurement volume in the flow field. The contribution of the voxel to the pixels depends on the distance between the voxel and the pixel’s line-of-sight, modeled via a Gaussian function. After being analyzed, the scattered light emitted by particles in the flow field propagates sequentially through a gaseous or liquid medium, air, and the optical system before reaching the CCD sensor. The above three methods for calculating weight arrays assume that the scattered light from particles in the object space propagates directly through air and the optical system to the CCD sensor. This light transmission model does not consider ray-tracing effects in media with different refractive indices in the object space. Zhu and Wu proposed a method for calculating the weight matrix to overcome this problem [8]; firstly, the mapping relationship between spatial object point (X, Y, Z) and MLA coordinates (s, t) is established through Hall E M’s volume calibration method, and then the weight coefficients of discrete voxels in the flow field are calculated using forward ray-tracing technique. However, this calibration method is not applicable when the flow field is inside a cylindrical pipeline or when the calibration plate cannot be placed. Thus, to overcome this problem, Zhu presented an equivalent ray-tracing method for the calculation of the weight coefficients [21]. A light-field snapshot of a smart calibration board establishes a mapping relationship that links the target points sampled in the flow field to their equivalent points in the air. The weight coefficients are then obtained by the ray-tracing method, where the starting points of ray tracing are shifted from the target points to their equivalent points. In a study of the cross-correlation algorithm, Zhu et al. presented an approach for determining optimal parameters for cross-correlation calculation in light-field Tomo-PIV [22]. The selection criterion of the interrogation window size was studied based on the analysis of the valid detection probability of the correlation peak. The optimal seeding concentration and the size of tracer particles were explored through synthetic Gaussian vortex field reconstruction. The optimized parameters were employed in a cylinder wake flow measurement in a confined channel. However, they did not study the impact of the reconstruction accuracy of the 3D particle field in light-field Tomo-PIV on the accuracy of the cross-correlation algorithm. Due to the scarcity of research on temporally resolved light-field Tomo-PIV, subsequent processing of velocity field is also severely limited.

Among them, accurate and efficient reconstruction of the 3D particle field is critical for obtaining a high-precision velocity field efficiently [23]. In light-field Tomo-PIV, the reconstruction algorithms for the 3D particle field are mainly divided into traditional iterative reconstruction algorithms and deep learning techniques. Traditional iterative reconstruction algorithms for reconstructing 3D particle fields focus on iterative optimization, with typical examples including multiplicative algebraic reconstruction (MART) [4], dense ray-tracing-based MART (DRT-MART) [6], expectation maximization (EM) [1], and the precognition-based simultaneous algebraic reconstruction technique (Pre-SART) [8]. The development and application of iterative reconstruction algorithms for 3D particle fields based on light-field imaging can be traced back to 2012. Lynch K. et al. first proposed the framework of light-field Tomo-PIV in 2012 [4]. They used MART to reconstruct the volume field of an Oseen vortex ring, laying the foundation for the engineering application of this technique. In 2016, Shi et al. focused on the influencing factors of particle reconstruction quality and studied the effects of key parameters such as the pixel-to-microlens ratio (PMR), microlens geometry, number of reconstruction iterations, relaxation factor, and the voxel-to-pixel ratio on the final particle reconstruction quality [24]. In 2017, Shi et al. further improved the traditional MART algorithm and proposed DRT-MART for light-field Tomo-PIV [6]. Similar to the line-of-sight multiplication (MLOS) technique for pixels corresponding to the same voxel in traditional Tomo-PIV, the pixel position corresponding to each voxel is determined through the dense ray tracing technique, and then the pixels corresponding to each voxel are multiplied. When the product exceeds a certain threshold, the corresponding voxel is a nonzero voxel. Finally, the MART algorithm is used to calculate the 3D particle field distribution of these nonzero voxels. In 2019, Cao et al. proposed the EM algorithm for 3D particle field reconstruction, and they also studied the influence of optical parameters of the light-field camera on the reconstruction quality of EM [1]. Zhu et al. proposed the SART algorithm based on prerecognition for 3D particle field reconstruction in 2021 [8]. During the precognition process, a feature index was introduced to optimize the filtering logic of the particle signals, effectively improving the accuracy of the precognition. Afterward, the SART algorithm was used to reconstruct the 3D particle field from the light-field image. Owing to the low depth resolution of the light-field camera, the 3D particle field reconstructed by traditional iterative reconstruction algorithms is elongated along the depth direction (along the Z-axis) [25]. Dual light-field cameras placed perpendicular to each other are used for capturing tracer particle images in light-field Tomo-PIV [26,27,28,29]. The dual-light-field-camera system provides higher depth resolution than a single light-field camera. Therefore, the elongation artifact in the reconstructed 3D particle field is reduced using dual light-field cameras. However, the traditional iterative reconstruction algorithms for 3D particle field reconstruction are computationally intensive and typically require more than 15 min to tens of hours [30,31]. The reconstruction time of a dual field camera system is longer than that of a single field camera. In recent years, owing to the rapid development of deep learning and machine learning, 3D particle field reconstruction based on deep learning has been developed. These algorithms can alleviate the elongation of reconstructed 3D particle field commonly observed in traditional iterative reconstruction methods and significantly improve reconstruction efficiency by leveraging the feature-learning capability of neural network. However, currently, there is limited research on the reconstruction of the 3D particle field for the light-field Tomo-PIV using deep learning. In 2024, Cao et al. proposed a technique based on the combination of the digital refocusing technique and 3D U-Net for the reconstruction of the 3D particle field in light-field 3D PIV [32]. Their method can significantly alleviate the elongation of reconstructed 3D particle fields along the depth direction. Moreover, the 3D U-Net network model can improve the reconstruction efficiency of the 3D particle field. Zhu et al. proposed the light-field supervised deep neural network (LF-DNN) method in 2024 for reconstructing a 3D particle field from a light-field image [33]. LF-DNN combines a residual neural network structure and a new hybrid loss function. Their research revealed that LF-DNN outperforms MART and prerecognition MART (PR-MART) in terms of reconstruction quality, mitigation of delay effects, and noise tolerance. LF-DNN also improves the reconstruction efficiency, which is 9.6 times and 7.1 times greater than that of MART and PR-MART, respectively.

This paper aims to review the recent progress in 3D particle field reconstruction algorithms for light-field Tomo-PIV, including both traditional iterative reconstruction methods and deep learning techniques. First, the fundamental principles of the light-field camera and light-field Tomo-PIV are briefly introduced. Afterward, the working mechanisms of both conventional mainstream 3D particle field reconstruction methods and deep learning-based reconstruction algorithms are elaborated. Next, the advantages, disadvantages, limitations, and technical bottlenecks of traditional iterative reconstruction methods and deep learning techniques are analyzed. Finally, the current challenges in 3D particle field reconstruction for light-field Tomo-PIV are analyzed, and potential future research directions and development trends in this field are discussed. Existing reviews related to light-field flow measurement only partially discuss isolated reconstruction algorithms without unified comparison and systematic integration. Unlike previous fragmented reviews, this work establishes a comprehensive comparative analysis framework for major 3D particle field reconstruction methods, which constitutes its core contribution. By systematically synthesizing fragmented research findings in this field, this review provides a solid theoretical and technical foundation for accelerating the practical application of light-field Tomo-PIV in fluid mechanics and cross-disciplinary flow measurement.

2. Light-Field Imaging and Light-Field Tomo-PIV

2.1. Principles of the Light-Field Imaging

2.1.1. Structure of the Light-Field Camera

The light-field imaging system used in the light-field Tomo-PIV discussed in this paper is an MLA-based light-field imaging system. The light-field imaging system based on an MLA is an improved method that combines some optical components, such as an MLA, with a single traditional camera. An MLA is an optical component consisting of numerous tiny lenses arranged in an array. The size of a single lens in an MLA can be as small as a few hundred microns, with a focal length of tens of microns to a few millimeters and a thickness of approximately 1–2 mm. In 1992, Adelson E. H. designed a plenoptic camera in which light from the object space is imaged onto the MLA surface through a main lens [34]. The image on the back focal plane of the MLA is transferred and imaged onto the CCD sensor through a relay lens. However, the introduction of the relay lens causes severe vignetting in the light-field image. In 2005, Ng. R et al. simplified the design of the plenoptic camera by removing the relay lens from their light-field camera [35]. An MLA is installed at a specific distance in front of the traditional camera’s CCD sensor so that the CCD sensor is at the back focal length of the MLA. However, the spatial resolution of the standard plenoptic camera is lower. In response to this issue, Lumsdaine A. and Georgiev T. proposed a focused light-field camera in 2008 [36]. Unlike the standard plenoptic camera, the distance between the MLA and the CCD sensor is not equal to the focal length of the MLA but is smaller or larger than the focal length of the MLA. A certain plane in the object space is first imaged on a virtual image plane (VIP) at a certain distance behind the main lens through the main lens. The VIP is then reimaged on the CCD sensor through the MLA. This secondary imaging method has higher spatial resolution than the standard plenoptic camera. Thurow. B. W.’s team used customized optical accessories to accurately assemble the MLA in front of a traditional camera’s CCD sensor at a specific distance, fixing the MLA between the camera’s CCD sensor and the main lens [37]. This light-field camera is used for flow-field velocity field measurement and combustion diagnosis. Shi et al. also accurately fixed an MLA at a fixed distance in front of a traditional camera’s CCD sensor and used a light-field camera to measure the flow-field velocity field [24]. Thurow. B. W.’s team assembled a cage-type high-speed light-field camera, coupled two main lenses together to form a relay system, and installed an MLA in front of the relay system [38]. Afterwards, another lens was installed in front of the MLA, and the light-field camera was used to measure flow velocity fields and perform combustion diagnostics.

2.1.2. Conjugate Relationship of the Light-Field Camera

The schematics of the light-field cameras are shown in Figure 1. This figure illustrates different optical configurations of light-field cameras used in light-field Tomo-PIV. The optical geometry and conjugate relationship shown in the figure are the fundamental basis for the ray-tracing technique, the principle of tomography and reconstruction. All geometric symbols in subfigures are consistent and correspond to Equations (1)–(3). The standard plenoptic camera is shown in Figure 1a. From Figure 1a, the conjugate relationship can be expressed as follows:

\{\begin{cases} \frac{1}{l_{1}} + \frac{1}{l_{m}} = \frac{1}{f} \\ \frac{1}{l_{m}} + \frac{1}{d_{2}} = \frac{1}{f_{m}} \end{cases}

(1)

where l₁ is the distance between the object plane and the main lens plane, l_m is the distance between the main lens and the MLA plane, f is the focal length of the main lens, f_m is the focal length of the MLA, and d₂ is the distance between the MLA and the CCD sensor.

The focused Keplerian mode light-field camera and Galilean mode light-field camera are shown in Figure 1b,c, respectively. In Keplerian mode, the distance between the MLA and the CCD sensor is greater than the focal length of the MLA. Therefore, the VIP plane is located between the main lens and the MLA. For the Galilean mode, the distance between the MLA and the CCD sensor is less than the focal length of the MLA. Therefore, the VIP plane is located behind the CCD sensor. From Figure 1b,c, the conjugate relationships can be expressed as follows:

\{\begin{cases} \frac{1}{l_{1}} + \frac{1}{l_{2}} = \frac{1}{f} \\ \frac{1}{d_{1}} + \frac{1}{d_{2}} = \frac{1}{f_{m}} (Keplerian) \\ \frac{1}{d_{1}} - \frac{1}{d_{2}} = \frac{1}{f_{m}} (Galilean) \end{cases}

(2)

where l₂ is the distance between the main lens and the VIP, and d₁ is the distance between the VIP and the MLA plane.

Figure 1d shows the Raytrix R29. Raytrix uses microlenses with three different focal lengths in the MLA to improve the depth resolution of the focused light-field camera. Therefore, Raytrix R29 has three different VIP planes and three virtual object planes (VOPs). From Figure 1d, the conjugate relationships can be expressed as follows:

\{\begin{cases} \frac{1}{l_{v 1}} + \frac{1}{l_{v 1}^{'}} = \frac{1}{f} \\ \frac{1}{l_{v 2}} + \frac{1}{l_{v 2}^{'}} = \frac{1}{f} \\ \frac{1}{l_{v 3}} + \frac{1}{l_{v 3}^{'}} = \frac{1}{f} \\ \frac{1}{d_{2}} - \frac{1}{l_{v 2}^{'} - l_{m}} = \frac{1}{f_{m 2}} \end{cases}

(3)

where l_v1, l_v2, and l_v3 are the distances between the main lens and the VOP1, VOP2, and VOP3, respectively, and l′_v1, l′_v2, and l′_v3 are the distances between the main lens and the VIP1, VIP2, and VIP3, respectively.

In light-field Tomo-PIV, a standard plenoptic camera is typically used to capture light-field image of tracer particles. This can be found in the papers of Thurow. B. W., Shi, and Zhu. Cao explored the reconstruction quality of 3D particle fields using a focused light-field camera. Research has shown that the high inverse magnification of microlenses (M_m) has a significant effect on the reconstruction quality of 3D particle fields. A large |M_m| is beneficial for improving the quality of tomographic reconstruction and expanding the measurement volume along the Z-axis.

2.1.3. F-Number Matching of the Light-Field Camera

Before capturing the light-field image of the tracer particles, it is important to consider the F-number matching relationship of the light-field camera. In both the standard plenoptic camera and the focused light-field camera, the light collected on the CCD sensor originates from the rays gathered by the main lens. A series of subimages under each microlens are produced, as shown in Figure 2. Figure 2 illustrates the distribution of subimages formed by each microlens in the light-field camera. The generation, overlapping, and coverage of these subimages are determined by the optical geometry and F-number matching rules. They provide the fundamental raw data for the ray-tracing technique and subsequent 3D particle field reconstruction in light-field Tomo-PIV. To make full use of the pixels of the CCD sensor, the F-number of the main lens should be adjusted by changing the aperture size of the main lens so that every neighboring subimage is tangent (i.e., F-number matching).

The principles of the F-number of different light-field cameras are shown in Figure 3. This figure illustrates the F-number matching principles for various MLA-based light-field cameras. Reasonable F-number matching ensures effective collection of optical information and avoids subimage overlapping or information loss. This is a key precondition for guaranteeing ray-tracing accuracy and reliable 3D particle field reconstruction in light-field Tomo-PIV. In the light-field camera, the main lens can be seen as many discrete point light sources. The principle of the F-number of the standard plenoptic camera is shown in Figure 3a. The relationship between the main lens and the MLA plane satisfies the conjugate relation. The edge point of the subimage comes from the edge of the main lens pupil. To ensure that every neighboring subimage is tangent, the edges of adjacent subimages should be at the same point. The edge point D in the subimage is formed by light from point sources A and B on the main lens pupil passing through adjacent microlenses. Thus, the F-number matching relationship can be calculated by a triangle relation enclosed by the yellow lines.

\frac{P_{m}}{d_{2}} = \frac{P_{l}}{l_{m} + d_{2}} \approx \frac{P_{l}}{l_{m}} (l_{m} > > d_{2})

(4)

where P_m is the pitch of each microlens in the MLA, and P_l is the pitch of the main lens.

Focused light-field cameras operate in Keplerian and Galilean modes and have the same F-number as a standard plenoptic camera [39]. However, every neighboring subimage still overlaps. The overlapping region cannot be distinguished, so the number of pixels in the subimages that can be used for ray tracing is reduced. To solve this problem, the F-number of the focused light-field camera is rederived. Figure 3b,c illustrate the principles of the F-number of the focused light-field cameras. The MLA maps the main lens into a plane that is in front of and behind the CCD sensor for the Keplerian and Galilean modes, respectively. The main lens and the CCD sensor do not satisfy the conjugate relationship. Therefore, points A and B on the main lens are defocused onto the CCD sensor, leading to a blurred image with a diameter of c rather than a point. For the Keplerian mode, the diameter c can be calculated by the following equation:

c = \frac{P_{m} [l_{m} d_{2} - f_{m 2} (l_{m} + d_{2})]}{f_{m 2} l_{m}}

(5)

For the Galilean mode, the diameter c can be calculated by

c = \frac{P_{m} [f_{m 2} (d_{2} + l_{m}) - l_{m} d_{2}]}{f_{m 2} l_{m}}

(6)

To ensure that every neighboring subimage is tangent, the F-number matching condition is determined using the trapezoid relation (green lines in Figure 3b,c) rather than the triangle relation.

\frac{P_{m} - c}{d_{2}} = \frac{P_{l} - c}{l_{m} + d_{2}} \approx \frac{P_{l} - c}{l_{m}} (l_{m} > > d_{2})

(7)

2.2. Principle of the Light-Field Tomo-PIV

The technical strategy of light-field Tomo-PIV based on a single light-field camera is shown in Figure 4. This figure illustrates the complete workflow of light-field Tomo-PIV based on a single light-field camera, including tracer particle seeding, laser illumination, light-field image acquisition, 3D particle field reconstruction, and velocity field calculation via cross-correlation. This integrated technical route builds the overall framework for flow-field measurement and reflects the core operating logic of the entire measurement system. The principles and procedures of light-field Tomo-PIV are as follows:

•: Tracer particle seeding: Tracer particles are dispersed into the flow field. The particle density should be close to that of the fluid to ensure excellent flow following performance. Tracer particles with diameters of 20 μm and 50 μm are typically used in experiments. In general, a larger particle size produces a stronger scattered light signal, resulting in a higher signal-to-noise ratio and better imaging quality of the light-field image.
•: Dual-pulse laser illumination: A typical dual-pulse volumetric laser used in a light-field Tomo-PIV system is adopted for illumination. The laser emits two short pulses into the measured flow field, with a time interval Δt between the two pulses. This time interval Δt serves as the critical time base for the subsequent flow velocity calculation.
•: Light-field image capture: When the laser pulses illuminate the tracer particles, the particles generate scattered light. A light-field camera performs double frame synchronous exposure, such that the dual laser pulses correspond exactly to the two exposure instants of the camera. In this way, a pair of particle light-field images is recorded. The time interval between the two light-field images is consistent with the laser pulse interval, both of which are equal to Δt.
•: Reconstruction of the 3D particle field: The measurement volume is discretized into a 3D grid composed of voxels, each with a corresponding light intensity value E. The light intensity distribution of the 3D particle field E and the gray levels of the light-field image P on the camera sensor satisfy the following linear projection relationship:

$[\begin{matrix} P_{1} \\ P_{2} \\ ⋮ \\ P_{m} \end{matrix}] = [\begin{matrix} W_{1, 1} & W_{1, 2} & \dots & W_{1, j} & \dots & W_{1, n} \\ W_{2, 1} & W_{2, 2} & \dots & W_{2, j} & \dots & W_{2, n} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ W_{i, 1} & W_{i, 2} & \dots & W_{i, j} & \dots & W_{i, n} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ W_{m, 1} & W_{m, 2} & \dots & W_{m, j} & \dots & W_{m, n} \end{matrix}] [\begin{matrix} E_{1} \\ E_{2} \\ ⋮ \\ E_{n} \end{matrix}]$

(8)

where m is the total number of pixels on the CCD sensor, n is the total number of the discretized voxels in the measurement volume, and W_i_,_j is the contribution of the jth voxel to the ith pixel, which is referred to as the weight matrix.
Equation (8) represents a typical ill-posed inverse problem, which is usually solved using iterative reconstruction algorithms to reconstruct the 3D spatial distribution of tracer particles from a 2D light-field image. The reconstruction process is complex and computationally intensive.
•: 3D cross-correlation and velocity field: Finally, 3D cross-correlation is performed on the 3D particle field reconstructed at two successive instants, yielding the 3D displacement vector of the particle group R within the time interval Δt. The expression of 3D cross-correlation is as follows:

$R (Δ m^{'}, Δ n^{'}, Δ l^{'}) = \frac{\sum_{m^{'} = 1}^{x_{s i z e}} \sum_{n^{'} = 1}^{y_{s i z e}} \sum_{l^{'} = 1}^{z_{s i z e}} E_{t} (m^{'}, n^{'}, l^{'}) E_{t + Δ t} (m^{'} + Δ m^{'}, n^{'} + Δ n^{'}, l^{'} + Δ l^{'})}{σ_{t} σ_{t + Δ t}}$

(9)

According to R/∆t, the 3D velocity distribution of the entire measured flow volume can then be calculated.

2.3. Evaluation Indicators in the Light-Field Tomo-PIV

2.3.1. Reconstruction Accuracy

In traditional Tomo-PIV and light-field Tomo-PIV, the reconstruction accuracy of the 3D particle field affects the accuracy of 3D cross-correlation. Thus, evaluating the reconstruction accuracy of the 3D particle field is crucial. At present, the reconstruction accuracy of the 3D particle field includes the comparison between the reconstructed particle positions and the theoretical positions for each particle, as well as the reconstruction quality for multiple particles. The position accuracy of each particle and axial errors can affect the placement of the measurement volume [1]. This paper does not discuss the position accuracy of each particle. The reconstruction quality Q is defined as follows [40]:

Q = \frac{\sum_{j} [E_{1} (X_{j}, Y_{j}, Z_{j}) \cdot E_{0} (X_{j}, Y_{j}, Z_{j})]}{\sqrt{\sum_{j} E_{1}^{2} (X_{j}, Y_{j}, Z_{j}) \cdot \sum_{j} E_{0}^{2} (X_{j}, Y_{j}, Z_{j})}}

(10)

where E₀ is the predetermined 3D intensity distribution located at 3D position (X_j, Y_j, Z_j), E₁ is the reconstructed 3D intensity distribution located at 3D position (X_j, Y_j, Z_j), and Q (0 ≤ Q ≤ 1) is the reconstruction quality of the 3D particle field, also known as the normalized correlation coefficient. The larger the Q value, the closer the reconstructed 3D intensity distribution E₁ is to the predetermined 3D intensity distribution E₀. Equation (10) is applicable to evaluating the reconstruction accuracy of the 3D particle field. In the experiment of Tomo-PIV, due to the lack of a predetermined 3D particle field distribution, Equation (10) cannot be used to evaluate the reconstruction quality of the 3D particle field in the experiment. Therefore, in practical experiments, there is currently a general lack of evaluation of the reconstruction quality of 3D particle field.

2.3.2. Particle Concentration

The concentration of tracer particles is an important parameter that cannot be ignored in Tomo-PIV. The concentration of tracer particles in the flow field affects both the quality of 3D particle field reconstruction and the ability to characterize the structure of the 3D flow field. In traditional multicamera Tomo-PIV, the concentration of tracer particles in the flow field is defined as follows:

c = \frac{N_{p a r t i c l e}}{N_{p i x e l}} (ppp)

(11)

where N_particle is the number of the tracer particles, and N_pixel is the number of the pixel of the CCD sensor. The unit of concentration is ppp, which stands for particle per pixel. The number of tracer particles in the particle image can be determined by a Gaussian fitting function in image processing.

In light-field Tomo-PIV, due to the introduction of MLA in the imaging system, the concentration of tracer particles in the flow field is no longer displayed in the form of ppp, but is defined as follows:

c = \frac{N_{p a r t i c l e}}{N_{M L A}} (pp m)

(12)

where N_MLA is the number of microlenses. The unit of concentration is ppm, meaning particle per microlenses. Due to the MLA, the number of tracer particles in the light-field image cannot be determined through image processing. Therefore, it is difficult to determine the actual concentration of tracer particles during the experimental process. This is a difficult problem to solve in light-field Tomo-PIV.

In traditional multicamera Tomo-PIV, the particle concentration is typically maintained at approximately 0.05 ppp. In light-field Tomo-PIV, the particle concentration is generally 1 ppm. That is to say, the number of MLAs is equal to the number of particles. However, according to Equation (12), a particle concentration of 1 ppm corresponds to a low-concentration particle field. When the particle concentration is low, the reconstruction algorithms can reconstruct a high-quality 3D particle field. To ensure that there are 8–11 particles in each cross-correlation window of the 3D cross-correlation algorithm, a low-concentration 3D particle field usually requires a large cross-correlation window size. This reduces the number of vectors in the velocity field, thereby decreasing its spatial resolution, which can make it difficult to accurately characterize the 3D flow structure. When the particle concentration is high, for example, when the number of particles exceeds the number of microlenses, multiple scattering affects particle intensity, causing saturation in the light-field image. This makes it challenging for reconstruction algorithms to accurately reconstruct the 3D particle field, which subsequently reduces the accuracy of displacement estimation based on 3D cross-correlation.

2.3.3. Uncertainty in Tomo-PIV

Reconstruction uncertainty

Reconstruction uncertainty refers to the discrepancy between the reconstructed 3D particle distribution and the actual 3D particle field [41,42]. In traditional Tomo-PIV, reconstruction uncertainty is typically manifested as reconstruction error, ghost particles, particle elongation, intensity distortion, and particle loss. Consequently, these effects degrade particle localization accuracy and velocity estimation. The magnitude of reconstruction uncertainty depends on multiple factors, including calibration errors, inaccuracies in the weighting matrix, limited viewing angles, insufficient camera numbers, image noise, particle image overlap, and reconstruction algorithm convergence errors. In addition, high particle concentrations may increase the occurrence of ghost particles and reconstruction artifacts, further degrading reconstruction quality. Since these uncertainty sources are strongly coupled, they propagate through the entire measurement chain and ultimately influence the accuracy of the reconstructed velocity field. In light-field Tomo-PIV, there are few studies on the uncertainty of reconstruction and the influencing factors of reconstruction uncertainty. Fahringer T W studied the deviation between the reconstructed 3D particle field and the actual 3D particle field using MART in the light-field Tomo-PIV [43]. However, the impact of reconstruction accuracy on velocity field accuracy remains insufficiently explored.

2.: Calibration uncertainty

Calibration uncertainty refers to errors arising from the determination of the mapping relationship between image coordinates and physical space coordinates [44,45]. In Tomo-PIV, calibration uncertainty directly affects the accuracy of particle projection, ray intersection, and weighting matrix construction, thereby influencing volumetric reconstruction quality and subsequent velocity measurements. The main sources of calibration uncertainty include calibration target positioning errors, main lens distortion modeling errors, image point detection errors, optical misalignment, and residual errors in self-calibration procedures. In addition, environmental factors such as mechanical vibrations and thermal variations may further degrade calibration accuracy. These uncertainties can propagate through the reconstruction process, leading to particle elongation, ghost particles, and reduced velocity measurement accuracy. However, in light-field Tomo-PIV, there is little discussion on the analysis of calibration uncertainty and the impact of calibration uncertainty on the quality of 3D particle field reconstruction and velocity field accuracy.

3.: Cross-correlation uncertainty

Cross-correlation uncertainty denotes the random fluctuations and systematic biases in velocity vectors introduced during the volumetric cross-correlation stage after volumetric particle field reconstruction [46,47]. These uncertainties independently dominate the random error component of Tomo-PIV velocity measurements. Such uncertainties stem from multiple inherent factors of the correlation calculation. Primarily, ghost particle noise and insufficient tracer particles within interrogation volumes reduce the signal-to-noise ratio of correlation peaks, causing stochastic jitter during sub-voxel peak fitting. Second, non-uniform velocity gradients inside single interrogation windows generate systematic bias, which underestimates flow gradients and distorts turbulence statistics. When particle displacements are close to integer voxels, the peak-locking effect further induces periodic offset errors. However, in light-field Tomo-PIV, the impact of reconstruction accuracy and calibration accuracy on velocity field accuracy and uncertainty has not been fully investigated.

3. Traditional Iterative Reconstruction Algorithms and Deep Learning Techniques for 3D Particle Field Reconstruction

3.1. Traditional Iterative Reconstruction Algorithms for 3D Particle Field Reconstruction

3.1.1. MART

MART is widely used in traditional multicamera Tomo-PIV because of its low number of iterations and high reconstruction quality. Elsinga first used the MART algorithm for the reconstruction of Equation (8) in traditional multicamera Tomo-PIV. For each voxel, the voxel intensity E is updated by

E {(X_{j}, Y_{j}, Z_{j})}^{k + 1} = E {(X_{j}, Y_{j}, Z_{j})}^{k} {[\frac{P (x_{i}, y_{i})}{\sum_{j \in N_{i}} W_{i, j} E {(X_{j}, Y_{j}, Z_{j})}^{k}}]}^{μ W_{i, j}}

(13)

where μ = [0, 1] is a relaxation factor; the larger the value of μ is, the faster the convergence rate. N_i is the number of voxels intersected with the ith pixel’s line-of-sight, and k is the number of iterations.

In Equation (13), the update is driven by the ratio of the pixel intensity P and the corresponding object projection

\sum_{j \in N_{i}} W_{i, j} E {(X_{j}, Y_{j}, Z_{j})}^{k}

. Although the algebraic reconstruction technique (ART) algorithm is widely used in traditional Tomo-PIV, it is not employed in light-field Tomo-PIV due to the presence of apparent ray trajectory issues in the reconstructed 3D particle field. This increases the noise in the reconstructed 3D particle field. The MART algorithm can avoid negative light intensity values that occur during the iteration process of the ART algorithm. The MART algorithm has the advantages of fewer iterations, faster convergence, and consistent positivity of light intensity values throughout the entire iteration process. Of course, the MART algorithm also has its drawbacks. The MART algorithm heavily relies on the accuracy of the weight matrix [24]. A low-precision weight matrix leads to the emergence of incorrect particles and ghost particles during the iteration process. Direct computation using Equation (13) results in extremely high computational cost and computational complexity. Meanwhile, as the particle concentration increases, the reconstruction time increases. Thus, the conventional MART algorithm necessitates structural improvements to adapt to multi-threaded GPU parallel execution. As the particle concentration increases, the reconstruction quality of the MART algorithm gradually decreases [24]. In light-field Tomo-PIV, when the particle concentration is 1 ppm, the reconstruction quality is between 0.35 and 0.5. The reconstruction time is approximately 0.38 h. As the number of iterations increases, the computation time will also significantly increase. In the MART algorithm, the number of iterations is typically on the order of tens of iterations [48]. The geometric parameters of the MLA affect the reconstruction quality of the MART algorithm in light-field Tomo-PIV [24]. Moreover, the reconstruction quality of the hexagonally arranged MLA is higher than that of the square arranged MLA.

3.1.2. Expectation-Maximization (EM)

The EM algorithm is a deconvolution algorithm used to restore a sharp image from a blurred one. Dempster et al. established the connection between the EM algorithm and the Richardson-Lucy deconvolution algorithm, demonstrating that the latter can be derived within the EM framework [49]. The Richardson–Lucy algorithm is widely used in the field of astronomy, whereas the EM algorithm is primarily used in medical imaging [50]. They are merely different names for the same algorithm. It is based on maximum-likelihood estimation, featuring a simple iterative scheme that quickly converges to a satisfactory solution. Thus, EM algorithm is used to reconstruct the intensity distribution of the 3D particle field in light-field Tomo-PIV, which is expressed as follows:

E {(X_{j}, Y_{j}, Z_{j})}^{k + 1} = E {(X_{j}, Y_{j}, Z_{j})}^{k} [\frac{P (x_{i}, y_{i})}{\sum_{j \in N_{i}} W_{i, j} E {(X_{j}, Y_{j}, Z_{j})}^{k}} W_{i, j}]

(14)

The iterative format of the EM algorithm is very similar to that of the MART algorithm. Compared with the MART algorithm, the EM algorithm is less reliant on the accuracy of the weight matrix. In 3D particle field reconstruction based on the EM algorithm, the weight matrix is calculated using the backward ray-tracing technique to determine the voxel position corresponding to each pixel’s line of sight in light-field Tomo-PIV. A Gaussian function is used to approximate the elements in the weight matrix. In light-field Tomo-PIV, the reconstruction quality is approximately 0.35 when the particle concentration is 1 ppm. The reconstruction time is approximately 200 s. However, the EM algorithm requires loading the entire weight matrix and all pixel intensities during the calculation process. This requires the computer to have a large memory to store the weight matrix and pixel intensities. On the other hand, the optical parameters of the light-field camera have an impact on the reconstruction quality of EM algorithm. A small inverse magnification of the main lens M_l, a small focal length of the MLA f_m, a large microlens pitch P_m, and a large inverse magnification of the MLA |M_m| are helpful for the improvement of the reconstruction quality of the EM algorithm. However, the focal length of the main lens f has a little effect on the reconstruction quality of the EM algorithm.

3.1.3. DRT-MART

The number of elements in the weight matrix ranges from 10¹² to 10¹⁶ in light-field Tomo-PIV. Thus, calculating the entire weight matrix is complex and computationally intensive. The voxel corresponding to the particle’s position is a nonzero voxel. The particles in the flow field are usually sparse. Thus, the weight matrix is a large sparse matrix. The number of nonzero voxels is very small. Not all of the voxels are involved in tomographic reconstruction. Research has shown that only nonzero voxels are involved in tomographic reconstruction. Therefore, calculating only the weight matrix corresponding to nonzero voxels can greatly reduce the number of elements in the weight matrix, thereby reducing computer storage. Moreover, it can also reduce the computational burden of tomographic reconstruction. The schematic of the determination principles of nonzero voxels in light-field Tomo-PIV is shown in Figure 5. This schematic illustrates the workflow for identifying nonzero voxels via dense ray tracing. Only voxels corresponding to actual tracer particles are retained for subsequent tomographic reconstruction. This greatly reduces the scale of the weight matrix and the computational cost of iterative algorithms such as DRT-MART. As shown in Figure 5, to determine nonzero voxels in the light-field camera, ray tracing is used to trace the rays from each voxel through the main lens, MLA, and CCD sensor. Thus, the pixel location affected by each voxel element can be determined. If the product of the affected pixels exceeds a certain threshold, the voxel corresponding to the affected pixel is a nonzero voxel. Finally, the MART algorithm is used to calculate the intensity distribution of nonzero voxels.

The advantage of the DRT-MART algorithm is its fast iteration speed. When the DRT-MART algorithm is given the same computation time as the MART algorithm, it performs more iterations than the MART algorithm [48]. For example, under the same iteration time, the MART algorithm has 23 iterations, while the DRT-MART algorithm can reach 400 iterations. The elongation length of the tracer particles reconstructed by the DRT-MART algorithm is smaller than that of the MART algorithm. Thus, at the same particle concentration, the DRT-MART method can achieve better reconstruction quality than the MART method. No ghost particles are observed for both the DRT-MART and MART algorithms, owing to the multiple perspectives offered by the light-field camera [48]. When the particle concentration is 0.1 ppm in light-field Tomo-PIV, the reconstruction quality of the DRT-MART algorithm reaches around 0.7. However, the measurement of velocity field in Tomo-PIV with a particle concentration of 0.1 ppm is insufficient. When the particle concentration is 1 ppm, the reconstruction quality of the DRT-MART is around 0.5. In the study of DRT-MART algorithm, researchers did not list the calculation time required for non-zero voxel recognition based on the dense ray tracing and the calculation time of the DRT-MART with increasing particle concentration. But in traditional Tomo-PIV, the number of non-zero voxels increases with the increase of particle concentration. Therefore, the recognition time of non-zero voxels increases with increasing particle concentration in light-field Tomo-PIV. In practical experiments, vibration of the light-field camera leads to a decrease in the accuracy of the DRT-MART algorithm.

3.1.4. Pre-SART

The Pre-SART algorithm is a combination of the prerecognition method and the SART algorithm and is used in light-field Tomo-PIV. The prerecognition method is a recognition method for nonzero voxels similar to the DRT-MART algorithm. A schematic of the principle of prerecognition is shown in Figure 6. This figure illustrates the overall workflow of voxel prerecognition before tomographic reconstruction. Combined with ray-tracing technique, this prerecognition step realizes rapid screening of effective regions, eliminates invalid zero voxels in advance, and effectively reduces the computational load and iteration time of subsequent reconstruction algorithm. The prerecognition also leverages the characteristic that tracer particles in the flow field are sparse. The vast majority of discrete measurement volumes are zero voxels. Zero voxels do not contribute to the light-field image and therefore do not participate in the SART algorithm reconstruction. Therefore, as long as the weight matrix of nonzero voxels is calculated, this approach can greatly reduce the number of elements in the weight matrix, thereby decreasing the amount of numerical storage and reducing the computational burden of the SART algorithm. Unlike DRT-MART, this method considers the actual defects of the light-field camera, such as distortion and volume calibration errors in the optical system. Mismatches between voxels and corresponding pixels can occur, resulting in misidentification. An index σ is used to improve the precognition accuracy, which is expressed as follows:

σ = \frac{n_{n}}{n_{t}}

(15)

where n_n is the nonintensity pixel number, and n_t is the total pixel number affected by a voxel.

A σ_t threshold is introduced to determine whether a voxel P with a given intensity is recognized, which is expressed as follows:

E_{P} = \{\begin{cases} E_{P} (σ < σ_{t}) \\ 0 (σ > σ_{t}) \end{cases}

(16)

When σ < σ_t, the intensity of voxel P is E_p. When σ > σ_t, the voxel P is a zero voxel. Only the weight coefficients of nonzero voxels are calculated. Finally, the SART algorithm is used to calculate the intensity distribution of nonzero voxels, which is expressed by the following equation:

E_{j}^{k + 1} = E_{j}^{k} + μ \frac{\sum_{i} (\frac{P_{i} - \sum_{n = 1}^{N} W_{i n} E_{n}^{k}}{\sum_{n = 1}^{N} W_{i n}}) W_{i, j}}{\sum_{i} W_{i, j}}

(17)

The advantage of the Pre-SART algorithm is that it reduces the computational burden of the SART algorithm due to the reduction in the elements of the weight matrix. Compared with the DRT-MART algorithm, the Pre-SART algorithm considers the problem of lens distortion and introduces a threshold

δ_{t}

for recognizing non-zero voxels. In light-field Tomo-PIV, when the particle concentration is 1 ppm and

δ_{t}

= 0.1, the reconstruction quality of the Pre-SART is around 0.325. The calculation time is about 22.5 min. The required computer memory for the calculation process is about 1.2 GB. However, as the particle concentration increases, the computation time of the Pre-SART algorithm increases. The elongation length of the reconstructed 3D particle field is slightly reduced at different depths, and the measurement accuracy of the instantaneous velocity field is improved. But the elongation problem of the reconstructed particle still exists.

3.2. Deep Learning Techniques for 3D Particle Field Reconstruction

The 3D particle field reconstructed by traditional iterative reconstruction algorithms is elongated along the depth direction. As a result, the positional error of the particles in the depth direction is relatively large, which reduces the reconstruction quality of the 3D particle field. This is because the depth resolution of the light-field camera is lower than the lateral resolution. In recent years, with the rapid development of deep learning and machine learning, deep learning has gradually been applied to the reconstruction of 3D particle field based on light-field imaging, but little research has been conducted on this topic. Recently, in the light-field Tomo-PIV, 3D particle field reconstruction based on deep learning includes mainly 3D U-Net and LF-DNN. The following introduces the principles of these two network structures separately.

3.2.1. 3D U-Net

The 3D U-Net was developed by Çiçek et al. as an extension of the original 2D U-Net [32,51]. The 3D U-Net can be used to segment 3D images and solve the problem of medical image segmentation. The structure of 3D U-Net for 3D particle field reconstruction is shown in Figure 7. This figure presents the encoder-decoder architecture of the 3D U-Net adopted in this work. With downsampling, upsampling and skip connection designs, the network takes refocused light-field image stacks as input and outputs reconstructed 3D particle intensity distributions. The 3D U-Net effectively alleviates the particle elongation problem existing in traditional iterative reconstruction methods. The 3D U-Net adopts the encoder–decoder concept and designs a U-shaped convolutional neural network using convolution and pooling operations. In the encoder stage, a total of four downsampling operations are performed, and the downsampling component is responsible for feature extraction of the 3D volume. After each downsampling of the input 3D volume, the size of the 3D image is reduced, and feature information is gradually extracted. In the decoder stage, upsampling is performed four times. The upsampling component is the process of upsampling the extracted features and restoring and decoding the abstract features into the segmented 3D volume. Skip connections are used between the encoder and decoder to supplement the feature details learned from the same resolution encoder and to recover the detailed information lost during the downsampling process. The advantage of 3D U-Net is that its design is clear and simple, with a symmetrical structure. However, the network model usually has a large number of parameters and high computational cost; thus, it requires more GPU memory and a longer training time.

In the production of the dataset, the digital refocusing algorithm is used to project the light-field image of tracer particles onto discrete voxels, constructing a refocused cone as the input part of the dataset. According to the particle position, the adjacent 3 × 3 × 3 voxels’ intensity distribution of the tracer particle is extracted from the stack of LF-refocused images and is used as the output data of 3D U-Net. When 3D U-Net is used to reconstruct a 3D particle field, it is necessary to convert the light-field image of the particles into a stack of refocused images through a digital refocusing algorithm. Reconstructing a 3D particle field using the 3D U-Net algorithm does not start from the light-field image of tracer particles. The calculation time of the digital refocusing algorithm depends on the number of voxels and is independent of particle concentration. When the number of voxels is 128 (X-axis) × 128 (Y-axis) × 128 (Z-axis), the calculation time of the digital refocused algorithm is about 140 s and 150 s for the standard plenoptic camera and Raytrix R29, respectively. A total of 3500 samples are randomly generated for training the 3D U-Net. The ray-tracing technique is used to generate a light-field image of the tracer particles. The validation and testing set account for 10% of the total dataset. The 3D U-Net takes about 25.5 h to train using GPU. When the 3D U-Net is trained successfully, the 3D U-Net model can be used to reconstruct any light-field images within a range of particle concentrations. The time for the 3D U-Net to predict the 3D particle field is very short, so it has not been statistically analyzed. When the particle concentration is less than 0.1 ppm for the standard plenoptic camera, the reconstruction quality (Q) of the 3D U-Net is greater than 0.7. The reconstruction quality of the 3D U-Net is better than that of the EM algorithm. However, when the particle concentration is high, such as 1 ppm, the reconstruction quality is approximately 0.55. The 3D U-Net is suitable for the reconstruction of low-concentration tracer particles but not for the reconstruction of high-concentration tracer particles. The literature [32] did not address issues such as generalization from synthetic to experimental data, domain adaptation, overfitting, noise augmentation, or dependence on a specific optical setup.

3.2.2. LF-DNN

In a supervised DNN architecture, the multiple hidden layers between the input and output are key to its deep structure. Figure 8 shows the structure of the LF-DNN model for 3D particle field reconstruction. This figure shows the overall architecture of the LF-DNN. It takes perspective-shift subaperture images as input and adopts residual blocks to mitigate gradient vanishing. The network establishes an end-to-end mapping from raw light-field image to 3D particle field, achieving high-precision and efficient volumetric reconstruction. From Figure 8, the DNN is used to establish a mapping between 3D particle volumes and 2D particle images. The DNN can employ a variety of architectures, such as fully connected layers, convolutional layers, and recurrent layers. Convolutional layers are specifically designed to process structured array data, enabling the DNN to adaptively learn spatial hierarchical features from an input image. This process maps the input image to a new output, resulting in an image where specific features are retained and others are enhanced or altered. The residual blocks are merged into the DNN to alleviate the exponential decrease in the gradient of the loss function when propagating backward from the output layer to the input layer. Its architecture is composed of a primary convolutional layer and a secondary convolutional layer.

For the DNN, a ray-tracing technique is employed to synthetically generate corresponding light-field image of tracer particles. The perspective-shift images (subaperture images) of the light-field image are used as input rather than the raw light-field image. The output is the labeled 3D locations of the particles. The network’s training task is to learn to generate a high-quality approximation of the 3D particle distribution. The reconstruction quality (Q) index of LF-DNN surpasses 0.75 when the concentration is 1.0 ppm, representing a value nearly twice that of the MART method. The elongation of the reconstructed 3D particle field has been significantly reduced. A total of 20,000 samples are randomly generated for training LF-DNN. The dataset is split into three subsets for training, validation, and testing with a ratio of 70%, 15%, and 15%. LF-DNN has a total training time of approximately 50.75 h using a CPU and 15.92 h using a GPU. For 3D particle reconstruction, when running on a CPU, LF-DNN processes a light-field image in only 3.7 s, compared to 1.2 s on a GPU. MART takes 0.38 h to reconstruct a single light-field image, whereas PR-MART takes 0.28 h. The results demonstrate that LF-DNN outperforms the established MART and PR-MART methods.

During model training, regularized training strategies are used for LF-DNN, and the loss values of the training and validation sets are monitored in real time. When the validation loss no longer decreases, training is terminated to prevent model overfitting. Meanwhile, the model performance on the independent test set also proves that no obvious overfitting occurs in the current network. To improve model robustness, Gaussian noise with varying intensities is added to synthetic light-field image during training for noise augmentation, thereby simulating random camera noise and improving robustness to experimental data.

3.3. Summary of the Advantages and Disadvantages for Reconstruction Algorithms

Table 1 lists the comparison of various reconstruction algorithms using a single light-field camera. This table comprehensively compares mainstream reconstruction methods for 3D particle field using simulation in light-field Tomo-PIV, including traditional iterative algorithms and deep learning models. The evaluation covers reconstruction quality, computational time, technical advantages, and existing limitations of each method, which provides a clear reference for algorithm selection in practical measurement. Synthetic validation employs numerically generated particle field with known ground truth to simulate Tomo-PIV, thereby allowing quantitative assessment of reconstruction and velocity measurement accuracy. In contrast, experimental validation is performed using real Tomo-PIV measurement, where ground truth is not available. The performance is evaluated through a comparison between the flow structures obtained from ANSYS Fluent simulations and those measured by the light-field Tomo-PIV. From Table 1, in traditional iterative algorithms for the 3D particle field reconstruction, the reconstruction quality of the DRT-MART algorithm is higher than that of the MART, EM, and Pre-SART algorithms. In deep learning-based reconstruction methods, the 3D U-Net achieves a reconstruction quality above 0.7 under low tracer particle concentration, while its performance deteriorates under high-concentration conditions. Low tracer particle concentration can result in a low-resolution velocity field, which cannot meet the measurement requirement of light-field Tomo-PIV. Under a high-tracer-particle-concentration condition, the LF-DNN algorithm achieves a reconstruction quality above 0.75, comparable to that of traditional multicamera Tomo-PIV. In terms of reconstruction time, the EM algorithm, 3D U-Net, and LF-DNN achieve shorter reconstruction times compared with other algorithms. Among them, the LF-DNN has the fastest reconstruction time. The 3D particle field reconstructed using the MART and EM algorithms is elongated along the depth direction. The DRT-MART and Pre-SART algorithms can slightly alleviate the elongation problem of particles along the depth direction. However, deep learning-based reconstruction algorithms can alleviate particle elongation in the depth direction. It has been shown that the DRT-MART algorithm does not exhibit ghost particles during the reconstruction process. Other reconstruction algorithms rarely discuss the issue of ghosting particles. When real-time and high-precision reconstruction are required, LF-DNN can serve as a potential reconstruction algorithm for 3D particle field reconstruction under the condition that the testing data are within the distribution of the training dataset. Under limited computational resources, the EM, DRT-MART, and Pre-SART algorithms are recommended for 3D particle field reconstruction. The EM algorithm is recommended for reconstructing a 3D particle field under noisy experimental conditions.

4. Future Prospects

4.1. Traditional Iterative Algorithms

4.1.1. Influence of the Tracer Particle Concentration

Research significance

The concentration of tracer particles is a critical parameter in light-field Tomo-PIV because it directly governs the trade-off between reconstruction quality and computational efficiency. Low particle concentrations may lead to insufficient seeding density, causing poor representation of flow structures and large interrogation volume errors. Conversely, higher particle concentrations can potentially improve spatial resolution. However, they also increase particle image overlap, raise the number of nonzero voxels, and exacerbate the ill-posedness of the tomographic reconstruction problem. Understanding the influence of particle concentration on reconstruction quality, ghost particle generation, and velocity field accuracy is essential for developing reliable experimental guidelines and assessing measurement reliability in light-field Tomo-PIV applications involving complex flows.

2.: Existing bottlenecks

Despite progress in iterative reconstruction algorithms (e.g., DRT-MART; Pre-SART), several key bottlenecks remain. The reconstruction time increases significantly with tracer particle concentration and the number of nonzero voxels. As a result, iterative methods may impose a considerable computational burden, particularly in high-concentration cases. To date, main lens distortion, MLA distortion, optical aberrations, inhomogeneous lighting, background noise, refractive index mismatches, and particle overlapping are usually overlooked in light-field Tomo-PIV. These effects degrade both volumetric reconstruction quality and subsequent cross-correlation performance. However, their combined effects across different particle concentrations have not been systematically quantified in light-field Tomo-PIV. Few studies have addressed rigorous velocity uncertainty quantification in light-field Tomo-PIV. This issue becomes particularly challenging at high particle concentrations, where measurement errors exhibit increasingly nonlinear behavior. Moreover, few studies have investigated how high particle concentrations affect cross-correlation uncertainty, calibration residuals, and the spatial resolution of the velocity field.

3.: Feasible solutions and implementation approaches

To overcome the above limitations, the following approaches are recommended in light-field Tomo-PIV. Adaptive stopping criteria and voxel sparsity strategies are promising approaches for integration into DRT-MART and Pre-SART to reduce computational cost under high-particle-concentration conditions. In iterative reconstruction, regularization terms are recommended to penalize isolated or discontinuous voxels in the reconstructed volume [52]. An end-to-end calibration program is recommended for implementation. Polynomial or ray transfer matrix methods are recommended for simulating main lens distortion, MLA distortion, and optical aberrations. Deconvolution or particle separation algorithms are recommended to reduce particle overlap effects in light-field Tomo-PIV. The synthetic light-field image should be generated with controlled particle concentrations, known velocity fields, and realistic imaging artifacts (including main lens distortion, MLA distortion, noise, and particle overlap) to systematically quantify reconstruction error and velocity uncertainty.

By addressing these aspects, future light-field Tomo-PIV systems can achieve reliable velocity field measurements even under challenging high-seeding-density conditions, with well-characterized accuracy and resolution.

4.1.2. Reconstruction of Multiframe Light-Field Images

At present, iterative reconstruction algorithms focus only on reconstructing a static 3D particle field from either a single or a pair of light-field images, without considering temporal sequence information.

Research significance

Unsteady flows in confined spaces are widely encountered in aerospace propulsion, internal combustion engines, and microfluidic systems, and are characterized by three-dimensionality, strong fluctuations, transient variations, and non-periodicity. To fully capture flow evolution dynamics, identify coherent flow structures, and analyze unsteady flow characteristics, researchers typically rely on high-temporal-resolution light-field image sequences containing thousands to tens of thousands of consecutive frames. Efficient and accurate 3D particle field reconstruction from such massive sequential data is a fundamental prerequisite for time-resolved flow measurements, dynamic flow analysis, and transient feature characterization. It also determines the practical applicability of light-field Tomo-PIV for long-term monitoring and dynamic testing of complex unsteady flows in engineering scenarios. Therefore, exploring multiframe reconstruction techniques is of great theoretical value and practical engineering significance for expanding the application scope of light-field Tomo-PIV.

2.: Existing bottlenecks

Currently, research on 3D particle field reconstruction for light-field Tomo-PIV from large-scale multiframe light-field images based on traditional iterative algorithms remains very limited. Conventional tomographic iterative methods face significant technical bottlenecks when processing high-temporal-resolution image sequences. First, these algorithms perform independent iterative reconstruction frame by frame, without reusing effective computational results from adjacent frames, which leads to substantial redundant computation. Second, they ignore the strong spatiotemporal correlation and data redundancy between consecutive frames of particle images, failing to mine and utilize inherent temporal characteristics of particle motion. Third, the high-dimensional voxel projection and matrix operation involved in tomographic reconstruction bring extremely high computational overhead.

Combined with the above factors, reconstructing long light-field image sequences using traditional iterative algorithms may require computational times ranging from tens to thousands of hours in practical applications. Such low computational efficiency makes traditional iterative algorithms unsuitable for real-time measurement, online data processing, and dynamic analysis of unsteady flows. This severely restricts the development and popularization of time-resolved light-field Tomo-PIV.

3.: Feasible solutions and implementation approaches

To overcome the above limitations and promote the application of multiframe reconstruction in light-field Tomo-PIV, targeted technical improvements and algorithmic optimizations are proposed from multiple perspectives.

First, dedicated fast reconstruction algorithms should be developed for successive light-field image sequences. A sequential computing framework for light-field image sequences should be developed to enable unified processing of multiple frames, rather than independent frame-by-frame iterative reconstruction. Second, spatiotemporal joint reconstruction strategies should be introduced to optimize existing iterative algorithms in light-field Tomo-PIV. Temporal regularization terms can be incorporated into volumetric particle field reconstruction to exploit the temporal coherence between adjacent frames [53,54]. This may improve reconstruction efficiency by reducing redundant iterations and reusing intermediate computational results. Third, spatiotemporal redundancy in sequential light-field images and the continuity of particle motion should be fully exploited. Future studies may incorporate temporal regularization to exploit temporal coherence between neighboring frames. Such an approach could improve reconstruction stability and reduce random noise and ghost particles in single-frame reconstruction. The Shake-the-Box (STB) method is an advanced particle tracking technique used in traditional Tomo-PIV, but it has not yet been extended to the light-field Tomo-PIV [55]. Thus, the Shake-the-Box technique should be recommended for 3D particle reconstruction in light-field Tomo-PIV.

The above optimization strategies primarily aim to improve the efficiency of sequential 3D particle field reconstruction without compromising reconstruction accuracy and spatial resolution. Such improvements are essential for time-resolved light-field Tomo-PIV measurements and dynamic analyses of unsteady flows.

4.2. Deep Learning

At present, the training datasets for 3D U-Net and LF-DNN are derived solely from forward ray-tracing simulations, without incorporating real light-field measurements. There exists a domain gap between simulated data and real experimental light-field image acquired using a single light-field camera. When a model trained on synthetic data is directly transferred to experimental scenarios, light-field imaging noise, main lens aberrations, MLA aberrations, and uneven illumination in real scenes will degrade reconstruction performance. Achieving effective domain adaptation between simulation and experimental data remains one of the key challenges for future research.

The performance of both 3D U-Net and LF-DNN is strongly dependent on the specific optical setup of the light-field camera, including microlens parameters, focal length of the main lens and MLA, F-number matching, and calibration accuracy. Once the optical layout or light-field camera parameters change, the pre-trained model cannot be directly applied and must be retrained or fine-tuned.

4.2.1. Training Time of the Neural Network

Research significance

Dataset scale and training efficiency play vital roles in the development and application of deep learning-based reconstruction models for light-field PIV. An appropriate dataset enables networks to fully learn particle field features and ensures reliable generalization and reconstruction accuracy. Reducing training time can accelerate model optimization and algorithm iteration. This is essential for translating deep learning techniques from laboratory research into practical engineering applications. Therefore, finding a good balance between dataset size and training cost is of great practical significance for light-field Tomo-PIV.

2.: Existing bottlenecks

The size of the light-field dataset is among the critical factors influencing the training time of deep learning models. A larger light-field dataset typically requires more iterations and computational resources for the model to converge, leading to a significant increase in training time. Conversely, while a smaller light-field dataset allows for faster training, it may result in underfitting or poor generalization, as the model fails to adequately learn the feature distribution inherent in the light-field data.

In practice, in addition to light-field dataset size, several other factors affect the overall training duration. These include batch size, the number of network layers and parameters, hardware configuration (e.g., GPU/TPU performance), the choice of optimizer, and the learning rate schedule. For instance, in LF-DNN, training requires approximately 50 h on a CPU but only approximately 15 h on a GPU. Currently, hyperparameters are mostly set empirically, lacking unified optimization standards. Models are highly dependent on high-end computing hardware, and training efficiency drops sharply under ordinary computing conditions. It remains difficult to maximize model performance while controlling training time.

3.: Feasible solutions and implementation approaches

To solve these problems, a series of feasible approaches are put forward. First, data augmentation should be used to enrich sample diversity and improve model generalization without increasing the size of the raw light-field dataset. Second, hyperparameters should be optimized, including batch size, optimizer, and learning rate, to accelerate model convergence. Third, network structures should be simplified to reduce parameters and computational complexity. Fourth, GPU acceleration and distributed training should be utilized to improve computing efficiency. These methods can effectively balance light-field dataset scale and training efficiency, ensure model performance, and accelerate the deployment of deep learning models in practical light-field Tomo-PIV measurement applications.

4.2.2. Creation of the Dataset

Research significance

The availability and quality of training datasets are key factors affecting the prediction accuracy and generalization performance of deep learning reconstruction models. In the field of light-field Tomo-PIV, all existing deep learning methods, including 3D U-Net and LF-DNN, rely on light-field datasets generated by numerical simulations. Standardized, high-quality datasets enable neural networks to fully learn the inherent mapping between light-field image and 3D particle field. This helps ensure stable reconstruction accuracy under diverse operating conditions.

Moreover, the preprocessing workflow and dataset input format directly affect the overall deployment complexity and application scope of deep learning algorithms. At present, different models adopt independent preprocessing pipelines: 3D U-Net takes digitally refocused image stacks as input, while LF-DNN uses extracted perspective-shifted sub-aperture images. Simplifying data preprocessing and enabling end-to-end learning from raw light-field image can streamline model deployment and lower the barrier for practical applications. This helps promote the adoption of deep learning techniques in light-field Tomo-PIV measurement. Optimizing light-field dataset construction, standardizing preprocessing procedures, and exploring direct raw light-field data input are crucial for deep learning-based particle field reconstruction. They provide important theoretical and practical support for its long-term development.

2.: Existing bottlenecks

The availability of light-field datasets is one of the key factors influencing the accuracy of deep learning predictions. In current deep learning research, numerical simulation is commonly used to generate input–output datasets. For both the 3D U-Net and LF-DNN models, the first step is to generate a light-field image using the forward ray-tracing technique. Currently, deep learning requires preprocessing of light-field images to generate training input–output pairs. The 3D U-Net requires the light-field image to be preprocessed into a refocused image stack. This refocused image stack is generated using a digital refocusing algorithm and serves as the input for the 3D U-Net. For the LF-DNN model, the light-field image must be processed to extract perspective-shifted images, which are then converted into a series of sub-aperture images as network input.

A series of prominent bottlenecks exist in current dataset construction and data processing. First, all datasets are purely derived from ray-tracing simulations, leading to a non-negligible domain gap between synthetic and experimental data. This is mainly caused by main lens distortion, MLA distortion, uneven illumination, and complex background noise. This gap leads to degraded model performance when migrating from simulation to actual light-field Tomo-PIV measurement scenarios. Second, different models adopt customized data preprocessing pipelines without unified standards, which leads to poor dataset compatibility and hinders fair comparisons of algorithm performance. Third, it remains extremely challenging to enable neural networks to directly accept raw light-field image as input. Raw light-field data contains complex multi-directional optical information and redundant pixel information, thereby increasing the difficulty of feature extraction and network training. In addition, the community still lacks publicly available, standardized multi-condition light-field datasets, which limits research reproducibility and hinders the development of new algorithms.

3.: Feasible solutions and implementation approaches

To address the above problems, targeted solutions for light-field dataset construction and data input optimization are proposed. First, simulated light-field dataset can be enhanced by introducing realistic imaging interference factors. Main lens distortion, MLA distortion, optical aberrations, random noise, particle overlap, and non-uniform illumination should be embedded into ray-tracing simulations to reduce the domain gap between synthetic light-field dataset and experimental data. Second, unified dataset generation and preprocessing standards should be established. The rules for ray-tracing parameter settings, data formats, and preprocessing workflows for 3D U-Net and LF-DNN should be standardized to improve light-field dataset compatibility and enable fair cross-model comparison. Third, network structures and feature extraction modules should be redesigned. The front-end convolution and feature fusion layers should be optimized to adapt to the characteristics of raw light-field images. Meanwhile, a technical route for using raw light-field images as direct network input should be gradually explored to eliminate redundant preprocessing steps. Fourth, open-source benchmark datasets covering different particle concentrations, light-field camera parameters, and noise levels should be developed and released to provide a unified platform for the research community.

4.2.3. Interpretability

Research significance

The interpretability of deep learning models has become a key research topic in 3D particle field reconstruction using light-field Tomo-PIV. Current convolutional neural networks, including 3D U-Net and LF-DNN, are capable of delivering high-precision reconstruction results, yet they are generally regarded as black-box models. It is difficult to explicitly interpret the internal feature extraction process and the physical mapping from the input light-field image to the output 3D particle field.

Model interpretability is closely associated with the credibility of reconstruction results, error analysis, and the practical deployment of measurement techniques. In light-field Tomo-PIV measurement, researchers need to clarify how network parameters and optical imaging conditions affect reconstruction performance, so as to accurately analyze error sources and optimize experimental schemes. Poor interpretability not only limits the application of deep learning models in high-precision scientific measurement and rigorous flow mechanism research but also hinders the integration of data-driven networks with traditional physical reconstruction models. Therefore, improving the interpretability of deep learning-based reconstruction models is a necessary requirement for standardizing measurement techniques, analyzing reconstruction errors, and expanding the engineering application scope of light-field Tomo-PIV. It is also a core premise for realizing the deep integration of deep learning and physical imaging theories.

2.: Existing bottlenecks

At present, deep learning-based reconstruction models for light-field Tomo-PIV still face prominent bottlenecks in interpretability research. First, most existing studies focus on optimizing reconstruction accuracy and computational efficiency in light-field Tomo-PIV, while rarely exploring the internal mechanisms of the network. The feature transformation process from a 2D light-field image to a 3D particle field cannot be quantitatively analyzed, leading to unclear causal relationships between network architecture, hyperparameters, and reconstruction performance. Second, there is a lack of effective methods to isolate and quantify the effects of different interference factors (such as main lens distortion, MLA distortion, noise, and particle overlap) on network predictions. When reconstruction errors occur, it is difficult to quickly locate whether they arise from the imaging system, the light-field dataset, or network defects, which greatly increases the difficulty of model debugging and performance optimization. Third, effectively incorporating physical prior knowledge into deep learning networks remains a major challenge. Owing to the limited interpretability of black-box models, effectively embedding ray-tracing principles, tomographic projection constraints, and other physical constraints into network training remains challenging. This hinders the complementary advantages of physical models and data-driven methods. In addition, unified evaluation criteria and analytical frameworks for the interpretability of reconstruction-oriented neural networks are still limited in this field, and a coherent research framework remains to be established.

3.: Feasible solutions and implementation approaches

To address the above challenges and improve the interpretability of deep learning-based reconstruction models in light-field Tomo-PIV, multiple targeted technical approaches are proposed. First, a network visualization analysis method should be adopted. Feature map visualization, class activation mapping, and other methods should be employed to analyze feature extraction regions in each network layer and to reveal how the network perceives particle features, optical noise, and background information in light-field images. This helps clarify the data transmission process and the feature evolution patterns within the model. Second, attention mechanisms should be introduced into the network architecture. Task-oriented attention modules should be designed to enable the network to focus on effective particle signals and suppress irrelevant interference. Meanwhile, the weight distribution of the attention units should be analyzed to quantify the contributions of different light-field image regions to the final reconstruction result. Third, physical priors should be embedded into the network design and training. The forward ray-tracing model and tomographic projection principles can be combined to construct physics-constrained loss functions, ensuring that the network learning process is consistent with physical optical principles. This not only improves reconstruction accuracy but also makes model behavior more predictable and interpretable. Fourth, systematic ablation experiments should be conducted. The network layers, convolutional parameters, and training hyperparameters should be gradually adjusted to investigate their influence on reconstruction quality, particle elongation, and noise tolerance. The results can then be used to derive design rules for light-field reconstruction networks. Fifth, a complete evaluation system should be established for model interpretability. Quantitative metrics should be developed to evaluate the transparency and stability of the reconstruction network, thereby establishing a universal analysis framework for interpretability research in light-field Tomo-PIV.

In summary, the development of interpretable, verifiable, and stable deep learning methods is an urgent research priority. The above strategies can effectively break the limitations of black-box models, promote the integration of deep learning and physical theories, and lay a solid foundation for the reliable and large-scale application of deep learning techniques in light-field Tomo-PIV.

In addition to existing research directions, physics-guided hybrid reconstruction methods and multiframe spatiotemporal reconstruction are promising trends for light-field Tomo-PIV. Different from pure data-driven deep learning methods, hybrid reconstruction combines an analytical forward imaging model of the light-field with deep neural networks for improved reconstruction performance. By incorporating the forward ray-tracing principle, optical geometric constraints, and physical priors into network training, model interpretability is improved. The domain gap between synthetic and experimental data is reduced. Meanwhile, the requirement for large-scale labeled datasets is alleviated.

4.2.4. Time-Resolved Flow Measurement

Research significance

Time-resolved light-field Tomo-PIV measurement relies heavily on multiframe reconstruction to capture transient and unsteady flow features. Spatiotemporal constraints can reduce noise and ghost particles in sequential results. Sequence networks and optical flow methods make full use of particle motion information and improve reconstruction performance under high-particle-concentration conditions. The Lagrangian–Eulerian hybrid scheme unifies particle reconstruction and tracking, supporting comprehensive flow analysis. These techniques are essential to promote the application of light-field Tomo-PIV in dynamic flow experiments.

2.: Existing bottlenecks

Most current algorithms in light-field Tomo-PIV are designed for single-frame reconstruction and process light-field image frames independently. They fail to take advantage of inter-frame temporal correlation, leading to heavy repeated computation and low efficiency for long sequences. Noise and ghost particles accumulate continuously during multiframe processing. In addition, reconstruction and particle tracking are separated in most studies, and there is a lack of effective motion priors for handling high-concentration particle fields, which hinders high-quality time-resolved light-field Tomo-PIV measurement.

3.: Feasible solutions and implementation approaches

To solve these problems, multiple approaches are adopted in light-field Tomo-PIV. First, time regularization and spatiotemporal joint reconstruction are promising methods for suppressing cumulative noise and ghost particles. Second, RNNs and Transformers offer promising approaches for extracting temporal features of particle motion. Third, optical flow algorithms should be introduced to provide motion priors and boost reconstruction performance under high-particle-concentration conditions. Fourth, a Lagrangian–Eulerian hybrid framework should be built to combine reconstruction and particle tracking. These methods can overcome current technical limitations and facilitate the further development of light-field Tomo-PIV for time-resolved flow measurement.

Funding

This research was funded by the National Natural Science Foundation of China (No. 12302370).

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

The authors wish to express their gratitude to the National Natural Science Foundation of China (No. 12302370).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Cao, L.X.; Zhang, B.; Li, J.; Song, X.; Tang, Z.; Xu, C. Characteristics of tomographic reconstruction of light-field Tomo-PIV. Opt. Commun. 2019, 442, 132–147. [Google Scholar] [CrossRef]
Georgiev, T.G.; Lumsdaine, A. Reducing plenoptic camera artifacts. Comput. Graph. Forum 2010, 29, 1955–1968. [Google Scholar] [CrossRef]
Lumsdaine, A.; Georgiev, T.G. Full Resolution Light Field Rendering; Technical Report; Indiana University and Adobe Systems: San Jose, CA, USA, 2008; pp. 1–12. [Google Scholar]
Lynch, K.P.; Fahringe, T.W.; Thurow, B.S. Three-dimensional particle image velocimetry using a plenoptic camera. In Proceedings of the 50th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2012; pp. 1–14. [Google Scholar]
Fahringer, T.W.; Lynch, K.P.; Thurow, B.S. Volumetric particle image velocimetry with a single plenoptic camera. Meas. Sci. Technol. 2015, 26, 115201. [Google Scholar] [CrossRef]
Shi, S.X.; Ding, J.F.; New, T.H.; Soria, J. Light-field camera-based 3D volumetric particle image velocimetry with dense ray tracing reconstruction technique. Exp. Fluids 2017, 58, 78. [Google Scholar] [CrossRef]
Cao, L.X.; Zhang, B.; Hossain, M.M.; Li, J.; Xu, C.L. Tomographic reconstruction of light field PIV based on backward ray-tracing technique. Meas. Sci. Technol. 2020, 32, 044007. [Google Scholar]
Zhu, X.Y.; Wu, Z.A.; Li, J.; Zhang, B.; Xu, C.L. A pre-recognition SART algorithm for the volumetric reconstruction of the light field PIV. Opt. Lasers Eng. 2021, 143, 106625. [Google Scholar] [CrossRef]
Bolton, J.T.; Thurow, B.S.; Alvi, F.S.; Arora, N. Single Camera 3D Measurement of a Shock Wave-Turbulent Boundary Layer Interaction. In Proceedings of the AIAA Aerospace Sciences Meeting; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2017; pp. 1–16. [Google Scholar]
Ding, J.F.; Xu, S.M.; Zhao, Z.; Shi, S.X.; Kaufmann, R.; Ganapathisubramani, B. Volumetric Measurement of Synthetic Jet Impingement with Single-camera Light-field PIV. In Proceedings of the 13th International Symposium on Particle Image Velocimetry-ISPIV; Universität der Bundeswehr München: Neubiberg, Germany, 2019; pp. 22–24. [Google Scholar]
Zhao, Z.; Buchner, A.J.; Ding, J.F.; Shi, S.; Atkinson, C.H.; Soria, J. Volumetric Measurements of a Self-similar Adverse Pressure Gradient Turbulent Boundary Layer Using Single-camera Light-field Particle Image Velocimetry. Exp. Fluids 2019, 60, 141. [Google Scholar] [CrossRef]
Ding, J.F.; Lim, H.D.; Sheikh, S.; Xu, S.; Shi, S.; New, T.H. Volumetric Measurement of a Supersonic Jet with Single-Camera Light-Field PIV. In Proceedings of the 19th International Symposium on the Application of Laser and Imaging Techniques to Fluid Mechanics; LISBON Symposia: Lisboa, Portugal, 2018; pp. 1–10. [Google Scholar]
Chen, L.; Lei, G.; Wang, T.X.; Xu, C.L. Improved blur circle detection method for geometric calibration of multifocus light field cameras. Opt. Eng. 2022, 61, 093101. [Google Scholar] [CrossRef]
Hall, E.M.; Fahringer, T.W.; Guildenbecher, D.R.; Thurow, B.S. Volumetric calibration of a plenoptic camera. Appl. Opt. 2018, 57, 914–923. [Google Scholar] [CrossRef] [PubMed]
Shi, S.X.; Ding, J.F.; New, T.H.; Liu, Y.; Zhang, H. Volumetric calibration enhancements for single-camera light-field PIV. Exp. Fluids 2019, 60, 21. [Google Scholar] [CrossRef]
Zhao, Y.Y.; Li, H.T.; Mei, D.; Shi, S.X. Metric calibration of unfocused plenoptic cameras for three-dimensional shape measurement. Opt. Eng. 2020, 59, 073104. [Google Scholar] [CrossRef]
Zhu, X.Y.; Xu, C.L.; Che, Z.Z.; Zhang, L. Calibration refinement for light field particle image velocimetry through iterative polynomial model tweaking. Phys. Fluids 2026, 38, 013603. [Google Scholar] [CrossRef]
Fahringer, T.W.; Thurow, B.S. Tomographic reconstruction of a 3-D flow field using a plenoptic camera. In Proceedings of the 42nd AIAA Fluid Dynamics Conference and Exhibit; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2012; p. 2826. [Google Scholar]
Thurow, B.S.; Fahringer, T.W. Recent development of volumetric PIV with a plenoptic camera. In Proceedings of the 10th International Symposium on Particle Image Velocimetry; Delft University of Technology (TU Delft): Delft, The Netherlands, 2013; pp. 1–7. [Google Scholar]
Ding, J.F.; Wang, J.H.; New, T.H.; Soria, J. Dense ray tracing based reconstruction algorithm for light-field volumetric particle image velocimetry. In Proceedings of the 18th International Symposium on the Application of Laser and Imaging Techniques to Fluid Mechanics, Lisbon, Portugal, 4–7 July 2016. [Google Scholar]
Zhu, X.Y.; Hossain, M.M.; Li, J.; Zhang, B.; Xu, C. Weight coefficient calculation through equivalent ray tracing method for light field particle image velocimetry. Measurement 2022, 193, 110982. [Google Scholar] [CrossRef]
Zhu, X.Y.; Xu, C.L.; Hossain, M.M.; Li, J.; Zhang, B.; Khoo, B.C. Approach to select optimal cross-correlation parameters for light field particle image velocimetry. Phys. Fluids 2022, 34, 073601. [Google Scholar] [CrossRef]
Elsinga, C.E.; Scarano, F.; Wieneke, B.; van Oudheusden, B.W. Tomographic particle image velocimetry. Exp. Fluids 2006, 41, 933–947. [Google Scholar] [CrossRef]
Shi, S.X.; Wang, J.H.; Ding, J.F.; Zhao, Z.; New, T. Parametric study on light field volumetric particle image velocimetry. Flow Meas. Instrum. 2016, 49, 70–88. [Google Scholar] [CrossRef]
Zhu, X.Y.; Zhang, B.; Li, J.; Xu, C. Volumetric resolution of light field imaging and its effect on the reconstruction of light field PIV. Opt. Commun. 2020, 462, 125263. [Google Scholar] [CrossRef]
Mei, D.; Ding, J.F.; Shi, S.X.; New, T.H.; Soria, J. High resolution volumetric dual-camera light-field PIV. Exp. Fluids 2019, 60, 132. [Google Scholar] [CrossRef]
Zhu, X.Y.; Xu, C.L.; Hossain, M.M.; Khoo, B.C. Fast and accurate flow measurement through dual-camera light field particle image velocimetry and ordered-subset algorithm. Phys. Fluids 2023, 35, 063603. [Google Scholar]
Zhu, X.Y.; Xu, C.L.; Hossain, M.M.; Khoo, B.C. High-resolution three-dimensional flow measurement through dual-frame light field particle tracking velocimetry. Phys. Fluids 2025, 37, 023617. [Google Scholar] [CrossRef]
Xing, F.; He, X.M.; Wang, D.P.; Tan, H.J. Single camera based dual-view light-field particle imaging velocimetry with isotropic resolution. Opt. Lasers Eng. 2023, 167, 107592. [Google Scholar] [CrossRef]
Ding, Y.; Li, Z.; Chen, Z.; Ji, Y.; Yu, J.; Ye, J. Full-volume 3D fluid flow reconstruction with light field PIV. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 8405–8418. [Google Scholar] [CrossRef] [PubMed]
Mei, D.; Wei, Y.; Liu, P.; Yuan, W. Single sensor tomographic particle image velocimetry using kaleidoscopic light field camera. Opt. Lasers Eng. 2024, 180, 108309. [Google Scholar] [CrossRef]
Cao, L.X.; Hossain, M.M.; Li, J.; Xu, C. Three-dimensional particle image velocimetry measurement through three-dimensional U-Net neural network. Phys. Fluids 2024, 36, 16. [Google Scholar] [CrossRef]
Zhu, X.Y.; Fu, M.X.; Xu, C.L.; Hossain, M.; Khoo, B.C. Volumetric reconstruction of flow particles through light field particle image velocimetry and deep neural network. Phys. Fluids 2024, 36, 19. [Google Scholar] [CrossRef]
Adelson, E.H.; Wang, J.Y. A Single Lens Stereo with a Plenoptic Camera. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 99–106. [Google Scholar] [CrossRef]
Ng, R. Digital Light Field Photography. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2006. [Google Scholar]
Lumsdaine, A.; Georgiev, T. The focused plenoptic camera. In Proceedings of the IEEE International Conference on Computational Photography (ICCP); IEEE: New York, NY, USA, 2009; pp. 1–8. [Google Scholar]
Tan, Z.P.; Alarcon, R.; Allen, J.; Thurow, B.S.; Moss, A. Development of a high-speed plenoptic imaging system and application to marine biology PIV. Meas. Sci. Technol. 2020, 31, 054005. [Google Scholar] [CrossRef]
Tan, Z.P.; Johnson, K.; Clifford, C.; Thurow, B.S. Development of a modular, high-speed plenoptic-camera for 3D flow-measurement. Opt. Express 2019, 27, 13400–13415. [Google Scholar] [CrossRef] [PubMed]
Perwass, U.; Perwass, C. Digital Imaging System, Plenoptic Optical Device and Image Data Processing Method. US8619177B2, 31 December 2013. [Google Scholar]
De Silva, C.M.; Baidya, R.; Marusic, I. Enhancing Tomo-PIV reconstruction quality by reducing ghost particles. Meas. Sci. Technol. 2012, 24, 024010. [Google Scholar] [CrossRef]
Worth, N.A.; Nickels, T.B. Acceleration of Tomo-PIV by estimating the initial volume intensity distribution. Exp. Fluids 2008, 45, 847–856. [Google Scholar] [CrossRef]
Novara, M.; Batenburg, K.J.; Scarano, F. Motion tracking-enhanced MART for tomographic PIV. Meas. Sci. Technol. 2010, 21, 035401. [Google Scholar] [CrossRef]
Fahringer, T.W.; Thurow, B.S. Filtered refocusing: A volumetric reconstruction algorithm for plenoptic-PIV. Meas. Sci. Technol. 2016, 27, 094005. [Google Scholar] [CrossRef]
Scarano, F. Tomographic PIV: Principles and practice. Meas. Sci. Technol. 2013, 24, 012001. [Google Scholar] [CrossRef]
Wieneke, B. Volume self-calibration for 3D particle image velocimetry. Exp. Fluids 2008, 45, 549–556. [Google Scholar] [CrossRef]
Raffel, M.; Willert, C.E.; Scarano, F.; Kähler, C.J.; Wereley, S.T.; Kompenhans, J. Particle Image Velocimetry: A Practical Guide; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Atkinson, C.; Soria, J. An efficient simultaneous reconstruction technique for tomographic particle image velocimetry. Exp. Fluids 2009, 47, 553–568. [Google Scholar] [CrossRef]
Shi, S.X.; New, T.H. Development and Application of Light-Field Cameras in Fluid Measurements; Springer: Cham, Switzerland, 2023. [Google Scholar]
Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39, 1–22. [Google Scholar] [CrossRef]
Thiébaut, É.; Thi, E. Introduction to image reconstruction and inverse problems. In Optics in Astrophysics; Springer: Dordrecht, The Netherlands, 2005; pp. 1–397. [Google Scholar]
Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention; Lecture Notes in Computer Science Volume 9901; Springer: Berlin/Heidelberg, Germany, 2016; pp. 424–432. [Google Scholar]
Lasinger, K.; Vogel, C.; Pock, T.; Schindler, K. Variational 3D-PIV with sparse descriptors. Meas. Sci. Technol. 2018, 29, 064010. [Google Scholar] [CrossRef]
Lynch, K.P.; Scarano, F. An efficient and accurate approach to MTE-MART for time-resolved tomographic PIV. Exp. Fluids 2015, 56, 66. [Google Scholar] [CrossRef]
Gao, Q.; Pan, S.W.; Wang, H.P.; Wei, R.; Wang, J. Particle reconstruction of volumetric particle image velocimetry with the strategy of machine learning. Adv. Aerodyn. 2021, 3, 28. [Google Scholar] [CrossRef]
Novara, M.; Schanz, D.; Schröder, A. Two-pulse 3D particle tracking with shake-the-box. Exp. Fluids 2023, 64, 93. [Google Scholar] [CrossRef]

Figure 1. Structures of the light-field cameras.

Figure 2. Schematic of the subimages under each microlens.

Figure 3. Principles of the F-number of the different light-field cameras.

Figure 4. Technical strategy of the light-field Tomo-PIV based on a single light-field camera.

Figure 5. Determination principle of the nonzero voxels in light-field Tomo-PIV.

Figure 6. Principle of the prerecognition.

Figure 7. Structure of 3D U-Net for 3D particle field reconstruction.

Figure 8. Structure of the LF-DNN model for 3D particle field reconstruction.

Table 1. Quantitative comparison of typical particle reconstruction algorithms using a single light-field camera [1,7,8,32,33,48].

Reconstruction Category	Reconstruction Quality	Reconstruction Time	Advantages	Disadvantages
MART	0.35–0.5 (1 ppm)	0.38 h	(1) Relatively fast convergence with a limited number of iterations; (2) Guarantees non-negative reconstructed intensities; (3) Robust and widely validated for volumetric particle reconstruction.	(1) Requires an accurate weighting matrix and substantial computational resources; (2) Reconstruction time increases significantly with volume size and particle concentration; (3) Susceptible to ghost particles and reconstruction artifacts; (4) Reconstructed particles often exhibit noticeable elongation along the depth direction.
EM	≈0.35 (1 ppm)	0.056 h (200 s)	(1) Simple and easy-to-implement iterative framework; (2) Fast convergence and computational efficiency; (3) Less dependent on the precision of the weighting matrix than MART; (4) Stable reconstruction performance under moderate particle densities.	(1) Requires considerable memory for large-scale reconstructions; (2) Highly sensitive to errors in optical calibration and light-field parameters; (3) Reconstructed particles tend to be elongated in depth; (4) Reconstruction quality deteriorates at high particle concentrations.
DRT-MART	≈0.5 (1 ppm)	Not reported	(1) Effectively suppresses ghost particles through direct ray tracing; (2) Produces particles with shorter elongation lengths and better localization accuracy; (3) Achieves high reconstruction fidelity under sparse seeding conditions.	(1) Computational cost increases with particle concentration; (2) Sensitive to camera vibration and calibration errors; (3) Performance may degrade in densely seeded flow fields.
Pre-SART	≈0.325 (1 ppm)	0.375 h (22.5 min)	(1) Faster reconstruction than conventional algebraic iterative methods; (2) Reduced particle elongation and improved particle localization; (3) Suitable for sparse particle fields.	(1) Reconstruction time still increases with particle concentration; (2) Reconstruction quality decreases in highly dense particle fields; (3) Requires accurate preprocessing and system calibration.
3D U-Net	>0.7 (0.1 ppm) ≈0.55 (1 ppm)	(1) Digital refocused algorithm: ≈140 s (standard plenoptic camera)/≈150s (Raytrix R29) (2) Training time: ≈25.5 h (GPU)	(1) Reconstruction time is largely independent of particle concentration after training; (2) Significantly reduces particle elongation artifacts; (3) Capable of learning complex nonlinear mappings from light-field data to 3D particle distributions; (4) Provides high reconstruction accuracy at low particle concentrations.	(1) Requires digitally refocused images as input, introducing additional preprocessing cost; (2) Performance strongly depends on the quality and diversity of the training dataset; (3) Generalization to unseen experimental conditions may be limited; (4) Reconstruction quality decreases at high particle concentrations.
LF-DNN	>0.75 (1 ppm)	1. Reconstruction time: 3.7 s (CPU)/1.2 s (GPU) 2. Training time: about 50.75 h (CPU)/15.92 h (GPU)	(1) Extremely fast reconstruction after training; (2) Reconstruction time is independent of particle concentration; (3) Produces particles with reduced depth elongation; (4) Achieves high reconstruction accuracy while avoiding iterative optimization.	(1) Requires perspective-shift images as input; (2) Demands a large amount of labeled training data; (3) Long training time and high GPU memory consumption; (4) Generalization capability may deteriorate when imaging conditions differ from those used for training; (5) Limited physical interpretability compared with model-based reconstruction methods.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, L.; Gu, W.; Tian, X. 3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey. Processes 2026, 14, 2101. https://doi.org/10.3390/pr14132101

AMA Style

Cao L, Gu W, Tian X. 3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey. Processes. 2026; 14(13):2101. https://doi.org/10.3390/pr14132101

Chicago/Turabian Style

Cao, Lixia, Wei Gu, and Xing Tian. 2026. "3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey" Processes 14, no. 13: 2101. https://doi.org/10.3390/pr14132101

APA Style

Cao, L., Gu, W., & Tian, X. (2026). 3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey. Processes, 14(13), 2101. https://doi.org/10.3390/pr14132101

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey

Abstract

1. Introduction

2. Light-Field Imaging and Light-Field Tomo-PIV

2.1. Principles of the Light-Field Imaging

2.1.1. Structure of the Light-Field Camera

2.1.2. Conjugate Relationship of the Light-Field Camera

2.1.3. F-Number Matching of the Light-Field Camera

2.2. Principle of the Light-Field Tomo-PIV

2.3. Evaluation Indicators in the Light-Field Tomo-PIV

2.3.1. Reconstruction Accuracy

2.3.2. Particle Concentration

2.3.3. Uncertainty in Tomo-PIV

3. Traditional Iterative Reconstruction Algorithms and Deep Learning Techniques for 3D Particle Field Reconstruction

3.1. Traditional Iterative Reconstruction Algorithms for 3D Particle Field Reconstruction

3.1.1. MART

3.1.2. Expectation-Maximization (EM)

3.1.3. DRT-MART

3.1.4. Pre-SART

3.2. Deep Learning Techniques for 3D Particle Field Reconstruction

3.2.1. 3D U-Net

3.2.2. LF-DNN

3.3. Summary of the Advantages and Disadvantages for Reconstruction Algorithms

4. Future Prospects

4.1. Traditional Iterative Algorithms

4.1.1. Influence of the Tracer Particle Concentration

4.1.2. Reconstruction of Multiframe Light-Field Images

4.2. Deep Learning

4.2.1. Training Time of the Neural Network

4.2.2. Creation of the Dataset

4.2.3. Interpretability

4.2.4. Time-Resolved Flow Measurement

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI