Visual-Feedback-Based Frame-by-Frame Synchronization for 3000 fps Projector–Camera Visual Light Communication †

: This paper proposes a novel method for synchronizing a high frame-rate (HFR) camera with an HFR projector, using a visual feedback-based synchronization algorithm for streaming video sequences in real time on a visible-light communication (VLC)-based system. The frame rates of the camera and projector are equal, and their phases are synchronized. A visual feedback-based synchronization algorithm is used to mitigate the complexities and stabilization issues of wire-based triggering for long-distance systems. The HFR projector projects a binary pattern modulated at 3000 fps. The HFR camera system operates at 3000 fps, which can capture and generate a delay signal to be given to the next camera clock cycle so that it matches the phase of the HFR projector. To test the synchronization performance, we used an HFR projector–camera-based VLC system in which the proposed synchronization algorithm provides maximum bandwidth utilization for the high-throughput transmission ability of the system and reduces data redundancy efﬁciently. The transmitter of the VLC system encodes the input video sequence into gray code, which is projected via high-deﬁnition multimedia interface streaming in the form of binary images 590 × 1060. At the receiver, a monochrome HFR camera can simultaneously capture and decode 12-bit 512 × 512 images in real time and reconstruct a color video sequence at 60 fps. The efﬁciency of the visual feedback-based synchronization algorithm is evaluated by streaming ofﬂine and live video sequences, using a VLC system with single and dual projectors, providing a multiple-projector-based system. The results show that the 3000 fps camera was successfully synchronized with a 3000 fps single-projector and a 1500 fps dual-projector system. It was conﬁrmed that the synchronization algorithm can also be applied to VLC systems, autonomous vehicles, and surveillance applications.


Introduction
Due to the increasing demand for radio-frequency-based wireless communications, an alternative communication system using light as a source has emerged, known as visiblelight communication (VLC) [1][2][3]. A light source with frequencies ranging between 400 and 800 THz (750 and 375 nm, respectively) is used by VLC systems for encoded transmissions. A photodiode detector is used to detect the modulated visible light and decode it to extract information. With the use of semiconductor light-emitting diodes (LEDs) as a light source that can modulate light at a high speed that is imperceptible to human vision, a VLC system provides the functionality of both communication and room luminescence [4,5].
surveillance-camera systems. In [54], synchronization was achieved in two stages. During the first stage, the phones were synchronized to the clock of a leader device, using the network time protocol (NTP); during the second stage, all client phone cameras captured a continuous stream that was phase-shifted to achieve better accuracy. The prominent drawback of wireless time synchronization is the nondeterminism of media access time [55]. Another study involved illumination-based intensity-modulated synchronization, using a 1000-fps camera with a phase-locked loop algorithm [56]. This algorithm is robust to background light and locks-in a high camera frame rate to the LED by adjusting the gain parameter to fit the brightness. However, all systems involving camera clock controls using visual feedback require an LED as the light source, or they must be controlled using a wire triggering device or NTP. The wired synchronization system has higher accuracy, but it is unsuitable for an optical wireless communication system. This paper is an extended version of our previous journal paper [57] in which we proposed a real-time VLC video-streaming system, using an HFR projector and camera. In this paper, we propose a method for synchronizing the HFR camera and projector to resolve inconsistent estimates. This method efficiently reduces data redundancy, mitigates long-term inconsistency issues, and increases the overall bandwidth of the system. A visual feedback-based control algorithm is used to synchronize the HFR camera with the HFR projector by providing a delay to the camera clock to match the phase of the projector's frame rate. The algorithm generates a delay using the total brightness of an image captured by the HFR camera when a predefined pattern of white and black image sequences is projected. A detailed explanation of the operation of the visual feedback-based synchronization algorithm is described in Section 2. The real-time video reconstruction using a VLC system is explained in Section 3. Section 4 provides an evaluation of the performance of the synchronized system via experiments in real time at 60 fps, with the HFR projector and camera both working at 3000 fps. Finally, in Section 5, the conclusions are drawn.

System Configuration
An HFR projector that can control its projection rate is required for synchronization with an HFR camera for high-speed VLC communication. The DLP LightCrafter 4500 (Texas Instruments, Dallas, TX, USA) projector provides a 4000 fps frame rate with bit-plane projection, using 1-to 8-bit projection, supporting a resolution of 912 × 1140. The DLP LightCrafter 4500 is a DMD projection system, where a pixel is represented with a two-dimensional array of electrically addressable and mechanically tiltable micro-mirrors, which are widely used in consumer electronics [58][59][60]. The DLP projector reproduces the signal by modulating the exposure time of the mirrors over a specific operating refresh time based on the projected frame-bit planes. This feature helps with projecting data at the pixel level, and it transforms the image to a pixel-wise bit-plane projection for the VLC system.
The projected images are captured by an extended version of the monochrome FAST-CAM SA-X2 (Photron, Tokyo, Japan) HFR camera, which enables 10,000 fps (real-time) image processing of a megapixel image, using a global electronic shutter with excellent light sensitivity [29]. Figure 1 presents an overview of the FASTCAM SA-X2 with the embedded external board, which was designed to control the delay signal for the camera clock so that it outputs images having a resolution of 512 × 512 with a 12-bit dynamic range at 3125 fps in real time. For an external trigger source, a function generator, AFG1022 (Tektronix, Beaverton, OR, USA), was used to provide a 3000 Hz square-wave trigger at 3.3 V.

SA-X2
External Board

Visual-Feedback-Based Projector-Camera Synchronization
The synchronization method used in the previous paper consists of software-based synchronization. In software synchronization, the HFR camera frame rate is three times the HFR projector frame rate. For example, if the HFR projector frame rate is 1000 fps, then the HFR camera frame rate is 3000 fps. The HFR camera captures three images of a single 1-bit projected image, out of which only the second image with better image brightness, compared to the other two images, is selected. The second image is sometimes not in phase with the HFR projector, which results in the variation of brightness of an image. As a result, we observe inconsistent estimates, data redundancy, and that the overall bandwidth of the system decreases. To overcome the above limitations of software-based synchronization, phase synchronization of HFR camera with HFR projector is required, which can be achieved with the proposed visual feedback-based projector-camera synchronization system as shown in Figure 2. The HFR projector is triggered using an external trigger 1 source (function generator 1), whereas the HFR camera is triggered using another external trigger 2 source (function generator 2). A predefined binary pattern is projected and captured by the HFR camera to calculate the maximum brightness and generate a delay, using a proportional control-based algorithm. This delay is added to the trigger signal of the HFR camera clock to match the HFR camera and projector phases. As a result, we attain maximum and minimum brightness by adding a delay to the camera clock of the HFR camera system. To achieve this, the control logic of the timing controller was implemented on an FPGA on the external board attached to the HFR camera, as shown in Figure 3. The sync timing controller accepts the input delay value τ, calculated using the visual-feedback-based algorithm in the software. The value τ is then fed to the sync signal generator as τ 1 with limits between τ min and τ max . The sync signal generator accepts the input clock frequency f from the external trigger source, and then adds a delay τ 1 to the input clock to generate the output clock signal as SyncOut. This delayed SyncOut clock signal is given as a SyncIn signal to the HFR camera for triggering image capture, thereby matching the phase of the HFR camera with the HFR projector. To calculate the delay value, τ, for synchronization, the projector projects a predefined binary pattern of '10101010' continuously, as shown in Figure 4, at a frame rate of F p with phase P p , while the camera captures individual frames at a frame rate of F c with phase P c . The white and black images are represented by 1 for maximum brightness and 0 for minimum brightness, respectively. The ideal case where the HFR camera is synchronized with the HFR projector is shown in Figure 4, where the phase P c is synchronized with phase P p .  There are three cases of phase difference as shown in Figure 5. In the first two cases, the phase of the HFR camera is out of phase with the HFR projector (i.e., case-1 and case-2). In case-1, there is a delay in the camera trigger, compared with the projector, whereas in case-2, the camera is triggered before the projector. Thus, projector phase P p is not equal to camera phase P c . In case-3, the camera and projector are both triggered simultaneously, which is the desired result after synchronization. Therefore, case-1 and case-2 can be synchronized, and the HFR camera can be triggered with the same phase as that of the HFR projector by adding a delay of τ to F c , as shown in Figure 6. As with case-1 in Figure 6a, there is a delay in triggering F c ; therefore, to match P c with P p , we must generate a large delay, τ. In case-2, we require a small delay, τ, to match P c with P p as F c is triggered prior to the HFR projector, as shown in Figure 6b.  To calculate the value of τ, we must initially calculate the brightness-based index, R(k), which is the ratio of the total brightness of two images at consecutive frames, where S(k) is the total brightness sum of the input image at the current frame, k, and S(k − 1) is the brightness at the previous frame, k − 1.
Next, we evaluate the error, C(k), where R(k − 1) is the previous brightness-based index calculated, using Equation (2).
The delay, τ, is calculated for proportional control by considering the delay, τ(k − 1), at the previous frame and error, C(k), multiplied by a constant proportional gain, K d p . The value of τ lies between τ max and τ min ; τ max is set to maximum exposure duration of the camera, and τ min is zero.

Verification
Considering the above-mentioned synchronization algorithm, experiments were conducted to verify its performance at a high frame rate. The HFR projector alternately projected bit-plane images of black and white patterns through a 24-bit color image. The value of each bit plane image (black or white) is decided by the 8-bit pixel value of a single channel, which was set to decimal 170 (10101010 in binary). The HFR projector frame rate was set to 3000 fps, and the exposure of each frame was set to a maximum exposure time of 331 µs. The HFR camera was set to 3000 fps; the exposure time of each frame was set to 1/3015 s (331 µs) to capture the black and white patterns alternately. Because the visual feedback control algorithm depends on the total brightness of an image, the black and white patterns were determined by the maximum and minimum total brightness of the captured image. The output graph of synchronization is shown in Figure 7, where the zoomed portion shows a pattern that the total image brightness drops after every 24 bits, due to the projection of a blank image by the HFR projector between two images. Figure 7 shows that the HFR camera phase, P c , was initially not in sync with the HFR projector phase, P p , and due to this, the blending of projected black and white patterns resulted in inaccurate total brightness of the captured images. Therefore, during the initial duration of approximately 6 s, the total brightness information was inaccurate. After this, the synchronization algorithm was executed, and the value of delay, τ(k), was calculated using S(k), R(k), and C(k). The delay, τ(k), was given as input to τ 1 of the sync timing controller on the FPGA of the external board, which generated a SyncOut signal to trigger the HFR camera and match the phase (P c ) with the phase (P p ) of HFR projector. The proportional gain, K d p , was set to 0.01 to achieve synchronization in shorter duration with stability. Figure 7 shows that the calculated delay, τ(k), was 0.177 ms, and the HFR camera-projector system was synchronized in approximately 20 ms. As a result, in the synchronized system, the total brightness of the captured image from the HFR camera increased when a white image was projected, and decreased when a black image was projected.   HFR camera exposure times of 1/3015, 1/8000, and 1/12,500 were selected to evaluate the performance of our proposed algorithm, which can work with a very short exposure time of 1/12,500 and large exposure time of 1/3015, as shown in Figure 8. The projector frame rate was set to 3000 fps with the exposure time 331 µs, and the bit-plane images of the black and white patterns were projected alternately. The HFR camera frame rate was set to 3000 fps, and the exposure time was kept at 1/3015, 1/8000, and 1/12,500 under a constant room luminescence of 150 lx. Initially, the HFR camera and projector were not in sync for approximately 3 s, after which the synchronization algorithm was executed to calculate the delay, τ(k), depending on the total brightness of the image. Figure 8 shows that the HFR camera-projector was synchronized under different frame exposure times with different levels of total image brightness. The value of delay τ(k) was never constant for a particular frame exposure time, and it varied depending on either case-1 and case-2 as discussed above. From this experiment, we can deduce that the system works with a wider range of exposure times and provides system flexibility. Figure 8 shows that synchronization is achievable for all selected ranges, but the only difference is the total brightness of the image.   Figure 8. Relationship between total brightness and delay when 3000 fps black-and-white projection is captured at different exposures.

Real-Time Video Streaming Using Vlc System
In our previous study, we used real-time video streaming with the VLC system [57], which comprised an HFR projector (DLP LightCrafter 4500), HFR camera (monochrome Fastcam SA-X2 with an additional external board with FPGA), and personal computer (PC) as shown in Figure 9 with its concept. Figure 10 presents a block diagram of the transmitter and receiver section in detail. The transmitter section consists of a gray-coded color video sequence and header information that is fed to the HFR projector for bit-plane binary projection. The receiver section consists of a monochrome HFR camera that captures monochrome images sequentially, from which background subtraction is performed to achieve better thresholding of the bit plane images by combining them. The combined color image is a gray code image, which is further decoded to pure binary code to reconstruct the original image. With reference to our previous work, we added additional information to the header and introduced a method of background subtraction for each image.

Transmitter
The transmitter encoding system involves three stages, as shown in Figure 11, where the input image, I t (x, y), is first encoded from pure binary into gray code as I gray t (x, y), to which header information, I h (w, y), is added. Then, the encoded image, I gray rgb (m, n), as shown in Equation (5), is fed to the HFR projector for bit-plane projection. The header information is added to inform the receiver that it received the transmitted information, which consists of five blocks of pixels representing information about the current image as shown in Figure 12. The first block, S0, contains all pixel values set to a maximum value of 255 for an 8-bit pixel, which is used to determine the start of a new image. The next five blocks of pixels (i.e., F4, F3, F2, F1, and F0) in the header represent a 5-bit frame number, ranging from 0 to 31. Subsequently, two blocks of pixels contain 2-bit channel information C1 and C0 blocks to represent the red-green-blue (RGB) channels of an image. The next three blocks of pixels, represented as B2, B1, and B0), contain 3-bit information of 8-bit planes of a single channel. The C1, C0, B2, B1, and B0 blocks of pixels aid in determining the sequence of binary images for reconstruction, whereas the last block of pixel I0 represents the stream of video, webcam, or video sequence from two different PCs as input streams. Webcam/Video Figure 12. Header information.
After combining the gray-code image with the respective header information, the image is fed to the HFR projector, where the spatio-temporal projection of an HFR projector is achieved by decomposing a given packed 24-bit I gray rgb (m, n) image into its equivalent twenty four 1-bit binary images. The projection pattern used in this system should have the total duration of exposure for all patterns less than or equal to the vsync duration in which a blank sequence is introduced to complete the vsync exposure time.

Receiver
The projected bit-plane images were captured by a monochrome HFR camera to reconstruct the transmitted 24-bit RGB image. The operation of the receiver section is shown in detail in Figure 13, where the captured images, C in (u, v), were collected sequentially according to the header information and reconstructed into a gray-code image, C RGB (u, v), which was converted back to pure-binary code, I RGB (u, v), to retrieve the transmitted image, I t (x, y). The HFR camera was synchronized with an HFR projector, using a visual-feedbackbased control algorithm. To retrieve the bit information accurately for each projected bit-plane, a background subtraction method was used to eliminate the noise introduced by ambient light on the projector screen in an indoor office environment. The thresholding method was used for background subtraction, where the reference image was subtracted from the input image. The reference image was estimated using the global thresholding method by projecting the maximum and minimum intensities through the HFR projector onto the screen. The threshold value, thr(m, n), at (m, n) was calculated using Equation (6), where B(m, n) is the pixel value at (m, n) of C in (u, v), captured after projecting its maximum brightness, and D(m, n) is the pixel value at (m, n) of C in (u, v), captured after projecting a black image.
The robustness offered by the background subtraction method is discussed in our previous work. We also introduced an additional background subtraction method in which the reference image is updated for each channel to maximize the efficiency of background subtraction. The bit plane decomposition of an 8-bit image is represented as eight 1-bit binary images with higher bit planes, containing more significant visual information, and the lower bit-plane containing more details. However, the intensity of the lowest bit hardly changes; therefore, we used the lowest bit of each channel to update the reference image, as shown in Figure 14. Figure 14. Background subtraction method.
Let the 1-bit image of the projected bit-plane image captured by the HFR camera be C in (u, v), and the reconstructed 8-bit image of three channels be combined to form a single 24-bit RGB color image, C RGB (u, v). The C RGB (u, v) image is an encoded gray-code image that is further decoded to a pure-binary-code-based image at the pixel level to obtain the reconstructed RGB color image, I RGB (u, v).

Evaluation Parameter of Image Quality
To assess the image quality of the reconstructed images, a full-reference metrics-based objective image quality index, such as the peak signal-to-noise ratio (PSNR) and multiscale structural similarity index (MS-SSIM) [61] were used. PSNR compares images using different dynamic ranges, as expressed in Equation (7), where MSE is the mean-squared error, and MAX I is the dynamic range of the allowable pixel intensities. The value of the PSNR should be higher for better image quality.
The MS-SSIM, as shown in Equation (8), extracts the structural information from the field of view based on the human visual system (HVS) assumption. However, it is not very useful for blurred images, and the measured error lies between zero and one, where one represents the best image quality.
To evaluate the efficiency of the reconstructed images, a 5-bit frame number in the header was used by assigning the frame number to each input frame ranging from 1 to 32, thereby creating a packet of 32 frames. The loss is calculated by counting the missing frames within the 32 frames at the receiver. The efficiency of the system was calculated using Equation (9), where F r is the frame reconstruction efficiency, and S r represents the successful frame reconstructed out of the total number of frames, F t , within one packet of 32 frames. Thus, image quality metrics evaluate the quality of the reconstructed images at the receiver. Additionally, the frame reconstruction efficiency explains the number of frames being reconstructed at the receiver and those being lost because of the bandwidth of the system and luminescence of the HFR projector.

Experiments
To evaluate the performance of the system, we performed various experiments by streaming the saved video and real-time universal serial bus (USB) camera video and reconstructing it, using the VLC system. The HFR projector streamed 590 × 1080 video at 60 fps, which is a combination of 590 × 1060 gray-code images and 590 × 20 header information. This combined image is projected in a bit plane sequence as shown in Figure 15, where the duration of exposure for each bit-plane pattern is 331 µs. Therefore, the time required to project one frame is 24-bit × 331 µs, which is approximately 8000 µs or 8 ms. Therefore, in 1 s, approximately 125 frames can be projected using our system; however, owing to the limitation of the HFR projector, we can transmit 590 × 1080 images at maximum 60 fps. A 50-mm lens was mounted on the HFR camera, which was set to the same frame rate as that of the HFR projector (3000 fps). The experimental setup is shown in Figure 16, where the distance between the HFR projector and screen was 950 mm. The projection display onto the screen was 448 × 415 mm. The distance between the HFR camera and screen was 1130 mm to ensure that the overall area of the projected video on the screen was captured by the camera. For the proposed system, a stored video sequence was used for a single projector system, and live video streaming from two USB cameras was used to check the synchronization accuracy of the dual projector system. To check the robustness of the system, the indoor environment was illuminated with luminescence values of 0, 150, and 300 lx, using an external light source. The details of the experimental hardware are specified in Figure 17.
sequence of bit-plane projection

Synchronized Real-Time Video Reconstruction
To evaluate the operation of the VLC system after successful synchronization using the visual feedback algorithm, an experiment was performed in which a stored video sequence was streamed in real time. The selected video sequence was from the movie, Big Buck Bunny [62]. Initially, the pure-binary-code images of the 24-bit 1920 × 1080 RGB-color video sequence were gray coded and resized to 590 × 1060 alongside the addition of the 590 × 20 header information. The encoded resized image was projected in bit-plane or binary image sequences at 3000 fps, and the HFR camera captured 512 × 512 images to reconstruct the output image with a resolution of 510 × 459 by combining all bit planes of the 24-bit RGB image sequentially. Figure 18a shows a high definition input image of 1920 × 1080 at 60 fps. Figure 18b contains the reconstructed images 510 × 459, using gray code without background subtraction. Figure 18c depicts the reconstructed images 510 × 459, using gray code with background subtraction. From the reconstructed images, we can deduce that there were no artifacts present when gray-code encoding was used. Next, an experiment was conducted to measure the image quality analysis and performance of the system by sending the saved video at 60 fps, which was projected at 3000 fps under different on-screen luminescence conditions of 0, 150, and 300 lx; images were captured at 3000 fps at different exposure times, i.e., 1/3015, 1/8000, and 1/12,500. The results of the image-quality analysis of hundreds of reconstructed images with respect to their original images, are shown in Figures 19 and 20. From the graph shown in Figures 19 and 20, the PSNR and MS-SSIM values indicate that the image quality was better when captured at an exposure of 1/8000 than those at 1/12,500 and 1/3015. However, the background subtraction performed at every frame helped improve the image quality at various HFR camera exposure times. Figure 21 shows the performance of the system based on the number of frames reconstructed at the receiver. A duration of 40 s was observed, and the number of frames was monitored, as shown in Figure 21. The frame reconstruction ratios were almost 100% for 0 lx, whereas for 150 and 300 lx, the frame reconstruction was nearly 100% with small losses. The experimental results indicate that the HFR projector and HFR cameras were synchronized; otherwise, the frame reconstruction would not be possible, and we would not be able to reconstruct the video sequence in real time at 60 fps.

Real-Time Video Reconstruction Using Two HFR Projectors
The experimental setup for the two projectors is shown in Figure 22, where the dual projectors are maintained such that the projection area overlaps. The distance between the screen and both HFR projectors was kept the same at 950 mm, and the HFR camera was set at a distance of 1130 mm with a 50-mm mounted lens. The experiment scene is shown in Figure 22. In this experiment, the input video sequence was streamed from two USB cameras (XIMEA, MQ003CG-CM) in 24-bit color with a resolution of 640 × 480 at 60 fps for transmission, and both cameras were connected to two PCs. The experiment scene comprised a person throwing a football on the floor. In this experiment, HFR projector 2 was set to 180 • out of phase with respect to HFR projector 1, and both were set to 1500 fps. The bit-plane projection of the image sequence was the same as that in Figure 15. The HFR camera was kept at 3000 fps, that is, double the projection frame rate, which captured the images of each projector alternately and reconstructed both videos at 60 fps. In Figure 23, the two HFR projector input image sequences of 640 × 480 at 60 fps are shown. Figure 23b depicts the reconstructed 510 × 459 images, using gray code without background subtraction. Figure 23c shows the reconstructed images 510 × 459, using gray code with background subtraction. The sequences were reconstructed alternately from the image sequence projected, using HFR projectors 1 and 2.  The image quality analysis and performance of the two projector systems were evaluated by projecting a 60 fps video at 3000 fps under different on-screen luminescence conditions, i.e., 0, 150, and 300 lx, and the images were captured at 3000 fps with different exposure times, i.e., 1/3015, 1/8000, and 1/12,500. Figures 24-27 show that the values of PSNRs and MS-SSIMs were similar, but the best result was observed for exposure 1/3015 at 0 lux. However, the values of MS-SSIM are more promising than the PSNR values because they represent the image quality of the system. Figure 28 shows the performance of the frames reconstructed from each projector system, reflecting the number of frames reconstructed at the receiver corresponding to each projector. Figure 28 shows that the live USB streaming led to few losses in the frame during reconstruction but were not very significant, and the system could reconstruct video sequence in real-time at 60 fps. The results indicate that multiple projector synchronization was possible with the HFR camera, and the overall bandwidth of the system was utilized.

Conclusions
In this article, we presented a novel HFR projector-camera synchronization algorithm using a visual feedback algorithm and evaluated its performance by streaming real-time video using the HFR projector-camera-based VLC system. The experimental results show that synchronization can be achieved at a high frame rate, and that the system is robust to ambient light and can work on a wide range of exposure times. The background subtraction method increased the image quality of the reconstructed image under different ambient light conditions. The experiments were conducted for real-time video streaming to evaluate the percentage of frames received at different frames-per-second and luxes, and it was observed that the frame loss was slightly increased with an increase in the frame rate and lux. The HFR camera and HFR projector system bandwidth were not fully utilized at 3000 fps when a single projector system was used because the system could reconstruct a 60 fps streaming video at nearly 60 fps. Therefore, the dual projector system proved promising, and a full bandwidth of approximately 120 fps was utilized as the dual projector system distributed the computational load of one PC to two. Overall, the images reconstructed using the dual projector had better quality, and the system can be expanded to multiple projectors. The only constraint of the dual projector system is that HFR projector 2 should be triggered by HFR projector 1.
Author Contributions: All authors contributed to the study design and manuscript preparation. I.I. contributed to the concept of HFR-vision-feedback-based synchronization with HFR projector for visible light communication. S.R., K.S. and T.S. designed the high-speed camera-projector system for visible light communication. A.S. developed an algorithm for visual-feedback-based synchronization of an HFR projector-camera system and evaluated its performance using an HFR projector-camerabased visible-light communication system for real-time video streaming. All authors have read and agreed to the published version of the manuscript.