Image Segmentation and Quantification of Droplet dPCR Based on Thermal Bubble Printing Technology

Thermal inkjet printing can generate more than 300,000 droplets of picoliter scale within one second stably, and the image analysis workflow is used to quantify the positive and negative values of the droplets. In this paper, the SimpleBlobDetector detection algorithm is used to identify and localize droplets with a volume of 24 pL in bright field images and suppress bright spots and scratches when performing droplet location identification. The polynomial surface fitting of the pixel grayscale value of the fluorescence channel image can effectively compensate and correct the image vignetting caused by the optical path, and the compensated fluorescence image can accurately classify positive and negative droplets by the k-means clustering algorithm. 20 µL of the sample solution in the result reading chip can produce more than 100,000 effective droplets. The effective droplet identification correct rate of 20 images of random statistical samples can reach more than 99% and the classification accuracy of positive and negative droplets can reach more than 98% on average. This paper overcomes the problem of effectively classifying positive and negative droplets caused by the poor image quality of photographed picolitre ddPCR droplets caused by optical hardware limitations.


Introduction
Digital Polymerase Chain Reaction (dPCR) is the promising third-generation nucleic acid detection technology that is playing an increasingly important role in disease diagnosis, individualized medicine, and genetic testing. dPCR has a lower detection limit than RT-qPCR and is more tolerant to inhibitors, which can be used in the detection of SARS-CoV-2 at low concentrations with reduced incidence of false-negative results [1][2][3]. dPCR technology can be used for accurate detection of fetal aneuploidy [4,5], and fetal-free DNA in maternal plasma can be analyzed to detect genetic diseases such as thalassemia and hemophilia [6,7]. dPCR technology can detect tumor cells and DNA fragments transferred to the patient's blood and body fluids, enabling dynamic monitoring of the patient's disease course [8][9][10]. dPCR performs dilution of sample DNA molecules to disperse them into different micro reaction units, ultimately with only one or a limited number of DNA molecules in each partition, reducing the competition between target and non-target DNA molecules and improving the sensitivity of the assay [11,12]. Traditional 96/384-well plate analysis cannot meet the requirements of dPCR for the number of reaction partitions, and the microliter scale of reagents for reaction units results in high reagent consumption costs [13]. Micro-processing technology provides an efficient method to generate a large number of microchambers for dPCR with microfluidic chips. Droplet Digital PCR (ddPCR) uses droplets as reaction units, which is an ideal dPCR platform as it is easier to miniaturize the equipment than the integrated microfluidic chip [14][15][16]. The detection sensitivity of droplet digital PCR technology often depends on the total number of reaction units, and the higher the number of reaction units, the better the sensitivity and accuracy of the analysis [17]. In this

High-Throughput Droplet Generation and Acquisition of Image Data
To generate 24 pL droplets, a new droplet generation method is used. ddPCR droplets are generated by a generator controller controlling a thermal bubble inkjet print chip. The droplet-generating chip is a MEMS thermal bubble actuator realized by a semiconductor process [26][27][28][29]. A small volume of sample solution, normally 20 µL, is pipetted into an inkjet print chip, and the print controller generates hundreds of thousands or even millions of droplets at the 24 pL level in a short time and injects them into the manifold, saving reagent costs and achieving high droplet throughput. The droplets are fed into the optical module for image acquisition after 40 cycles of reaction in the thermal cycler. The entire droplet PCR process is shown in Figure 1a.
The result reading chip of fixed size is used to conduct CCD camera imaging at the fixed position of the optical system and each result reading chip results in 100 images. The motion control module of the optical system can determine the x and y coordinates of each image at the position of the result reading chip, move each region of interest to the specified coordinate position, and adjust the z-axis to the focus distance so that all the regions of the result reading chip containing droplets is scanned. The position information of each image obtained was used for imaging under a bright field and all fluorescence channels, and the image was saved for subsequent image analysis. Figure 1b shows the steps of image acquisition of the optical module. The blue box in Figure 1b shows a large picture composed of 100 bright-field images and one bright-field image (The red box represents one of the 100 images taken). The large picture is composed of 100 fluorescence channel images and one of them is displayed in the green box. Figure 1. Acquisition of dPCR image data with the high-throughput droplet generation and res reading chip. (a) Flow diagram of a thermal bubble inkjet print head generating dPCR drople where the print controller controls the droplet generation chip to generate hundreds of thousan of micro-droplets at the 24 pL level, and the generated droplets are transferred to a result read chip and subjected to PCR reactions in the PCR instrument. (b) Flow chart of the optical mod taking pictures and imaging the result reading chips after the PCR reaction, fixing the CCD came and moving result reading chips in a small area each time, the optical module will image under bright field and fluorescence field under the corresponding result reading chips position.
The result reading chip of fixed size is used to conduct CCD camera imaging at t fixed position of the optical system and each result reading chip results in 100 images. T motion control module of the optical system can determine the x and y coordinates of ea image at the position of the result reading chip, move each region of interest to the spe fied coordinate position, and adjust the z-axis to the focus distance so that all the regio of the result reading chip containing droplets is scanned. The position information of ea image obtained was used for imaging under a bright field and all fluorescence channe and the image was saved for subsequent image analysis. Figure 1b shows the steps image acquisition of the optical module. The blue box in Figure 1b shows a large pictu composed of 100 bright-field images and one bright-field image (The red box represe one of the 100 images taken). The large picture is composed of 100 fluorescence chann images and one of them is displayed in the green box. where the print controller controls the droplet generation chip to generate hundreds of thousands of micro-droplets at the 24 pL level, and the generated droplets are transferred to a result reading chip and subjected to PCR reactions in the PCR instrument. (b) Flow chart of the optical module taking pictures and imaging the result reading chips after the PCR reaction, fixing the CCD camera, and moving result reading chips in a small area each time, the optical module will image under the bright field and fluorescence field under the corresponding result reading chips position.

Droplet Identification, Localization, and Noise Filtering
The bright-field image is characterized by the outer edge of the droplets, that identifies and localizes the droplets (Figure 2a). We chose the feature point detection method of Sim-pleBlobDetector provided in OpenCV to segment and locate droplets [30]. The algorithm first sets a series of continuous thresholds to convert the input gray image into multiple binary images. Equation (1) represents the set of multiple binary images with threshold range [T1, T2] and step size t. Then, the boundary of each binary image is detected to form a connected domain and each boundary center is calculated. All central points are combined and classified into a series of groups (speckle features). Finally, the localization and size information of each group of the gray image is determined.
(1) threshold range [T1, T2]and step size t. Then, the boundary of each binary image is d tected to form a connected domain and each boundary center is calculated. All cent points are combined and classified into a series of groups (speckle features). Finally, localization and size information of each group of the gray image is determined.
[T1,T1+t,T1+2t,T1+3t,...T2]. The fabrication of the result reading chip will affect the droplet image quality, redu the droplet segmentation accuracy and affect the droplet quantification results. Parti debris, bright spots, or scratches on the result reading chip should be suppressed to i prove the accuracy of droplet identification and positioning. The bright spots in brig field images appear to have high gray value and largely connected domain area in images. In droplet localization and identification, threshold processing, and connect domain area, convexity and roundness screening methods can be used to screen out connected areas that do not meet the requirements to suppress bright spots and scratch [31][32][33][34]. The detected spot area was limited to [500,2000] pixels, droplets were screen The fabrication of the result reading chip will affect the droplet image quality, reduce the droplet segmentation accuracy and affect the droplet quantification results. Particle debris, bright spots, or scratches on the result reading chip should be suppressed to improve the accuracy of droplet identification and positioning. The bright spots in bright-field images appear to have high gray value and largely connected domain area in the images. In droplet localization and identification, threshold processing, and connected domain area, convexity and roundness screening methods can be used to screen out the connected areas that do not meet the requirements to suppress bright spots and scratches [31][32][33][34]. The detected spot area was limited to [500, 2000] pixels, droplets were screened and bright spots were suppressed by setting the roundness and convexity of spots. The roundness of spots is shown in Equation (2), and the convexity is shown in Equation (3) [35,36]. The value range of the two is 0-1. Area, Perimeter, and convex area indicate the area, perimeter, and convex hull corresponding to each spot. The droplets identified after screening are shown in Figure 2b, and the bright spots in the green circle are suppressed.
Some background impurities in the green box in Figure 2c are incorrectly identified as droplets. The intensity of the gray value of background impurities in the 16-bit fluorescence image is about 50% of the intensity of the gray value of negative droplets, and the background impurities can be suppressed on the fluorescence image by setting the threshold. The 100 bright-field images and 100 fluorescence images were processed according to the above steps, and we stitched the processed images for easy observation, and the results are shown in Figure S1.

Droplet Segmentation in the Fluorescent Image and Signal Quantification
Inherent defects in the optical system and angular deviations in the light received by the CCD camera can affect the uniformity of dPCR fluorescence images and result in image vignetting (the causes of vignetting are explained in Figure S2) [37][38][39]. To reduce the effect of the camera's inhomogeneous light sensitivity on the droplet signal values, the vignetting generated by the optical module is compensated to improve the accuracy of dPCR positive and negative droplet classification. The vignetting compensation steps are as follows:

1.
Gaussian filter operator is used to blurring an image of the uniform droplet in the fluorescence channel. The blurred image ( Figure 3a) is used as the vignetting correction template. The 3 × 3 Gaussian template g(x, y) is shown in Equation (4) and multiplied by a constant 1/16 to achieve a weighted average of the pixel values in the 3 × 3 neighborhood [40].

2.
A quadratic polynomial is chosen as the fitting formula, and the pixel gray value surface is fitted to the template image using the least-squares method, and the surface fitting function is shown in Equation (5) [41,42]. The vignetting correction template is sampled to find the surface fitting coefficients, and the fitted surface is drawn as shown in Figure 3b. The maximum value of the fitted surface equation is divided by the surface fitting value of the image, and the result is normalized to obtain the normalized dark corner surface compensation model as shown in Figure 3c.
The calculated vignetting compensation model is used to correct the images captured by the camera to eliminate image vignetting. Figure 3d shows the image without vignetting correction, and the gray value of the image is counted along the yellow diagonal line in Figure 3d and the gray value curve is plotted as shown in Figure 3e. The image after vignetting correction and the gray value curve of the image are shown in Figure 3f,g. Comparing Figure 3e,g, it is obvious that the image vignetting is better compensated. The pixel value surfaces of the images before and after vignetting correction are shown in Figure S3a the images after vignetting correction significantly improved. The satellite droplets affect the accuracy of the fluorescence field droplet signal calculation, and the pixel values of each droplet region are sorted and selected as the signal value of this droplet by averaging the 40-60% length part of the sequence, and Figure S5 represents the segmented droplet signal image after vignetting correction, and the satellite droplets are marked in the orange box.
is better compensated. The pixel value surfaces of the images before and after vign ting correction are shown in Figure S3a,b. The average values of the difference tween the average pixel values of the eight edge regions and the average pixel val of the center region in the attached Figure S4a,b are calculated as shown in the tached Tables S1 and S2, and the results are 667 and 71, respectively, and the u formity of the images after vignetting correction significantly improved. The satel droplets affect the accuracy of the fluorescence field droplet signal calculation, a the pixel values of each droplet region are sorted and selected as the signal value this droplet by averaging the 40-60% length part of the sequence, and Figure S5 r resents the segmented droplet signal image after vignetting correction, and the sa lite droplets are marked in the orange box. The vignetting corrected images are automatically classified into two classes of po tive and negative droplets using the unsupervised learning k-means clustering algorith k-means algorithm achieves the classification of positive and negative droplets mainly minimizing the value of the intra-cluster variance [43]. The scatter plot of the k-me algorithm clustered positive and negative droplet signals is shown in Figure 4a, and clustering results of positive droplets and negative droplets are labeled in Figure 4 respectively, in the fluorescence channel images, where the positive droplets cluster results are labeled with red circles in Figure 4b and the negative droplets clustering resu In the absence of asymptotic halo correction, the gray value curve shows a peak shape as a whole. (f) The image of (d) after vignetting correction. (g) The gray value curve is drawn along the yellow diagonal in (f). After vignetting correction, the grayscale curve is flat and the image vignetting is corrected.
The vignetting corrected images are automatically classified into two classes of positive and negative droplets using the unsupervised learning k-means clustering algorithm. kmeans algorithm achieves the classification of positive and negative droplets mainly by minimizing the value of the intra-cluster variance [43]. The scatter plot of the k-means algorithm clustered positive and negative droplet signals is shown in Figure 4a, and the clustering results of positive droplets and negative droplets are labeled in Figure 4b,c, respectively, in the fluorescence channel images, where the positive droplets clustering

Bright-Field Image Droplet Segmentation Accuracy
The sample solution is prepared and the droplets are generated for PCR reactions with an inject-based droplet generator, the images are acquired using an optical system, and finally, the droplets are classified and quantified by an image analysis workflow.
1, 5, 10, and 20 brightfield images were randomly selected from the sample and applied three different workflows to count the number of droplets identified and calculate the accuracy of droplet segmentation. The image segmentation algorithm was applied to

Bright-Field Image Droplet Segmentation Accuracy
The sample solution is prepared and the droplets are generated for PCR reactions with an inject-based droplet generator, the images are acquired using an optical system, and finally, the droplets are classified and quantified by an image analysis workflow.
1, 5, 10, and 20 brightfield images were randomly selected from the sample and applied three different workflows to count the number of droplets identified and calculate the accuracy of droplet segmentation. The image segmentation algorithm was applied to segment the brightfield images to obtain the number of segmented droplets in each image. After marking the outer contour of droplets on the original image, the droplets with segmentation errors and droplets without effective segmentation are manually observed and counted ( Figure S6 shows the droplets with segmentation errors and droplets without effective segmentation, the incomplete droplets at the edge of the image are not counted). The actually segmented droplets are the number of droplets correctly segmented in the image minus the incorrectly segmented droplets. The total number of droplets in the image is the sum of the actual number of segmented droplets, the number of droplets incorrectly segmented, and the number of droplets that are not effectively segmented.
In this work, we composed four image analysis workflows. Workflow 1 used fixed threshold screening, kirsch edge detection operator processing, fixed threshold image binarization, morphological processing, and area screening; Workflow 2 changed fixed threshold image binarization to adaptive threshold binarization screening and added ellipse detection methods on the basis of Workflow 1; Workflow 3 added watershed algorithm for secondary segmentation on the basis of Workflow 2. The number of droplets identified in 1, 5, 10, and 20 bright field images was calculated and the accuracy of droplet segmentation was calculated for Workflow 1, 2, and 3, and the results are shown in Table 1. (The specific flow of the three types of workflows is shown in Figure S7, and the segmentation results are shown in the attached Figure S8). Table 1. Accuracy of bright-field image droplet segmentation for the above three segmentation workflows and workflow with SimpleBlobDetector algorithm.

Number of Bright
Field Images Using the SimpleBlobDetector algorithm to count the number of droplet recognition and calculate the accuracy of droplet segmentation. Compared with the other three workflows, the accuracy of this bright-field droplet image segmentation algorithm is higher, and the actual correct segmentation recognition rate of droplets in the image reaches more than 99% on average, as shown in Figure 5a. combined with Figure S8, it can be seen that when the droplet imaging in the bright field image is blurred, the other three workflows appear in the image in areas where many droplets are not recognized, and the algorithm stability is not high. Meanwhile, we also compared our outcome to different workflows and third party software ( Figure S9).

Fluorescence Image Droplet Classification Accuracy
Droplets recognized and localized in bright field images ( Figure 5a) were used to remove background in blank regions, scratches, and particles for the fluorescent image segmentation (Figure 5b). The image analysis workflow can calculate the difference between positive droplets (Figure 5d) and negative droplet (Figure 5f) clustering center values to filter out the droplet images with a poor clustering effect.
We further counted the true positives (TP), false negatives (FN), false positives (FP), and true negatives (TN) of the 20 images. We also calculated accuracy, sensitivity, and specificity for the 20 images, and higher values of the three indicate better validity of classification performance, and their calculation formulae are shown in (6)- (8).

Fluorescence Image Droplet Classification Accuracy
Droplets recognized and localized in bright field images ( Figure 5a) were used to remove background in blank regions, scratches, and particles for the fluorescent image segmentation (Figure 5b). The image analysis workflow can calculate the difference between positive droplets (Figure 5d) and negative droplet (Figure 5f) clustering center values to filter out the droplet images with a poor clustering effect.
We further counted the true positives (TP), false negatives (FN), false positives (FP), and true negatives (TN) of the 20 images. We also calculated accuracy, sensitivity, and specificity for the 20 images, and higher values of the three indicate better validity of classification performance, and their calculation formulae are shown in (6)- (8).
where accuracy indicates the ratio of the number of correctly classified droplets to the total number of droplets, sensitivity and specificity indicate the proportion of correctly classified positive droplets to all positive droplets, and the proportion of correctly classified negative droplets to all negative droplets, respectively. The statistical means of accuracy, sensitivity, and specificity for the 20 images were 99.75%, 95.32%, and 99.95%.

Validation of Positive Droplet Concentration Gradient Detection with Multiple Fluorescence Channels
A series of positive droplet concentrations were prepared by mixing positive droplets containing three dyes (Fam, Hex, Cy5) all at a concentration of 50 nM with negative droplets without dye at different ratios. The positive droplet and negative droplet solutions were mixed in a 4:5 ratio and diluted sequentially to produce four samples with a concentration gradient of 44.44%, 27.50%, 17.18%, and 10.72% of positive droplets, and finally, the four samples were analyzed and the actual measured positive droplet ratios were obtained.
The experimental procedure above was repeated four times under the same experimental conditions, and a total of 48 sets of experimental data were obtained for three channels. The mean value of each concentration measured in each fluorescence channel obtained in the four experiments was counted, and the standard deviation of the mean value of the four measured concentration gradients from the actual concentration was calculated. The correlation curves for the three channels are plotted in Figure 6 and the standard deviation for each concentration is labeled, where the actual measured concentration of the algorithm is strongly correlated with the ideal concentration, with correlation coefficients of 0.9937, 0.9764 and 0.9977 for the Fam, Hex and Cy5 channels, respectively. where accuracy indicates the ratio of the number of correctly classified droplets to the total number of droplets, sensitivity and specificity indicate the proportion of correctly classified positive droplets to all positive droplets, and the proportion of correctly classified negative droplets to all negative droplets, respectively. The statistical means of accuracy, sensitivity, and specificity for the 20 images were 99.75%, 95.32%, and 99.95%.

Validation of Positive Droplet Concentration Gradient Detection with Multiple Fluorescence Channels
A series of positive droplet concentrations were prepared by mixing positive droplets containing three dyes (Fam, Hex, Cy5) all at a concentration of 50 nM with negative droplets without dye at different ratios. The positive droplet and negative droplet solutions were mixed in a 4:5 ratio and diluted sequentially to produce four samples with a concentration gradient of 44.44%, 27.50%, 17.18%, and 10.72% of positive droplets, and finally, the four samples were analyzed and the actual measured positive droplet ratios were obtained.
The experimental procedure above was repeated four times under the same experimental conditions, and a total of 48 sets of experimental data were obtained for three channels. The mean value of each concentration measured in each fluorescence channel obtained in the four experiments was counted, and the standard deviation of the mean value of the four measured concentration gradients from the actual concentration was calculated. The correlation curves for the three channels are plotted in Figure 6 and the standard deviation for each concentration is labeled, where the actual measured concentration of the algorithm is strongly correlated with the ideal concentration, with correlation coefficients of 0.9937, 0.9764 and 0.9977 for the Fam, Hex and Cy5 channels, respectively.

Discussion
The droplet inkjet printing chip in this paper can generate more than 100,000 microdroplets with a volume of 24 pL, and the inkjet printing system can generate highthroughput droplet micro-reaction units within a second, which is conducive to improving the sensitivity of ddPCR detection. The recognition and classification of droplets are accomplished by means of optical module photography and image processing.
In our work, we developed a method to generate droplets with the diameter of only 32.5 µm. Due to the small size of our droplets, we noticed that the effect of the scratches, dirt particles, and vignetting, etc., on droplet segmentation and fluorescent intensity readout are more significant compared to large droplets with the diameter over 100 µm. This paper uses the SimpleBlobDetector algorithm to recognize droplets and removes the bright spot problem in the bright field image by setting the parameters of droplet area and roundness in the recognition process and suppresses the background contamination by setting the image threshold range in the fluorescence channel to improve the accuracy of droplet recognition. We compared the droplet recognition accuracy of the SimpleBlobDetector method used in this paper with that of traditional methods such as threshold detection and watershed segmentation and found that our workflow is more stable and has higher droplet recognition accuracy, and the effective droplet segmentation accuracy of the speckle detection algorithm used is higher than 99% for 20 randomly selected bright field images. We also compared our outcome to different workflows and third party software. Our image analysis workflow showed higher accuracy and more stable performance.
There is vignetting in the fluorescence field images taken by the optical module, and the vignetting can affect the calculation of the intensity of the droplet signal values and reduce the accuracy of the classification. In this paper, a polynomial surface fitting method is used to calculate the vignetting template of the camera and compensate for the captured image to eliminate the vignetting problem of the image. The interference of satellite droplets with the signal value was suppressed by taking the average of the gray value of the middle 20% of each droplet as the signal value of the droplet. The gray value range was set for each fluorescence channel to suppress abnormal bright spots in the fluorescence channel, and then the signal values of each droplet counted were clustered and analyzed to obtain the number of positive and negative droplets. The statistical accuracy of droplet classification on 20 fluorescence images is higher than 98%, which verifies that the image analysis workflow can accomplish the classification and quantification of high-throughput droplets in the case of poor flow channel plate process and limited photographic accuracy of the optical module.
Bright-field images may have vignetting caused by inaccurate focus in the process of capture, which affects the recognition and localization of droplets by the SimpleBlobDetector algorithm. To increase droplet identification and localization efficiency, the algorithm will be improved to compensate for the focus blurry phenomenon in bright field images. In addition, this paper studied Gaussian surface fitting and polynomial fitting when calculating the vignetting compensation of the fluorescence channel, and other fitting methods can be found to improve the accuracy of vignetting correction. We selected K-means clustering for the classification of positive and negative droplets, and other clustering methods can be explored to improve the classification accuracy. In this paper, satellite droplets appear in the droplets printed by the inkjet printing chip. By improving the droplet homogeneity and reducing the influence of satellite droplets on the signal value, we can also improve the accuracy of the initial concentration calculation of the sample.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/s22197222/s1, Additional File S1, Figure S1: Localization identification of droplets and suppression of the result reading chip contamination result map, each image is stitched from 100 bright-field images or fluorescent images; Figure S2: The optical module produces the vignetting phenomenon; Figure S3: Pixel value surface plots of fluorescence images before and after vignetting correction; Figure S4: The average gray value of each corner area and center area of the image before and after vignette correction was counted; Figure S5: The resulting graph was obtained by segmenting the droplets in the image after vignetting correction of the fluorescence image; Figure S6: Droplets with wrong segmentation, droplets not validly identified, and droplets with incomplete edges in the image; Figure S7: Flow chart of three types of bright field image droplet segmentation workflows; Figure S8: The results of bright-field image droplet segmentation for the three algorithms. Table S1: The difference between the eight corner areas and the central area and the average of the difference between the eight corner areas and the central area in Figure S4a; Table S2: The difference between the eight corner areas and the central area and the average of the difference between the eight corner areas and the central area in Figure S4b; Figure S9: Accuracy values given by different image analysis methods; Figure S10: Fill different channels with droplets; Figure S11: Droplets in multiple layers versus droplets in one layer; Figure S12: How merged droplets are treated in the image analysis. The droplet classification information by applying image processing algorithm statistics can be downloaded at: Additional File S2.