Systematic Quantiﬁcation of Cell Conﬂuence in Human Normal Oral Fibroblasts

: Background: The accurate determination of cell conﬂuence is a critical step for generating reasonable results of designed experiments in cell biological studies. However, the cell conﬂuence of the same culture may be diversely predicted by individual researchers. Herein, we designed a systematic quantiﬁcation scheme implemented on the Matlab platform, the so-called “Conﬂuence-Viewer” program, to assist cell biologists to better determine the cell conﬂuence. Methods: Human normal oral ﬁbroblasts (hOFs) seeded in 10 cm culture dishes were visualized under an inverted microscope for the acquisition of cell images. The images were subjected to the cell segmentation algorithm with top-hat transformation and the Otsu thresholding technique. A regression model was built using a quadratic model and shape-preserving piecewise cubic model. Results: The cell segmentation algorithm generated a regression curve that was highly correlated with the cell conﬂuence determined by experienced researchers. However, the correlation was low when compared to the cell conﬂuence determined by novice students. Interestingly, the cell conﬂuence determined by experienced researchers became more diverse when they checked the same images without a time limitation (up to 1 min). Conclusion: This tool could prevent unnecessary human-made mistakes and meaningless repeats for novice researchers working on cell-based studies in health care or cancer research. Two Figure 4 denotes the cell conﬂuence area percentage calculated by our scheme. The ordinate is the percentage visually determined by an expert. circle points denote the 30 image data points. The presents the quadratic model and the black curve presents interpolation using the shape-preserving piecewise cubic model, Section In the regression model built for the following


Introduction
Cell confluence is defined as the percentage of a culture dish or a flask occupied by any type of adherent mammalian cells. It is an important parameter used to determine the phase (lag, log, and plateau) of cell growth in order to correctly conduct further experiments in cell culture biology [1]. Although this is a fundamental step in cell biological research, the successful determination of precise confluence is largely dependent on the experience and intuition of researchers, especially when different

Materials and Methods
Cell line-human normal oral fibroblasts (hOFs) were cultured in vitro from oral gingival tissues isolated from adult donors undergoing third molar extraction, under the approval of the Institutional Review Boards (IRB) of National Taiwan University Hospital (IRB approval number: 203106005RINC), Taiwan. In brief, clinical specimens were processed in sterile conditions using fine forceps to separate the epithelium and fibroblastic tissues. Isolated fibroblastic tissues were covered with a coverslip, and we then waited for cell migration out of the tissues. Once the cells had moved out of the tissues, the fibroblastic tissues were removed and the cells were trypsinized and cultured in Dulbecco's modified eagle's medium (DMEM, Life Technologies Co. Grand Island, NY, USA) containing 10% fetal bovine serum (FBS), 2 mM L-glutamate, and 50 U/mL penicillin (Sigma-Aldrich, St. Louis, MO, USA). Cells were incubated in a humidified incubator with 5% CO 2 at 37 • C and passaged every two days. After seeding, cell images were acquired under an inverted microscope (Olympus CKX41, Shinjuku-ku, Tokyo, Japan) with a digital camera (COOLPIX P5000, NIKON CORP., Minato-ku, Tokyo, Japan) every 6 h for a total of 60 h. Five regions of each dish (rightmost, leftmost, upper, lower, and middle positions) were selected for image acquisition. The images were taken with the following parameters: image spatial resolution 3648 × 2736 (72 dpi in both the x and y directions), ISO 800 or 1600, aperture f/5.3, focal length 26.3 mm, and without flash. Figure 1 demonstrates ten typical cell confluence images.

Materials and Methods
Cell line-human normal oral fibroblasts (hOFs) were cultured in vitro from oral gingival tissues isolated from adult donors undergoing third molar extraction, under the approval of the Institutional Review Boards (IRB) of National Taiwan University Hospital (IRB approval number: 203106005RINC), Taiwan. In brief, clinical specimens were processed in sterile conditions using fine forceps to separate the epithelium and fibroblastic tissues. Isolated fibroblastic tissues were covered with a coverslip, and we then waited for cell migration out of the tissues. Once the cells had moved out of the tissues, the fibroblastic tissues were removed and the cells were trypsinized and cultured in Dulbecco's modified eagle's medium (DMEM, Life Technologies Co. Grand Island, NY, USA) containing 10% fetal bovine serum (FBS), 2 mM L-glutamate, and 50 U/mL penicillin (Sigma-Aldrich, St. Louis, MO, USA). Cells were incubated in a humidified incubator with 5% CO2 at 37 °C and passaged every two days. After seeding, cell images were acquired under an inverted microscope (Olympus CKX41, Shinjuku-ku, Tokyo, Japan) with a digital camera (COOLPIX P5000, NIKON CORP., Minato-ku, Tokyo, Japan) every 6 h for a total of 60 h. Five regions of each dish (rightmost, leftmost, upper, lower, and middle positions) were selected for image acquisition. The images were taken with the following parameters: image spatial resolution 3648 × 2736 (72 dpi in both the x and y directions), ISO 800 or 1600, aperture f/5.3, focal length 26.3 mm, and without flash. Figure 1 demonstrates ten typical cell confluence images. Typical cell images of different confluence levels, from 5.5 to 94.5%. These are the raw color images. The algorithm designed in this study was used to calculate the confluence percentages of these images.
Cell segmentation algorithm-there are many image processing techniques, and combinations of some of these techniques can produce thousands of tools. Top-hat transform is a morphological operation used in many image processing tasks, such as feature extraction, background equalization, and image enhancement. The top-hat transform of image f is given by where o denotes the opening operation and s is the structural element. The structural element has a disk shape with radius 4. After the top-hat transform, the background is equalized and enhanced as shown below. Thereafter, it is easily clustered into two categories: foreground (the cells) and background. The famous Otsu thresholding technique can then be applied to find an optimal threshold value. The Otsu thresholding technique exhaustively searches for the threshold that minimizes the within-class variance, defined as a weighted sum of variances of the two classes: where p1 and p2 are the probabilities of classes 1 and 2 separated by the threshold t, 1 255 and t is an integer; and and are the variances of these two classes. The probabilities p1 and p2 are Cell segmentation algorithm-there are many image processing techniques, and combinations of some of these techniques can produce thousands of tools. Top-hat transform is a morphological operation used in many image processing tasks, such as feature extraction, background equalization, and image enhancement. The top-hat transform of image f is given by where o denotes the opening operation and s is the structural element. The structural element has a disk shape with radius 4. After the top-hat transform, the background is equalized and enhanced as shown below. Thereafter, it is easily clustered into two categories: foreground (the cells) and background. The famous Otsu thresholding technique can then be applied to find an optimal threshold value. The Otsu thresholding technique exhaustively searches for the threshold that minimizes the within-class variance, defined as a weighted sum of variances of the two classes: where p 1 and p 2 are the probabilities of classes 1 and 2 separated by the threshold t, 1 < t < 255 and t is an integer; and σ 2 1 and σ 2 2 are the variances of these two classes. The probabilities p 1 and p 2 are computed from the 256-bins of the gray-level image histogram: where h(i) is the relative histogram of the gray-level i. Notably, p 1 + p 2 = 1.
For two classes as in this study, minimizing the within-class variance is equivalent to maximizing the between-class variance since σ 2 = σ 2 w + σ 2 b , where σ 2 is the image total variance. The between-class variance is: For simplification, we remove the threshold parameter t. Since we know the two relationships. p 1 + p 2 = 1 and µ = p 1 µ 1 + p 2 µ 2 , the right-hand term in Equation (3) becomes Thus, we obtain a simplified equation to calculate the between-class variance σ 2 b = p 1 1−p 1 (µ − µ 1 ) 2 by using only µ 1 , p 1 , and µ. The threshold value t is obtained by maximizing the between-class variance σ 2 b . After the threshold value is determined, the cell confluence image is then binarized, the area above the threshold value is calculated, and the area percentage is saved. We obtained three images for each percentage category for linear regression. The total number of images used for regression analysis was 30. Every image was visually controlled by an expert to be categorized by percentage.
The requirements of using this program are the following: (1) The illumination setting of the digital camera should be consistent; (2) The camera parameters such as the focal length and zoom factor should be fixed.
The "Confluence-Viewer" Matlab source code is open in the "Matlab file exchange" website. The link is as in [25]. Some cell images for constructing the regression model are downloadable from [26].
Statistical analysis-Pearson's correlation was used to measure the linear correlation between two variables. Pearson's r (Pearson correlation coefficient) is a value between +1 and −1. The correlation and statistical significance were analyzed using the IBM SPSS Statistic software (ver. 27, Armonk, NY, USA).
Regression models-two models were used for curve fitting: the quadratic model and the shape-preserved cubic interpolation model. A quadratic model assumes that the data points form a parabolic curve, i.e., y = a 1 x 2 + a 2 x + a 3 . Three points can determine a parabolic curve. However, when the data consist of more than three points, it is an overdetermined problem. We can use the least square error method to minimize the total error and analytically determine the three parameters a 1 , a 2 , and a 3 . This quadratic model can be constructed using the Matlab function "polyfit". The shape-preserving cubic interpolation is a similar extension and can be constructed using the Matlab function "interp1" with the parameter "pchip".

Acquisition and Processing of Cell Culture Images
Firstly, we cultured hOFs by seeding a regular cell number (5 × 10 6 ) in a 10 cm dish, and cell images were acquired using an optical microscope with a digital camera every 6 h. The typical cell images were divided into 10 incremented cell confluences, which represented the cells occupying the cultured area denoted as a percentage ( Figure 1). Herein, we demonstrate an easy-to-implement scheme that could reach our goal. The flowchart of our scheme is shown in Figure 2. This flowchart describes the scheme using one of the ten datasets. The raw image was recorded in color. For the algorithmic measurement of the cell confluence, the digital images were converted into gray mode using only the red channel. Since the high-resolution image might contain noise, the image was diminished to only Appl. Sci. 2020, 10, 9146 5 of 12 6% of the raw size. This is helpful for reducing the computational cost without affecting the accuracy of the final result. The uneven illumination problem (as shown in each top-left subfigure of Figure 3) was alleviated using the top-hat transformation (as shown in each middle subfigure of Figure 3), followed by threshold segmentation to generate binary images for determining the percentage of cell confluence. Transformed binary images of different cell confluence levels were obtained (as shown in each bottom-right subfigure of Figure 3). Two regression models were tested, one being the quadratic model and the other being the shape-preserved cubic interpolation model. The abscissa of Figure 4 denotes the cell confluence area percentage calculated by our scheme. The ordinate is the percentage visually determined by an expert. The circle points denote the 30 image data points. The red curve presents the quadratic model and the black curve presents interpolation using the shape-preserving piecewise cubic model, as described in Section 2. In this way, the regression model was built for the following tests as shown in Figure 4.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 12 one being the quadratic model and the other being the shape-preserved cubic interpolation model. The abscissa of Figure 4 denotes the cell confluence area percentage calculated by our scheme. The ordinate is the percentage visually determined by an expert. The circle points denote the 30 image data points. The red curve presents the quadratic model and the black curve presents interpolation using the shape-preserving piecewise cubic model, as described in Section 2. In this way, the regression model was built for the following tests as shown in Figure 4.

Determination of Cell Confluence
The different confluence levels shown in the cell photos were quantified by the algorithm (see Materials and Methods), experienced researchers, and novice students. Photos of the cells marked from 1 to 10 represent the increased cell number, and they were used for the prediction of cell confluence by humans or the algorithm. Each photo was examined by six experienced researchers and six novice students. The results showed that the best regression of mean cell confluence was

Determination of Cell Confluence
The different confluence levels shown in the cell photos were quantified by the algorithm (see Materials and Methods), experienced researchers, and novice students. Photos of the cells marked from 1 to 10 represent the increased cell number, and they were used for the prediction of cell confluence by humans or the algorithm. Each photo was examined by six experienced researchers and six novice students. The results showed that the best regression of mean cell confluence was

Determination of Cell Confluence
The different confluence levels shown in the cell photos were quantified by the algorithm (see Materials and Methods), experienced researchers, and novice students. Photos of the cells marked from 1 to 10 represent the increased cell number, and they were used for the prediction of cell confluence by humans or the algorithm. Each photo was examined by six experienced researchers and six novice Appl. Sci. 2020, 10, 9146 7 of 12 students. The results showed that the best regression of mean cell confluence was performed using the algorithm prediction ( Figure 5A). The mean cell confluence predicted by experienced researchers also generated a better regression than that predicted by novice students (Figure 5B,C). The time for humans to determine the cell confluence for each photo was limited to 1 min. Interestingly, the regression decreased when no time limitation was set for experienced researchers to determine the cell confluence ( Figure 5D).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 12 performed using the algorithm prediction ( Figure 5A). The mean cell confluence predicted by experienced researchers also generated a better regression than that predicted by novice students (Figure 5B,C). The time for humans to determine the cell confluence for each photo was limited to 1 min. Interestingly, the regression decreased when no time limitation was set for experienced researchers to determine the cell confluence ( Figure 5D).

The Correlation of Cell Confluence Predicted by Novice Students and Experienced Researchers
We used Spearman's rank correlation analysis to compare the correlations of cell confluence levels determined by novice students with those determined by experienced researchers and the algorithm. The results show that little correlation was found between novice students and experienced researchers in terms of determined cell confluence by visualizing cell images with or without time limitation (Figure 6A,B). The correlation between the novice-and algorithm-determined cell confluence was also low ( Figure 6C). The 95% limit of agreement was then determined using the Bland-Altman analysis. Novices versus experienced researchers with or without time limitation for confluence determination and versus the algorithm are shown in Figure 6D-F, respectively. The results suggest that the cell confluence levels determined by novices and experienced researchers or the algorithm have low consistency.

The Correlation of Cell Confluence Predicted by Novice Students and Experienced Researchers
We used Spearman's rank correlation analysis to compare the correlations of cell confluence levels determined by novice students with those determined by experienced researchers and the algorithm. The results show that little correlation was found between novice students and experienced researchers in terms of determined cell confluence by visualizing cell images with or without time limitation (Figure 6A,B). The correlation between the novice-and algorithm-determined cell confluence was also low ( Figure 6C). The 95% limit of agreement was then determined using the Bland-Altman analysis. Novices versus experienced researchers with or without time limitation for confluence determination and versus the algorithm are shown in Figure 6D-F, respectively. The results suggest that the cell confluence levels determined by novices and experienced researchers or the algorithm have low consistency.

The Correlation of Cell Confluence Predicted by Experienced Researchers and the Algorithm
We then examined the correlation of cell confluence levels predicted by experienced researchers and the Confluence-Viewer program. Interestingly, the results showed a good correlation between the algorithm-and experienced-researcher-predicted cell confluence with or without time limitation ( Figure 7A,B). Although a longer determination time for experienced researchers decreased their prediction accuracy, a high correlation remained in these two conditions ( Figure 7C). Thus, the algorithmdetermined cell confluence is comparable to experienced researchers' predictions. Again, the Bland-Altman analysis was used to determine the 95% limit of agreement. The confluence determination results of experienced researchers with or without time limitation versus the algorithm are shown in Appl. Sci. 2020, 10, 9146 9 of 12 Figure 7D,E. Additionally, the cell confluence determined by experienced researchers showed little bias with or without time limitation ( Figure 7F). From Figure 7E, we can see that the difference between results from experienced researchers (without time limitations) and the algorithm is very limited. The standard deviation of the difference is the smallest among all comparisons ( Figures 6D-F and 7D-F). This result reveals that the proposed software system is reliable and has the potential to replace experienced researchers' work in cell confluence quantifications.
We then examined the correlation of cell confluence levels predicted by experienced researchers and the Confluence-Viewer program. Interestingly, the results showed a good correlation between the algorithm-and experienced-researcher-predicted cell confluence with or without time limitation ( Figure 7A,B). Although a longer determination time for experienced researchers decreased their prediction accuracy, a high correlation remained in these two conditions ( Figure 7C). Thus, the algorithm-determined cell confluence is comparable to experienced researchers' predictions. Again, the Bland-Altman analysis was used to determine the 95% limit of agreement. The confluence determination results of experienced researchers with or without time limitation versus the algorithm are shown in Figure 7D,E. Additionally, the cell confluence determined by experienced researchers showed little bias with or without time limitation ( Figure 7F). From Figure 7E, we can see that the difference between results from experienced researchers (without time limitations) and the algorithm is very limited. The standard deviation of the difference is the smallest among all comparisons ( Figures 6D-F and 7D-F). This result reveals that the proposed software system is reliable and has the potential to replace experienced researchers' work in cell confluence quantifications.

Discussion
The measurement of cell growth is the most fundamental technique to be learned by new students who are interested in working in molecular and cell biology labs. Except for routine passages, adherent cell cultures are usually treated with experimental agents before they are trypsinized for counting cell numbers. Therefore, the adequate and accurate determination of cell

Discussion
The measurement of cell growth is the most fundamental technique to be learned by new students who are interested in working in molecular and cell biology labs. Except for routine passages, adherent cell cultures are usually treated with experimental agents before they are trypsinized for counting cell numbers. Therefore, the adequate and accurate determination of cell confluence can greatly influence the results of experiments. For example, different levels of cell confluence represent specific phases of growth curves, which may differentially respond to the same agents [27]. An inaccurate determination of cell confluence would lead to poor experimental results and meaningless repeats and troubleshooting. Our analysis demonstrated that inexperienced students made more diverse decisions on cell confluence than did experienced researchers and the algorithm. A closer correlation was also observed between experienced researchers and the algorithm. In a busy lab, experienced researchers may overlook the guidelines on confluence determination for novice students for various reasons. Therefore, our algorithm would save time for experienced researchers and prevent consumable supplies being wasted due to mistaken confluence causing confused data.
Busschots et al. proposed a non-invasive, non-destructive, and label-free method using Image J freeware bundled area fraction (AF) output analysis [28]. In our method, we proposed a scheme and designed a program on the Matlab platform by directly measuring the cell shapes and unoccupied space using hundreds of cell imaging files for testing. In this method, we built a regression model based on morphologic operations, an image enhancement method, and a quadratic regression. This algorithm could precisely recognize similar types of cells with distinct morphology and confluence. In previous work, the free software Image J was used to adjust parameters, and all operations had to be performed manually. For example, "image analysis" included eight steps that had to be operated non-automatically. This may be not suitable for processing sets of images. In this study, we were able to write the source codes with image processing techniques. This algorithm and its code are fully automatic for processing files by batch. We are able to modify the program if the images have different situations such as different illuminations and focal lengths, and even if the cell morphology is different from the shape of the fibroblasts. Our scheme established a robust and user-friendly code that has better potential for multiple applications.
We used human oral fibroblasts in the current study. As the primary cells will enter replicative senescence, it may be possible to use this program to assess the growth of oral cells and their responses to compounds against oral ulcers. We also expect to execute the high-throughput analysis of malignant cells treated with therapeutic agents to determine the efficacy of drugs using this Confluence-Viewer program.

Conclusions
In summary, we provided a novel non-invasive and non-destructive algorithm to assist inexperienced students and even novice cell biologists in making consistent, systematic, and fast quantifications of cell confluence. This method was also validated using predictions of cell confluence from experienced researchers, which were highly correlated with the results of the algorithm. Our source codes are open, and we expect that this program will be widely used or even modifiable by cell biologists working in various research fields for more efficient experimental progression.