A Keratin 12 Expression-Based Analysis of Stem-Precursor Cells and Differentiation in the Limbal–Corneal Epithelium Using Single-Cell RNA-Seq Data

Simple Summary The corneal epithelium protects the eye against external insults. It spreads between two domains, an outer rim or limbus, and the vision-critical central cornea. The epithelial cells constantly proliferate at the limbus from resident stem cells. The proliferating cells then migrate toward the central cornea, where there is a constant cell loss. As the cells migrate, they change their gene complexion causing changes in their properties toward the properties needed for their roles at the central cornea; this process is called differentiation. When a limbal stem cell deficiency develops, e.g., due to limbal injury, the limbus-to-cornea cell resupply stops, and the central corneal cell population dwindles. This results in corneal scarring that blocks vision. In-depth knowledge of the gene changes and the stages of change are essential for the design of corneal restorative strategies. Differentiation correlates with increases in the gene Krt12. After discovering that Krt12 increases are stepwise, this feature was exploited to identify the correlating changes in each gene of a set that included the 5647 most abundant corneal epithelial genes. The analysis yielded lists detailing the existence of hundreds to thousands of genes that change within any two sequential stages of Krt12 increase. Abstract The corneal epithelium (CE) is spread between two domains, the outer vascularized limbus and the avascular cornea proper. Epithelial cells undergo constant migration from the limbus to the vision-critical central cornea. Coordinated with this migration, the cells undergo differentiation changes where a pool of unique stem/precursor cells at the limbus yields the mature cells that reach the corneal center. Differentiation is heralded by the expression of the corneal-specific Krt12. Processing data acquired by scRNA-Seq showed that the increase in Krt12 expression occurs in four distinct steps within the limbus, plus a single continuous increase in the cornea. Differential gene analysis demonstrated that these domains reflect discreet stages of CE differentiation and yielded extensive information of the genes undergoing down- or upregulation in the sequential transition from less to more differentiate conditions. The approach allowed the identification of multiple gene cohorts, including (a) the genes which have maximal expression in the most primitive, Krt12-negative cell cohort, which is likely to include the stem/precursor cells; (b) the sets of genes that undergo continuous increase or decrease along the whole differentiation path; and (c) the genes showing maximal positive or negative correlation with the changes in Krt12.


Introduction
The corneal epithelium (CE) constitutes the first line of defense of the vision system against pathogen invasion, and it is also a critical component of optical visual focusing.The mature CE consists of 5-6 layers of stratified non-keratinized epithelium.Its differentiation plan and homeostasis resemble that of the other outer ectoderm-derived stratified epithelia such as the epidermis and various mucosae.Briefly, the surface-exposed cells at the upper Biology 2024, 13 layer continuously exfoliate and this cell loss, in turn, is balanced by the proliferation of cells within the basal layer of the strata.Studies over the last four decades have shown that the short-term needs for cell replacement, whether homeostatic or acute, are fulfilled by a rapidly proliferating and partially differentiated cell type.Because the capacity for cycles of replication of these cells is limited, they have been classically referred to as transient amplifying (TA) cells.The resupply of this short-term response cell population depends on a population of stem/precursor cells which are normally in a semi-quiescent or slow cycling state but can increase their proliferation as needed to replenish a diminished TA cell population.These cells tend to reside within specialized niches and maintain an extended replicative capacity [1,2].Another common feature of most ectoderm-derived stratified epithelia is the generalized expression of the tonofilament-forming acidic-basic cytokeratin pair Krt14 and Krt5.However, epithelia are distinguished from each other by the expression of distinct complements of acidic and basic cytokeratins, likely needed to generate tonofilaments that best suit the specific functional needs of each lining [3].The CE is characterized by the expression of Krt12, not identified in any other tissue, and Krt3, which is rarely expressed in other stratified epithelia [4][5][6][7][8].
In addition to its cytokeratin specificity, the CE presents a unique feature that distinguishes it from all other epithelia.Owing to the avascular nature of the CE, the precursor stem cells are segregated to the vascularized outer annulus that separates the cornea from conjunctival lineages.All basal limbal cells are negative for the tissue-specific cytokeratins till the border with the transparent corneal zone and become strongly positive, in an abrupt manner, as the cell migrates into the avascular cornea or stratify within the limbus [4,[9][10][11].From a health perspective, this dual-domain arrangement represents a biological weakness.Damage to the thin limbal rim interrupts the supply of cells for the maintenance of TAs at the adjacent peripheral corneal zone and opens the door for the centripetal invasion of the pro-vascular conjunctival epithelium.This colonization leads to neovascularization, and, because the CNJE performs poorly as a fluid permeability barrier and pathogen refractor, to stromal scarring and pathogen penetration.The resulting blinding state is referred to as limbal stem cell deficiency (LSCD) [12,13].The correction of an LSCD state requires the transfer of a suitable population of cells through a transplant operation.The success of these transplants depends on the regenerative capacity of the transferred cells [14,15].Hence, an in-depth characterization of the precursor compartment and the stage of early differentiation are valuable tools for the design of best clinical reparative strategies and analysis of their outcomes.
Microarrays and real-time polymerase chain reactions have been used to identify gene profiles that underpin the phenotypes of lineages or specific cell cohorts of the ocular surface epithelia [16][17][18][19].More recently, the advent of the single-cell RNA seq (scRNA) methodology has led to the genetic segregation of the CE into clusters of cells exhibiting similar GE profiles, and through this clustering, to the assignment of genes as primarily belonging to specific limbal, corneal, and conjunctival, basal, or suprabasal domains or cohorts [19][20][21][22][23]. Bearing in mind the intimate relationship between the expression of Krt12 and the differentiation process, this report uses scRNA-Seq data to examine the correlation between the levels of Krt12 expression and the expression of other genes along the limbus to the central cornea differentiation and migration axis.

Tissue Procurement
Human corneas from unidentifiable cadaver donors, obtained under informed consent, were obtained from the Eye-bank from the National Research Disease Interchange (Philadelphia, PA, USA), and processed between 48 and 72 h of death.The procurement request excluded donors who had undergone chemotherapy.No donor details apart from age and sex were released.The donors were Caucasian males of ages 68, 71, and 65.The Icahn School of Medicine Institutional Review Board has determined that as per section 45 CFR Part 46 of the U.S.A. Code of Federal Regulations, unidentifiable cadaver tissue does not constitute research in human subjects (see https://grants.nih.gov/policy/humansubjects/hs-decision.htm(accessed on 7 December 2023)).The use of human tissue in this study was in accordance with the provisions of the Declaration of Helsinki.

Cell Preparation
Corneas were radially split into 8 segments on a cutting board.Each octile was placed on the stage of a stereoscope fitted with a rotatable black plastic board and illuminated by a 150 W Fiber Optic Dual Gooseneck Illuminator (Cole Palmer, Vernon Hills, IL, USA).After accommodating the angle of the two illuminating beans, it was possible to visualize any remaining conjunctival tissue and the limbal zone.A drop of Trypan blue was then added for 30 sec and washed with saline.The stain highlighted the freely accessible conjunctival stroma and its underlying sclera and any area of damaged or missing corneal epithelium.The conjunctiva was removed by picking up the loose tissue and cutting it out with a fine iris scissor to the edge of the scleral stain (CNJ sample, Figure 1).Next, after cutting away and discarding the blue-stained sclera, a limbal-peripheral strip (LiPe sample) was collected by a cut in the peripheral zone which resulted in a width of the peripheral tissue as close as possible to the visible width of the limbus at each particular octile.Finally, cuts were made to collect a section of the adjacent periphery (Pe sample) of a width similar to the width of the periphery included in the LiPe strip, and a segment of the central cornea showing an undamaged overlying epithelium (Co sample).The surgery was performed using the tip of Extra Keen Blue single-edged blades.The octile strips from each sample type were incubated with 2 mg/mL Dispase II (Fisher, Waltham, MA, USA) made in a bicarbonate-free 1:1 mix of Dulbecco's modified minimum essential medium and Ham F-12 (DMEM-F12; Fisher) for 18 h, at 4 • C under a 60 tilting/min motion.At the end of this treatment, sheets of epithelial cells were either free-floating in the Dispase II solution or were lightly attached to the stroma, from which they were released by gently prodding with the tip of a jeweler's forceps.The sheets were incubated in a 0.25% Trypsin (Fisher) solution at 37 • C for 5 min (2 mL/sample), the Trypsin medium was admixed with 4 volumes of DMEM-F12-10% fetal bovine serum (Atlanta Biologics, Flowery Branch, GA, USA), and single-cell suspensions were generated by 40 passes through a 5 mL pipette.
Cells were spun down in a clinical centrifuge and resuspended, in 2 mL of DMEM/F12, triturated again, filtered through a 70 µ filter, and cultured for 3 ½ h in a 25 mL culture flask.After visually confirming the presence of single attached cells displaying side blebs indicating incipient attachment and spreading, the flask was set horizontally for 3-4 min to allow full draining of unattached cells.The accumulated medium was fully aspirated and the lightly attached cells were released by gentle streaming from a 1 mL pipette tip.The protocol recovered about ¼ of the cells in the suspension.This brief cell adhesion protocol harvested most basal epithelial cells, while excluding or drastically minimizing the recovery of supra-basal epithelial cells.
Eighty percent of the recovered cells were used for scRNA-Seq analysis.One full set of CNJ, LiPe, Pe, and Co samples (Exp. 1) was derived from a single donor and two additional LiPe samples (Exp. 2 and 3) from another two corneas.The remaining twenty percent of the samples were complemented with 1 µg/mL propidium iodide (PI) and the forward light scattering (FSC), a relative measure of overall cell size, of the PI-negative cells was used to determine the fraction of Li and Pe derived cells within the population.

Single-Cell RNA-Seq Reading
Libraries were prepared using the 10x Genomics Chromium Controller (Pleasanton, CA, USA) in conjunction with the single-cell 3′ v2 kit.Briefly, the cDNA synthesis, barcoding, and library preparation were carried out according to the manufacturer's instructions.Libraries were sequenced on a Novaseq 6000 instrument.(Illumina, San Diego, CA, USA).Sample demultiplexing, barcode processing, and unique molecular identifier (UMI) counting were performed by using the 10x Genomics pipeline Cell Ranger v.2.1.0with default parameters.Specifically, for each library, raw reads were demultiplexed using the pipeline command 'cell ranger mkfastq' in conjunction with 'bcl2fastq' (v2.17.1.14,Illumina) to produce two fastq files: the read 1 file contains 26 bp reads, each consists of a cell barcode and a unique molecule identifier (UMI), and the read 2 file contains 96 bp reads including cDNA sequences.Reads then were aligned to the human reference genome (GRCh38), filtered, and counted using 'cell ranger count' to generate the gene barcode matrix.The genomics coverage ranged between 94.2 and 96.1% for all the samples.CSV files were derived from the feature barcode matrix data following the 10x Genomics instructions.The CSV files were opened in Microsoft Excel and processed as described below.

Single-Cell RNA-Seq Reading
Libraries were prepared using the 10x Genomics Chromium Controller (Pleasanton, CA, USA) in conjunction with the single-cell 3 ′ v2 kit.Briefly, the cDNA synthesis, barcoding, and library preparation were carried out according to the manufacturer's instructions.Libraries were sequenced on a Novaseq 6000 instrument.(Illumina, San Diego, CA, USA).Sample demultiplexing, barcode processing, and unique molecular identifier (UMI) counting were performed by using the 10x Genomics pipeline Cell Ranger v.2.1.0with default parameters.Specifically, for each library, raw reads were demultiplexed using the pipeline command 'cell ranger mkfastq' in conjunction with 'bcl2fastq' (v2.17.1.14,Illumina) to produce two fastq files: the read 1 file contains 26 bp reads, each consists of a cell barcode and a unique molecule identifier (UMI), and the read 2 file contains 96 bp reads including cDNA sequences.Reads then were aligned to the human reference genome (GRCh38), filtered, and counted using 'cell ranger count' to generate the gene barcode matrix.The genomics coverage ranged between 94.2 and 96.1% for all the samples.CSV files were derived from the feature barcode matrix data following the 10x Genomics instructions.The CSV files were opened in Microsoft Excel and processed as described below.

Data Preprocessing
Figure 1B shows the flow cytometry FSC distributions for all three CE samples submitted for scRNA-Seq.Limbal basal cells are remarkably and uniformly smaller than the adjacent peripheral cells.The size transition occurs sharply at the Li-Pe interphase [24].Suprabasal limbal cells are also much larger than their basal counterparts [20].Basal limbal cells accounted for 53% of the cells in the sample.The percentages for the other two LiPe samples were 62% and 58%, respectively.PI-stained cells amounted to less than 2% in all cases.
In a first step, the samples were purged of non-CE lineage cells.All columns containing the melanocyte markers DCT, Tyr, Tyrp1, and/or Mlan-A (6.1% of the total; Table 1) were deleted.About 20% of the melanocyte-containing markers occurred in columns concurrently expressing keratin epithelial markers (Krt5, Krt14, or Krt12).Hence, it is reasonable to assume that these columns consist of epithelial-melanocyte hetero-cellular duplets [25].Consistent with the expected localization of melanocytes in the limbus, there were few cells expressing melanocyte markers in the pure peripheral or central corneal samples.A few cells expressing the AQP5 CNJE marker [16] or suspected blood-exclusive CD markers were also excluded.The remaining cells expressed at least one of the two cytokeratins, Krt5 and Krt14, which are expressed in all stratified epithelia.The starting cell number, the number and percent of purged cells, and the number of cells remaining after the cell exclusion for each sample are detailed in Table 1.Next, the mean gene count (MGC) per cell was calculated and plotted in decreasing order (Figure 2A).The resulting curves easily revealed an inflection at around an MGC of 0.5 in all cases.This accelerated count decrease is suggestive of cells in a decaying state.Thus, the analysis was limited to cells with MGCs above this value.In addition, all cells with an MGC larger than 2.5 were excluded, as those are likely to include a large percentage of cell duplets.Post-trimming images are shown in a comparative format in Figure 2B.The columns were then normalized to an MGC of 1.All the calculations were performed in a ThinkStation computer controlled by an i9 8-core processor.Gaps between domains represent the exclusion of 20 cells at the domain interfaces in the comparative domain analyses.In the table, D4 has been split into two subdomains, D4i and D4ex.

Domain Identification
The non-CE cell, outlier purged and expression normalized spreadsheets were used to examine the distribution of Krt12. Figure 2C depicts the range of Krt12 expression in the LiPe, Pe, and Co samples for Exp. 1. Figure 2D incorporates the ranges for three LiPe samples each derived from a different donor cornea and processed at different times.The high similarity of the Krt12 plots for all three LiPe samples demonstrates the reproducibility of the scRNA-Seq protocol.Plots of all three LiPe samples in the lowest 0.5% of the full expression range (0-20 A.U.s) highlighted the presence of 5 distinct domains of Krt12 increase, D0 through D4, in all three samples.Figure 2E shows the distribution for the larger LiPe sample, LiPe-1.The pattern of Krt12 increase and the expression ranges were remarkably similar in all three LiPe samples, allowing their amalgamation into a single file.As described below, this amalgamation was essential to

Domain Identification
The non-CE cell, outlier purged and expression normalized spreadsheets were used to examine the distribution of Krt12. Figure 2C depicts the range of Krt12 expression in the LiPe, Pe, and Co samples for Exp. 1. Figure 2D incorporates the ranges for three LiPe samples each derived from a different donor cornea and processed at different times.The high similarity of the Krt12 plots for all three LiPe samples demonstrates the reproducibility of the scRNA-Seq protocol.Plots of all three LiPe samples in the lowest 0.5% of the full expression range (0-20 A.U.s) highlighted the presence of 5 distinct domains of Krt12 increase, D0 through D4, in all three samples.Figure 2E shows the distribution for the larger LiPe sample, LiPe-1.The pattern of Krt12 increase and the expression ranges were remarkably similar in all three LiPe samples, allowing their amalgamation into a single file.As described below, this amalgamation was essential to augment the number of differentially expressed genes (DEGs).The Krt12 changes for the amalgamated files are depicted in Figure 2F.The Krt12 range and the cell number included in each domain are displayed in the inserted table.Additionally, the CNJ, LiPe, and its subdomains, Pe, and Co domains are graphically depicted in Figure 1C, where the red color intensity, or lack of it, qualitatively represents the levels of Krt12 expression in the domain.
In contrast with the Krt12 distribution in the LiPe samples, where 32% of the cells did not express Krt12, the Pe and Co samples contained no Krt12-negative cells, and their Krt12 plots displayed a rapid rise starting from the lowest values (Figure 2E).In the Pe sample, only 20 cells, or about 1% of the total population, expressed Krt12 matching the expression range of D0-D3, 0-5.32 arbitrary units (A.U.) (Figure 2E).Thus, it can be reasonably concluded that the overwhelming majority of cells within the D0-D3 range derive from the limbus.They accounted for 60% of the cells in the amalgamated sample (4003 out of 6667), roughly consistent with the percent of Li cells observed in the flow cytometry plot (Figure 1).

Quality Controls and Validations
The results of this study depend on the validity of the underpinning hypothesis, that there is a close relationship between Krt12 expression level and the degree of differentiation.The results also depend on adequate genomic coverage achieved and the cell health and proper normalization of the cell sets.Concerning genomic coverage, the LiPe-1 sample of the scRNA Seq protocol identified 21,944 genes, or 65 percent of the 33,531 probed genes.This percentage is within the percent of probed genes displayed by a wide variety of organ and cell lines [26].All the other samples displayed a very similar global gene distribution.To probe the cell health and proper normalization, it was reasoned that within the single CE, lineage cells could be expected to globally maintain a very similar number of healthy mitochondria and ribosomes.Table 2 lists the mean ± SD of the ratios between any two adjacent domains and between the first (D0) and last (D4i.2) domains, for the proteincoding mitochondrial genes (MT-) and for the genes coding for the proteins that build the small (RPS) and large (RPL) ribosomal complexes.For these comparisons, the D1, D2, and D3 domains have been amalgamated into a single D123 domain because, for most practical reasons, as shown below, they can be taken as one domain.Figure 3 shows the distribution of ratios for all genes included in each gene family.Overall, the patterns displayed support the claim that all the genes in these three sets maintain a constant rate of expression along all the defined Krt12 expression ranges, indicating that both bioenergetic function and protein biosynthesis capacity are fairly well preserved across the whole dataset.Only in the case of RPL, there may be a statistically significant, though very small, expression decrease.Additionally, the noisier D4i.1-D4i.2comparison for RPS genes includes one R equal to a 1.34 GE decrease and one 1/R = 1.35 (R = 0.741) GE increase.Since there is little if any indication of a statistical difference in expression between these domains, to avoid multiple false positives, it will be prudent to limit the DEG definition to those genes complying with a BHp < 0.01 and a ratio higher than 1.35 or lower than 0.741.Table 2. Mean ± SD global ratios between different Krt12 expression domains for all protein coding.Mitochondria (MT-) and small (RPS) and large (RPL) ribosomes.The number of genes in each cohort is given in parenthesis.The p values were calculated by comparing the ratios for all genes in between domain pairs (<>).efficiency and/or accuracy of translation [32,33], and the genes commonly chosen as constancy controls, ACTG1, ACTB, and GAPDH.Table 2. Mean ± SD global ratios between different Krt12 expression domains for all protein coding.Mitochondria (MT-) and small (RPS) and large (RPL) ribosomes.The number of genes in each cohort is given in parenthesis.The p values were calculated by comparing the ratios for all genes in between domain pairs (<>).Figure 4A,B depicts the expression levels of all the expected correlating and invariant genes, respectively, in relationship to Krt12.In general lines, Figure 4A displays the expected results; all five genes follow the general trend of change for Krt12.Of notice, the analysis yielded an absence of statistical Krt3 and GJA1 between the D0 and D123 expression.However, these genes have very low expression in these two domains, and as In addition to the analysis of global gene sets, selected genes for which it is possible to formulate a priori expectations were also examined.Regarding genes that may increase with differentiation, three prospective genes code for aldehyde-converting enzymes, ALDH3A1, ALDH1A1, and TKT, which are highly expressed in the cornea [27,28].Their very high concentration has been hypothesized to help with the transparency of the CE cells, in similarity to their role as crystallins for the ocular lens.Another gene with ample evidence for a differentiation-dependent gene is GJA1/connexin 43 [29][30][31].Finally, an obvious choice for inclusion in this evaluation is Krt3, the other corneal-specific cytokeratin.For genes that could be expected to remain invariant, the selection included (a) EEFA1 and E1F1, two highly expressed genes that play critical but distinct roles in the efficiency and/or accuracy of translation [32,33], and the genes commonly chosen as constancy controls, ACTG1, ACTB, and GAPDH.

D0vD123
Figure 4A,B depicts the expression levels of all the expected correlating and invariant genes, respectively, in relationship to Krt12.In general lines, Figure 4A displays the expected results; all five genes follow the general trend of change for Krt12.Of notice, the analysis yielded an absence of statistical Krt3 and GJA1 between the D0 and D123 expression.However, these genes have very low expression in these two domains, and as shown below, the current expression matrix is not large enough to generate accurate BHp values for low-expression genes.Figure 4B presents some intriguing results.The ribosomal-associated genes EEF1A and EIF1 show only small, not statistically significant decreases very similar to those described above for the ribosomal gene sets.The genes most commonly used as control genes, though, display intriguing patterns.The actin genes undergo marked decreases with the transition from the limbal to the peripheral cornea (i.e., D123 to D4i.1).The near identity of the changes for both actin genes provides Biology 2024, 13, 145 9 of 23 reasonable assurances of the bona fide nature of these expression shifts.GAPDH changes in the opposite direction.The driving forces and functional significance of these changes remain to be investigated.
shown below, the current expression matrix is not large enough to generate accurate BHp values for low-expression genes.Figure 4B presents some intriguing results.The ribosomal-associated genes EEF1A and EIF1 show only small, not statistically significant decreases very similar to those described above for the ribosomal gene sets.The genes most commonly used as control genes, though, display intriguing patterns.The actin genes undergo marked decreases with the transition from the limbal to the peripheral cornea (i.e., D123 to D4i.1).The near identity of the changes for both actin genes provides reasonable assurances of the bona fide nature of these expression shifts.GAPDH changes in the opposite direction.The driving forces and functional significance of these changes remain to be investigated.

Gene Expression Differential Analysis
Differentially expressed genes (DEGs) were defined as genes with Benjamini-Hoechsberg-adjusted p-values (BHp values) lower than 0.01 in an expression comparison.Due to the high frequency of cells showing nil expression for the low-expression genes, the identification of bona fide statistically significant genes decreased with the decrease in gene expression level.This is exemplified in Figure 5, which examines the relationship between gene expression and the frequency of BHp < 0.01 values in a D0 to D4 comparison.Likewise, the size of the populations being compared can be expected to have a strong influence on the calculation of p-values.This expectation was confirmed by comparing the BHp yields when only half of the D0 and D4 domains (e.g., by excluding even columns) were compared with the yield when using the full population sets; the identified gene lists showed identical D0/D4 expression ratios, but the BHp values for these two lists were two more than 2 orders of magnitude larger, setting a large number of genes with BHp values outside the BHp > 0.01 DEG limit.These features represent the main drawback of the methodology implemented in this study; extending the analysis to a larger set of low-expression genes, many of which might contribute to the cell phenotype will require a much larger cell base than the one used here.Hence, comparisons were

Gene Expression Differential Analysis
Differentially expressed genes (DEGs) were defined as genes with Benjamini-Hoechsbergadjusted p-values (BHp values) lower than 0.01 in an expression comparison.Due to the high frequency of cells showing nil expression for the low-expression genes, the identification of bona fide statistically significant genes decreased with the decrease in gene expression level.This is exemplified in Figure 5, which examines the relationship between gene expression and the frequency of BHp < 0.01 values in a D0 to D4 comparison.Likewise, the size of the populations being compared can be expected to have a strong influence on the calculation of p-values.This expectation was confirmed by comparing the BHp yields when only half of the D0 and D4 domains (e.g., by excluding even columns) were compared with the yield when using the full population sets; the identified gene lists showed identical D0/D4 expression ratios, but the BHp values for these two lists were two more than 2 orders of magnitude larger, setting a large number of genes with BHp values outside the BHp > 0.01 DEG limit.These features represent the main drawback of the methodology implemented in this study; extending the analysis to a larger set of low-expression genes, many of which might contribute to the cell phenotype will require a much larger cell base than the one used here.Hence, comparisons were made using the amalgamated file and were limited to the highest 6647 expressing genes out of the more than 21,000 genes identified by the scRNA Seq.
made using the amalgamated file and were limited to the highest 6647 expressing genes out of the more than 21,000 genes identified by the scRNA Seq.The domains indicated by the graphing of Krt12 rise in Figure 2 were used to establish the gene-to-Krt12 expression correlations with the following alterations.First, in the D1-D2 and D2-D3 comparisons, the 10 last and first 10 cells of each domain were excluded to avoid any difference degrading the effect of a zone where the two domains may intermingle.A D4i.1 domain was defined as the first 1000 cells of D4i, after the exclusion of the first 100 cells post-D3 domain.The D4i.2 domain consisted of the next 1000 cells.This domain trimming is indicated in Figure 2F by gaps in the Krt12 line.The D4ex subdomain was not used in the analyses.The Figure 2F inset provides an account of the range of Krt12 expression in A.U., the start and end of each domain, and the number of cells present in each (#).Table 3 provides an account of the identified DEGs for various domain comparisons, without or with the limitation to ratios exceeding 1.35-fold.The D0-D1 comparison is particularly intriguing because it represents the expression changes associated with the initial gene expression of Krt12 that occur while cells are located within the limbal domain; it identifies the genes associated with the cell cohort enriched in the lineage stem cells.There were nearly twice as many downregulated genes as upregulated ones.Table 4 lists the top downregulated (D0/D1; R > 1) and upregulated (D1/D0; 1/R > 1) DEGs, sorted according to decreasing R or 1/R, respectively.The use of the signal ratios seems more relevant than the use of the BHp values because, as discussed above, the latter parameter is heavily influenced by the level of gene expression.The domains indicated by the graphing of Krt12 rise in Figure 2 were used to establish the gene-to-Krt12 expression correlations with the following alterations.First, in the D1-D2 and D2-D3 comparisons, the 10 last and first 10 cells of each domain were excluded to avoid any difference degrading the effect of a zone where the two domains may intermingle.A D4i.1 domain was defined as the first 1000 cells of D4i, after the exclusion of the first 100 cells post-D3 domain.The D4i.2 domain consisted of the next 1000 cells.This domain trimming is indicated in Figure 2F by gaps in the Krt12 line.The D4ex subdomain was not used in the analyses.The Figure 2F inset provides an account of the range of Krt12 expression in A.U., the start and end of each domain, and the number of cells present in each (#).Table 3 provides an account of the identified DEGs for various domain comparisons, without or with the limitation to ratios exceeding 1.35-fold.The D0-D1 comparison is particularly intriguing because it represents the expression changes associated with the initial gene expression of Krt12 that occur while cells are located within the limbal domain; it identifies the genes associated with the cell cohort enriched in the lineage stem cells.There were nearly twice as many downregulated genes as upregulated ones.Table 4 lists the top downregulated (D0/D1; R > 1) and upregulated (D1/D0; 1/R > 1) DEGs, sorted according to decreasing R or 1/R, respectively.The use of the signal ratios seems more relevant than the use of the BHp values because, as discussed above, the latter parameter is heavily influenced by the level of gene expression.The next sequential comparisons, D1-D2 and D2-D3, yielded no DEGs.The combination of D2 and D3, to form the D23 domain, where the maximal Krt12 is three times as large as the maximal Krt12 in D1, yielded only eight upregulated DEGs exceeding the 1/R > 1.35× threshold.One of these genes, CRTAC1, was present within the upregulated D0-D1 DEG list.But, CRTAC1 and all other seven genes, namely, TGFBI, CPXM2, CLU, IGFBP7, ALDH1A1, IGFBP2, and FTH1, were within the top genes in the upregulated D123-Di4.1 list, suggesting that the change in the Krt12 rate of increase that establishes the D2 and/or D3 domains represents the earliest cell changes associated with the transition toward the gene composition of the corneal periphery cell.The comparison of D0 against the full set of limbal domain Krt12-positive cells (D123) expanded the DEG list by 30-40% from the number of DEGs yielded by the D0-D1 comparison.However, for the gene presented in both lists, the ratios for both down-and upregulated DEGs were statistically identical (Mean ± SD of 1.00 ± 0.04 and 1.00 ± 0.04 for dwon and upregulated, respectively and p > 0.6 for both).Thus, notwithstanding the eight genes mentioned above, most of the increase in the DEG list when using D123 instead of D1 in the comparison with D0 is likely due to the effect of the sample size described above in the p values.
Rationally, the next sequential comparison should be between D3 and the first Di4 subdomain Di4.1.However, given the minimal differences between D1, D2, and D3 and considering the small size of D3 and the strong effect of the population size on the chances for identifying DEGs, comparing the whole D123 with Di4.1 instead seemed a more efficient way for DEG identification.Consistent with the concept of a sudden gene expression pattern change at the Li to Pe transition, this comparison yielded very large down-and upregulated DEGs (Table 3).More than 40% of the total cell population displayed a BHp > 0.01.The top downregulated and upregulated DEGs are presented in Table 5.The comparison between D4i.1 and D4i.2 identified genes that undergo statistically significant change along with the increase in Krt12 GE.Table 6 lists the top genes of each category.The average expression values for Krt12 for the D4i.1 and D4i.2 were 37.1 and 127.3 A.U., respectively, equivalent to a 1/R of 3.43.Using this value as a basis for comparison, only three genes displayed an upregulation of larger magnitude than Krt12.In contrast, 25 genes downregulated at a faster speed than the rate of Krt12 increase.The complete set of data used to calculate the is listed in the Supplement S1.Having identified the DEGs associated with each transition, it was possible to identify DEGs that either increase or decrease in expression at each transition and those that display a differential expression only at one of the Krt12-defined domains.The continuous downand upregulation lists consisted of 151 and 72 genes, respectively.The top genes of these two categories are listed in the upper panels of Table 7.Most of the continuously upregulated genes are well-known genes contributing to the corneal phenotype.In addition, 179 genes undergo downregulation only at the start of Krt12 expression in D1, of which 21 had an R exceeding 1.35, the limit used to screen out false positives.These genes are strong candidates as genes that contribute to the stem/precursor cell phenotype and, thus, deserve future attention.Finally, there were 53 genes showing upregulation only in the D0-D1, but the increases were tenuous; no gene showed an increase over the 1.35 limit.
Table 7. Upper panels: Top genes that undergo continuous down or (left top panel) upregulation (right top panel) along the whole Krt12 expression span in the LiPe sample.R is the D0/D4i.2expression ratio; 1/R is the D4i.2/D0 ratio.Lower panels: Top genes that undergo down (bottom left panel) or upregulation (bottom right panel) only at the D0 to D1 transition.R1 and R123 are the D0/D1 and D0/D123 expression ratios, respectively; 1/R1 and 1/R123 are the D0/D1 (R1) and D0/D123 R123 ratios, respectively.The genes with ratios below 1.35 are displayed in italics.Finally, we used the single large 4700-cell Co (central cornea) sample to identify genes associated with the expression of high levels of Krt12 in the basal central corneal cells.The cell set was divided into four quartiles and DEGs between the first (Q start ) and fourth (Q end ) quartiles were calculated.The total number of DEGs are listed in Table 3 (Q start /Q end ); the top down-, and upregulated genes are shown in Table 8.The Q start /Q end ratio for Krt12 was 4.22.Thus, all 160 genes listed in Table 8 undergo fold changes that exceed the change in Krt12.

Gene Ontology Differential Analysis
From the limbus.These cells are more likely to be approaching stratification than the limbal and peripheral cells.DEGs were evinced from a comparison of the lowest 1000 Krt12 (C0-Q start ) expression level cells vs. the highest 1000 (Co-Q end ).The larger dataset allowed the use of 7300 genes.The comparison identified 2002 downregulated and 4277 upregulated DEGs, i.e., 86 % of the genes in the sample.Table 8 lists the top.
To explore the functional and phenotypic source of the differential gene expression between domains, a differential gene ontology analysis was implemented.The downand upregulated whole gene lists for the D0-D1 and the D123-D4i1 sets were submitted for Panther (ver.18.0) classification at https://www.geneontology.org/(accessed on 6 November 2023).The resulting statistically significant domains overrepresented GO terms unique to either each of the domains were identified.Table 9's upper left and right frames list the top unique GO terms (UGOTs) of D0 not present in D1, and the respective D1 UGOTs, limited to those with a false discovery rate < 0.01 and ranked according to the extent of gene overrepresentation (FE).Table 9's bottom left and right frames list the top unique D123 and D4i.1, respectively.Due to the hierarchical structure of the gene ontology classification, lists contain multiple terms related to a single metabolic or bioenergetic activity.Thus, to allow the inclusion of terms associated with different functions within the list length limitation, we have selectively removed the general category terms.Four intriguing terms within D123, the domain representing all Krt12-positive cells within the limbus, are the positive regulation of protein localization to the Cajal body, maturation of LSU-rRNA, negative regulation of stem cell differentiation, and regulation of stem cell population maintenance.For the Q4i.1 set, it is possible to note lumen acidification of several organelles, desmosome organization, cellular response to arsenious substances, and glycolytic processes.

Gene Expression Correlation Analysis
An alternative inquiry on gene expression within the cornea can be based on the degree of correlation between the change in any gene in comparison with the change in Krt12.Due to the very shallow Krt12 rate of change, this approach is not effective for the small Krt12 increase within the limbal D0-D3 domains.The larger cell list for the central cornea, in contrast, provided a basis for obtaining robust data.The first 4600 cells of the Co sample were organized in ascending Krt12 expression levels and split into twenty 230-cell quantiles (Q1-Q20), and the correlation of each gene expression change with the changes in Krt12 was calculated using Excel's Correl function.The top genes positively and negatively correlating with the Krt12 expression changes, respectively, are listed in the upper frame of Table 10, top frame.Table 10, lower frame lists the genes that show CC values in excess + 0.90 in both the D4 domain of LiPe, the Pe, and the Co samples.The 2000-cell D4i subdomain was divided into 20,100-cell quantiles, and the average keratin expressions were calculated and plotted from Q1 to Q20.As depicted in Figure 6, four of the keratins displayed meaningful change.Krt3-expression increases correlated well with the increases in Krt12, though, with a lag in the response.The universal cytokeratins Krt14 and Krt5 displayed opposite changes; Krt5 changes tracked the Krt12 increases, whereas Krt14 exhibited a gradual continuous decrease.Finally, Krt75, a low-expression cytokeratin, showed a particularly intriguing distribution.Within the D0-D3 zone, it remained nearly constant (p > 0.05; Figure 4 inset) but decreased sharply in D4i.

Cytokeratin Expression
Each stratified epithelium expresses a unique set of tonofilament-forming cytokeratin pairs.This pattern is likely to represent an evolutionary adaptation to optimize each lineage to function in its specific environment.In the CE, the tissue-specific cytokeratins are Krt12 and Krt3.However, the CE expresses a large complement of other cytokeratins, in particular, the universal stratified epithelia Krt5 and Krt14.If cytokeratin expression is related to the optimization of function and the expression of Krt12 changes with the degree of differentiation, the question arises as to whether the expression of these other tonofilament-forming proteins undergoes differentiation-associated changes, and if so, how the changes relate to the change in Krt12.The gathered data present a unique opportunity to examine this issue.The 6647-gene LiPe list contains 15 cytokeratins within D4.

Discussion
ScRNA measurements are usually processed by clustering algorithms that calculate the similarity/dissimilarity of data points.One of the main uses of this methodology is the identification and characterization of heretofore undetected phenotypically distinct minute cell populations residing within an organ or tissue, e.g., immunosurveillance or incipient transformed cells.In the ocular surface, several studies have employed this methodology to assign cells as primary belonging to one of the various distinguishable domains of conjunctival and corneal lineages [20][21][22][23].Following this clustering, using differential analysis, it becomes possible to associate each expressed gene with a specific cluster and calculate a statistical probability of the assignment.Given its significance in the management of limbal stem cell deficiency, the cluster of basal limbal cells, the site of the lineage stem/precursor cells is of particular interest.
The present study uses an alternative approach for the identification of corneal epithelial domains.It is based on the fact that the CE cells undergo a single linear differentiation path, coupled with the hypothesis that the degree of corneal epithelial cell differentiation within the basal cell compartment is reflected in the level of Krt12 expression.The resulting extensive differential expression lists and GO terms between the identified domains bear out the validity of the approach.
A graphic analysis of the rate of change of Krt12 in the LIPe sample led to the identification of four distinct Krt12 expression domains within the limbal zone.Approximately half of the cells were Krt12-negative and the rest showed Krt12 increasing in three discreet steps.These data are the first evidence for global basal limbal cell subsets.The highest Krt12 level in the limbal domains, though, amounts to well less than 1% of the maximal expression level in the central corneal epithelium.This very low expression is unlikely to translate into detectable protein levels.Nevertheless, comparative analysis of gene expression in the D0 vs. D1 domains revealed that at the start of intra-limbal Krt12 expression, a very large number of genes undergo down-or upregulation within the basal limbus.The bona fide stem/precursor cells can be reasonably expected to reside within D0.This proposition seems supported by the D0 UGOTs (Table 9).Firstly, regulation of hemopoiesis and embryonic organ development are direct indicators of a relationship to stemness.Secondly, the highest overrepresented term, the positive regulation of core promoter binding combined with an overrepresentation of genes involved in the negative regulation of DNA-templated transcription and the negative regulation of the RNA biosynthetic process and transcription by RNA polymerase II, yields a picture of cells with high potential for wide gene expression capacity which is prevented from strong expression by mechanisms aimed at slowing RNA and protein synthesis; these are features expected from the quiescent stem/precursor cell.Thirdly, a unique ability to deal with misfolded protein is an expected critical capacity of cells surviving in the tissue for an extended period.The NADH to ubiquinone aerobic electron transport chain UGOT may reflect a higher preference for dependence on oxygen by cells closely associated with the blood circulations than in the D1 cells, which initiate the differentiation toward the anaerobic-preferring feature of the avascular central cornea.The main D1 UGOTs, regulation of plasma membrane organization, intermediate filament cytoskeleton organization, cell-cell junction organization, and keratinocyte differentiation, are indicative of the structural changes in the differentiated phenotype, including a substantial increase in cell size.
The smaller D2 and D3 domains showed very little difference with the D1 domain; they seem to belong to cells that are mostly unchanged in gene complexation from D1.However, the subtle upsurge in these cells of the high expression genes that undergo strong increases as cells undergo a frank transition from the limbal to the periphery zones, indicates that, while still within the limbus by the Krt12 expression criterium, these cells undergo the earliest changes associated with the peripheral-corneal phenotype.It is likely that had the cell number available been larger, more genes would have fallen within the BHp < 0.01 which defines a significant difference in this study.To identify the genes associated with the cell transition from the vascular limbus to the avascular cornea, we compared the whole D123 set against D4i.1, the first 1000 cells of D4 after exclusion of the 100 cells in the transition zone.Consistent with the multiple visible sharp phenotypic changes in the CE cell at the Li-Pe interface, the D123-D4i.1 comparison yielded a very large list of DEGs including about 40% of the probed genes (Table 2).
Regarding the D123-D4i.1 comparison itself, a full interpretation of the UGOTs listed in Table 8 is not easily accomplished.From the top UGOTs for the combined D123 domains, it is clear, though, that abundant RNA processing in the nuclear Cajal body that occurs in the Li ceases or drastically decreases once cells migrate to the Pe.The Cajal body function may be associated with other UGOTs, including regulation of telomere maintenance via telomerase, since the enzyme mRNA has been found to associate with the Cajal body and its telomere length regulation [34,35].Other intriguing UGOTs, potentially reflecting the much less differentiated state of the D123 domain in comparison to D4i.1 are the regulation of hematopoietic progenitor cell differentiation, the regulation of stem cell differentiation, and the regulation of stem cell population maintenance (all these terms are present in both the D0 and D1, and thus do not show up in either the D0 or D1 Table 8 UGOTs).The D4i.1 UGOT list includes processes that increase organelle acidification and cellular response to arsenic-containing substances.Arsenic inhibits various mitochondrial enzymes leading to the uncoupling of oxidative phosphorylation.Thus, both the acidification and the arsenic response terms may simply reflect the genetic changes associated with the transition of the cornea cell from an aerobic to an anaerobic-able gene configuration.
After completing the determination of DEGs associated with each transition, it was possible to categorize genes as continuously changing along the differentiation path or identify genes with selective differential expression at certain stages.This examination identified as many genes undergoing downregulation as upregulation.The latter cohort contains very well-recognized genes associated with the corneal phenotype, such as NQO1 Biology 2024, 13, 145 20 of 23 and the aldehyde dehydrogenases, both important detoxification genes.How the strong downregulation of many genes contributes to the limbal-corneal phenotype remains to be examined.
Probably the most intriguing cohort, though, is that of the genes that only undergo downregulation with the transition from Do-D1; they are likely to be critical genes for the stem/precursor cell or its survival.The top two genes in the list are NR2F2 and ID3.The first is a retinoid-responsive nuclear factor with a wide gene promotion pattern.NR2F2 has been shown to act as a promoter of stemness in the epidermis [36].ID3 is a repressor of basic helix-loop-helix transcription factors and has been shown to support human embryonic stem cell maintenance [37].Both are particularly interesting subjects for further research.The additional two comparisons identified either the genes associated with CE cell maturation in relatively early stages in the peripheral zone next to the limbus (D4i.1-D4i.2comparison; Table 7) or between the lower and higher extremes of Krt12 expression in the central cornea (CoQ start -CoQ end ; Table 8), respectively.
An interesting global feature of the Krt12-linked GE changes can be gleaned from the ratio between the down-and upregulated genes in the various interdomain comparisons (Table 3).Within the LiPe sample, the ratio decreases as differentiation proceeds, equaling 1.68, 1.47, and 1.11 for the D0-D123, D123-D4i.1,and D4i.1-D4i.2comparisons.Since the number of transcripts is the same for all cells in the normalized data, the implication is that gene diversity is been progressively lost as differentiation progresses, in particular within the limbal zone itself.The pattern, though, reverts at the center of the cornea, probably because the downregulation has already been mostly completed there, as suggested by the GE pattern described in Figure 7.  7) or between the lower and higher extremes of Krt12 expression in the central cornea (CoQ start -CoQ end ; Table 8), respectively.
An interesting global feature of the Krt12-linked GE changes can be gleaned from the ratio between the down-and upregulated genes in the various interdomain comparisons (Table 3).Within the LiPe sample, the ratio decreases as differentiation proceeds, equaling 1.68, 1.47, and 1.11 for the D0-D123, D123-D4i.1,and D4i.1-D4i.2comparisons.Since the number of transcripts is the same for all cells in the normalized data, the implication is that gene diversity is been progressively lost as differentiation progresses, in particular within the limbal zone itself.The pattern, though, reverts at the center of the cornea, probably because the downregulation has already been mostly completed there, as suggested by the GE pattern described in Figure 7. Two clustering-based studies of scRNA-Seq data found that the top downregulated gene in the D4i.1 to D4i.2 comparison, GPHA2, is acutely localized to a less differentiated subset of limbal cells [21,22].In this study, though, GPHA2 was substantially expressed in the lower subdomain of the Pe.Furthermore, a D0-D4 plot for GPHA2 showed a distribution similar to that shown in Figure 4 for Krt75.Since Krt75 belongs in the same Table 8 list as GPHA2, it was intriguing to examine the distribution of these two genes in the 20 quantiles of D4 used in Figure 6 and the relationship to other downregulating genes of the D4i.1-D4i.2comparison.Figure 7 describes the changes in GPHA2 and for the four genes with the highest correlation coefficient (>0.95) to its distribution.The inset shows the combination of the previously defined D0-D3 domains with the first four quantiles of D4 for GPHA2 Krt12 and Krt3.It is clear that in our analysis, GPHA2 achieves its maximum expression at the very start of the cell transition to the Pe domain (D4Q1).The other genes in Figure 7 display a very similar domain distribution pattern.The apparent discrepancy between both approaches remains to be resolved.
In summary, the present study demonstrates that the application of Krt12 expression as a gauge of the extent of differentiation is an efficient approach for the identification of Two clustering-based studies of scRNA-Seq data found that the top downregulated gene in the D4i.1 to D4i.2 comparison, GPHA2, is acutely localized to a less differentiated subset of limbal cells [21,22].In this study, though, GPHA2 was substantially expressed in the lower subdomain of the Pe.Furthermore, a D0-D4 plot for GPHA2 showed a distribution similar to that shown in Figure 4 for Krt75.Since Krt75 belongs in the same Table 8 list as GPHA2, it was intriguing to examine the distribution of these two genes in the 20 quantiles of D4 used in Figure 6 and the relationship to other downregulating genes of the D4i.1-D4i.2comparison.Figure 7 describes the changes in GPHA2 and for the four genes with the highest correlation coefficient (>0.95) to its distribution.The inset shows the combination of the previously defined D0-D3 domains with the first four quantiles of D4 for GPHA2 Krt12 and Krt3.It is clear that in our analysis, GPHA2 achieves its maximum expression at the very start of the cell transition to the Pe domain (D4Q1).The other genes in Figure 7 display a very similar domain distribution pattern.The apparent discrepancy between both approaches remains to be resolved.
In summary, the present study demonstrates that the application of Krt12 expression as a gauge of the extent of differentiation is an efficient approach for the identification of the dynamics of gene expression changes underpinning the stem/precursor cell phenotype and the progress of CE differentiation.The Discussion provides a very limited example of the analytical possibilities afforded by the results.A future focus on individual genes may help establish a full representation of the coordination of growth and differentiation in the limbal-corneal epithelial lineage.

Conclusions
The degree of expression of Krt12, the corneal specific cytokeratin, in corneal epithelial basal cells, subjected to single-cell scRNA sequence measurement identifies five different domains characterized by the rate of Krt12 expression increase.Differential gene analysis between these domains demonstrates that they represent defined stages of differentiation.Four of these stages occur within the limbal zone, the seat of the limbal-corneal stem/precursor cells.These results combined with a study of the Krt12 gene correlation generate a whole picture of gene dynamics during all stages of differentiation.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biology13030145/s1,Supplement S1: All data used to generate Tables 4-8.Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Figure 1 .
Figure 1.Ocular surface tissues, surgical segmentation, and the determination of limbal basal cell content and sub-limbal domains based on Krt12 expression changes.(A) Graphical representation of the corneal surface domains.(B) Flow cytometry forward light scattering (FSC) of the adherent epithelial cells collected after a 3 ½ h culture of epithelial cells harvested from either a limbalperipheral combined zone (LiPe), an adjacent peripheral zone (Pe), or a central corneal zone (Co).FSC is a relative measure of cell size.Note that the cells collected from the LiPe sample contain similar amounts of size high and low SFC cells, whereas the Pe and Co populations consist only of high FSC cells.(C) Graphical representation of the four epithelial samples subjected to the scRNA seq analysis.D0-D4 represent the five subsections within the LiPe population identified by the changes in Krt12 expression.The intensity of the red color and the size of each domain has been drawn as a qualitative representation of the Krt12 level and size of each domain, respectively.The potential presence of non-epithelial cells, namely, melanocytes and blood-derived cells within the conjunctival and LiPe domains is indicated.

Figure 1 .
Figure 1.Ocular surface tissues, surgical segmentation, and the determination of limbal basal cell content and sub-limbal domains based on Krt12 expression changes.(A) Graphical representation of the corneal surface domains.(B) Flow cytometry forward light scattering (FSC) of the adherent epithelial cells collected after a 3 ½ h culture of epithelial cells harvested from either a limbalperipheral combined zone (LiPe), an adjacent peripheral zone (Pe), or a central corneal zone (Co).FSC is a relative measure of cell size.Note that the cells collected from the LiPe sample contain similar amounts of size high and low SFC cells, whereas the Pe and Co populations consist only of high FSC cells.(C) Graphical representation of the four epithelial samples subjected to the scRNA seq analysis.D0-D4 represent the five subsections within the LiPe population identified by the changes in Krt12 expression.The intensity of the red color and the size of each domain has been drawn as a qualitative representation of the Krt12 level and size of each domain, respectively.The potential presence of non-epithelial cells, namely, melanocytes and blood-derived cells within the conjunctival and LiPe domains is indicated.

Figure 2 .
Figure 2. Mean gene count distribution and relative Krt12 expression in the limbal-corneal epithelial samples.(A) Mean gene count distribution of the cell population after excluding non-epithelial and conjunctival cells.In all three corneal related samples, at a certain point, the mean count undergoes a rapid decrease.At the high end of the count, there are small populations of high cell count.(B) The mean gene count distribution following exclusion of the very high and low gene count populations.(C) Krt12 gene expression distribution in the LiPe, Pe, and Co populations.Only the LiPe contains cells with nil or extremely low Krt12 gene expression.(D) The distribution of Krt12 gene expression in three independent scRNA-Seq experiments.(E) The distribution of Krt12 gene expression within the 0-20 A.U. range in the LiPe-1 and Pe-1 samples.Inflections in the rate of increase in Krt12 expression point to five distinct domains, D0 through D4.In the Pe sample, there are no Krt12 nil cells, and only 20 cells have expression levels matching the expression range for the D0-D3 domain.(F) The distribution of Krt12 gene expression within the 0-20 A.U. range in the amalgamated LiPe1-LiPe2-LiPe3 sample.Note the near identity of expression ranges with the LiPe-1 shown in panel E. Gaps between domains represent the exclusion of 20 cells at the domain interfaces in the comparative domain analyses.In the table, D4 has been split into two subdomains, D4i and D4ex.

Figure 2 .
Figure 2. Mean gene count distribution and relative Krt12 expression in the limbal-corneal epithelial samples.(A) Mean gene count distribution of the cell population after excluding non-epithelial and conjunctival cells.In all three corneal related samples, at a certain point, the mean count undergoes a rapid decrease.At the high end of the count, there are small populations of high cell count.(B) The mean gene count distribution following exclusion of the very high and low gene count populations.(C) Krt12 gene expression distribution in the LiPe, Pe, and Co populations.Only the LiPe contains cells with nil or extremely low Krt12 gene expression.(D) The distribution of Krt12 gene expression in three independent scRNA-Seq experiments.(E) The distribution of Krt12 gene expression within the 0-20 A.U. range in the LiPe-1 and Pe-1 samples.Inflections in the rate of increase in Krt12 expression point to five distinct domains, D0 through D4.In the Pe sample, there are no Krt12 nil cells, and only 20 cells have expression levels matching the expression range for the D0-D3 domain.(F) The distribution of Krt12 gene expression within the 0-20 A.U. range in the amalgamated LiPe1-LiPe2-LiPe3 sample.Note the near identity of expression ranges with the LiPe-1 shown in panel E. Gaps between domains represent the exclusion of 20 cells at the domain interfaces in the comparative domain analyses.In the table, D4 has been split into two subdomains, D4i and D4ex.

Figure 3 .
Figure 3. Ratios for each gene included in the MT-, RPS, and RPL gene sets.The dotted lines indicate the 1.25x ratios in either direction.

Figure 3 .
Figure 3. Ratios for each gene included in the MT-, RPS, and RPL gene sets.The dotted lines indicate the 1.25x ratios in either direction.

Figure 4 .
Figure 4. Expression relationships between Krt12 and selected high expression genes with expected behavior.(A) Genes expected to increase with differentiation.(B) Genes expected to be invariant across the spectrum of Krt12 increase.N.S.Not statistically significant.

Figure 4 .
Figure 4. Expression relationships between Krt12 and selected high expression genes with expected behavior.(A) Genes expected to increase with differentiation.(B) Genes expected to be invariant across the spectrum of Krt12 increase.N.S.Not statistically significant.

Figure 5 .
Figure5.The effect of cell expression level on the number of genes complying with the BH-p < 0.01 threshold.Each quantile (Q) consists of 2000 genes in decreasing overall gene expression for a total of 18,000 examined genes.

Figure 5 .
Figure5.The effect of cell expression level on the number of genes complying with the BH-p < 0.01 threshold.Each quantile (Q) consists of 2000 genes in decreasing overall gene expression for a total of 18,000 examined genes.

Table 9 .
Top frames: Gene ontology terms unique to D0 (left frame) or D1 (right frames) in the D0-D1 comparison set.Bottom frames: Gene ontology terms unique to D123 (left frame) or D4i.1 (right frame) in the D123 to D4i.1 comparison.The fold expression overrepresentation index (FE) and the false discovery rate (FDR) are listed.

Figure 6 .
Figure 6.Correlation of GE changes with the changes in Krt12 for cytokeratins.Figure 6. Correlation of GE changes with the changes in Krt12 for cytokeratins.

Figure 6 .
Figure 6.Correlation of GE changes with the changes in Krt12 for cytokeratins.Figure 6. Correlation of GE changes with the changes in Krt12 for cytokeratins.

Figure 7 .
Figure 7. Main frame: The distribution of GPHA2 and four highly correlated genes in the first 2000 cells of the Pe domain of LiPe.Inset: The distribution of GPHA2 between the D0-D3 domains and D4i split into 4 quantiles.N.S., not significant; * BHp = 0.0003.

Figure 7 .
Figure 7. Main frame: The distribution of GPHA2 and four highly correlated genes in the first 2000 cells of the Pe domain of LiPe.Inset: The distribution of GPHA2 between the D0-D3 domains and D4i split into 4 quantiles.N.S., not significant; * BHp = 0.0003.

Funding:
This research was funded by NIH/NEI RO1 EY 030567 and EY029279 and by a Challenge Grant from Research to Prevent Blindness (RPB).Institutional Review Board Statement:The Icahn School of Medicine Institutional Review Board has determined that as per section 45 CFR Part 46 of the U.S.A. Code of Federal Regulations, unidentifiable cadaver tissue does not constitute research in human subjects (see https://grants.nih.gov/policy/humansubjects/hs-decision.htm (accessed on 7 December 2023)).The use of human tissue in this study was in accordance with the provisions of the Declaration of Helsinki.

Table 1 .
Accounting of cell number and calculated cell types present in each scRNA seq sample.* as a percent of epithelial.

Table 3 .
Accounting of down and upregulated DEGs within the domains (D) of the LiPe sample and the start and end Quantiles of the Co sample.The total number of DEGs and the number exceeding a 1.35-fold change for the 6647 genes compared are shown.

Table 3 .
Accounting of down and upregulated DEGs within the domains (D) of the LiPe sample and the start and end Quantiles of the Co sample.The total number of DEGs and the number exceeding a 1.35-fold change for the 6647 genes compared are shown.

Table 4 .
Top genes that are downregulated (top) or upregulated (bottom) with the initiation of Krt12 expression (D0 to D1) within the limbal domain.

Table 5 .
Top downregulated (top) or upregulated (bottom) genes in the transition from the Krt12positive limbal cells (D123) to the first 1000 cells of the D4 (i.e., Pe) domain.

Table 6 .
The top downregulated and upregulated genes in the comparison of the D4i, first and second 1000 cell subdomains.

Table 8 .
The top downregulated and upregulated genes associated with the transition from the lowest to the highest Krt12 expression in the central cornea (CoQ start -CoQ end ).

Table 10 .
Top: Genes with a high positive and negative correlation (crl) with the increase in Krt12 expression in the central cornea.Bottom: Genes that correlated with the Krt12 GE increase in D4, Pe, and Co.