Isolation and Characterization of Human Colon Adenocarcinoma Stem-Like Cells Based on the Endogenous Expression of the Stem Markers

Background: Cancer stem cells’ (CSCs) self-maintenance is regulated via the pluripotency pathways promoting the most aggressive tumor phenotype. This study aimed to use the activity of these pathways for the CSCs’ subpopulation enrichment and separating cells characterized by the OCT4 and SOX2 expression. Methods: To select and analyze CSCs, we used the SORE6x lentiviral reporter plasmid for viral transduction of colon adenocarcinoma cells. Additionally, we assessed cell chemoresistance, clonogenic, invasive and migratory activity and the data of mRNA-seq and intrinsic disorder predisposition protein analysis (IDPPA). Results: We obtained the line of CSC-like cells selected on the basis of the expression of the OCT4 and SOX2 stem cell factors. The enriched CSC-like subpopulation had increased chemoresistance as well as clonogenic and migration activities. The bioinformatic analysis of mRNA seq data identified the up-regulation of pluripotency, development, drug resistance and phototransduction pathways, and the downregulation of pathways related to proliferation, cell cycle, aging, and differentiation. IDPPA indicated that CSC-like cells are predisposed to increased intrinsic protein disorder. Conclusion: The use of the SORE6x reporter construct for CSCs enrichment allows us to obtain CSC-like population that can be used as a model to search for the new prognostic factors and potential therapeutic targets for colon cancer treatment.


Introduction
Colon cancer remains one of the most frequent cancers in the developed countries, following lung and prostate cancers among males, and breast cancer among females [1]. Despite the availability of a wide range of drugs, the disease at the IV metastatic stage remains incurable, and the five-year survival of these patients does not exceed 5-10% [2,3]. Therefore, the study of malignant tumors in order to find new therapeutic targets remains one of the priorities of the modern medical and biological community.
Considerable attention is focused on study of the mechanisms of occurrence and maintenance of the intra-tumor heterogeneity [4]. According to the concept of cancer stem cells (CSCs), the growth and metastasis of the tumor are caused by the functioning of a small population of tumor cells with specific properties. These cells should have a number of functional characteristics, namely, a high ability to form tumors in the body

The Increased Cytostatic Resistance of Cell Clones Selected after Reporter Transduction
We compared the functional properties of several cell clones obtained from a population of BSC8_SORE + cells. During the growth of the clones, we observed an increase in the degree of intraclonal heterogeneity of the cells, which could be judged by the level of GFP fluorescence ( Figure 1D). In order to select the most malignant clone, we determined the degree of cell chemosensitivity to the effect of various concentrations of the most active cytostatics widely used in clinical practice (  To enrich a subpopulation of stem-like cells, we used the method proposed by Tang in 2015 [28] based on the SORE6x reporter construct introduction into the cells. The SORE6x plasmid contains a six-time repeated region of the Nanog gene promoter, which is a binding site for the endogenously expressed OCT4 and SOX2 transcription factors ( Figure 1C). In the original paper, the authors also pointed out that the design allows "discarding" the products of the pseudogenes of the Oct4 gene, which makes its use a highly effective method for the CSC enrichment. Activation of the reporter leads to the puromycin resistance and the expression of the destabilized form of GFP (with four-times shorter half-life than the original GFP). The use of these properties allowed us to select subpopulation of puromycinresistant cells and analyze their CSCs-like properties comparing them with the cells of the original BSC8 line. After viral transduction and selection on the medium with puromycin during 5 days, more than 98% of cells died. The remaining puromycin-resistant cancer cells were called BSC8_SORE+and used for the transcriptome analysis.
We assumed that the resulting subpopulation of cells has properties similar to stem cells, based on the fact that the reporter structure assumed the acquisition of resistance only by cells with endogenous expression of recognized stem factors OCT4 and SOX2. It is known that cancer stem cells, according to their definition, are responsible for the formation of tumors and maintaining their growth, and one of their properties is increased resistance to cytostatics used in cancer therapy. Based on the property of increased resistance to 5-fluorouracil compared to the original cell line, we performed subcloning of the resulting population of puromycin-resistant cells and compared the chemoresistance of the obtained subclones.

The Increased Cytostatic Resistance of Cell Clones Selected after Reporter Transduction
We compared the functional properties of several cell clones obtained from a population of BSC8_SORE + cells. During the growth of the clones, we observed an increase in the degree of intraclonal heterogeneity of the cells, which could be judged by the level of GFP fluorescence ( Figure 1D). In order to select the most malignant clone, we determined the degree of cell chemosensitivity to the effect of various concentrations of the most active cytostatics widely used in clinical practice oxaliplatin,. The results for one of the clones compared to the original line are shown in the Figure 2. BSC8_SORE+/Clone 10 showed a seven-fold increase of the resistance to 5-fluorouracil (5-FU) treatment, fivefold to SN-38, and six-fold to oxaliplatin ( Figure 2). The doubling time of the BSC8 line cells is 17 h, whereas in enriched stem cells (Clone 10) it is 23.3 h (1.3 times higher). Given that the chemoresistance is considered as one of the major properties of CSCs, further work done to characterize this clone.

Clone Formation under Normoxia and Hypoxia and Cell Migratory Activity Analysis
Next, we analyzed the expression of stem cell markers in the obtained cell line with the endogenic OCT4 and SOX2 expression. The levels of expression and the patterns of marker localization in the tumor cells of the original BSC8 cell line were similar to those of the BSC8_SORE+ line. There was no difference in the proportion of the cells expressing CD44, and its cytoplasmic expression as determined by staining was found in at least 40 percent of the cells. A significant increase in the proportion of LGR5 positive cells was noted in the BSC8_SORE+ cells. Weak cytoplasmic staining was found, when the antibodies to the OCT4 key embryonic stem cell transcription factor were used (Supplementary Figure S2).
We analyzed the functional characteristics of the isolated cell clones, such as clonogenic and migratory activity, as well as the mobility and chemoresistance. This analysis revealed that the cells do not differ in the level of clonogenic activity in a semi-liquid agar. Cells formed the same number of colonies, which also were identical in shape and size. The role of cultivation under hypoxic conditions was also analyzed. A larger number of cells expressing GFP on a section of colonosphere formed under hypoxic conditions can be noted. A looser structure of the colony in hypoxia was also noted (Supplementary Figure S3).
When growing BSC8_SORE+/Clone 10 cells on plastic, we observed an increase in the intraclonal cell heterogeneity. The levels of dsGFP expression visibly changed. During the flow cytometry analysis, we found the presence of 2 cell populations with 100× differences in the degree of fluorescence. The number of brightly fluorescent cells was twice as low as the number of cells with low level of GFP expression ( Figure 1D). higher). Given that the chemoresistance is considered as one of the major properties of CSCs, further work done to characterize this clone.

Clone Formation under Normoxia and Hypoxia and Cell Migratory Activity Analysis
Next, we analyzed the expression of stem cell markers in the obtained cell line with the endogenic OCT4 and SOX2 expression. The levels of expression and the patterns of marker localization in the tumor cells of the original BSC8 cell line were similar to those To find out how much the strengthening of GFP expression can reflect the "degree of stemness" of the cells selected, we isolated GFP+ cells by cell sorting. After plating the cells at the clonal density and culturing for two weeks, it was found that the BSC8_SORE+/Clone 10/GFP + cells exhibited increased clonogenic activity when cultured on plastic ( Figure 3A). A ten-fold increase in the clonogenic activity of GFP+ cells obtained from BSC8_SORE+/Clone 10 cells was observed compared to the control samples ( Figure 3B). We did not find differences in the clonogenic activity of "non-green" cells on plastic ( Figure 3A). A ten-fold increase in the clonogenic activity of GFP+ cells obtained from BSC8_SORE+/Clone 10 cells was observed compared to the control samples ( Figure  3B). We did not find differences in the clonogenic activity of "non-green" cells (GFP-) obtained from the BSC8_SORE+/Clone 10. In this case, the clonogenic activity was the same as in the cells of the original cell line (BSC8).
When cells were cultured under hypoxic conditions, there was a slight increase in the number of colonies, both in the original cells and in the SORE+ "stem-like" cells. We should also emphasize the increase in the size of the growing colonies, where the hypoxic conditions resulted in larger colonies (Supplementary Figure S3).  When cells were cultured under hypoxic conditions, there was a slight increase in the number of colonies, both in the original cells and in the SORE+ "stem-like" cells. We should also emphasize the increase in the size of the growing colonies, where the hypoxic conditions resulted in larger colonies (Supplementary Figure S3).
Next, we analyzed the cell migratory activity by measuring the "wound repair" rate and by monitoring migration through the membrane of the Boyden chamber towards chemoattractant (represented by the medium with FBS). It was found that in comparison with and the original BSC8 line, the BSC8_SORE+ cell line was characterized by the somewhat reduced level of mobility-it took these cells more than five days for complete wound closure ( Figure 4A,B). When analyzing migration activity through the pore towards the nutrient, the BSC8_SORE+ cells showed more than two times higher migration rate than the cell mobility of the original cell line. The large size of the formed colonies indirectly indicates a high clonogenic activity of the cells migrating through the membrane ( Figure 4C,D). what reduced level of mobility-it took these cells more than five days for complete wound closure ( Figure 4A,B). When analyzing migration activity through the pore towards the nutrient, the BSC8_SORE+ cells showed more than two times higher migration rate than the cell mobility of the original cell line. The large size of the formed colonies indirectly indicates a high clonogenic activity of the cells migrating through the membrane ( Figure 4C,D). In our xenograft model experiments, the cells were injected into the immune-deficient nude mice subcutaneously in the thigh area, a single-cell suspension of 500,000 cells in 100 µL of Hanks' solution. In the experiment, 10 nude mice were taken (one died), the size of the tumors measured on day 40 after the injection of BSC8_SORE+ cells (left thigh) and BSC8 (right thigh) is shown in the Figure 5 and Supplementary Figure S4. Due to the limited number of Nude mice injections were given under both legs. Thus, we compared the ability of injected cells to regenerate new tumor and the sizes of them. Tumorogenic activity was evaluated by measuring the tumor diameters at the time of 20, 30 and 40 days from the day of injection. Thus, it is shown that the enrichment of the BSC8population with the help of a reporter containing a promoter of the nanogene carrying a binding site for the Oct4/Sox2 complex, contributes to an increase in their tumorogenicity. We believe that our results show the activity of expression of POU-domain transcription factors in the stem-like component of the tumor population and characterize this family of proteins as a marker of cancer stem cells and a potential target for genetic cancer therapy. The characteristics of the BSC8_SORE+ cells we present give us reason to believe that they are similar to cancer stem cells, and further in the text we allow ourselves to call them a subpopulation of CSC-like cells. showed that the sizes of tumors generated by BSC8 and BSC8_SORE+ cells significantly different on day 40, p < 0.01. * p < 0.05 for the difference.   Comparison of the tumorogenic activity of the "stem-like" BSC8_SORE+ cells and original BSC8 cells, the sizes of tumors/measured in mice 40 days after the injection of BSC8_SORE+ cells and BSC8 are shown. Evaluation of the significance of the differences using the Student's t-test showed that the sizes of tumors generated by BSC8 and BSC8_SORE+ cells significantly different on day 40, p < 0.01. * p < 0.05 for the difference.

The Transcriptomic Landscape of the SORE+ Cells
BSC8_SORE+ cancer stem cell-like cell line demonstrated differential expression of 12831 genes compared to BSC8 control line. Of these genes, 5973 were up-regulated and 6451 were down-regulated ( Figure 6) To identify the most prominent functional effects, we first constructed the proteinprotein interaction network (PIN) for proteins encoded by the most up-or down-regulated genes with the expression difference of more than 10-fold and with the best interaction confidence (>0.9). The network was constructed using the String server (https://string-db. org/, accessed on 12 August 2020). Next, we extracted the whole connected component from the induced and inhibited networks and performed k-means clustering using the same server. Figure 7 and Supplementary Figure S5 show versions with short and extended legends of the induced and the inhibited PINs. It is clearly seen that the up-regulated and down-regulated networks contain the large tightly linked clusters involved simultaneously in several functions that confirm a modular pattern of transcriptome changes. The pattern of gene function distribution among various clusters indicated that in accordance with the experimental data, BSC8_SORE+ demonstrated clear manifestations of drug resistance, pluripotency, and decreased proliferation rate. Below, we provide a detailed description of the main clusters from the up-and down-regulated PINs. The regulators containing more than five bonds, we designated as hub proteins.  To identify the most prominent functional effects, we first constru protein interaction network (PIN) for proteins encoded by the most up lated genes with the expression difference of more than 10-fold and with tion confidence (>0.9). The network was constructed using the String serv db.org/, accessed on12 August 2020). Next, we extracted the whole conn from the induced and inhibited networks and performed k-means clustering using the same server. Figure 7 and Supplementary Figure S5 show versions with short and extended legends of the induced and the inhibited PINs. It is clearly seen that the up-regulated and down-regulated networks contain the large tightly linked clusters involved simultaneously in several functions that confirm a modular pattern of transcriptome changes. The pattern of gene function distribution among various clusters indicated that in accordance with the experimental data, BSC8_SORE+ demonstrated clear manifestations of drug resistance, pluripotency, and decreased proliferation rate. Below, we provide a detailed description of the main clusters from the up-and down-regulated PINs. The regulators containing more than five bonds, we designated as hub proteins.  The network was constructed using the STRING server with the highest interaction confidence. Clusterization was provided by k-means clustering. Hub proteins, that is, the ones having more than five connections, are shown as large buttons; node proteins are shown as plain small buttons. Fold expression difference is not less than 10-fold. GO biological processes that enrich for clusters from the up-and down-regulated networks with the highest significance are indicated in bar charts under the networks. Cluster numbers coincide with the numbers of bars on the charts.

Characterization of the PINs and Enrichment of Gene Modules for Genes Differently Expressed in BSC8_SORE+ vs. BSC8
The PIN for the up-regulated genes is shown at Figure7A and Supplementary Figure  S5A. Gene expression fold difference for the up-regulated genes from this PIN is listed in Supplementary Table S1. The PIN for the induced genes contained 12 clusters. The largest of these clusters were related to the regulation of pluripotency and pathways maintaining pluripotency. The main hubs of pluripotency clusters were the regulators of stem cell differentiation and maintenance, including NANOGNB (PPIDRPONDR-FIT = 62.8%), POU5F1 (39.4%), LIN28A (46.4%), SALL4 (43.6%), SOX4 (82.3%), SOX2 (95.3%), LIN28A (47.4%), FOXO3 (81.9%), KLF8 (60.7%), and IGF1 (47.7%). Additionally, manifestations of stemness were seen from the induction of Nodal signaling responsible for regulation of the morphogenesis, migration, and epithelial to mesenchymal transition and from the gametogenesis cluster. The induced network also contained several clusters that do not participate in the pluripotency regulation directly but can promote stem cell survival. These clusters were implicated in innate immunity [38], ubiquitin protein degradation [39], and The network was constructed using the STRING server with the highest interaction confidence. Clusterization was provided by k-means clustering. Hub proteins, that is, the ones having more than five connections, are shown as large buttons; node proteins are shown as plain small buttons. Fold expression difference is not less than 10-fold. GO biological processes that enrich for clusters from the up-and down-regulated networks with the highest significance are indicated in bar charts under the networks. Cluster numbers coincide with the numbers of bars on the charts.

Characterization of the PINs and Enrichment of Gene Modules for Genes Differently Expressed in BSC8_SORE+ vs. BSC8
The PIN for the up-regulated genes is shown at Figure 7A and Supplementary Figure  S5A. Gene expression fold difference for the up-regulated genes from this PIN is listed in Supplementary Table S1. The PIN for the induced genes contained 12 clusters. The largest of these clusters were related to the regulation of pluripotency and pathways maintaining pluripotency. The main hubs of pluripotency clusters were the regulators of stem cell differentiation and maintenance, including NANOGNB (PPIDR PONDR-FIT = 62.8%), POU5F1 (39.4%), LIN28A (46.4%), SALL4 (43.6%), SOX4 (82.3%), SOX2 (95.3%), LIN28A (47.4%), FOXO3 (81.9%), KLF8 (60.7%), and IGF1 (47.7%). Additionally, manifestations of stemness were seen from the induction of Nodal signaling responsible for regulation of the morphogenesis, migration, and epithelial to mesenchymal transition and from the gametogenesis cluster. The induced network also contained several clusters that do not participate in the pluripotency regulation directly but can promote stem cell survival. These clusters were implicated in innate immunity [38], ubiquitin protein degradation [39], and phototransduction signaling that mediate neuronal morphogenesis [40]. The PIN for the down-regulated genes contained 12 clusters. The induced PIN also revealed the features of drug resistance. It can be seen from the cluster of drug metabolism through cytochrome p450 pathway (hubbed by IDO1), leptin signaling, and Toll-like receptors. The induction of these clusters both confirmed our experimental results indicating a higher drug resistance (specifically, for 5-fluorouracil), and also showed which particular pathways confer this property [41,42].
The PIN for the down-regulated genes is shown in Figure 7B and Supplementary Figure S5B. The expression fold difference for the down-regulated genes from this PIN is shown in Supplementary Table S2. The PIN for the down-regulated genes contained 11 clusters and also showed manifestations of stemness. For example, the clusters of aging (the main hubs are AKT (PPIDR PONDR-FIT = 18.3%), mTOR (23.6%), and EDN1 (44.3%) and differentiation (the main hub is MAPK9 (PPIDR PONDR-FI T = 18.4%) point to dedifferentiation ( Figure 7B, Supplementary Figure S5B). The clusters of cell cycle regulation evidence of slow cell cycling. This feature is a well-known manifestation of stemness [43] Many hubs of these clusters were classical cell cycle regulators of E2F, cycline (CCN), and CDK, RB, and PCNA families. Another important cell cycle-related cluster was implicated in chromatin modification via histones (cluster 1, the main hubs were HIST1H2 (PPIDR PONDR-FIT = 78.4%) and HIST3H2 (PPIDR PONDR-FIT = 44.4%). The clusters of immune response and inflammation via interferon signaling, chemokines and TNF (PPIDR PONDR-FIT = 12.4%) in the PIN of down-regulated genes suggest a decreased activity of pro-inflammatory signaling. Since pro-inflammatory signaling stimulates stem cell differentiation [44], the decrease of the activity of this pathway also points to the cell dedifferentiation.
Therefore, the functional picture provided by PIN clusters for the inhibited and induced genes indicated that compared to the BSC8 control cells, BSC8_SORE+ cells demonstrate manifestations of the pluripotency and drug resistance. The unexpected result was the induction of genes participating in the leptin signaling and phototransduction. BSC8_SORE+ cells also showed the coordinated down-regulation of signaling involved in the cell cycle regulation, and phase transition, inflammation, aging, and differentiation, which complements the data obtained with the induced network.

Gene Module Analysis
To obtain a transcriptome-wide picture and to find out whether the discovered effects were manifested globally, we performed Gene Ontology (GO) and Cancer BioSystems analysis of functional gene modules (by averaging gene expression). Sixty most significantly up-and down-regulated modules are shown in Figure 8A-D and Supplementary Tables S3-S6. The results of this analysis provided important support to the PIN analysis data. For example, similar to the trends found by the analysis of PIN clusters, the substantial number of the induced modules from both databases were related to the pluripotency, including signaling via NODAL (PPIDR PONDR-FIT = 13.3%) and activin (branches of the TGF-β pathway), and signaling via Yamanaka factors (POU5F1 (PPIDR PONDR-FIT = 39.4%), NANOG (62.8%), and SOX2 (95.3%). Additionally, several modules were related to TGF-β, c-Kit, and FGF signaling and to EMT and stemness, as well as to male gametogenesis and development ( Figure 8A,C, Supplementary Tables S3 and S5). A large number of the up-regulated modules were also involved in the response to various stressors and stimuli, such as drug resistance and antimicrobial immunity. Importantly, gene module analysis confirmed the activation of phototransduction and leptin signaling pathways.  In accordance with the PIN analysis results, the down-regulated gene modules were related to the cell growth, proliferation, cell cycle regulation and phase transition, aging, differentiation, chromosome organization, chromatin remodeling, immune response (including interferon signaling, inflammation, antigen processing, and cytokine signaling), and RNA metabolism ( Figure 8B,D, Supplementary Tables S4 and S6). The new down-regulated modules were represented by several pathways related to apoptosis and numerous pathways associated with the protein metabolism. Therefore, the gene module analysis results confirmed that the effects identified for the most deregulated genes by PIN analysis apply to the entire transcriptome.
To verify the resistance of the BSC8_SORE+ cells to 5-fluorouracil via the in silico analysis, we used CDRgator (Cancer Drug Resistance navigator) [45]. CDRgator (http: //cdrgator.ewha.ac.kr, accessed on 15 September 2020) is a database that unifies data on the drug resistance gene signatures for about 1000 cancer cell lines and cancer cells of resistant patients (for about 30 drugs). Of all gene signatures, we selected ones involved in the regulation and responsible for the 5-fluorouracil resistance of colorectal carcinoma. As a result, we obtained three gene signatures (30,35, and 600 genes) and compared them with our gene expression data using binomial test for the statistical significance evaluation (Supplementary Table S7).
Since IDPs/IDPRs are known to play crucial roles in PIN regulation, and since many IDPs/IDPRs serve as hub proteins [61,[64][65][66][67][68], we also evaluated intrinsic disorder predispositions of the proteins encoded by the up-and down-regulated genes with very high significance in the CSC-like BSC8_SORE+ cell line (Supplementary Tables S8 and S9). The genes encoding these proteins are also listed in Supplementary Tables S1 and S2. Figure 9A represents results of the CH-CDF analysis of these two protein sets. This approach allows predictive classification of proteins based on their position within the CH-CDF phase space into structurally different classes, such as ordered proteins, proteins with compact disorder/hybrid proteins and proteins with extended disorder [69][70][71][72]. This analysis revealed that among the 219 induced proteins, 21 proteins is expected to be highly disordered, 32 proteins might have a molten globular or hybrid structure, and 164 proteins are mostly ordered. Curiously, 175 inhibited proteins showed a bit different disorder status, with 15, 51, and 107 being expected to behave as native coils/native pre-molten globules, native molten globules/hybrid proteins, and ordered proteins, respectively. In other words, 24.2% of the induced proteins and 37.8% of the inhibited proteins are expected to have high levels of intrinsic disorder.
Based on the peculiarities of the results generated by the per-residue disorder predictors, such as content of disordered residues (or percent of predicted intrinsically disordered residues, PPIDR) and mean disorder score (MDS), proteins are typically grouped into three categories: highly ordered (PPIDR = 0-10% or MDS = 0.00-0.25), moderately disordered (PPIDR = 10-30% or MDS = 0.25-0.50), and highly disordered (PPIDR greater than 30% or MDS greater than 0.50). Obviously, the MDS value calculated for a given protein is not directly related to its PPDR value (e.g., theoretically, a protein with the PPDR of 100% might have the MDS ranging from 0.5 to 1.0; whereas a protein with the PPDR of 0% might have any MDS < 0.5). Figure 9B shows the MDS vs. PPDR plot generated for the induced and inhibited proteins in the CSC-like BSC8_SORE+ cells based on the results of their analysis by the meta-predictor of intrinsic disorder, PONDR ® FIT, which is slightly more accurate than any of its component predictors. Note, that the results of the analogous analysis using some of the components of the PONDR ® FIT (PONDR ® VLXT, PONDR ® VL3, PONDR ® VSL2, IUPred_L and IUPred_S) are provided in Supplementary Materials (Supplementary  Tables S8 and S9).
Since IDPs/IDPRs are known to play crucial roles in PIN regulation, and since many IDPs/IDPRs serve as hub proteins [61,[64][65][66][67][68], we also evaluated intrinsic disorder predispositions of the proteins encoded by the up-and down-regulated genes with very high significance in the CSC-like BSC8_SORE+ cell line (Supplementary Tables S8 and S9). The genes encoding these proteins are also listed in Supplementary Tables S1 and S2. Figure 9A represents results of the CH-CDF analysis of these two protein sets. This approach allows predictive classification of proteins based on their position within the CH-CDF phase space into structurally different classes, such as ordered proteins, proteins with compact disorder/hybrid proteins and proteins with extended disorder [69][70][71][72]. This analysis revealed that among the 219 induced proteins, 21 proteins is expected to be highly disordered, 32 proteins might have a molten globular or hybrid structure, and 164 proteins are mostly ordered. Curiously, 175 inhibited proteins showed a bit different disorder status, with 15, 51, and 107 being expected to behave as native coils/native pre-molten globules, native molten globules/hybrid proteins, and ordered proteins, respectively. In other words, 24.2% of the induced proteins and 37.8% of the inhibited proteins are expected to have high levels of intrinsic disorder.  Overall, this PONDR ® FIT-based analysis indicated that the majority of the induced and inhibited proteins are either moderately or highly disordered. In fact, in the induced set, 68 (31.1%), 96 (43.8%), and 55 proteins (25.1%) have PPIDR scores of PPIDR < 10%, 10% ≤ PPIDR < 30%, and PPIDR ≥ 30%, respectively, indicating that 68.9% of these proteins are either highly or moderately disordered. Similar picture is observed, when these proteins are grouped based on their MDS values: 95 (43.4%), 94 (42.9%), and 30 (13.7%) have MDS < 0.25, 0.25 ≤ MDS < 0.5, and MDS ≥ 0.5, respectively. Therefore, based on their MDS values, high or moderate levels of disorder is found in 124 (56.6%) induced proteins. Application of the analogous classification to inhibited proteins indicated that 39 (22.3%), 71, (40.6%), and 65 (37.1%) of them have PPIDR < 10%, 10% ≤ PPIDR < 30%, and PPIDR ≥ 30%; and 52 (29.7%), 92 (52.6%), and 31 (17.7%) of these proteins have MDS < 0.25, 0.25 ≤ MDS < 0.5, and MDS ≥ 0.5. In other words, based on their PPIDR/MDS values, 77.7%/70.3% of inhibited proteins belong to moderately or highly disordered category. Figure 9B also gives us a possibility to select the most disordered proteins in the sets, as proteins possessing MDS ≥ 0.5 and/or PPDR ≥ 30%. This analysis indicated that 55 (25.1%) and 65 (37.1%) of the induced and inhibited proteins satisfy these criteria. Even using PPIDR ≥ 50% as the most stringent criteria for the protein to be classified as highly disordered, we found that 26 (11.9%) and 29 (16.6%) of the induced and inhibited proteins belong to this category.
Curiously, although it seems that the inhibited proteins are a bit more disordered than the induced, ones, the Top-10 induced proteins showed higher levels of disorder and the Top-10 inhibited proteins. In fact, PPIDR scores of 10 most disordered proteins in the induced set ranged from 97.6% to 76%, whereas in the inhibited set, 10 most disordered proteins were characterized by the PPIDR scores were somewhat lower ranging from 90.8% to 67.1%. Finally, the potential relation of intrinsic disorder to the functionality of induced and inhibited proteins was stressed out by adding the corresponding PPIDR values to the proteins discussed in the previous sections and in Discussion section below. Importantly, most of these proteins are either moderately or highly disordered, indicating that the intrinsic disorder is needed for the function of proteins associated with stemness of CSCs.

Discussion
In this study, we obtained a line of the colon adenocarcinoma stem-like cells. The selection was based on the activation of reporter genes driven by the binding of the SOX2 and OCT4 endogenous stem cell factors. Selected cells were characterized by increased clonogenic and migration activity, and also showed increased resistance to cytostatic drugs. Using bioinformatics methods, we analyzed the levels of gene expression, combining them into functional groups, followed by the analysis of their roles in the maintenance and functioning of tumor stem cells. We also checked the intrinsic disorder status of the corresponding proteins.
Stem cell renewal can be achieved by symmetric division, as well as asymmetric division [73,74]. Symmetric division of stem cells is a highly conservative and precisely regulated process of division resulting in formation of two cells, which differ in potentials, morphology, and functions [75]. During long-term cultivation, the population of GFP+ cells acquire heterogeneity, which we explain by the differentiation of the part of the cells. FACS sorting of this heterogeneous population allowed us to obtain cells with low and high clonogenic activity. The clonogenic activity of the BSC8_SORE/Clone 10/GFP+ cells were 10 times higher than the activity of the cells from the original BSC8 line. This confirmed that the GFP+ cells had stem properties.
It is assumed that only cells with unlimited proliferation ability are capable of forming clones both in vitro and in vivo in immunodeficient mice [76,77]. An increase in the clonogenicity of tumor cells under the influence of various factors (for example, activation of the expression of embryonic transcription factors) is interpreted as an increase in the "stemness" and, hence, malignancy potential [76]. The transcriptome analysis revealed high levels of expression of a number of stem factors involved in the ensuring both normal and tumor stemness. Next, we described some of these factors in more detail.
Due to the growing interest in tumor stemness, a lot of results have been accumulated concerning the expression of OCT4 variants in tumors of different localization, and although OCT4A expression is limited to embryonic cells and embryonic carcinoma cells, it has been noted that OCT4B is expressed in human somatic stem cells, tumor cells, adult tissues, and pluripotent cells [78][79][80][81][82]. The pluripotency is an important feature of the CSCs that promotes the self-renewal and chemoresistance. Preservation of embryonic stem cell (ESC) pluripotency under the different pathophysiological conditions requires a complex interaction between different cellular pathways, including those involved in the homeostasis and energy metabolism. However, the exact mechanisms that support CSC pluripotency remain unclear. It seems that the molecular pathway of the self-renewal in normal stem cells is the same as that of CSCs in tumor [83]. Many self-renewal regulatory factors, such as OCT4, SOX2, BMI, and NANOG, are expressed in human malignant tumors, and they play an important role in carcinogenesis. The ability of CSCs to self-maintain, being a common feature with ESCs, provides CSCs with the ability to form metastatic tumors and also contributes to the activation of some embryonic mechanisms to protect cells from the effects of chemotherapy and autoimmune aggression. The expression of many oncofetal and testicular antigens generally correlates with a negative clinical prognosis of cancer. We will discuss only some examples of proteins characteristic for CSCs, which were identified during comparative analysis of BSC8_SORE+ and the original BSC8 cell line transcriptomes.
As an example of a protein with an increased level of expression, we should mention another important transcription factor encoded by the oncofetal gene, SALL4 (PPIDR PONDR-FIT = 43.6%) [84,85]. Along with other solid tumors, its increased expression is described in colon cancer cells [83]. An increased expression of co-receptor for nodal signaling pathways TDGF-1/CRIPTO-1 (PPIDR PONDR-FIT = 10.6%) is observed in the BSC8_SORE+ cells, also previously described as a factor of unfavorable prognosis. This factor plays an important role in the maintenance of stem cells and in the activation of cell metastatic abilities by increasing mobility, increased MMP2 (PPIDR PONDR-FIT = 2.42%) expression, induction of EMT transition and chemokine receptor CXCR4 [86][87][88][89][90].
The results of numerous studies confirm the role of a complex of pathways involved in the cell chemoresistance in the biology of CSCs and the realization of their stem potential. According to the transcriptome analysis, the BSC8_SORE+ cells express the elevated levels of ABCB5 (PPIDR PONDR-FIT = 8.04%). Its expression allows tumor cells to form a multidrug resistance (MDR) phenotype enabling the resistance to 5-FU chemotherapy [91,92]. A number of studies have shown the role of ABCB5 in the implementation of resistance and functioning of cancer stem cells of such types of such solid tumors as that of the oral cavity and colon cancer [93,94]. The CYP1A2 (PPIDR PONDR-FIT = 5.43%) involved in the phase 1 excretion of xenobiotics, is expressed at elevated level in the BSC8_SORE+ cells. IDO1 (Indoleamine 2,3-dioxygenase) is a key regulator of the activity of these enzymes. IDO1 metabolites (in particular tryptophan metabolites) are capable of inhibiting the functions of cytotoxic T lymphocytes, leading to local immunosuppression [95][96][97][98].
The activity of co-transporters and enzymes can significantly reduce the intracellular concentration of drugs. The combination of the activity of DNA repair systems together with the violation of the restriction points of the cell cycle leads to the CSC survival accompanied by a gradual accumulation of new mutations. As for DNA repair, it should be noted that the increased activity of a number of auxiliary proteins, such as NEK8 (PPIDR PONDR-FIT = 12.4%), taking part in the activity of RAD51-dependent DNA repair (NEK8 regulations of DNA damage-induced RAD51 foci formation and replication for protection) were also shown in the BSC_SORE+ cells.
An important role in ensuring cytostatic resistance is the retention of cells in a state of mitotic rest (G0 phase). "Dormant" tumor cells are not sensitive to most cytostatic agents, as well as are more resistant to adverse microenvironment conditions of distant metastasis formation [99]. One of the distinctive features of the BSC8_SORE+ cells is the cell cycle slowing down and the cell metabolism inhibition. According to experimental data, the BSC8_SORE+ cell cycle duration is 23.3 h, which is 1. Signs of metabolic deprivation are also consistent with the cell cycle slowing down. It is known that the cell cycle slowing down is one of the key CSC properties and can lead to an increased resistance to cyclo-dependent cytostatics (5-FU and SN-38) [100]. The selection of a cell population with a low cycling rate is one of the methods of enrichment of highly malignant cells. The selected cells correspond to the CSC criteria and exhibit the increased chemoresistance, clonogenic activity, migratory activity, and also express various extracellular matrix-degrading enzymes [75,101,102].
One The expression levels of chemokine receptors and their ligands, such as CXCR2, CXCL14, CCL16, CXCL17, CCL28, and CCL22 (with the PPIDR PONDR-FIT of 13.1%, 32.3%, 24.2%, 72.3%, 35.4%, and 18.3%, respectively) were also increased in BSC8_SORE+ cells. At the same time, there was the repression of CXCL1, CXCL3, CXCL8, CXCL10, and CXCL11 (PPIDR PONDR-FIT of 24.5%, 25.2%, 16.2%, 24.5%, and 23.4%, respectively). In physiological conditions, chemokines participate in the directed movement of immune cells (in particular lymphocytes). Their role in the development of tumor cell metastasis to lymph nodes and distant organs is assumed to be guiding [103,104]. There are also papers showing chemokines participation in the stem cell functioning processes. Overexpression of CCL28 enhances cell proliferation, as well as their migration and clonogenic properties [105]. According to the authors, these manifestations occur due to the MAP kinase cascade activation.
The aforementioned CXCR2 overexpression is consistent with the literature data stating that the CXCR2 overexpression on the surface of tumor cells is a marker of poor prognosis for some types of cancer [106][107][108][109]. The use of CXCR1/2 oral inhibitors leads to the inhibition of metastasis to the liver of colon cancer due to the inhibition of neovascularization and increased apoptosis of tumor cells [110]. The importance of angiogenesis for tumor growth and spreading is undeniable. In the obtained stem cells, we observed the increased expression of angiopoietins ANGPT2 and ANGPT4 with the PPIDR PONDR-FIT of 8.07% and 32.8%, respectively, which are the TIE2 receptor ligands on the surface of endothelial cells [111,112]. Their expression promotes VEGF-independent tumor neovascularization. According to the literature, they are negative prognostic markers [113][114][115][116][117].
Finally, our bioinformatics analysis indicated that a significant portion of up-and down-regulated proteins in the CSC-like BSC8_SORE+ cells are moderately or highly disordered. In fact, induced and inhibited proteins contain variable levels of intrinsic disorder, and their PPIDR PONDR-FIT values ranges from 2.21% to 97.6% and from 2.65% to 90.8%, respectively. Consideration of the presence of intrinsic disorder in proteins associated with the stemness of CSCs provides an important angle for better understanding of their functionality. In fact, high disorder content in many stemness-associated proteins can be related to the presence multiple posttranslational modification (PTM) sites and numerous isoforms generated in these proteins by alternative splicing, as well as it can define their binding promiscuity and ability to undergo binding induced folding at interaction with specific partners. Such disorder-based structural and functional heterogeneity of human proteins associated with CSC stemness is in agreement with the well-established fact that IDPs or hybrid proteins containing ordered domains and IDPRs are typically involved in recognition, regulation, and cell signaling [58,61,[118][119][120][121][122][123][124][125][126][127][128][129][130][131][132], and are commonly found among proteins related to the pathogenesis of various human diseases [123,[133][134][135][136][137][138][139][140][141]. These findings are also in line with the previously reported roles of intrinsic disorder in reprogramming transcription factors (also known as Yamanaka factors, such as SOX2, OCT3/4 (POU5F1), KLF4, and c-MYC, and the Thomson factors, such as SOX2, OCT3, LIN28, and NANOG), overexpression of which leads to the transformation of terminally-differentiated somatic cells into the induced pluripotent stem (iPS) cells [142].

Obtaining the Primary Cell Line
The material for this study was obtained on the basis of the patients' informed consent. The ongoing research received the ethical approval of the ethics committee of the St. Petersburg Clinical Research and Practical Center of Specialized Types of Medical Care (Oncologic). Tumor material was obtained during the planned surgical treatment of 12 patients. Tissue fragments were transported to the laboratory within 6-12 h after surgery in saline solution.
The cells were isolated from the primary tissue based on the recommendations of Yu et al. [143] and according to our previous study [144]. Each sample was cut into pieces of approximately one cubic millimeter in size and incubated in 10-fold (by volume) excess of trypsin (Gibco, Gaithersburg, MD, USA) overnight in a refrigerator at 4 • C, followed by an hour incubation at 37 • C. The action of trypsin was inhibited by RPMI medium with 10% FBS (fetal bovine serum). This suspension of disintegrated cells was centrifuged to collect the cells (150 g for 5 min). The cells were then resuspended in Roswell Park Memorial Institute (RPMI) 1640 culture medium with 10% FBS and then plated on a 10 cm dishes. The remaining "undigested tissue" was treated with a solution of collagenase in RPMI medium with serum containing 500 units/mL of the enzyme, incubated in a Petri dish at 37 • C for one hour. The released cells were collected, and cells were placed in a medium for culturing in the incubator. The colorectal cancer cells were cultured in Dulbecco's Modified Eagle Medium (DMEM) medium (Gibco, Gaithersburg, MD, USA) with 10% FBS (Sigma, Co, St.Louis, MO, USA). Cells were passaged using trypsin solution (Gibco, Gaithersburg, MD, USA), once in three days at a ratio of 1:3. We used PCR-based testing with universal primers specific to the 16S rRNA region for detecting mycoplasma contaminants in cell culture [145][146][147]

Packaging of VIral Particles, Infecting Cells in Vitro, and Selecting Clones with Integrated SORE6x Reporter
We used the HEK 293T human embryonic kidney cell line to obtain viruses. Cells were transfected by the calcium phosphate method to introduce the SORE6x plasmid reporter construct (courtesy of Dr. Tang) based on the lentiviral integrating vector, as well as auxiliary plasmids to provide viral particle according to the standard protocol (https:// www.epfl.ch/labs/tronolab/wp-content/uploads/2019/06/LV_production.pdf, (accessed on 19 July 2020) and our previous study [28]. Usually, infection was done twice (that is, after a day, another 10 µL of viral particles were added). Three days after the virus infection, the cells were transplanted to 10-cm culture cups (Corning Life Sciences, MA, USA) in the culture medium. The next day, the puromycin antibiotic (Sigma-Aldrich Co, St.Louis, MO, USA) was added to the medium at a concentration of 5 µg/mL. The selection followed, changing the medium with the antibiotic every two days, for 10-14 days. We selected growing colonies individually and, after treatment with trypsin, plated the cells on a six-well plate.

Immunofluorescence Staining of Formalin-Fixed Paraffin-Embedded Tissue Sections
We hydrated the sections gradually with graded alcohols: washed in 100% ethanol twice for 15 min each time, then in 90% ethanol twice for 15 min each time, and rinsed in deionized H 2 O for 1 min. Antigen unmasking was provided by heat treatment with 10 mM sodium citrate buffer, pH 6.0 at 95 • C for 5 min. Samples were incubated for 30 min with a blocking solution (1% horse serum in phosphate-buffered saline (PBS) and washed with three changes of PBS for 5 min each). For staining, sections were incubated with primary antibody diluted in PBS with 1% bovine serum albumin (BSA) for 60 min at room temperature or overnight at 4 • C followed by washing with three changes of PBS for 5 min each. Secondary antibodies were diluted in PBS with 1% BSA, and samples were incubated at room temperature for 60 min in a dark chamber. Immediately after washing with three changes of PBS for 5 min each, sections were covered with either an aqueous or a hard-set mounting medium. They were examined using a fluorescence microscope with appropriate filters.

Determining the Resistance to Cytostatics by the MTT Method
We evaluated the cell viability using a colorimetric method with MTT (Sigma-Aldrich Co., St. Louis, MO, USA). The method is based on the fact that mitochondrial oxidoreductases of the living cells restore yellow MTT to purple formazan. The amount of formazan produced correlates with the number of viable cells in the population. To determine the sensitivity of colon adenocarcinoma cells to cytostatic drugs (5-Fluorouracil, SN-38, Oxaliplatin), we seeded 50,000 cells per well of a 96-well plate in a volume of 100 µL and added 25 µL of the drug solution with a five-fold relative final concentration. The drugs were used at a concentration that allowed to determine the change in the cell line chemosensitivity (5-FU-100 ng/mL), or at a concentration for determining IC50. Cytostatic solutions were prepared in several test concentrations. For each cell line studied, we seeded the cells on six control wells-without adding the cytostatic. After incubation for two days in a cell incubator, 10 µL of the MTT solution (5 mg/mL in PBS buffer) was added to the medium followed by the incubation for 2-4 h. Then the medium was carefully removed and formazan crystals were dissolved in 100 µL of DMSO (Sigma-Aldrich Co, St.Louis, MO, USA). We measured the optical density at a wavelength of 570 nm on the Thermo Electron Multiskan EX (Invitrogen, Thermo Fisher Scientific; Waltham, MA, USA). For each experimental point, we repeated the measurement six times and calculated the standard average error. The results were presented as the percentage of cells that survived cytostatic treatment relative to the control (the number of cells that were not treated).
To verify resistance of BSC8_SORE+ cells to 5-fluorouracil in silico, we used Cancer Drug Resistance Navigator (CDRgator) (http://cdrgator.ewha.ac.kr, accessed on 15 September 2020). CDRgator is a database that unifies data on drug resistance gene signatures for about 1000 cancer cell lines and cancer cells of resistant patients (for about 30 drugs).

Cell Proliferation and Colony Formation Assaying
We seeded the cells into a 96-well plate at 0.5 × 10 4 cells/well with a complete medium at 37 • C. To determine the number of live cells, we added the methylthiazolyldiphenyltetrazolium bromide (MTT) reagent to each well in 0, 24, 48, 72, and 96 h, respectively. The absorbance at 570 nm was detected spectrometricaly with a microplate reader (Bio-Rad Laboratories, Hercules, CA, USA). To determine clonogenic ability, we seeded the cells into a 24-well plate in the amount of 200 cells/mL in five repetitions and incubated them for 14 days. Then we stained the cell colonies with 0.01% crystal violet and counted them.

Soft Agar Cloning
Cells were counted, resuspended at 2 × 10 3 cells/mL in the medium (DMEM with 10% FBS and L-glutamine) containing 0.3% weight/volume (w/v) agar (Bacto, Dickinson, Sparks, MD, USA) and overlaid onto a 30-mm dish containing a solidified bottom layer of 0.6% w/v agar in the same medium. After incubation for 10-15 days at 37 • C and 10% CO 2 , all dishes were stained by adding 1 mL/dish of 0.01% (w/v) crystal violet (Fronine, Taren Point, NSW, Australia), and the colonies were counted with a dissection microscope. The assaying was triplicate. The role of cultivation under hypoxic conditions was analyzed in hypoxia incubation chamber (StemcellTechnologies, Vancouver, British Columbia, Canada) with certified medical grade pre-mixed gas (1% O 2 , 5% CO 2 , 94% N 2 ).

Wound Repair Assay
Cells were plated in 24-well plates at 10 6 cells/well in 1 mL of the culture medium. Two days later, a wound was scratched in the adherent cell monolayers with an Eppendorf tip, and the medium was changed to DMEM supplemented with 1% FBS (Invitrogen, Thermo Fisher Scientific; Waltham, MA, USA). We examined the wells every day and took the photomicrographs using the EVOS FL Auto Imaging System (Invitrogen, Thermo Fisher Scientific; Waltham, MA, USA). Then we measured the wound width on the photomicrographs using the same well area for each measurement.

Migration Assay
For migration assaying, we used Transwell chambers (Corning Product; Corning, NY, USA) equipped with 8-µm-pore inserts. We plated the cells in serum-free medium on uncoated inserts and incubated them for 48 h. The volume of 600 µL of culture medium containing 20% FBS (Invitrogen, Thermo Fisher Scientific; Waltham, MA, USA) was added to the lower chamber. We removed non-invaded cells and fixed the cells attached to the bottom of the membrane with 4% paraformaldehyde, stained them with 5% crystal violet (Sigma-Aldrich Co., St. Louis, MO, USA) and counted at 200× magnification. These experiments were performed in triplicate.

RNA Isolation for Transcriptome Sequencing
Total RNA was isolated from cultured cells and tissue samples (with preliminary homogenization) using the RNeasy Mini Kit (Qiagen). RNA concentration was measured on a NanoDrop 2000 spectrophotometer (Invitrogen, Thermo Fisher Scientific; Waltham, MA, USA). RNA was stored at −80 • C.

Transcriptome Sequencing and Analysis
NGS was done on the Illumina platform by parallel measurement of three biological samples both for BSC8_SORE+ and control cells with a read length of 150 nm. The NGS reads were processed similarly to the previous work [147]. The reads were trimmed using Trimmomatic software with default parameters [148]. The trimmed reads were mapped to the canonical nonredundant human transcriptome presented in the RefSeq database [149] using the Bowtie 2 software [150]. This aligner became a de facto standard within mapping pipelines showing a remarkable tolerance both to sequencing errors and indels [151]. We analyzed the resulting gene counts using the Limma package (implemented in the R environment) specially developed for whole transcriptome analyses of differentially expressed genes [152]. The extended version of the transcriptome analysis is presented in Supplementary Methods.

Gene Module Analysis
The gene module enrichment analysis was similar to the previous work [153][154][155][156]. The biological processes were taken from the GO database [157]. As a source of molecular pathways, we used the NCBI BioSystems [158]. The redundancy of this resource, which is a most complete compendium of molecular pathways from different databases, was eliminated by uniting entries with identical gene sets. The extended method is presented in Supplementary Methods.

Protein-Protein Interaction Network Analysis
The protein-protein interactions (PPI) were taken from the STRING database [159]. The PPI networks (PINs) were visualized using the STRING server. We analyzed the dense connected components of PINs for proteins encoded by genes differing in expression between BSC8_SORE+ cells and the control BSC8cell line, as previously [160][161][162][163][164][165][166]. Gene expression difference for genes in the network was not less than ten folds.
The up-regulated and down-regulated genes were analyzed separately.

Intrinsic Disorder Predisposition Analysis
We analyzed the intrinsic disorder predisposition in 219 induced and 175 inhibited proteins using a set of specialized computational tools. The global disorder propensity of target proteins (i.e., their classification as wholly ordered or wholly disordered) was evaluated by the charge hydrophathy-cumulative distribution function (CH-CDF) analysis. The CH-plot classifies query proteins as proteins with substantial amounts of extended disorder (native coils and native pre-molten globules) or proteins with compact globular conformations (native molten globules and ordered proteins) using information on their absolute mean net charge and mean hydropathy [167,168]. The CDF plot discriminates all types of disorder (native coils, native molten globules, and native pre-molten globules) from the ordered proteins [168]. Therefore, the combined CH-CDF plot (where Y-coordinate of a query protein is its distance from the boundary in the CH-plot and X-coordinate is an average distance of its CDF curve from the CDF boundary) gives an opportunity for unique assessment of intrinsic disorder in several categories, allowing predictive classification of proteins into structurally different classes [69][70][71][72]. In fact, based on their positions within the CH-CDF phase space plot, the query proteins are classified as ordered proteins (i.e., those predicted as ordered and compact by both CDF and CH; these are located within the lower-right quadrant (Q1), native molten globules or hybrid proteins containing sizable levels of order and disorder (i.e., proteins predicted to be disordered by CDF but compact by CH-plot that can be found within the lower-left quadrant (Q2), proteins with extended disorder, such as native coils and native pre-molten globules (i.e., proteins predicted to be disordered by both methods that are located within the upper-left quadrant (Q3), and proteins predicted to be disordered by CH-plot but ordered by CDF (the upper-right quadrant (Q4) [71].
Per-residue disorder predisposition of query proteins was evaluated using a set of disorder predictors from the PONDR family PONDR ® VLXT [169], PONDR ® VL3 [170], PONDR ® VSL2 [171] and PONDR ® FIT [172], as well as IUPred computational platform that allows identification of either short or long regions of intrinsic disorder, IUPred-L and IUPred-S [173,174]. The use multiple computational tools for prediction of intrinsic disorder in proteins is an accepted practice in the field. This is because different computational tools use different attributes (such as amino acid composition, hydropathy, sequence complexity, etc.) and models for to calculate a disorder predisposition score for every amino acid residue in a query protein. As a result, often, different tools generate rather different outputs. There is no accepted consensus, of which disorder predictor is the best in evaluating disorder predisposition of a query protein. In reality, since different computational tools are sensitive to different disorder-related aspects of the amino acid sequence, all of them contain some useful information.
The per-residue disorder predisposition scores are on a scale from 0 to 1, where values of 0 indicate fully-ordered residues, and values of 1 indicate fully-disordered residues. Values above the threshold of 0.5 are considered disordered residues, whereas the residues with the disorder scores between 0.25 and 0.5 are considered as highly flexible, and the residues with the disorder scores between 0.15 and 0.25 are classifies as flexible. The results of these analyses were used to classify query proteins based on their percent of predicted intrinsically disordered residues (PPIDR) and mean disorder score (MDS). Here, the accepted strategy was used to classify proteins based on their PPIDR values as highly ordered (PPIDR < 10%), moderately disordered (10% ≤ MDS < 30%), and highly disordered (PPIDR ≥ 30%) [175]. Similarly, proteins were considered as highly ordered, moderately disordered, and highly disordered if their MDS values were MDS < 0.25, 0.25 ≤ MDS < 0.5, and MDS ≥ 0.5, respectively.

Statistical Evaluation
RNA levels and cell viability were evaluated after three identical tests. Statistical difference in the analysis of variance was calculated using Statistica 6.0, with differences with p < 0.05 being considered as statistically significant. Mixed-model analysis of variance (ANOVA) or the Student's t-test was used to analyze data from the luciferase reporter assays, and p values less than 0.05 were considered as statistically significant.

Conclusions
Achieving a significant success in the treatment of malignant tumors is impossible without effective targeting of the stem cells. It is necessary to target genes that provide unlimited proliferation and self-renewal of the stem component. The use of reporter constructs detecting the expression level of these factors as an enrichment method allows visualization and analysis of the most malignant subpopulation of tumor cells.
The use of the lentiviral reporter made it possible to isolate a high-malignant subpopulation of colon adenocarcinoma cancer cells. A comparative analysis of the transcriptome of stem-like cells allowed to characterize the main signaling pathways involved in their self-maintenance and proliferation. The results of this analysis can serve as further evidence of the high similarity of the CSC early embryonic development and functioning processes. Reversing of the stem phenotype towards terminal differentiation is a promising direction in cancer treatment. The transcriptome analysis revealed a set of genes that can serve as potential prognostic markers and therapeutic targets in colon adenocarcinoma treatment.