An Analysis of Transcriptomic Burden Identifies Biological Progression Roadmaps for Hematological Malignancies and Solid Tumors

Biological paths of tumor progression are difficult to predict without time-series data. Using median shift and abacus transformation in the analysis of RNA sequencing data sets, natural patient stratifications were found based on their transcriptomic burden (TcB). Using gene-behavior analysis, TcB groups were evaluated further to discover biological courses of tumor progression. We found that solid tumors and hematological malignancies (n = 4179) share conserved biological patterns, and biological network complexity decreases at increasing TcB levels. An analysis of gene expression datasets including pediatric leukemia patients revealed TcB patterns with biological directionality and survival implications. A prospective interventional study with PI3K targeted therapy in canine lymphomas proved that directional biological responses are dynamic. To conclude, TcB-enriched biological mechanisms detected the existence of biological trajectories within tumors. Using this prognostic informative novel informatics method, which can be applied to tumor transcriptomes and progressive diseases inspires the design of progression-specific therapeutic approaches.


Introduction
Precision medicine guided by genomics is emerging as a realizable, though limited, approach to improve survival of cancer patients. Currently, most precision medicine strategies focus on targeting select mutational abnormalities with notable successes such as EGFR, c-met [1,2]. The effectiveness of precision medicine can be limited due to a lack of actionable targets, high mutation rates and the dynamically evolving signaling circuitry associated with oncogenic disease progression [3]. These lessons were reported from several genomics guided precision medicine based clinical trials, which includes, SHIVA, Molecular Screening for Cancer Treatment Optimization (MOSCATO-01), Copenhagen Prospective Personalized Oncology (CoPPO), MAST, PERMED 01, and PREDICT [4][5][6][7][8][9]. Despite improvement in progression free survival observed in a minority of patients, representing only 10% of the total patient population eligible to receive targeted therapies based on drug availability and drug-target matching criteria, a large number of cancer patients remain unable to benefit from precision medicine guided therapy [10][11][12][13].

TcB Analysis Pipeline
Gene expression values were log 2 transformed, sum of gene expression values by sample were first calculated, followed by calculation of median shift by each sample was performed and these values were designated as TcB. Then, gene median shift by individual gene was calculated. Median shift = (value-range minimum)/range. Data frame was then rearranged in increasing order of TcB or gene median shift, grouped by TcB as lowTcB (0-0.25), midTcB (0.375-0.625) and highTcB (0.75-1). Differential gene expression between groups by pairwise (edgeR or Dseq2) or within group by pairwise comparing against group median was performed using R and initial analysis was performed using Network analyst version 3.0 [33]. Significant genes (p < 0.0005), FDR 0.05 were used for pathway enrichment analysis by reactome, and gprofiler. GSEA enrichment option for analysis by multiple databases are included in R code.

Availability of Computer Code and Algorithm
TcB calculations, transcriptomic ordering, grouping, significant genes and pathway enrichment analysis were written as R code and provided as Appendix A.

Network and Heatmap Analysis
Network analysis of reactome enriched pathways were performed using Enrich-mentMap tool [34] available from Cytoscape version 3.8.2 [35]. Heatmap analysis represented as row variances, hierarchical clustering based on one minus Pearson's correlation by predefined groups, heatmap collapsing was performed based on group medians were performed using Morpheus available from https://software.broadinstitute.org/morpheus/ accessed on 12 November 2021.

Statistical Analysis
Significant genes by fold change > 1.25 (log 2 ), or 0.25 fold for median shift transformed datasets, FDR and p < 0.005. Reactome enrichment FDR 0.05 and q-value cutoffs p < 0.0001. Quantitative rendering represented as box plots, ribbon plots, lollipop plots, density plots and parallel index plots were generated using OriginPro 2021 (OriginLab Corporation, Northampton, MA, USA).

Canine Lymphoma Study
Canine B cell lymphoma subjects (n = 10) were enrolled and treated with BKM120 in an IRB and IACUC approved veterinary clinical study, performed at Tufts Cummings School of Veterinary Medicine, Grafton, MA. All experiments pertaining to canine clinical study were performed in accordance with relevant guidelines and regulations. Canine subjects received BKM120 2.25 mg/kg orally for 28 consecutive days. BKM120 provided as gift from Novartis. Analysis for tumor response were performed by direct tumor measurement or through the use of CT imaging. Furthermore, fine needle aspirations of the lymphoma nodes were collected on Days 0, 7 and 21 and utilized for RNA isolation and gene expression analysis. Routine blood and urine analysis were performed for toxicity evaluation and as part of standard clinical assessment. RNA isolation was performed using RNeasy mini kit (Qiagen, Germantown, MD, USA) and transcriptomic analysis with canine array. We performed unbiased assessment to determine pertinent biological pathways associated with treatment response as has been shown before [16,36].

DNA, RNA and Protein Synthesis Assay
DNA synthesis by EZClick EdU cell proliferation kit (#K946), RNA synthesis by EZClick Global RNA synthesis assay kit (#K718), Protein synthesis by EZClick Global Protein synthesis assay kit (#K459) were purchased from Biovision (Milpitas, CA, USA), assays were performed following the instructions supplied by the manufacturer, and analyzed by flow cytometry as described before [36].

Cell Cycle Analysis
Fluorescent cell cycle reporter, cell cycle green/red lentiviral EF1a, Puromycin (Sartorius, Bohemia, NY, USA) was transduced in SUDHL-4 cells, as described before [37]. Following puromycin selection, transfected SUDHL-4 cells were plated at 50,000 cells per well in 96 black clear bottom well plate, pre-coated with retronectin and continuously imaged every 2 hours following the release from BKM120, using Incucyte Zoom (Sartorius, Bohemia, NY, USA).

Ordering Gene Expression Signatures by TcB
Approximately 22,000 protein coding genes are expressed at varying levels in the transcriptome of individual cell or patient at any given time. Considering that malignancy is a progressive disease, collection of transcriptomic signatures of various patients represents a biologically disordered dataset. Moreover, malignancies are well known to progress in a manner marked by unrestrained proliferation, which is accompanied by increases in DNA activity, which can be directly linked to increased global transcription [38]. Following these established biological principles, we developed a strategy for rearranging biologically disordered transcriptomes into increasing levels of transcriptional activity, henceforth termed transcriptomic burden (TcB). Evaluating patients with different TcB proved fruitful in identifying biological trajectories associated with tumor progression.
In the first step, we initially re-organize patients based on the median gene expression levels across all patients. In essence, this stratifies patients based on their overall RNA activity (which we refer to a subject's TcB) relative to the total population under study. In practice, we first calculate the sum of normalized RNA expression values for every transcribed gene by each patient, (represented in each column) ( Figure 1A). Then, standardized TcB values were calculated (using sum of gene expression value series, as numerical shift, a directional measure that assigns positional identities to each patient as numbers between 0 and 1 ( Figure 1A).
Once data was re-ordered around patient TcB, a second informatic process was performed to then evaluate individual genes within the different TcB groups. Again, we used median shift for each gene across all patients to clarify the direction of individual gene function in the disease, as "gene shift" ( Figure 1A). This step reordered the transcriptomic data-frame by increasing order of TcB and gene shift, using these metrics as abacus like frames, resulting in a linearly aligned dataset as summarized in Figure 1A. Because 0.5 represents the midpoint on this scale, patients whose TcB shift is between 0 and 1 collectively represent the order of patients based on the progression of their transcriptome from low to high TcB, as shown in Figure 1A,B. Further through systematic comparison of transcriptomic data segmented based on low, mid, and high TcB it is possible to determine progressive biological changes occurring in tumors based on differential gene expression analysis, as shown in Figure 1B.  This TcB analysis method was applied to a previously published diffuse large B cell lymphoma (DLBCL) transcriptomic data as training set consisting of n = 75 patients [26]. The original data set consisted normalized RNA counts reported as TPM (transcript per million) was analyzed as an unordered transcriptome of patients by heat map (Figure 2A). This dataset, transformed as a scatter plot by gene function, illustrated a randomly distributed gene activity across the entire spectrum of the disease (Figure 2A). TcB analysis was then performed on the transcriptomic datasets of 75 DLBCL patients represented by 19,734 genes. Patients were stratified by median shift cutoffs (0-0.25, 0.375-0.625 and 0.75-1, (n = 49), respectively, as low, mid and high TcB) with patients remaining outside of these cutoffs were excluded. TcB analysis showed that the original dataset became resolvable into different populations as shown in scatter and density plot ( Figure 2B). With increasing TcB, we noticed that gene activity spread (measured as gene shift) gradually began to condense, shown in the scatter plot ( Figure 2B). These TcB groups were then used for statistical comparison for differential gene expression across and within groups. There were 1,136 genes that showed significant up or downregulation between TcB groups (p < 0.0005, FDR 0.05), shown as heatmap ( Figure 2C) (see Supplemental Table S1). By using Gprofiler and reactome pathway enrichment analysis tools, we identified and grouped significant genes by their corresponding biological functions (see Supplemental Table S1). We then organized these pathways into interactive biological networks using cytoscape. In a surprising finding, DLBCL patients with low TcB had the most complex network of interconnected pathways as opposed to their mid or high TcB counterparts ( Figure 2D-F). Biological enrichment in low TcB included signaling, translation, and metabolic functions ( Figure 2D and Figure S1). In contrast, a predominance of transcriptional regulation (gene expression) and chromatin modification, with no apparent enrichment for signaling, translational, or metabolic pathways were observed in high TcB groups. Mid TcB groups also showed transcriptional regulation and chromatin modification too, along with cell cycle pathways, transcription regulation, and receptor tyrosine kinase signaling ( Figure 2E,F and Figure S1). Considering that DLBCL is classified by B lymphocyte subtypes (germinal center subtype or activated B cell subtype) or molecular subtypes (myc-bcl-2 double expresser), we assessed if the enriched biological processes were specific to disease subtypes. However, our results indicate that biological enrichments identified through TcB ordering showed considerable homogeneity across DLBCLs, irrespective of cell of origin or molecular subtypes ( Figure S2). Together, these results, based on the comparison of transcriptomic changes across the full spectrum of DLBCL, suggest that tumor progression can be predictably described as gradual changes in biological functions.

TcB Stratification Identifies Conserved Biological Patterns across Tumors
TcB stratification was applied to multiple tumor transcriptomic datasets to further verify the approach. Previously published log 2 transformed datasets representing DLBCL (from National Cancer Institute-NCI, n = 481), [27], acute myeloid leukemia (AML from Oregon Health & Sciences University, n = 451) [29], acute lymphoblastic leukemia (ALL from NCI Target-2018, n = 203) [28], breast cancer (METABRIC, n = 1904) [30] and bladder cancer (TCGA, n = 408) [31] were analyzed. The biological pathways identified for all tumors were then collapsed to reveal higher order processes that drive cellular function, as shown in Figure S2 and Table S1, and the pattern of changes categorized by TcB groups are shown in heatmaps represented in Figure 3A. When tracking the overall behavior of these higher-order functions by averaging expression of genes representing each pathway, we found that every single pathway followed a uniform pattern of changes in all tumors, except in AMLs that originates from hematopoietic stem cells ( Figure 3B). These results clearly revealed that protein translation, electron transport chain and citric acid cycle, transcription, the cell cycle and ECM are enriched higher-order biological functions that were consistently resolved by TcB. Due to our cutoff rule, the analysis was performed on fractions of the original populations ( Figure 3C) though still considered notable sample sizes with significance was stringently determined at the genomic level. These results suggested conservation of these processes across various tumor types could be illuminated by TcB enrichment and analysis.

Correlating Gene Functions and TcBs in Pediatric Solid Tumor Progression
Datasets analyzed thus far represented unique cancer types and did not include normal tissue controls. TcB changes may be a physiological function irrespective of pathological status; therefore, we evaluated higher order processes revealed by TcB analysis with normal controls to see if the data contained artefacts. For this study, we analyzed a recently published transcriptomic dataset consisting of 657 pediatric extracranial tumors with 14 cancer diagnosis types matched to 147 normal tissues [32]. Using TcB analysis, the entire panel was first examined, and datasets representing each tumor type were evaluated separately. Results from the full panel analysis showed that normal tissues had lower TcB values when compared to 14 different types of tumors ( Figure 4A). These results suggested that global transcriptomic activity generally increases in tumors. Unlike other tumors, the higher order biological enrichment detected in this panel was restricted to cytokine activity and ECM genes that increased as TcB increased ( Figure 4B and Table S2). Transcription regulation genes decreased in normal tissue as TcB increased ( Figure 4B and Table S3). Further examining the log 2 mean of combined gene expression of the detected higher order processes compared to normal tissue as a baseline, it became evident that cytokine genes are highly expressed in desmoplastic round cell tumors, melanoma, hepatoblastoma, osteosarcoma, alvelolar soft part sarcoma, and teratoma ( Figure 4C). Most tumors, except yolk sac tumors, neuroblastomas, and Ewings sarcomas, expressed higher levels of ECM genes than normal tissue ( Figure 4C). Similarly, with the exception of yolk sac and Wilms tumors, the expression of genes associated with transcription was reduced in most tumors ( Figure 4C).
Datasets containing n > 75 from this transcriptomic panel consisting of 14 different tumors and normal were then selected and analyzed individually. The transcriptomic dataset used for individual TcB calculations and biological analysis included Ewings sarcoma (n = 98), neuroblastoma (n = 227), normal (n = 104), osteosarcoma (n = 94) and rhabdomyosarcoma (n = 122). As shown by the heatmap of higher order enrichment analysis, gene expression of cell cycle, transcription and translation was decreasing with TcB increases only in tumors, but not in normal tissues ( Figure 4D and Table S2). The results from individual TcB analysis of pediatric tumors and normal tissues indicates that the patterns of changes represented by cell cycle, transcription and translation identified from previous analyses of solid tumors and hematological malignancies ( Figure 3) as higher order enrichments and are unlikely to be an artifact.
We consistently observed declining patterns of cell cycle gene expression along with TcB increases in our analysis, but this is a counterintuitive biological phenomenon for tumor progression. Therefore, we sort to resolve this conundrum through identifying the biological nature of genes that became reduced with higher TcB. We therefore, compared correlation between TcBs and expression values of each gene across the entire dataset as a next step in our assessment. We identified the top 50 significant genes (p < 0.0001), by positive (r > 0.75) and negative (r < −0.75), by Spearman's rank correlation from individual and entire panel analysis of Ewings sarcoma, neuroblastoma, normal, osteosarcoma and rhabdomyosarcoma. TcB clustering analysis of these top 50 genes identified two clusters, with immune function (represented by C1QA, C1QB, C1QC, CD14, CD68, CD74, FCER1G, IFI30, LGALS9, TYROBP, and HLA-DMA) as upregulated ( Figure S4). Furthermore, these tumors also showed significant decreases (log 2 FC, −10 to −5, highTcB versus lowTcB) in the expression of genes that encode for histone proteins ( Figure 4D and Figure S4). This unusual feature of histone complex gene expression declining from lowTcB (log 2 , 4.5-7.56) with higher TcB falling below normal tissues were observed in these tumors ( Figure 4D). In non-transformed human cells, histone gene expression and biosynthesis of 400 million histone proteins are tightly coupled with DNA replication and cell cycle, and essential for cell survival [39]. However, histone levels and expression are reported as inversely correlated with overall transcriptional rate in Drosophila [40]. Taken together, this observation of diminished histone gene expression via TcB correlation provides novel dimension into pathological basis of disease biology in these tumors, and warrants further proteome-level validations. Overall, by comparing the transcriptomes of tumor and normal we conclude that TcB-based biological predictions are not arbitrary, but as aligned with the nature of malignancy.

Charting the Biological Roadmap of Malignant Progression in Pediatric ALL
Our next goal was to determine whether TcB shifts and biological patterns uncovered by this method were associated with progressive steps in malignancy. To this end, we required a study where 2 samples were taken from the same patient over disease progression. The Target 2018 ALL study fit this paired patient and time-varying sample collection criteria. This ALL study analyzed transcriptomic data from patients with initial and relapse diagnoses [28]. TcB analysis was performed among these patients with clinically defined relapse. Of the 41 patients analyzed, 24 developed a shift in TcB (from low to mid or higher) and 17 showed no such shift (remained at the same level) between diagnoses. In our analysis of this group for higher order biological pathway behaviors, we found that log 2 expression profiles of genes involved in mitochondrial translation, TCA/ETC, cell cycle and transcription declined as TcB increased ( Figure 5A). In contrast, ECM genes increased along with TcB ( Figure 5A). Analysis of correlations by TcB shifts from repeated ALL samples collected from same patient at different intervals showed that patterns of TcB shifting from low to mid, and mid to high TcB shift values naturally occur with disease progression ( Figure 5B). Results from these findings suggested that an increase in TcB alone could be an independent characteristic feature of ALL disease progression ( Figure 5B).

Dynamics of TcB Ordered Biological Functions
Understanding the dynamics of TcB enriched higher order processes in the context of tumor progression is crucial notably because it is know that increases in ribosomal biogenesis and translation in G1, transcription in G2, reduction in transcription at M phase are related with cellular progression through cell cycle [41]. Cancer is characterized by disordered proliferation, and the biological features found within different subsets of TcB tumors as well as the features found in all tumor categories are likely to suggest a stepwise dysregulation of the cell cycle involving higher-order processes as the tumor progresses. Therefore, we attempted to determine the dynamics of these properties in cell culture models. Our analysis of the Broad Institute CCLE panel, which consists of 1527 cells lines, and other tumor panels did not resolve using the TcB method. TcBs were found to be skewed towards left, suggesting that most cell lines show greater levels of transcriptional activity than natural tumors, and perhaps reached a stable state during cell culture ( Figure S5). As a result, evaluation of biological trajectory of tumor progression using cell culture models may prove challenging for validating our observations derived from the analysis of tumor transcriptomes.
Given limitations of cell line analysis, we investigated the directional features of these higher order biological functions by adopting a natural disease, canine lymphoma. A longitudinal in vivo study of the tumor transcriptome at three time points was conducted using a phosphoinositide-3 kinase (PI3K) inhibitor, BKM120, as targeted drug ( Figure 6A). BKM120 blocks the oncogenic PI3K signaling mechanism that plays an important role in metabolism and ribosomal biogenesis [42]. Additionally, we and others have previously reported that PI3K signaling is dysregulated in canine lymphoma [24,43]. With PI3K targeting, we rationalized that blocking higher-order functionality associated with low TcB would allow TcB and the properties of higher order biological functions to be reset. A total of ten dogs with B cell lymphoma were enrolled in this trial, and six of them completed the evaluation of BKM120, a PI3K targeted inhibitory therapy that was administered orally in cycles of 28 consecutive days, as summarized in Figure 6A. This evaluation, while lacking power for population-level outcomes, followed Simon's "minimax" design due to the smaller sample size and lower subject enrollment in this large animal study [44]. The data gathered was therefore deemed adequate to gather reliable biological insights. Only 6 of the 10 dogs with lymphoma in this study completed at least one treatment cycle. Clinical characteristics, including breed, age, diagnosis, stage, treatment history, toxicity, and outcome, are summarized in Figure 6B. Based on peripherally measurable lesions as determined by consensus criteria defined by veterinary cooperative oncology group [45], 2/6 dogs treated with BKM120 showed partial response with more than 30% decrease in tumor volume; 1 dog with progressive disease, and 3 dogs remained with stable disease. Two of the six dogs that showed partial response had persistent blood glucose levels and were withdrawn before completing five cycles of BKM120 treatment in 28 days; the remaining four were removed because of lack of improvement or because the physician determined they had reached the end of their lives. TcB shifting from low to mid, and mid to high TcB shift values naturally occur with disease progression ( Figure 5B). Results from these findings suggested that an increase in TcB alone could be an independent characteristic feature of ALL disease progression (Figure 5B).  To further explore the biological directionality of this disease, we examined gene expression changes related to higher order processes in ALL, comparing initial versus relapse samples. The groups were again parsed into subsets that had a TcB shift between sample time and ones that did not. In quantitative directional analysis by lollipop plot, log 2 FC genes (initial vs. relapse) associated with ECM progressive increase and appear between mid-to-high TcB or low-to-high TcB shifts ( Figure 5C). Conversely, log 2 FC of genes representing transcription, cell cycle, TCA/ETC, and mitochondrial translation (relapse vs. initial) was negatively regulated with TcB shifts, but not in ALL without TcB shifts ( Figure 5C). Since these biological changes and patterns were consistent observed in most tumors and seem to exhibit directional activity in ALL, we conclude that genes constituted within higher order biological functions are likely to dictate disease progression in ALL. Together, these findings suggest TcBs and higher-order biological functions are indicators of directional properties in tumor progression (summarized ( Figure 5D) and can have prognostic insight.
The transcriptomic assessments were performed on tumor biopsy samples collected at the time of diagnosis, 1 week and 3 weeks after treatment, and TcB estimation was performed using the dataset (n = 18). Results show that the TcB values of canine lymphoma subjects CL#1, CL#6 remained unchanged, while CL#2, CL#5 showed significant decreases over the course of 3 weeks with BKM120 ( Figure 6C). By applying the median shift transformation, the entire canine transcriptome (represented by n = 30,311 genes) is scaled uniformly, so that global changes in gene function may be directly compared between different time intervals. The kernel density plot of these results clearly indicates that TcB declines gradually after 3 weeks of BKM120 treatment. The declining TcB effect was modest in CL#1, 2 & 6, which had stable or progressive disease ( Figure 6B,C). Of 6 dogs, 2 had partial response showed rapid decline in gene expression within the first week after being treated with BKM120 ( Figure 6B,C). Furthermore, CL#5, the subject with the most elevated gene expression levels, also had more dramatic reduction in gene activity by week 3, though with advanced stage and relapsed disease, this subject also died by week 3 (Figure 6B,C). As TcBs and gene shifts are compared, both net transcriptional activity (as measured by TcB) and gene function (as measured by gene shift) responded to therapy through downward shift ( Figure 6D). Drug treatment affected expression of genes associated with TCA, ribosome/translation, and cell cycle which became gradually upregulated, while genes representing ECM became gradually downregulated over 3 weeks of treatment (for entire population) with BKM120, as shown by lollipop plot mimicking low TcB state in Figure 6E. By comparing the observed biological pattern reversal and TcB decreases in BKM120 treated canine lymphoma ( Figure 6E), it becomes clear that biological functions associated with disease progression follow reversible chronological patterns. Based on the expression profile changes of genes representing higher-order biological processes, each canine subject shows progressive upregulation of genes involved in TCA, ribosome/translation, and cell cycle functions in all of them, with the exception of CL#5 with terminal disease pretreatment and post week 1. CL#5 showed an entirely dominance of genes involved in ECM that reflected a fully developed disease state ( Figure 6F), as supported from the predicted progression pattern of human ALL patients ( Figure 5E). By comparing these canine results to the generalized patterns observed in human DLBCL tumors, we conclude that these higher order biological functions both exhibit orderly responses and are reversible features. The findings were further confirmed by in vitro measurements of replication, transcription, and translation activities using SUDHL-4 (DLBCL cells) and BKM120 treatment ( Figure 7A). The results indicate that translational activity, measured as global protein synthesis, is decreased with BKM120 treatment, which recovers by day 1 after drug release in SUDHL4 cells (Figure 7). Following the initial lag in protein synthesis in SUDHL4 cells, show a delayed global DNA and RNA synthesis (days 1-2) coinciding with delayed G2 recovery, following BKM120 treatment release ( Figure 7A,B). We conclude based on our observations of BKM120 treatment in canine lymphoma and experiments using SUDHL4, as well as our predictions based on TcB analysis of human tumors, that the progression of malignant tumours can be characterized by a sequence of transcriptomic changes relating to protein transcription, replication, and translation. with canine lymphoma subjects on y-axis. (D) Kernel density plot of global genes (n = 30,311) transformed by median shift for overall gene behavior between pre-treatment and treatment intervals in x-axis indicated by each canine subject treated with BKM120 (E) Lollipop plot representing fold change in log2 expression of individual genes representing higher-order biological processes (y-axis) comparing BKM120 treatment intervals vs. pre-treatment and in x-axis averaged by canine lymphoma subjects is represented in x-axis. (F) Heatmap represents the average log2 expression of genes representing higher-order biological processes by individual canines at pre-treatment and BKM120 treatment intervals.

Discussion
Cancer is defined as an uncontrolled growth of body cells that spreads to other parts of the body (source NCI/Cancer.gov) 12 November 2021. Although cancers arise from many different types of cells, the biological principles governing malignant progression seemingly may have common paths. Clinical observations indicate that malignancies, of the same type such as in lymphomas, can be indolent, turn aggressive, show progressive features or undergo cycles of dormant and active states [46]. Presently, there is no direct biological explanation for how these uncontrollable cells manifest an uneven proliferation rate [46]. Although mutation drifts and progressive genomic alterations at the molecular level are recognized in evolving tumors [47,48], the longitudinal biology behind tumor progression is still largely enigmatic. Transcriptome datasets are conventionally analyzed for significant genes or pathways using many different methods for computing differential gene expression have provided interesting seminal findings regarding genetic mutations and ex vivo drug sensitivity [26][27][28][29][30][31]. Yet, these studies did not attempt to generate global biological summaries because there existed no methodologies for linearizing the datasets for unbiased exploitation of all the available information.
Several studies have attempted to unravel the biological progression roadmaps in cancer. Based on the evolution of genetic changes, Fearon and Vogelstein initially proposed a linear tumor progression sequence [49]. Many studies followed this approach by aligning gene expression datasets with mutational events in order to estimate temporal biological patterns in tumor progression [48,50,51]. Nevertheless, genomic datasets from cross-sectional tumor collection can include samples from unknown disease states, treatment status, environmental exposures, etc., which exhibit tremendous mutation heterogeneity and varying loads of mutational burdens. Therefore, aligning mutations with temporal biological order is considered a weak strategy [52]. Despite mutation assessments being considered sufficient guides for determining and administering effective cures, Vogelstein asserts that a crucial need in basic cancer research is a better understanding of the biological pathway trajectories [52]. There have been further models using probabilistic or Bayesian networks. In these models, genes in one pathway become parents of all genes in the next, and parental genes tied by mutation were integrated into the probability model [53,54]. Lastly, progression at the pathway level was inferred from a priori gene assignment, but only when the pathway had many gene sets. However, none of these approaches were successful in identifying homogenous biological trajectories among cancers, which led to the conclusion that cancer progression is non-linear. Major drawbacks of these approaches include the inability to analyze transcriptomic patterns unbiasedly and the lack of biologically appropriate hypotheses for identifying biological trajectories. Our strategy assumes that transcriptional complexity will continue to increase with tumor progression and the transcriptomic burden will increase. DNA levels in a cell remain relatively constant and can be synthesized in a short period of time. However, RNA levels are higher, so cells need longer periods of time to synthesize enough RNA to divide. Malignant cells may proliferate more rapidly when this constraint is removed, and their RNA content must gradually increase. Our results illustrate this generalized concept with several lines of evidence, including TcB shift from low to high-TCB in ALL, reversibility in canine lymphoma, and transcriptional lag in SUDHL4 cells with BKM120 treatment, all indicative of cancer progression.
Transcriptomic analysis through a TcB strategy of linearizing tumor transcriptomes uncovered progressive biological features in tumors, laying the groundwork for more comprehensive analyses. Overall, the results from TcB based analysis support a homogeneity in tumors in terms of higher-order biological choreographers of malignant cell proliferation, whereas the signaling pathways involved in promoting such growth show heterogeneity. Interestingly, the translation (ribogenesis), replication, and transcription that were identified from our TcB analysis also correspond to central dogmatic principles which intersect progression fates of cell cycle. In this respect, our identification of these central dogmas being progressive aligned by TcB as in the same order of the cell cycle path defines biological mechanisms for tumor progression. Our findings also support the theory of embryonic reversal-based definition of malignant progression [55][56][57] in that changes observed in translation (ribogenesis), replication, transcription, and ECM in primary human tumors followed the opposite directions of higher-order processes occurring in human embryonic development [58]. TcB rendering of progressive network features of DLBCL, ALL, and pediatric solid tumors, as well as BKM120 treatment responses in canine lymphomas (see Figures S6 and S7), suggests that biological networks in tumors show continuous changes that are impressive and with predictable pattern. As TcB increases, biological networks that include signaling pathways, ribosomal biogenesis, translation, transcription, and cell cycle remain interconnected across low and mid TcB in DLBCL and ALL ( Figure S7). In contrast, when the tumor transcriptome is enriched with high TcB, biological networks exhibit predominantly ECM reorganization, demonstrating that fully progressed tumors exhibit a biologically distinct state ( Figure S7). Additionally, we found that, when the transcriptomic data was combined with the mutation profiles available for each patient in bladder cancer, a linear alignment by TcB also predicted that the mutations exhibited predictable evolutionary patterns ( Figure S8). Mutation rates increased from low to mid-TcB from 300 to 420/patient but declined at high TcB ( Figure S8A). The number of genes mutated at low TcB (2000) dramatically increased to >8000 from mid to high TcB, which is another indication of tumor progression ( Figure S8B). A word cloud-based analysis of genes frequently altered in bladder cancers revealed that the TTN gene is most frequently altered, with RB1 and CST5 highly altered in low-TcBs but absent at higher-TcBs, p53 and RP11 alterations evident between middle and high-TcBs ( Figure S8C). In contrast, MUC16 and KMT2D steadily increased from low-TcB and became dominant at high-TcB ( Figure S8C). Furthermore, the number and frequency of gene modifications increase with a TcB trajectory, further validating the TcB methodology.
Our TcB transformation and biological learning strategy has limitations, including a stringent cutoff that eliminated portions of patient for gene-centric analysis. The method must be further developed to accommodate samples representing continuous gradients in transcriptomics. Additionally, samples that do not represent the entire spectrum of tumor progression could skew the results. As an example, our analysis of transcriptomic datasets for head and neck cancers, lung adenocarcinomas, and colon carcinomas detected TcB shifts in either low or high levels, suggesting these samples could represent either early or advanced biological conditions, meaning biological trajectory extraction is not possible. Our analysis pipeline included curation of data that eliminated ambiguous biological processes, such as "disease" or "development", since our intent was to illustrate and collapse enrichments based on biochemical activities directly related to cell proliferation. Several bioinformatic tools are required in order to render biological trajectory data, which is limited to the annotations and interactions defined within the databases. Future directions for improving this methodological strategy include procuring appropriate datasets through large prospective trials, integration of clinical outcome and multi-omic datasets, along with a diagnostic or drug-controlled decision making based on the information of TcB analyses to further solidify its use in actionable cancer care.

Conclusions
Precision medicine requires high resolution biological insights to be able to predict tumor attitude and prescribe multifaceted targeted therapeutic strategies for effective cancer treatment [10]. Clinical failures frequently observed with precision oncology suggests progressive changes in biological network evolution as potential "Charlotte Web" factors that could facilitate evading and impeding the therapeutic efficacies of targeted drugs. To acquire further high-resolution biological perspectives into tumor progression, it would be necessary to integrate TcB analysis with transcriptomic data (such as bulk sequencing, single cell transcriptomics, etc.), multi-omics (such as metabolomics and proteomics) to generate a dynamic and multi-dimensional biological roadmap of tumor progression corresponding to each tumor type over time. Detailed rendering of TcB derived network dynamics as outlined in this study and defining combination therapies with integrated prediction of future biological states will have promising potential in precision oncology.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/biomedicines10112720/s1, Figure S1. Network analysis of enriched pathways by cytoscape within TcB groups of Cornell DLBCL. Figure S2. Heatmap represents gene expression patterns consolidated by TcB enriched higher order biological processes across Cornell DLBCL varying by cell of origin (Germinal Center or Activated B cell lymphoma) or molecular subtypes (Myc/Bcl-2 double expresser or not double expresser). Figure S3. Comparison of pathway enrichments by reactome from all TcB groups by hierarchical method and flattening by higher order process. Figure S4 Figure S5. Plot of kernel density shows equal distribution of TcB values (x-axis) across 6 tumor datasets (n = 3522) and skewed distribution in CCLE cell line panel (n = 1527). Figure S6. Canine lymphoma BKM120 trial and gene expression analysis. (A) Heatmap represents overall changes in significant gene expression patterns across canine lymphoma specimens obtained from pretreatment and during one and three weeks treatment with BKM120. (B) Network representation from geneset enrichment analysis using reactome database show up or down regulated processes and interactions from overall impact of BKM120 treatment in canine lymphomas. Figure S7. Biological progression network in (A) DLBCL, (B) ALL and (C) Pediatric solid tumors. Cytoscape representation of biological progression networks enriched from TcB analysis, with nodes representing biological process from lowTcB (blue), midTcB (green) and highTcB (red), with edges indicating interactions among enriched biological process. Figure S8. Gene alterations and TcB trajectory. Gene alterations associated with each bladder cancer patient's transcriptome were aligned by TcB for projection of gene alteration evolution. Graphs show (A) average number of genes altered per patient; and (B) total number of gene alterations across all TcB subgroups. (C) Word cloud analysis for genes frequently altered, together with progression of higher order biological processes associated with increasing TcB trajectory. Tables S1-S3. TcB analyzed solid and hematological tumors, pediatric solid tumor panel and canine lymphoma dataset. Institutional Review Board Statement: Ethics statement. The canine lymphoma studies were conducted following protocol # 115.15 approved by the Institutional Review Board at Tufts Cummings School of Veterinary Medicine, Grafton, MA, USA. If the canine subjects have a performance status of 0 or 1, as defined by the modified ECOG performance index 2004, any pet older than one year old, failed standard therapies, or their owners were not interested in pursuing conventional therapies were approved for enrollment in this study.

Informed Consent Statement:
Informed consent for dogs involved in the study was obtained from pet owners prior to enrollment into the study. Data Availability Statement: "All data generated from this study are available in the main text or the supplementary materials". Transcriptomic datasets were download from www.C-Bioportal.org, accessed on 1 March 2021 and pediatric solid tumor datasets were downloaded from https://omicsoncogenomics.ccr.cancer.gov/cgi-bin/JK, accessed on 12 November 2021.