Resolution of Complex Issues in Genome Regulation and Cancer Requires Non-Linear and Network-Based Thermodynamics

The apparent lack of success in curing cancer that was evidenced in the last four decades of molecular medicine indicates the need for a global re-thinking both its nature and the biological approaches that we are taking in its solution. The reductionist, one gene/one protein method that has served us well until now, and that still dominates in biomedicine, requires complementation with a more systemic/holistic approach, to address the huge problem of cross-talk between more than 20,000 protein-coding genes, about 100,000 protein types, and the multiple layers of biological organization. In this perspective, the relationship between the chromatin network organization and gene expression regulation plays a fundamental role. The elucidation of such a relationship requires a non-linear thermodynamics approach to these biological systems. This change of perspective is a necessary step for developing successful ‘tumour-reversion’ therapeutic strategies.


Information Crisis in Bioscience and Cancer Research
The discovery of a linear cause/effect relation was a signature of the effective explanation since the very beginning of modern science. Everyone knows that any linear relation has limitations (e.g., the physical boundaries of the system) and only holds when particular requirements are met, but these limitations are too often overlooked, especially in biomedical research. Thus, the logic of the linear causality penetrated modern education and science with no caveats regarding the scale and boundary conditions. The elucidation of DNA double helix and the breaking of genetic code consisting of three DNA nucleotide letters allowing for the transcription of genes for protein synthesis were two great discoveries of the 20th century. This breakthrough strengthened the reductionist trend and favoured the development of genetic engineering up to CRISPR technology in the 21st century. In this case, linearity is embedded in a linear deterministic flow of information while considering each gene product as an autonomous agent in charge of a specific physiological role. However, until now, this approach has not provided an answer to a simple question in spite of evident technological success: how the same genome with the same genes differentially regulates their expression in different tissues of multicellular organisms. As wittily said by Carl Woese [1] "an engineering biology might still show us how to get there; it just does not know where "there" is". The lacking "there" is a constitutive problem of modern biology.
The fact that cancer problem is more complex than we have thought and needs re-thinking was recently recognised by leaders in cancer research Robert Weinberg [2] and Douglas Hanahan [3] after the failure of cancer genome sequencing projects to support a somatic mutation theory of cancer. The latter, in turn, largely makes a current base of the costly "precision medicine", which is also beginning to frustrate the hopes: "The targeting of lower-level agents (genes and pathways) provides unsatisfactory results at higher levels of this system such as clinical outcomes" [4]. In addition, the reproducibility crisis has been claimed to comprise 75% of the published reports in biomedicine, with 95% in cancer research [5][6][7].
The general situation appears as an information crisis in bioscience and biomedicine. While the nature of the crisis is multi-faceted and it goes from strictly scientific issues (the lack of the ability to put into context the relations between different layers of biological organization) to research policy asking for rapid solutions and, thus, not financing very basic research approaches that shake the boundaries of accepted dogma.
Here, we propose a few considerations regarding the origin of both problems, together with some theoretical hints to overcome the crisis.

Regulation of the Human Genome: Networking by Self-Organisation Is the Second Principle of Genome Regulation after WATSON-CRICK Complementarity
In mammals, only 2% of the genome DNA is coding the translated proteins. Approximately 50% of DNA is composed of simple reiterated sequences enclosing; in addition, a considerable portion (45%) of mostly silent transposable elements in the human genome. The clustering of heterochromatin (constitutive and facultative) is differently patterned in relation to the radial gradient between the nucleolus and nuclear envelope in the cells of different tissues [8][9][10][11][12][13][14][15][16], as in Figure 1; this pattern is altered in cancer, as it is known to all pathologists.
In the past, several scientists suggested that heterochromatin regulates the differential expression of genes by biophysical mechanisms, by a force field gradient acting in the nucleus space [17][18][19][20][21]. Currently, there are enough results that include those coming from the nucleus image analysis and chromosome conformation capture techniques, which suggests that the heterochromatin mediating the gene silencing position effect is responsible for the differentiation-specific expression of the genetically active euchromatin [22][23][24][25][26][27][28][29][30]. The above evidence suggests that a three-dimensional heterochromatin (3D) topology created by 3D contacts represents a lacking "there", i.e., the positional information of the supra-chromosomal network in the cell nucleus for transcription speciation. This positional information is specified by the loops of the~1Mbp topology-associated chromatin domains (TADs) [31]; these loops join gene enhancers with promoters that are are insulated from neighbour TADs, and can converge to mRNA transcription hubs, likely uniting the genes for specific functions [24,[32][33][34].
The integrity of the chromatin network is reciprocally dependent on ongoing transcription [35,36]. This implies that this spatial system should change with transcription speciation by differentiation in normal development or by induction [37]. Understanding how positional information is translated into functional one is crucial in overcoming the above-mentioned knowledge crisis, because it gives the 'missing link' between the specificity of transcription and chromosomal organization. showing the principal components and differential replication timing. (b) EC resides in the nuclear interior, whereas HC localizes to the nuclear and nucleolar periphery. A scheme is republished from [14], Figure 1, with the licence provided by Elsevier and Copyright Centre.
In the past, several scientists suggested that heterochromatin regulates the differential expression of genes by biophysical mechanisms, by a force field gradient acting in the nucleus space [17][18][19][20][21]. Currently, there are enough results that include those coming from the nucleus image analysis and chromosome conformation capture techniques, which suggests that the heterochromatin mediating the gene silencing position effect is responsible for the differentiationspecific expression of the genetically active euchromatin [22][23][24][25][26][27][28][29][30]. The above evidence suggests that a three-dimensional heterochromatin (3D) topology created by 3D contacts represents a lacking "there", i.e., the positional information of the supra-chromosomal network in the cell nucleus for transcription speciation. This positional information is specified by the loops of the ~ 1Mbp topology-associated chromatin domains (TADs) [31]; these loops join gene enhancers with promoters that are are insulated from neighbour TADs, and can converge to mRNA transcription hubs, likely uniting the genes for specific functions [24,[32][33][34].
The integrity of the chromatin network is reciprocally dependent on ongoing transcription [35,36]. This implies that this spatial system should change with transcription speciation by differentiation in normal development or by induction [37]. Understanding how positional information is translated into functional one is crucial in overcoming the above-mentioned knowledge crisis, because it gives the 'missing link' between the specificity of transcription and chromosomal organization.

The Genome "Maps" of Positional Information Need Phase Transitions
From a purely structural point of view, the position of a system becomes informative if (and only if) it introduces a symmetry break in the space. The term 'symmetry break' is coming from statistical mechanics that can be interpreted as going from a situation in which a system can choose between different equally probable and, thus, symmetric, positions to a situation in which one choice becomes much more probable than the others and the system is channelled toward the corresponding direction.
In an abstract sense, the genome would be a 'homogeneous space' if a 2 m-long DNA molecule is evenly (symmetrically) distributed in the 10 µm-sized nucleus. The readout of any gene would be equally probable. It is not the case as the negatively charged DNA is many-fold packed into the chromatin by specific positively charged proteins (histones). Moreover, through contacts of the distant genome parts and self-organisation of the euchromatin and heterochromatin domains this DNA folding is topologically uneven, thus undergoing symmetry break. This can create positional information for the differential tissue-specific transcription of the genes, either exposed in loose euchromatin or hidden in densely packed heterochromatin. In this sense, the establishment of positional information needs symmetry break. Symmetry break in the genome space means that its space becomes a "map" with recognizable coordinates for the whole genome forming a network [38][39][40] of wired contacts that impose a topological grid (and subsequently a specific address) on nuclear space. Symmetry breaking causes physical phase transitions that operate by hydration, change of charge, aggregating, and crowding of the genetic material. Physical phase transition is also a prerequisite of self-organisation [41].
Therefore, we can safely state that networking by self-organisation is the second principle of genome regulation after Watson-Crick complementarity. This is the 'missing link' that, when fully investigated, could answer Carl Woese's question [1].
The idea of self-organisation in biology has a long history [39], but it only entered nucleome studies in the 21st century [42][43][44] that were strengthened by the Hi-C genome methodology [45,46]. In parallel, phase transition by self-organized criticality (SOC) was revealed in quantitative studies of transcriptome dynamics during cell fate change, such as an oocyte-to-embryo transition in early embryogenesis and induction of the commitment for cell differentiation [47,48].
Each cell type corresponds to a specific configuration in a genome map. It is mandatory to destroy the previous order for creating a new network to change cell fate [47,49] and, subsequently, a new map of positional information. A system that is able to perform such a task should possess the dissipative features (to be both open and energy-saving) that are necessary for sustaining the onset of new ordered structures by dissipating the energy excess [39,49,50]. The toy-model of self-organization by criticality (SOC) is a sand-pile to which we pour new sand, a grain at the time [51]. This system attains a 'stable critical state': when the slope becomes too steep, somewhere on the pile, the grains slide down, causing a small avalanche. These small fluctuations do not alter the system (sand-pile slope): the added sand balances the continuous avalanches and the shape of the pile remains the same. Nevertheless, occasionally, an added grain can cause a large catastrophic avalanche by a domino effect that involves progressive smaller avalanches falling down until the base and, thus, flattening the entire sand-pile (long-range correlation, which is typical of transition states). The small perturbations that are caused by added grains are the counterpart of the continuous solicitations that are experienced by a biological system from its microenvironment and its stability comes from the dissipation of such perturbations by small fluctuations around the 'native state'.
A network structure sustains the 'domino-effect' that is necessary for spreading the signal and, at odds with real sand-piles, where the onset of 'catastrophic avalanches' happens by chance and with no peculiar location, a 'purposely wired' network is able to discriminate the non-relevant and informative perturbations in a way that is analogous to the allosteric effect in protein molecules [52,53]. This translates into the 'response specificity' of biological systems, in which only certain stimuli (independently of their relative energy) are able to elicit a relevant response [52]. The allosteric effect of an enzyme is the most studied case of 'non-local' effect mediated by the wiring protein structure (contacts among residues generated by folding), where not only the local active site is important, but also conformational changes of the entire structure of the protein molecule that impinges on the functionality of the active site [53]. Similarly, the preferential way of the chromatin networking might be determined by the anisotropic (symmetry breaking) super-packaging of heterochromatin [54], with its aggregation-stimulating capacity to impinge 'a sticky silence' on the large genome domains [15,55]. The non-coding RNA that is transcribed from heterochromatin [56] can also electrostatically contribute to its packaging [25,57]. The spatial compartmentalization of the nucleus is maintained by relatively simple basic physicochemical principles, including electrostatic and hydrophobic forces that generate an extremely detailed 'nuclear topology' giving a material basis to robust gene expression regulation, as indicated by Ronald Hancock [58]. However, the latest studies on single cells revealed that, at a level of TAD loops, the genome organization is heterogeneous and transcription itself might shape and stabilize the TAD-shaping contacts [59,60]. These data show the complexity of the system, both being consistent with the SOC features described above and with the principles of reciprocal causation [61].

Differential DNA Replication Timing Translates Temporal Information into Positional Information
The discrimination between relevant and non-relevant signals asks for positional information and clearly includes the fourth dimension: time. The positional information in the cell nucleus is, in fact, set by differential replication timing, early for transcriptionally active and late, for inactive genome parts [62], see also Figure 1a. The simultaneously replicating chromatin domains become both similarly epigenetically marked and spatially joined [63][64][65][66]. This replication timing, in turn, correlates with the speed of rapid oscillatory motions of the corresponding TADs [67], which were first identified through the motion of replication clusters [68]. We can safely say that, the temporal information is translated into spatial symmetry breaking for differential transcription operating in G1-phase through the mechanism of differential replication timing coupled with the epigenetic marking of the chromatin. This implies that a significant spatial re-arrangement of the nuclear configuration changing the genome transcription profile can be epigenetically transmitted along with subsequent cell divisions [69]. It is evident that we are facing a sort of a 'second genetic code' operating with a four-dimensional (4D)-language (and thus much more difficult to interpret with respect to the mono-dimensional logic of genetic code based on Watson-Crick pairing along a linear sequence). This spatio-temporal language is dealing with self-similar (fractal) folded structures across different organization layers due to the need of 10,000-fold packaging of the 2m-long DNA thread in a cell nucleus [70,71]. An interesting approach for their study in individual cells is correlative microscopy with increasing resolution [72].
The onset of a state transition (and the subsequent state change that in the case of the nucleus corresponds to the reorganization of chromatin network) represents a 'giant fluctuation' that invades the entire network [73][74][75]. Thus, the supra-chromosomal network structure should be also arranged in a manner that provides the necessary elasticity and order parameters that correlate with the fluctuation amplitudes [38,76]. Oscillations of the chromatin networks have been registered. They likely start with the pulsation of the clusters of nucleolus-associated heterochromatin [77] that involve the nucleoli (whose pulsations were already known to Balbiani (1883), as cited by [78,79], and end up into overall transcriptional bursting in individual cells [80][81][82] enabling regulatory information to be coordinated and transmitted in a digital manner [83]. These events increase the coherence and synchronization of the whole cell population, as revealed in different stress conditions [84][85][86]. The oscillations for orchestration of cellular coherence have ultradian periodicity (4min-4h) [87]. Currently, the genome oscillations are in the focus of 4D nucleome studies [25,88,89].

Deterministic Chaos for Cell Fate Change: Inevitable Heterogeneity and Fluctuations
The enucleation of bifurcation space points is another crucial issue to consider, where a system undergoes a shift from one state to another. The tissue-specific differentiation states are relatively few, only~250 human tissues and~440 cell types per 22,000 protein-coding genes, whose compositions (in the case of lack of any preferred global configuration of the entire genome organization) would have been over billions [90]. This means that these extremely rare and robust compositions of active/inactive genes (attractors) cannot be set by the laws of linear thermodynamics. It is non-linear thermodynamics instead, which can help to set these low probable (in the case of a purely random choice of gene expression levels) events. Within its action, a rare attractor can be found by the random search of the fluctuating system in a multi-dimensional space of the possibilities, which involves the entire cell community in feed-back interactions with permissive environment; however, the system becomes "hooked" (determined) when an appropriate sensor/inducer can give rise and forward a channelled trajectory toward it [91]. This type of behaviour of dynamic systems, which "harness stochasticity" and allow for the creative choice, was first recognized by Edward Lorenz and termed as "deterministic chaos" [92], which becomes "tamed" (set at the lowest energy state) in the appropriate attractor [93][94][95][96].
From the above considerations, we see that any cellular system changing its fate, when in search of the appropriate differentiation attractor (this first step is designated in biology as a commitment), should be both heterogeneous and fluctuating. It also means that to channel the process further, initially a few outliers, a small avant-garde group might be involved, which can then self-organize the whole cell population [97], where chaotic 'white noise' of a strongly fluctuating system might be paradoxically favourable [98].
For an experimentalist, this translates into the need, in order to catch the ongoing dynamical process, to study individual cells (microscopy, in situ and flow cytometry, single-cell transcriptome sequencing), and to consider not only the population averages, but also population heterogeneity. As analysed by [82], the relationships between inputs to outputs can either be deterministically specified, diversity generating, or buffering. It also means that the traditional methods of studying the DNA, RNA, and proteins from the population extracts may be not informative enough, at some point even misleading, as showing no statistically significant changes, where only the initiating small groups of cells may be already starting to channel the process. This implies that we should not neglect the statistical outliers which may be the forerunners of the change [98]. Therefore, the reproducibility crisis in biomedicine might be associated not so much with the incorrect measurements and statistical errors, as with the still not recognized dissipative nature of their objects.

Cancer Cell Treatment Resistance Is Ensured by Deterministic Chaos and Reprogramming to the Embryonic State
In particular, the non-equilibrium thermodynamics is related to the nature of cancer cells and their resistance to the conventional radio-chemotherapy treatments that are applied in oncology clinics, which aim to kill these cells. The majority is killed, while a small minority (in some cases only 0.01% or even less) inevitably (at least, in the case of tp53 mutants) escapes death and finds a new attractor while using the regulations of the tamed chaos [96,98]. This strict minority of cells can give rise to cancer relapse. The "statistically irrelevant" behaviour of cancer drug escapers is due to the change of their very nature-becoming reprogrammed to the state of an egg, embryo, or adult stem cell, thus acquiring the toti-or pluri-potency [99][100][101][102][103][104][105][106][107][108][109][110][111][112][113]. The capability of deep reprogramming particularly depends on the tp53 (tumour suppressor) insufficiency [114], which is common in the most aggressive cancers [115].
The cells reaching the state equivalent to the embryonal stem cell (ESC) should destroy the positional information of the normal tissue of origin. ESC themselves display a highly dynamic loosely bound architectural chromatin proteins of the less constrained (than in differentiated cells) cell nucleus [116]. They also set bi-valent chromatin domains (as detected by activating and repressing histone marks at the promoters of developmental genes), which enable transcription fluctuations from repressed to active to occur [117]. Thus, the system can be channelled toward different "predetermined" pathways/attractors. Moreover, some opposing pathways (e.g., proliferation versus death, senescence versus oncogenesis) can be driven, even by the same pleiotropic genes, f.ex., by c-myc and ras [118,119], or by reversible shifts of pou5f1 from full to spliced transcript [120]. In the emergency conditions of the limited resources, this tactic of coupling the opposite effects driven by the same genes, even within the same bi-potential cell undergoing asymmetric division, allows for concentrate growth factors for a few channelled survivors [91,121]. This is not strange, given that any complex system interprets the incoming signals according to its state [122,123]. This inescapable 'system state dependence' of any observed feature should push toward a general re-thinking of many non-critically accepted statements, like the widely believed tenet that high proliferation speed is the principal marker of cancer cells.
As a matter of fact, the main feature of cancer cells is not their speed of proliferation (the embryonal cells can proliferate faster), but their capability to be reprogrammed to the totipotent or pluripotent profile. Just on the contrary, the reprogramming that is induced by irradiation or anti-cancer drugs is associated with uncoupling from cell division and transient drop of proliferation, leading to tetraploidy and multi-nucleation [112,[124][125][126], and just those cells serve (after de-polyploidisation and return to mitoses) for the escape from damage and cancer relapse [127,128]. The whole-genome doubling of the reprogrammed cell, in turn, can allow for the creation and bifurcation of the two different epigenomes for asymmetric cell division of two daughters [121,129].
Hence, these reprogrammed polyploidised cancer cells acquire high genome expression plasticity (the capability to visit and set in different attractors), allowing for acquiring an otherwise improbable resistance to extinction, which is the essential hallmark of cancer [103,106,108,130]. The process also involves accelerated cell senescence-the metastable bi-potential state, which appears to be "another face" of self-renewal associated with polyploidy [131][132][133][134][135].

Cancer Cells Recapitulate the Stress-Adaptive Programs of Unicellulars and Early Metazoans
This extreme endurance of tumour cells is likely borrowed from the phylogenetic evolution of cell response to stress [136,137], where the elastic epigenetic reprogramming was the main tool in the search of the available survival pathways. In this reprogramming, the protozoan, and even prokaryotic transcription cassettes (in the latter, first of all, of the DNA repair pathways), become dominating in the transcriptome of cancer gene network [138][139][140][141][142], particularly in association with polyploidy [143,144]. Accordingly, cancer is not only reprogrammed to the expression state of a gamete, a two-cell embryo or morula of a multicellular organism [100,113], but it can recapitulate through reprogramming the adaptive pathways of unicellular organisms. Those developed on Earth through the 3.5 billion-year history, which have survived six Earth-wide catastrophes, some destroying up to 75% of species.
In turn, the early embryonal development in mammals also bears the phylogenetic features of the transition from unicellular to multicellular organisms. This transition evolved little new genes and it was mostly operating with multi-nuclearity, transient colonial forms, ploidy cycles [145,146], chromatin diminution [147], facultative sex [148], and asexual reproduction [149,150]. Interestingly, the earliest tumours were discovered within the same evolutionary forms, in basic metazoan Hydra, the authors concluded: "cancer is as old as multicellular life on Earth" [151]. Unicellular-to-multicellular transition, as better studied on cellular slime moulds going from unicellular-to-multicellular transition and back in each life cycle, is mostly based on epigenetic plasticity and self-organisation [152].

Chaos in Cancer is Akin to Chaos in an Early Embryo Which Serves to Change Cell Fate
Moreover, the impressive chaos was found in the genome of an early mammalian embryo (up to blastocyst stage), in striking similarity with cancer. Chromosome instability, aberrant mitoses, heteroploidy, anaphase bridges, structural chromosome aberrations, and loss of heterozygosity in some single cells, etc. manifest it [153,154]. During the oocyte-to-embryo change, a critical phase transition occurs in transcription involving thousands of herein low expressed genes, as determined by Tsuchiya et al. [48], while Peaston and colleagues [155] found the activation of the thousands of previously silent endogenous retroelements at that same time. They suggested that massive activation of retroviruses (released from their heterochromatic shelters by genome-wide demethylation) provides potential scope for large-scale coordinated epigenetic fluctuations that can trigger this cell fate change, thus potentially driving it by the rules of the non-equilibrium thermodynamics. The hyper-dynamic behaviour of the structural chromatin proteins, with the diffusion of chromocentres found in ESC [116] (directly indicating the erasure of positional information) corresponds to the same type of regulation. In other words, albeit shocking a median biologist, chaos in early embryo paradoxically serves cell fate determination [156].

Reprogramming of Positional Information Can Be Used for Cancer Reversion Therapy
By the same rules of the non-linear thermodynamics and, as mentioned above, the aggressive radio-chemotherapy inducing reprogramming associated with reversible polyploidy and genome instability, paradoxically favours the opposite to the aimed effect-creation, selection, and survival of residual resistant clones and, therefore, is generally without perspective [106,113,128]. The exit from the circulus vicious of cancer treatments might be searched either in interruption of reprogramming (that might be difficult or even impossible) or in the opposite direction, while using the toti-pluripotent reprogrammed state of cancer cells for their channelled normalisation (differentiation, which epigenetically can take over the mutations [157]). For this purpose, the appropriate morphogenic embryonal inducers, regeneration fields [158], and the structured 3D environment-putting tumours in context [159] and changing the chromatin folding-can be used [160]. This approach can, at least, stop the invasion and metastatic spread of cancer, which is the main cause of patient mortality.
In turn, if we consider the perspective of cancer normalisation by differentiation [161][162][163], the open problem of the whole genome regulation by topological mechanisms for tissue differentiation should be solved in parallel. These two are the converging issues that need to consider the non-linear thermodynamics of the living matter, which ought to become the mainstream approach in the very near future.
Up to now, these systemic approaches are still in their infancy for the therapeutic application, although the theoretical background of cell fate reversion is already established by many different studies comprising a coherent picture [50,[164][165][166]. During the last two decades, it has been firmly ascertained that adult somatic cells-either normal or pathologic-can be efficiently "reprogrammed", recovering the status of Induced Pluripotent Stem Cells (iPSCs) and then redirected to acquire a differentiated phenotype [167]. Still more intriguing is the action of retinoic acid, which was initially demonstrated to promote reversion of teratocarcinoma [168], has been later recognized to induce a nearly complete differentiation of leukemogenic cells in acute promyelocytic leukemia through the induction of terminal differentiation into granulocytes [169], thus reversing cell fate from cancer to healthy fully differentiated state, which can, in turn, lead to partial or complete clinical remission. Recently [170], such a cell fate reversion was also observed for a solid tumour. The most recent achievement, the conversion of breast invasive cancer cells into adipocytes, was shown in a mouse model while using PPAR-gamma (member of nuclear receptors regulating adipocyte differentiation) agonist combined with the MEK inhibitor [171].
The oncoming future will eventually verify the 1948 prediction that was made by Warren Weaver (one of the fathers of mathematical information science), who posited that the study of networked systems represents the only possible way to cope with the complexity of life [172].