Challenges and Perspectives of Chemical Biology, a Successful Multidisciplinary Field of Natural Sciences

Objects, goals, and main methods as well as perspectives of chemical biology are discussed. This review is focused on the fundamental aspects of this emerging field of life sciences: chemical space, the small molecule library and chemical sensibilization (small molecule microassays).


Introduction
Chemistry, biology and physics are the essential pillars sustainingthe development of the natural sciences. The enormous progress in our understanding of biological systems, currently in expansion, is due to the skillful application of the principles and techniques of organic chemistry, wherebysynthetic organic chemistry plays the initiator role in the biological discovery. This shows that biology "has moved" from the descriptive (phenomenological) level to the molecular (biochemistry) level generating new disciplines (structural biology, molecular biology) that now form part of the field of natural sciences. Living organisms produce and releasechemical compounds into the environment significantly affecting other organisms and determining the existence of chemical interactions between these individuals. That is to say, all different organisms generate chemical signals and, in return, every OPEN ACCESS single one responds to some other organism's chemical signal. Chemical signals produced by organisms are made up of compounds produced through secondary metabolic pathways that are intimately related to primary metabolic pathways and metabolites (carbohydrates, lipids, proteins, and nucleic acids). In biology, the analysis of these interactions might be performed "from up to down" in the direction of decreasing complexity of a biological system. For example, "top-down" analysis begins with a cell, a tissue, a limb, or an organism itself, and ends at the molecular level with the molecules that participate in its complex intra-and/or interactions. From chemistry's molecules and native macromolecules "bottom-up" synthesis begins in the direction of increasing complexity to reach the totality of the cell and its higher organizations emerging through modular motives and supramodular functional units [1].Since 1839 it has been recognized (Schwann and Schleiden) that the cell is the simplest unit in living organisms. Furthermore, the cell is a protected region, in which diverse small molecules and macromolecular clusters (both kinds of molecules are endogenous) interact with each other in a harmony that is reached by auto-assembling. Much of the cell's content is an aqueous solution with small molecules (e.g., simple sugars, amino acids, vitamins) and ions (e.g., sodium, chloride, calcium ions) [2]. In this sense, to perform further studies on living systems and biochemical processes, there was a need to have available tools to disrupt these systems using small molecules and, therefore, find a new depth and detailed information on the living systems operation ( Figure 1). Following this trend, a new discipline, chemical biology, appeared from the interface between the synthetic organic chemistry and the molecular, structural, and cellular biology. Chemical biology's principal task was to explain the fundamental ideas related to life chemistry and to apply the knowledge of living organism's behaviors to its interactions between biological macromolecules (endogens) and small organic molecules (exogens). This means going beyond the understanding of biological processes to the molecular level.
Chemical biology differs from biochemistry (biological chemistry) principally in its chemical analysis methods of secondary metabolism products and their interconvertions. Chemical biology also differs from bio-organic chemistry, whose action field is the secondary metabolism products study.
Chemical biologists are largely using organic chemistry techniques in exploring biological systems and understanding how biological systems work (mechanism, etc), whereas biochemists use techniques closer to biology to understand interactions of biomolecules on the descriptive level, generally.
The initial stage of chemical biology consists on the analysis of a biological system or phenomenon of interest ( Figure 2). In this analysis structural information concerning the structure of biomolecules involved in a particular biological phenomenon, or the structure of endogenous small molecules which interact with these macromoleculesare deduced,for instance. Without structure, identifying the function of the system is complicated. This structural information is then employed to define unsolved chemical problems, i.e. the development of new methods for the synthesis of small molecules like secondary metabolites or inhibitors that can be used to perturb and examine biological systems. Without its synthesis there will not be enough material to study the structure and the process dynamic neither. The last stage consists in the use of the prepared molecules as instruments within adequately designed biological or biochemical experiments [3,4] (Figure 2). The objects, objectives, principal methods, and perspectives of chemical biology are discussed in this review. Emphasis is placed on the central aspects of this emerging field of life science which include the chemical space, small molecule library, and chemical sensibilization (small moleculemicroarrays) [5], trying to make it more illustrative. This review does not pretend to be complete; giving the chemical biology bases and advances in the limited space of this journal format is almost unattainable. The principal goal of the present revision is to encourage young organic chemistry researchers' interest through the synthetic organic chemistry -biology interface and to demonstrate the multidisciplinary research importance of this new area that has been gaining momentum worldwide.

Chemical Space and Biological Space
The expression "small molecules" appeared during the development of organic compound synthetic methodologywithreference to their molecular weight (less than 500-700 Da). Synthetic or natural molecules are used as pharmacological prototypes (models) or as precursors in the construction of new chemical entities of wide and diverse practical utility. They also can be used as crucial instruments to study biological processes.
The virtual chemical world of small molecules as well as natural macromolecules is immense; therefore studying them is an arduous task [6][7][8][9]. However, dynamic interactions of organic chemistry and biology have led to identify certain molecular structures that are widely employed within the "natural laboratory" repertory. Besides being important in the studies of small molecules and natural macromolecular associations, these molecular structures are known as "privileged structures" [10]. The term "privileged structure" was introduced by Evans in 1988, and it is defined as "a molecular structure able to provide different receptor ligands" [11].
Accordingly, synthetic small molecules (synthetically derived by chemists) or natural small molecules (metabolite products of organisms) with cell membrane permeability capacity can be used to modulate protein functions in a selective, rapid and reversible way.
Molecules are characterized by a wide rangeof descriptors, such as shape, physical properties (molecular mass, nucleophilicity, lipophilicity, and dipolar moment), topology, etc [12][13][14]. In this sense, the term "chemical space" is equivalent to the "multi-dimensional descriptor space" that enshrines all of the carbon small molecules, which in principle, could be created. This means that within a chemical space there are structural or molecular characteristics that determine an organic compound's family [5].
On the other hand, it is calculated that the virtual library of chemical compounds with pharmacological properties could be approximately 10 60 bioactive molecules [15,16], although the chemical compounds used by biological systems represent only a very small fraction of this astronomic number and they have small molecular mass. It is assumed that the simplest living organisms can auto-organize with some hundred different types of these compounds; while the most complex organisms must contain thousands of different small molecules [5]. Thus, it is clear in terms of the number of compounds, that the biologically relevant chemical space is a very small fraction of complete chemical space that may contain 10 30 -10 200 possible small molecules [17,18] according to the calculated parameters ( Figure 3). At the same time, it is important to recognize that nowadays there are approximately 49,000,000 substances registered by the Chemical AbstractsService (CAS) [19] and only 1,350 pharmaceuticals based on the small molecules approved by the U.S. FDA [20]. Living systems have evolved over a billion years to materialize carefully the controlled chemistry in an aqueous media typically at temperatures between 0-100 °C. Under these conditions that are essential for life, many chemical reactions do not occur with an appreciable rate and most of them would not yield the products in a reproducible and specific way. Therefore, these chemical reactions require an additional and vital component, called an enzyme. Enzymes, together with other proteins and diverse nucleic acids are used by the living systems to undergo the realization and control of these reactions. These macromolecules are responsible for the synthesis, transport, and degradation of every small molecule within the biological environment. Now it is known that the genomes of the simplest living systems encode the sequences of less than 1,000 different proteins, while humans and all mammals have around 50,000 genes, this means that as a rough order of magnitude, an estimated of 50,000 to 100,000 active proteins exist in mammalian bodies, numbers that are a small fraction when compared with the total number of proteins that could theoretically exist. For example, the average size of a natural typical protein is about 300 residues (α-amino acids). If only the 20 canonical α-amino acids come together in various combinations to produce proteins, the number of possible α-amino acid combinations in this 300 amino acids protein model is 20 raised to 300 (20 300 ) or 10 390 , and if only a single molecule of each of these polypeptides were to be produced, their combined mass would vastly exceed that of the known Universe. Natural proteins are therefore also a very select group of molecules [5] (Figure 3).
The emergence of macromolecules, which possess the ability to store, distribute information, and translate it into a catalytic function, manifests the dual multi-faceted nature of protein synthesis: as a chain of enzymatic steps of the chemical pathway in the biochemical space and as a process of genetic information transfer in the space of molecular biology.
Being in the biologically relevant chemical space, natural compounds, or natural product-like small molecules play an important role as simple instruments to understand intracellular signaling and protein-protein or protein-DNA dynamic interaction processes, which are common and fundamental to any normal cellular process and to cellular deregulation process. Secondary and primary metabolites co-evolved together-proteins and nucleic acids -and its molecular scaffolds and functional groups "were adjusted" during millions of years for a specific biochemical purpose. For this reason, natural products and their synthetic analogues encompass this biologically relevant chemical space and have high affinities to their respective biological targets.

Small Molecules Library Generation
There are three sources that allow obtaining small molecules that could form libraries: (1)isolation of natural products, (2)chemical or/and chemo-enzymatic derivation of natural products, and (3)chemical synthesis [21,22]. Traditionally, natural products are usually studied as a complex extract mixture that is subjected to rigorous separation processes, analysis, and spectroscopic study, in addition to evaluation of their biological properties. This process conduces to the identification of lead molecules that can act as pharmacologic agents, because natural products are indisputable models for chemical synthesis and chemical biology.
Chemical synthesis (preparation of new molecules by means of chemical reactions) has been and still is an important procedure for the generation of new molecular libraries. Chemical synthesis possesses several strategies and tactics that develop gradually based on the demand of other sciences.
The first strategy, known as synthesis oriented towards a specific target (or desired product) -Target Oriented Synthesis (TOS), allows the access to a specific region of chemical space. This approach is intimately linked to the retro-synthetic analysis development [23], which begins with the disconnection of a complex structure looking for some simple and appropriated materials to reach the preparation of the structurally complex molecule (Figure 4). The retro-synthetic analysis of a complex product allows preparing it via a disconnection process where this product is "broken" down into chemical species that can be synthesized from available substrates using known reactions. This "top-down retro-synthesis" process is opposite to chemical synthesis.
As it was mentioned above, synthetic organic chemistry explores a dense region of chemical space in a precise area with known properties. However, are these chemical space regions defined by a natural product or a known structure, really the best or most fertile region for the discovery of small structures able to modulate macromolecular functions? This is a highly relevant question for organic chemists, taking into account the high potentiality offered by the small molecules.
The answer is found inside the principles of the synthetic tendency, known as Diversity Oriented Synthesis (DOS), that allows a wide compound distribution within the chemical space [24,25]. This methodology provides deliberate, simultaneous, and efficient synthesis of more than one targetcompound in a diversity-directed approach to respond to a complex problem [26] and allows the construction of small molecule collections showing a range of bioactivities pointing to the efficient synthesis of molecules with diverse and different molecular structures [27]. Although the structural complexity is not a requirement for molecular diversity, it has been proposed to confer specificity within biological interactions [28]. In the DOS methodology, the synthetic analysis is performed "forward" and the strategy is developed so the simple row materials can be transformed into diversified and complex compounds [29] (Figure 5).

D es cl os in g th e st ru ct ur e, lo ok in g fo r si m pl er pe ac es
Adequate comertial products "Stored materials"

Desired product synthesis
Desired product "target-molecule" D es cl os in g th e st ru ct ur e, lo ok in g fo r si m pl er pe ac es D es cl os in g th e st ru ct ur e, lo ok in g fo r si m pl er pe ac es Adequate comertial products "Stored materials"

Productos deseados complejos y diversos
Members of the DOS library should be "diverse" in their own substitutions, as well as in the location of these substitutions. In this sense, to design a DOS methodology, it is necessary to bear in mind the four types of diversification [30,31]: • Substituent diversity: this can be incorporated by a "combinatorial variation" within the employed building blocks. • Stereochemical diversity: this can be incorporated using agents able to control the asymmetric reactions. • Functional group diversity: this can be included, by chemical manipulation. • Nuclei diversity: this allows the different ring's fusion and formation.
The DOS methodology includes the use of sequential reactions capable of generating complexity and incorporating molecular diversity inside a compound's collection to form simple starting materials [32]. As a result, in these "branching pathways", the product of one reaction is the substrate for the following step, transforming a simple row material in a diverse and complex molecular series. The total synthesis of complex natural products (TOS) as well as the structurally diverse libraries (DOS) requires strategies and tactics with well-defined characteristics. These synthetic methods must be robust, flexible, and stereoselectives [33].
The increasing interest in the synthesis of heterocyclic molecule libraries obligates chemists to develop new strategies for the promissory libraries design that could be generated employing different models. An important question must be done before selecting such a model: how many molecules must a collection contain to be productive and structurally diverse? There are "large" libraries (with more than one million compounds) [34] and "short" libraries (with only 10,000 compounds) [35], both guided by one natural product, whose principal task is to generate a lead-compound, a pharmacological agent, in an effective, rapid, and economic way. However, a common misconception is that the most large and diverse collections are automatically better; moreover in practical terms, "large" libraries (10 10 molecules) are difficult to organize based on the properties of each compound. It seems that it could be suitable to generate and examine a "small or short" collection (less than 60 compounds) of alkaloid-like molecules or terpenoids. A "structurally diverse" ideal library containing 40 natural diverse molecules has been theoretically evaluated and shown to have superior parameters compared to collections with 46-168 members [36].
To go deeper into biological processes at a molecular level, new libraries focused on Biology Oriented Synthesis (BIOS) are needed as well [37]. The main requirement for these types of libraries, based on the natural product bioactive structures, is to study the biological systems by means of direct perturbations using small molecules [38,39]. The depicted small molecule advantages -high temporal control, good and easy dosage control, and versatility allow for the measurement of biological responses rapidly in a wide variable range for different cell species (systems) in vivo and in vitro.
In summary, it is necessary to highlight the vital importance of natural products and/or its close analogues in revealing biological mechanisms, furthermore emphasize that almost all of them are contained by the biologically relevant chemical space (Figure 3). Natural products have great affinity towards macromolecules, e.g., proteins, DNA, and lipid structures, products of primary metabolism. At the same time, various classes of compounds, forexample, terpenes, phenolics, phenylpropanoids or alkaloids,playa prominent role in secondary metabolism. Therefore, one of these promissory products could be an initial prototype in the generation of its analogues via BIOS methodology [40][41][42][43]. Thus, the new natural products inspired libraries are efficient and promising in the pharmacologically active agents research. Within the scientific literature more than 50 libraries based on natural product frameworks are found [43,44]. Some of these molecular collections are based on the combinatorial chemistry idea [45][46][47].

Chemical Sensibilization
Since one of the prime targets of chemical biology is to exploit the power of synthetic organic chemistry to discover and explain the essential molecular pathways in cellular, molecular, and structural biology, modern preparation methods are needed for new small molecules that will be the main instruments in these studies.
Having these instruments, new micro bioassay techniques are needed (chemical sensibilization), which will be able to detect new changes through functional perturbation using small molecules and to answer biological questions ( Figure 6). Foremost, different measure formats can be employed to explore these perturbations in a highly rational and efficient way [48]. Within the development of a small molecules screening test, three critical factors must be considered: (i) test type (biochemical, cellular, phenotypic, micro assay etc.); (ii) detection technology (luminescence, fluorescence, radioactive,etc.), and (iii) required reagents to be employed (cell lines, enzymatic substrates, purified proteins, antibodies, and positive or/and negative controls, etc.). There are several formats, diverse shapes and sizes; nevertheless, these can be broadly classified in three different categories: (a) High-Throughput Screens (HTS); (b) High-Content Screens (HCS) and c) Small-Molecule Microarrays (SMM). The first process, where numerous molecules are analyzed in a swift and parallel way to uncover bioactivity, was developed thanks to combinatorial chemistry [49,50]. The second process is based on the cell or organism analysis by image automatized techniques to detect multiple phenotypic responses. The latter is more attractive and trendy today [51,52]. In these sorts of assessments, generally, small molecules are covalently linked to the micro assay surface (glass, gel, polymer, etc.) and exposed to the target of interest. These assessments allow for the identification of novel modulators for different proteins within several biological processes.
In the "physiological context", these assays are now automated and work well for cell free systems (enzymes, proteins, DNA in vitro), nonetheless the in vivo assays based on vertebrate mammalian cellular tissues are very difficult and expensive long-time process (Figure 7). The intermediate level between enzymatic assays and assays on the cellular tissues of mammalian vertebrates could be an invertebrate model assay (Caenorhabditiselegans, Drosophila melanogaster and Daniorerio) [53], because they are relative simple processesinvolving easy manipulations. However, in comparison to two first models, the zebrafish model for small-molecule discovery is more similar to mammalian orthologs. The zebrafish [54][55][56][57] instance can be classified as a "border line instance" (Figure 7). This last method is productive especially over the developmental biology [58], fields such as chemical genetics [59] and oncology [60,61], among others.
One of the SMM methods and thereby, of chemical biology, is the design and preparation of small molecules, whosemolecular mechanism (mode of action) is based on the inactivation of enzymes implicated in diverse diseases, including parasitic, infectious, etc [62]. During biochemical research, it is well-known that any small molecule that slows down or blocks enzyme catalysis (reversiblyor irreversibly) is an enzyme inhibitor. These molecules must be structurally similar to the substrate for a specific enzyme. If the interaction with the target enzyme is irreversible (usually covalent), then the small molecule is referred to as an enzyme inactivator (or irreversible inhibitor). Many natural products and/or theirclose analogues work as enzyme inhibitor or inactivators. As the name implies, inhibition of an enzyme activity by a reversible inhibitor is reversible, suggesting that noncovalent interactions are involved. An irreversible inhibitor (enzyme inactivator) can prevent the return of the enzymatic activity for an extended period of time, suggesting the involvement of a covalent bond [62]. Among all the target proteins for potential therapeutic use, enzymes are the most promising for rational inhibitor design. Thus, the discovery of new selective enzyme inhibitors is an exciting approach to the rational discovery of new drugs [62][63][64]. These molecules can be designed using an organic-synthetic rational approach founded on natural products.
In general, most of these compounds perform as base models (prototypes) for pharmaceutical development [65,66] and are invaluable precursors in cellular biology [67]. Moreover, by use of small molecules similar to natural products as protein-protein interaction modulators it can be understood more about the complex intracellular signaling processes.
To ensure an effective small molecule library designed by BIOS methodology, there are three synthetic strategies: (1) molecular scaffold based libraries of a particular natural product (alkaloid, phytohormone, etc.); (2) libraries derived from natural product sets with specific substructures; (3) libraries determined by the natural product's structural characteristics resemblance. The three strategies provide positive results and interesting examples [37,68].
Most BIOS/SSM research is committed to the new natural product analogues or chemotherapeutic agents' discovery with improved and concrete properties. Waldmann and Schreiber´ studies illustrate this working chart. Prof. Waldmann and co-workers have shown a small molecule collection based on the sesquiterpenedysidiolide structural framework isolated from the Caribbean sponge Dysideaehteria exhibiting an inhibitory activity against a phosphatase protein Cdc25A. During this study a compound with 27 times more activity was found, retainingthe γ-hydroxybutenolide structural moiety [69] (Figure 8). The research group directed by Prof. Schreiber took an indole alkaloid spirotryprostatin B as a prototype in a new library study of 3,232 spirooxindole molecules. This molecule, isolated from the saprophyte mold Aspergilliusfumigatus, has been found to have antimitotic properties. By developing yeast assay it was possible to identify enhancers of growth arrest induced by latrunculin B (a natural product that sequesters monomeric actin and prevents the formation of actin microfilaments), new spirooxindols with enhancer properties were synthesized [70] (Figure 9). In the effortsof Shair and co-workers [71], a small (2,527) library of molecules was developed inspired by the alkaloidgalantamine, a potent AChE inhibitor. It is interesting that the alkaloid structure was selected because of its high range of functionality and molecular rigidity, and not because of its potent activity, while looking for molecules proficient in perturbing the protein traffic from the endoplasmic reticulum to the plasma membrane through the Golgi apparatus. Employing a phenotypic cellular assay with SMM screening, it was possible to identify new molecule, secramine that is a potent VSVG-GFP movement inhibitor from the Golgi apparatus to the plasma membrane ( Figure 10). Within the experiment designed by Schreiber and co-workers [72], a SMM of 12,396 molecules library was used, based on specific substructures from complex natural products, evaluation with the fusion protein Hap3p-GTS resulted in the discovery of the cellular transcription inhibitor named haptamide B. A further study by Schreiber and co-workers, focused on the new bioactive molecules identification with a 1,3-dioxane sub-structure effecting phenotypic tests and several enzymatic bioassays (50 different biotests) [73]. Using a similar strategy the Schlutz group prepared a new Nheterocyclic library containing 45,140 distinct molecules with the purine sub-structure [74]. Recently, a SMM method was reported within the A549 and HeLa mammalian cell-based screening format with an imaging-based readout. This method will be a valuable support on the discovery of new potential chemotherapeutic agent [75].The above mentioned and selected examples are just a few of the vast studies number of such developed by scientists working in the chemical biology field.

Future
Chemical biology must demonstrate how reactions and small molecules will be able to be used in a fascinating way to study biology. In general it must also show how it analyzes the structures and functions of materials produced by chemical or biological means. Chemical biology, a synthetic or modified small molecule science, within the live systems context is now able to go beyond the memory and cognition, the detection and signaling, and the comprehension and modulation studies of cellular circuits. Without a doubt chemical biology is now capable of and will achieve success in the discovery ofnew biological phenomena, expanding our knowledge horizons about living beings, including ourselves [76].