Biological Importance of Cotton By-Products Relative to Chemical Constituents of the Cotton Plant

Although cultivated for over 7000 years, mainly for production of cotton fibre, the cotton plant has not been fully explored for potential uses of its other parts. Despite cotton containing many important chemical compounds, limited understanding of its phytochemical composition still exists. In order to add value to waste products of the cotton industry, such as cotton gin trash, this review focuses on phytochemicals associated with different parts of cotton plants and their biological activities. Three major classes of compounds and some primary metabolites have been previously identified in the plant. Among these compounds, most terpenoids and their derivatives (51), fatty acids (four), and phenolics (six), were found in the leaves, bolls, stalks, and stems. Biological activities, such as anti-microbial and anti-inflammatory activities, are associated with some of these phytochemicals. For example, β-bisabolol, a sesquiterpenoid enriched in the flowers of cotton plants, may have anti-inflammatory product application. Considering the abundance of biologically active compounds in the cotton plant, there is scope to develop a novel process within the current cotton fibre production system to separate these valuable phytochemicals, developing them into potentially high-value products. This scenario may present the cotton processing industry with an innovative pathway towards a waste-to-profit solution.


Introduction
Cotton (Gossypium) is naturally a perennial plant that is now commercially cultivated as an annual plant in many parts of the world [1]. The cotton bud is the most utilized part of the plant and is the starting raw material for a wide range of products, such as textiles, edible oil, paper, livestock feed, and medicinal products, to name a few [2][3][4][5][6][7]. Cotton fibre has many positive characteristics (comfort, colour retention, absorbency, strength) [2] and, hence, global cultivation has increased to an estimated production of over 23 million tonnes in 2013-2014 [8]. This increase in cotton production has resulted in tonnes of waste remaining after harvesting and processing (ginning), which has contributed to a growing challenge of its disposal [9,10].
Non-cotton fibre biomass residues generated from cotton production and processing includes cotton gin trash (CGT), post-harvest field thrash (PHT), and crushed seeds from which oil has been extracted. Post-harvest trash (PHT) are the remaining parts of the plant left on the field, while CGT is centralised at gins and is comprised mainly of sticks, burrs (calyx), leaves, and soil [9,11]. These by-products of the cotton industry, although underutilized, are being used as soil composts and cottonseed meal nutritional supplements for livestock feed [10,[12][13][14]. Other methods of utilization

Cotton Industry and Processing
When fully matured, cotton bolls are picked and transported for processing, leaving the remaining plant as field trash. During the refining process or ginning of the harvested cotton, impurities are removed from the cotton fibres and are recovered as a processing by-product (CGT). Moreover, cotton seed is also processed to recover cotton seed oils and cotton seed meals. Cotton production generates three categories of waste products: (i) field trash (stems, flowers, leaves, and stalks); (ii) CGT (leaves, fibre, flowers, immature seeds, sticks and soil) [9] and cotton seed meal (from which oil has been extracted) (Figure 1).

Cotton Waste
Cotton by-products have provided producers with additional value, mainly in the form of livestock feed supplements and soil amendments [6,66]. Despite its abundance, field trash is generally viewed as having little value-added potential and is, therefore, not a resource that is utilised in standard farming practices. Field trash is currently slashed and left in the field where it provides some benefits through improving soil carbon and reducing soil erosion. The high cost associated with harvesting field trash for other uses is considered a major economic hurdle.
Cotton seeds constitute 55% of the total ginned cotton by weight, whereas cotton fibre and CGT make up about 35%-40% and 10% respectively [67]. Although historically viewed as a waste by-product of cotton processing, cotton seed is now considered a high value co-product and an important part of the cotton processing value chain. Cotton seed is fractionated into high value oils and high protein meals, both with applications in food and feed industries. In contrast, CGT is considered a lower value waste with little value adding potential and the management of CGT is regarded as a financial burden to most ginning operations. CGT is generally disposed of in one of four ways: as solid waste (landfilling),

Cotton Waste
Cotton by-products have provided producers with additional value, mainly in the form of livestock feed supplements and soil amendments [6,66]. Despite its abundance, field trash is generally viewed as having little value-added potential and is, therefore, not a resource that is utilised in standard farming practices. Field trash is currently slashed and left in the field where it provides some benefits through improving soil carbon and reducing soil erosion. The high cost associated with harvesting field trash for other uses is considered a major economic hurdle.
Cotton seeds constitute 55% of the total ginned cotton by weight, whereas cotton fibre and CGT make up about 35%-40% and 10% respectively [67]. Although historically viewed as a waste by-product of cotton processing, cotton seed is now considered a high value co-product and an important part of the cotton processing value chain. Cotton seed is fractionated into high value oils and high protein meals, both with applications in food and feed industries. In contrast, CGT is considered a lower value waste with little value adding potential and the management of CGT is 4 of 25 regarded as a financial burden to most ginning operations. CGT is generally disposed of in one of four ways: as solid waste (landfilling), composting and land application, incineration and, to a lesser extent, fed to livestock as a supplement [9,10]. Moreover, disposal options are tightly regulated by local environmental laws which add further restrictions.
CGT has a reasonable nutritional profile composed of dry matter (90%), crude protein (12%), total digestible nutrients (47%), calcium (11%), sodium (121 ppm), and iron (963 ppm) [66], and has been proven to contribute to the wellbeing of livestock [6,12]. Despite this, its use has been discouraged (and banned altogether in some countries) owing to the presence of residual chemicals which are used during cultivation [68]. This has resulted in CGT being widely used as fertilizer supplements [69,70] and composts to maintain/conserve soil moisture and composition that improve crop production [71,72].
Waste generated from cotton harvesting and cotton ginning mills are used as replacement components for inorganic-based filler materials and additives, for the production of thermoplastic composites poly(lactic acid) (PLA) and low-density polyethylene (LDPE) [73]. By-products from the cotton industry have also been processed to produce fulvic acid and silica [74]. Cotton trash has been investigated in numerous studies as a renewable feedstock in bioethanol production [15,16,75,76]. CGT is well suited as a biofuel feedstock because its composition has the attribute of high polysaccharide content (up to 50%) for effective and scalable conversion to biofuels.
A promising, yet less well documented use of CGT is in the manufacturing of biologically active compounds. There are a variety of chemical compounds which occur naturally in cotton plants with wide ranging activities. Given that such compounds are present in the cotton plant, it is plausible that the remaining trash also contains a proportion of biologically active molecules. These compounds and their uses are examined and discussed in the following sections of this review.

Chemical Compounds in Cotton
Different compounds present in cotton play important roles during metabolism or interaction with the environment. Naturally-occurring compounds in cotton include terpenes, phenols, proteins, carbohydrates, fatty acids, and lipids [19] (Table 1). As with most plants, the distribution of these compounds vary between different parts of the cotton plant with some compounds concentrated in specific parts of the plant [77] (Figure 2). The distribution of these chemical compounds is related to their different properties and functionality in the plant. The various compounds found in cotton plant will be discussed, highlighting the chemistry, as well as their distribution within the plant. composting and land application, incineration and, to a lesser extent, fed to livestock as a supplement [9,10]. Moreover, disposal options are tightly regulated by local environmental laws which add further restrictions.
CGT has a reasonable nutritional profile composed of dry matter (90%), crude protein (12%), total digestible nutrients (47%), calcium (11%, sodium (121 ppm), and iron (963 ppm) [66], and has been proven to contribute to the wellbeing of livestock [6,12]. Despite this, its use has been discouraged (and banned altogether in some countries) owing to the presence of residual chemicals which are used during cultivation [68]. This has resulted in CGT being widely used as fertilizer supplements [69,70] and composts to maintain/conserve soil moisture and composition that improve crop production [71,72].
Waste generated from cotton harvesting and cotton ginning mills are used as replacement components for inorganic-based filler materials and additives, for the production of thermoplastic composites poly(lactic acid) (PLA) and low-density polyethylene (LDPE) [73]. By-products from the cotton industry have also been processed to produce fulvic acid and silica [74]. Cotton trash has been investigated in numerous studies as a renewable feedstock in bioethanol production [15,16,75,76]. CGT is well suited as a biofuel feedstock because its composition has the attribute of high polysaccharide content (up to 50%) for effective and scalable conversion to biofuels.
A promising, yet less well documented use of CGT is in the manufacturing of biologically active compounds. There are a variety of chemical compounds which occur naturally in cotton plants with wide ranging activities. Given that such compounds are present in the cotton plant, it is plausible that the remaining trash also contains a proportion of biologically active molecules. These compounds and their uses are examined and discussed in the following sections of this review.

Chemical Compounds in Cotton
Different compounds present in cotton play important roles during metabolism or interaction with the environment. Naturally-occurring compounds in cotton include terpenes, phenols, proteins, carbohydrates, fatty acids, and lipids [19] (Table 1). As with most plants, the distribution of these compounds vary between different parts of the cotton plant with some compounds concentrated in specific parts of the plant [77] (Figure 2). The distribution of these chemical compounds is related to their different properties and functionality in the plant. The various compounds found in cotton plant will be discussed, highlighting the chemistry, as well as their distribution within the plant.    Quercetin 3-glycosides; Tetra-and higher saccharides,

Terpenes
Like most plants, the cotton plant is susceptible to insect, herbivore, and pathogen attack. In a bid to ward off these predators, compounds are produced by the plant as a defence mechanism. Terpenes are an important class of defence compounds synthesized in the cotton plant and are also the largest group of plant defence compounds [124,125]. They are major constituents of essential oils found in most plants and, as such, have been applied in the food, chemical, and cosmetic industry [126]. Terpenes are composed of units of a five-carbon compound, isoprene (Figure 3), linked together in a head to tail fashion [127], forming long chains or rings. They are classified into seven classes by the number of isoprene units they contain and include hemiterpenes, monoterpenes, sesquiterpenes, diterpenes, triterpenes, tetraterpenes, and polyterpenes [125,[127][128][129]. Generally, hemiterpenes do not occur as free compounds but are bound to other non-terpene compounds [126], while terpenes modified by oxidation or a re-arrangement of the carbon skeleton are referred to as terpenoids.

Terpenes
Like most plants, the cotton plant is susceptible to insect, herbivore, and pathogen attack. In a bid to ward off these predators, compounds are produced by the plant as a defence mechanism. Terpenes are an important class of defence compounds synthesized in the cotton plant and are also the largest group of plant defence compounds [124,125]. They are major constituents of essential oils found in most plants and, as such, have been applied in the food, chemical, and cosmetic industry [126]. Terpenes are composed of units of a five-carbon compound, isoprene ( Figure 3), linked together in a head to tail fashion [127], forming long chains or rings. They are classified into seven classes by the number of isoprene units they contain and include hemiterpenes, monoterpenes, sesquiterpenes, diterpenes, triterpenes, tetraterpenes, and polyterpenes [125,[127][128][129]. Generally, hemiterpenes do not occur as free compounds but are bound to other non-terpene compounds [126], while terpenes modified by oxidation or a re-arrangement of the carbon skeleton are referred to as terpenoids. According to Pare and Tumlinson [130] and Rose and Tumlinson [131], terpenes in cotton can be divided into two groups. The first are constitutive compounds that are present in the storage compartments of the cotton plant and are released immediately after insect feeding or damage. Some of these terpenes include α-pinene, β-pinene, limonene, caryophyllene, α-humulene, and myrcene. The second group of terpenes are referred to as inducible compounds which are synthesized de novo several hours after exposure to pests and herbivores and include β-ocimene, α-farnesene, β-farnesene, and linalool. Some of these terpenes occur in their enantiomeric forms in the plant with a reported occurrence of the negative forms e.g., -α-farnesene, -β-farnesene and -β-ocimene [131,132]. Monoterpenes, sesquiterpenes, triterpenes, and terpene derivatives mostly occur in the cotton plant, with monoterpenes, sesquiterpenes, and their derivatives being the most common [133]. The total concentration of terpenes in cotton plant is unclear, although accumulation of terpenes in cotton plant parts varies with up to 15.5 mg terpenoids reportedly accrued per fresh weight of cotton leaves [124]; 2.81 mg and 2.49 mg per foliage weight reported for monoterpenes and sesquiterpenes, respectively.

Terpene Biosynthesis
Terpenes are synthesized via the acetate/mevalonate pathway [133] and mevalonate independent pathway [127,128]. The non-mevalonate pathway is also referred to as the deoxyxylulose phosphate (DXP) pathway or the methyl erythritol phosphate (MEP) pathway. Although terpene synthesis begins with photosynthesis, most studies identify the combination of three acetyl CoA molecules as the starting point of terpene or terpenoid biosynthesis via the acetate/mevalonate pathway [134].
The mevalonate and non-mevalonate pathway result in the formation of isopentenyl pyrophosphate (IPP) and dimethyl allyl pyrophosphate (DMAPP) which forms isoprene catalysed by isoprene synthase. Monoterpenes are synthesized in the plastids of plant cells from geranyl pyrophosphate (GPP) (Figure 4) which is formed from the combination of DMAPP and IPP catalysed by isoprenyl diphosphate synthases [135]. Sesquiterpenes ( Figure 4) are synthesized in the cytosol from farnesyl pyrophosphate (FPP), which is formed from one molecule of GPP and IPP joined in a head to tail combination. The activity of sesquiterpene synthase enzymes converts FPP to sesquiterpenes via ionization reactions [136]. Other terpenes, diterpenes, and triterpenes are synthesized from FPP via the formation of geranyl geranyl diphosphate (GGDP) and squalene, respectively ( Figure 4). Both pathways of terpene biosynthesis can, thus, be summarised into a four step process: (1) synthesis of IPP and isomerization to DMAPP; (2) addition of more IPP compounds; (3) terpene backbone formation by terpene synthase activity; and (4) enzymatic modification to induce specific functions of the terpenes [129]. According to Pare and Tumlinson [130] and Rose and Tumlinson [131], terpenes in cotton can be divided into two groups. The first are constitutive compounds that are present in the storage compartments of the cotton plant and are released immediately after insect feeding or damage. Some of these terpenes include α-pinene, β-pinene, limonene, caryophyllene, α-humulene, and myrcene. The second group of terpenes are referred to as inducible compounds which are synthesized de novo several hours after exposure to pests and herbivores and include β-ocimene, α-farnesene, β-farnesene, and linalool. Some of these terpenes occur in their enantiomeric forms in the plant with a reported occurrence of the negative forms e.g., -α-farnesene, -β-farnesene and -β-ocimene [131,132]. Monoterpenes, sesquiterpenes, triterpenes, and terpene derivatives mostly occur in the cotton plant, with monoterpenes, sesquiterpenes, and their derivatives being the most common [133]. The total concentration of terpenes in cotton plant is unclear, although accumulation of terpenes in cotton plant parts varies with up to 15.5 mg terpenoids reportedly accrued per fresh weight of cotton leaves [124]; 2.81 mg and 2.49 mg per foliage weight reported for monoterpenes and sesquiterpenes, respectively.

Terpene Biosynthesis
Terpenes are synthesized via the acetate/mevalonate pathway [133] and mevalonate independent pathway [127,128]. The non-mevalonate pathway is also referred to as the deoxyxylulose phosphate (DXP) pathway or the methyl erythritol phosphate (MEP) pathway. Although terpene synthesis begins with photosynthesis, most studies identify the combination of three acetyl CoA molecules as the starting point of terpene or terpenoid biosynthesis via the acetate/mevalonate pathway [134].
The mevalonate and non-mevalonate pathway result in the formation of isopentenyl pyrophosphate (IPP) and dimethyl allyl pyrophosphate (DMAPP) which forms isoprene catalysed by isoprene synthase. Monoterpenes are synthesized in the plastids of plant cells from geranyl pyrophosphate (GPP) (Figure 4) which is formed from the combination of DMAPP and IPP catalysed by isoprenyl diphosphate synthases [135]. Sesquiterpenes ( Figure 4) are synthesized in the cytosol from farnesyl pyrophosphate (FPP), which is formed from one molecule of GPP and IPP joined in a head to tail combination. The activity of sesquiterpene synthase enzymes converts FPP to sesquiterpenes via ionization reactions [136]. Other terpenes, diterpenes, and triterpenes are synthesized from FPP via the formation of geranyl geranyl diphosphate (GGDP) and squalene, respectively ( Figure 4). Both pathways of terpene biosynthesis can, thus, be summarised into a four step process: (1) synthesis of IPP and isomerization to DMAPP; (2) addition of more IPP compounds; (3) terpene backbone formation by terpene synthase activity; and (4) enzymatic modification to induce specific functions of the terpenes [129].  Figure 4. Biosynthesis of terpenes from isopentenyl pyrophosphate, a product of the mevalonate and non-mevalonate pathway.

Monoterpenes (C10)
The monoterpenes (C10H16) are a class of terpenes that consist of two isoprene units and can be linear (acyclic), monocyclic (containing one ring), or bicyclic (containing two rings) [137]. There are over 1000 monoterpenes known to occur in nature and examples of common monoterpenes in plants include myrcene (acyclic), limonene (monocyclic), and pinene (bicyclic) ( Figure 5) [137]. Together with the sesquiterpenes, monoterpenes are major constituents of essential oils extracted from various plant materials [138]. Biochemical modifications of monoterpenes such as oxidation, hydroxylation and rearrangement of atoms result in the formation of monoterpenoids such as geraniol and linalool [135,139]. In the cotton plant, there are some acyclic monoterpenes which belong to the group of constitutive compounds, such as α-pinene, β-pinene, and limonene amongst others, as well as herbivore-induced monoterpenes [130,132]. Although monoterpenes found in cotton are distributed in different parts of the plant, including leaves, seeds, flowers, stems, and roots, they are predominantly concentrated within the leaves and flowers ( Figure 2) [23,124,140,141].

Monoterpenes (C10)
The monoterpenes (C 10 H 16 ) are a class of terpenes that consist of two isoprene units and can be linear (acyclic), monocyclic (containing one ring), or bicyclic (containing two rings) [137]. There are over 1000 monoterpenes known to occur in nature and examples of common monoterpenes in plants include myrcene (acyclic), limonene (monocyclic), and pinene (bicyclic) ( Figure 5) [137]. Together with the sesquiterpenes, monoterpenes are major constituents of essential oils extracted from various plant materials [138]. Biochemical modifications of monoterpenes such as oxidation, hydroxylation and rearrangement of atoms result in the formation of monoterpenoids such as geraniol and linalool [135,139]. In the cotton plant, there are some acyclic monoterpenes which belong to the group of constitutive compounds, such as α-pinene, β-pinene, and limonene amongst others, as well as herbivore-induced monoterpenes [130,132]. Although monoterpenes found in cotton are distributed in different parts of the plant, including leaves, seeds, flowers, stems, and roots, they are predominantly concentrated within the leaves and flowers ( Figure 2) [23,124,140,141]. Sesquiterpenes (C15H24) are composed of three isoprene units either in acyclic or cyclic form and occur in most plants. Sesquiterpenes are not limited to higher plants; they have been discovered in micro-organisms, such as bacteria, fungi, and marine organisms [135]. The sesquiterpenes occur in many cotton species and have been extracted from the leaves, flowers, seeds, and bolls of cotton plants [21,142]. Bell [19] reported the total concentration of some sesquiterpenes in essential oil extracted from whole cotton plants up to 26.12% and 30.1% for G. hirsutum and G. barbadense respectively. The sesquiterpenes, α-bergamotene, caryophyllene, bisabolene, farnesene, humulene and copanene are some of the sesquiterpenes commonly associated with cotton ( Figure 6), while oxidized forms such as bisabolol, bisabolene oxide, caryophyllene oxide, and other sesquiterpenoids also occur in the cotton plant [19] (Figure 7).  There are no reports of diterpenes, tetraterpenes, and polyterpenes in cotton plants, however, two triterpene derivatives, β-sitosterol and β-amyrin montanate (Figure 8), were reported to occur in cotton leaves by Shakhidoyatov et al. [18]. Triterpenes are generally made of six (6) isoprene units and contain 30 carbon atoms with a molecular formula of C30H48.  Sesquiterpenes (C 15 H 24 ) are composed of three isoprene units either in acyclic or cyclic form and occur in most plants. Sesquiterpenes are not limited to higher plants; they have been discovered in micro-organisms, such as bacteria, fungi, and marine organisms [135]. The sesquiterpenes occur in many cotton species and have been extracted from the leaves, flowers, seeds, and bolls of cotton plants [21,142]. Bell [19] reported the total concentration of some sesquiterpenes in essential oil extracted from whole cotton plants up to 26.12% and 30.1% for G. hirsutum and G. barbadense respectively. The sesquiterpenes, α-bergamotene, caryophyllene, bisabolene, farnesene, humulene and copanene are some of the sesquiterpenes commonly associated with cotton ( Figure 6), while oxidized forms such as bisabolol, bisabolene oxide, caryophyllene oxide, and other sesquiterpenoids also occur in the cotton plant [19] (Figure 7). Sesquiterpenes (C15H24) are composed of three isoprene units either in acyclic or cyclic form and occur in most plants. Sesquiterpenes are not limited to higher plants; they have been discovered in micro-organisms, such as bacteria, fungi, and marine organisms [135]. The sesquiterpenes occur in many cotton species and have been extracted from the leaves, flowers, seeds, and bolls of cotton plants [21,142]. Bell [19] reported the total concentration of some sesquiterpenes in essential oil extracted from whole cotton plants up to 26.12% and 30.1% for G. hirsutum and G. barbadense respectively. The sesquiterpenes, α-bergamotene, caryophyllene, bisabolene, farnesene, humulene and copanene are some of the sesquiterpenes commonly associated with cotton ( Figure 6), while oxidized forms such as bisabolol, bisabolene oxide, caryophyllene oxide, and other sesquiterpenoids also occur in the cotton plant [19] (Figure 7). There are no reports of diterpenes, tetraterpenes, and polyterpenes in cotton plants, however, two triterpene derivatives, β-sitosterol and β-amyrin montanate (Figure 8), were reported to occur in cotton leaves by Shakhidoyatov et al. [18]. Triterpenes are generally made of six (6) isoprene units and contain 30 carbon atoms with a molecular formula of C30H48.  Sesquiterpenes (C15H24) are composed of three isoprene units either in acyclic or cyclic form and occur in most plants. Sesquiterpenes are not limited to higher plants; they have been discovered in micro-organisms, such as bacteria, fungi, and marine organisms [135]. The sesquiterpenes occur in many cotton species and have been extracted from the leaves, flowers, seeds, and bolls of cotton plants [21,142]. Bell [19] reported the total concentration of some sesquiterpenes in essential oil extracted from whole cotton plants up to 26.12% and 30.1% for G. hirsutum and G. barbadense respectively. The sesquiterpenes, α-bergamotene, caryophyllene, bisabolene, farnesene, humulene and copanene are some of the sesquiterpenes commonly associated with cotton ( Figure 6), while oxidized forms such as bisabolol, bisabolene oxide, caryophyllene oxide, and other sesquiterpenoids also occur in the cotton plant [19] (Figure 7). There are no reports of diterpenes, tetraterpenes, and polyterpenes in cotton plants, however, two triterpene derivatives, β-sitosterol and β-amyrin montanate (Figure 8), were reported to occur in cotton leaves by Shakhidoyatov et al. [18]. Triterpenes are generally made of six (6) isoprene units and contain 30 carbon atoms with a molecular formula of C30H48.  There are no reports of diterpenes, tetraterpenes, and polyterpenes in cotton plants, however, two triterpene derivatives, β-sitosterol and β-amyrin montanate (Figure 8), were reported to occur in cotton leaves by Shakhidoyatov et al. [18]. Triterpenes are generally made of six (6) isoprene units and contain 30 carbon atoms with a molecular formula of C 30 H 48 . Sesquiterpenes (C15H24) are composed of three isoprene units either in acyclic or cyclic form and occur in most plants. Sesquiterpenes are not limited to higher plants; they have been discovered in micro-organisms, such as bacteria, fungi, and marine organisms [135]. The sesquiterpenes occur in many cotton species and have been extracted from the leaves, flowers, seeds, and bolls of cotton plants [21,142]. Bell [19] reported the total concentration of some sesquiterpenes in essential oil extracted from whole cotton plants up to 26.12% and 30.1% for G. hirsutum and G. barbadense respectively. The sesquiterpenes, α-bergamotene, caryophyllene, bisabolene, farnesene, humulene and copanene are some of the sesquiterpenes commonly associated with cotton ( Figure 6), while oxidized forms such as bisabolol, bisabolene oxide, caryophyllene oxide, and other sesquiterpenoids also occur in the cotton plant [19] (Figure 7). There are no reports of diterpenes, tetraterpenes, and polyterpenes in cotton plants, however, two triterpene derivatives, β-sitosterol and β-amyrin montanate (Figure 8), were reported to occur in cotton leaves by Shakhidoyatov et al. [18]. Triterpenes are generally made of six (6) isoprene units and contain 30 carbon atoms with a molecular formula of C30H48.

Phenols
Phenolic compounds are secondary metabolites found in most plants and normally comprise of one or more hydroxyl groups directly attached to one or more aromatic hydrocarbons [143]. Phenols occur in many lower and higher plants, medicinal plants/herbs, and dietary herbs [144], and their distribution is mainly governed by the physiological roles they play within the plant [143,145].
There are up to nine (9) groups of compounds classified as phenols, including phenolic acids, phenolic acid analogs, flavonoids, tannins, stilbenes, curcuminoids, coumarins, lignans, and quinones [144,146]. Despite the wide occurrence of phenols in higher plants, only phenolic acids, phenolic acid analogs, flavonoids, tannins, and coumarins have been reported to occur in cotton seeds (41 ppm), bracts (22.6 ppm), leaves (21.6 ppm), and roots [19]. Phenolic compounds are synthesized within the chloroplast of plant cells through a series of reactions which are preceded by the synthesis of aromatic amino acids tyrosine and phenylalanine via the shikimate-chorismate pathway. This pathway involves reactions between phosphoenol pyruvate (a by-product of glycolysis) and erythrose 4-phosphate (a by-product of the oxidative pentose phosphate pathway). These two aromatic amino acids, regarded as the major precursors in the synthesis of phenolic compounds, undergo a series of reactions via the phenylpropanoid pathway resulting in different classes of phenolic compounds ( Figure 9). Several other key enzymes are implicated in the synthesis of phenols from one class to another.

Phenols
Phenolic compounds are secondary metabolites found in most plants and normally comprise of one or more hydroxyl groups directly attached to one or more aromatic hydrocarbons [143]. Phenols occur in many lower and higher plants, medicinal plants/herbs, and dietary herbs [144], and their distribution is mainly governed by the physiological roles they play within the plant [143,145].
There are up to nine (9) groups of compounds classified as phenols, including phenolic acids, phenolic acid analogs, flavonoids, tannins, stilbenes, curcuminoids, coumarins, lignans, and quinones [144,146]. Despite the wide occurrence of phenols in higher plants, only phenolic acids, phenolic acid analogs, flavonoids, tannins, and coumarins have been reported to occur in cotton seeds (41 ppm), bracts (22.6 ppm), leaves (21.6 ppm), and roots [19]. Phenolic compounds are synthesized within the chloroplast of plant cells through a series of reactions which are preceded by the synthesis of aromatic amino acids tyrosine and phenylalanine via the shikimate-chorismate pathway. This pathway involves reactions between phosphoenol pyruvate (a by-product of glycolysis) and erythrose 4-phosphate (a by-product of the oxidative pentose phosphate pathway). These two aromatic amino acids, regarded as the major precursors in the synthesis of phenolic compounds, undergo a series of reactions via the phenylpropanoid pathway resulting in different classes of phenolic compounds (Figure 9). Several other key enzymes are implicated in the synthesis of phenols from one class to another.  Figure 9. The generalised biosynthetic pathway of phenolic compounds.

Flavonoids
Flavonoids are the most abundant class of phenolic compounds. Huang, Cai and Zhang [144] reported that over 4000 flavonoids occur in nature while Cheynier [143] suggested that the number is closer to 8000. Flavonoids derive their name from the latin word "flavus" which means "yellow", because of the prevalent yellow colour and are largely responsible for the colours of flowers, leaves, barks, fruits, and seeds of most plant species [134]. Flavonoids have a basic skeletal structure of phenyl benzopyrone (C6-C3-C6) comprised of two aromatic rings linked by three carbon atoms. Flavonoids

Flavonoids
Flavonoids are the most abundant class of phenolic compounds. Huang, Cai and Zhang [144] reported that over 4000 flavonoids occur in nature while Cheynier [143] suggested that the number is closer to 8000. Flavonoids derive their name from the latin word "flavus" which means "yellow", because of the prevalent yellow colour and are largely responsible for the colours of flowers, leaves, barks, fruits, and seeds of most plant species [134]. Flavonoids have a basic skeletal structure of phenyl benzopyrone (C6-C3-C6) comprised of two aromatic rings linked by three carbon atoms. Flavonoids occur as free compounds e.g., quercetin ( Figure 10) or as glycosides combined with different sugars [144] e.g., kaempferol 3-glycosides, and quercetin 3-glycosides. There are several different classes of flavonoids such as the flavones, flavonols, isoflavones, aurones, anthocyanins, biflavonoids, flavanols, and flavanones [134,147] (Figure 11). These flavonoids differ slightly in their chemical structures. The flavonols possess hydroxyl side groups, which distinguishes them from the flavones. Isoflavones differ from flavones by the location of the phenyl group, whereas the anthocyanins differ from other flavonoids by possessing a positive charge. Biflavonoids have a general formula of (C6-C3-C6)2 and aurones possess a chalcone-like group instead of the six-membered ring typical of flavonoids. Several of these flavonoids have been identified in cotton including flavones, and flavonols which mostly occur as glycosides located in flowers, leaves, and seeds [19]. The most common flavonoids in cotton are glycosides of kaempferol, quercetin, and herbacetin ( Figure 12). Flavonoid glycosides are water-and ethanol-soluble, while free flavonoids are only soluble in organic solvents [134].  There are several different classes of flavonoids such as the flavones, flavonols, isoflavones, aurones, anthocyanins, biflavonoids, flavanols, and flavanones [134,147] (Figure 11). These flavonoids differ slightly in their chemical structures. The flavonols possess hydroxyl side groups, which distinguishes them from the flavones. Isoflavones differ from flavones by the location of the phenyl group, whereas the anthocyanins differ from other flavonoids by possessing a positive charge. Biflavonoids have a general formula of (C6-C3-C6) 2 and aurones possess a chalcone-like group instead of the six-membered ring typical of flavonoids. Several of these flavonoids have been identified in cotton including flavones, and flavonols which mostly occur as glycosides located in flowers, leaves, and seeds [19]. The most common flavonoids in cotton are glycosides of kaempferol, quercetin, and herbacetin ( Figure 12). Flavonoid glycosides are water-and ethanol-soluble, while free flavonoids are only soluble in organic solvents [134]. There are several different classes of flavonoids such as the flavones, flavonols, isoflavones, aurones, anthocyanins, biflavonoids, flavanols, and flavanones [134,147] (Figure 11). These flavonoids differ slightly in their chemical structures. The flavonols possess hydroxyl side groups, which distinguishes them from the flavones. Isoflavones differ from flavones by the location of the phenyl group, whereas the anthocyanins differ from other flavonoids by possessing a positive charge. Biflavonoids have a general formula of (C6-C3-C6)2 and aurones possess a chalcone-like group instead of the six-membered ring typical of flavonoids. Several of these flavonoids have been identified in cotton including flavones, and flavonols which mostly occur as glycosides located in flowers, leaves, and seeds [19]. The most common flavonoids in cotton are glycosides of kaempferol, quercetin, and herbacetin ( Figure 12). Flavonoid glycosides are water-and ethanol-soluble, while free flavonoids are only soluble in organic solvents [134].

Phenolic Acids and Analogs
Phenolic acids and their analogs are another group of phenolics that occur in cotton. The hydroxybenzoic acids (HDBA), gallic acid ( Figure 13), p-hydroxybenzoic acid, protocatechiuc acid, and others listed in Table 1 are common secondary metabolites in cotton, as well as being the predominant phenolic acids in nature. The hydroxycinnamic acids (HDCA) are hydroxyl derivatives of cinnamic acids with a basic C6-C3 structure. Some HDCA identified in cotton plants include chlorogenic acid, ferulic acid (Figure 13), and p-coumaric acid which are precursors in the biosynthetic pathway to other phenolic compounds such as the lignins, coumarins, and flavonoids [144]. Most phenolic acids have a bitter taste and presumably contribute to the bitter taste of cottonseed products [19]. Gossypol, gossypurpurin, gossyrubilone, and other phenolic acid analogs [30,144] presented in Figure 14 are common secondary compounds isolated from cotton seeds and it is believed they occur in other parts of the cotton plant.

Tannins and Coumarins
Tannins are a large class of poly phenolic water-soluble compounds which have molecular weights in the range of 500-4000 g/mol. Plant tannins are divided into two classes, the hydrolysable tannins

Phenolic Acids and Analogs
Phenolic acids and their analogs are another group of phenolics that occur in cotton. The hydroxybenzoic acids (HDBA), gallic acid ( Figure 13), p-hydroxybenzoic acid, protocatechiuc acid, and others listed in Table 1 are common secondary metabolites in cotton, as well as being the predominant phenolic acids in nature. The hydroxycinnamic acids (HDCA) are hydroxyl derivatives of cinnamic acids with a basic C6-C3 structure. Some HDCA identified in cotton plants include chlorogenic acid, ferulic acid (Figure 13), and p-coumaric acid which are precursors in the biosynthetic pathway to other phenolic compounds such as the lignins, coumarins, and flavonoids [144].

Phenolic Acids and Analogs
Phenolic acids and their analogs are another group of phenolics that occur in cotton. The hydroxybenzoic acids (HDBA), gallic acid ( Figure 13), p-hydroxybenzoic acid, protocatechiuc acid, and others listed in Table 1 are common secondary metabolites in cotton, as well as being the predominant phenolic acids in nature. The hydroxycinnamic acids (HDCA) are hydroxyl derivatives of cinnamic acids with a basic C6-C3 structure. Some HDCA identified in cotton plants include chlorogenic acid, ferulic acid (Figure 13), and p-coumaric acid which are precursors in the biosynthetic pathway to other phenolic compounds such as the lignins, coumarins, and flavonoids [144]. Most phenolic acids have a bitter taste and presumably contribute to the bitter taste of cottonseed products [19]. Gossypol, gossypurpurin, gossyrubilone, and other phenolic acid analogs [30,144] presented in Figure 14 are common secondary compounds isolated from cotton seeds and it is believed they occur in other parts of the cotton plant.

Tannins and Coumarins
Tannins are a large class of poly phenolic water-soluble compounds which have molecular weights in the range of 500-4000 g/mol. Plant tannins are divided into two classes, the hydrolysable tannins Most phenolic acids have a bitter taste and presumably contribute to the bitter taste of cottonseed products [19]. Gossypol, gossypurpurin, gossyrubilone, and other phenolic acid analogs [30,144] presented in Figure 14 are common secondary compounds isolated from cotton seeds and it is believed they occur in other parts of the cotton plant.

Phenolic Acids and Analogs
Phenolic acids and their analogs are another group of phenolics that occur in cotton. The hydroxybenzoic acids (HDBA), gallic acid ( Figure 13), p-hydroxybenzoic acid, protocatechiuc acid, and others listed in Table 1 are common secondary metabolites in cotton, as well as being the predominant phenolic acids in nature. The hydroxycinnamic acids (HDCA) are hydroxyl derivatives of cinnamic acids with a basic C6-C3 structure. Some HDCA identified in cotton plants include chlorogenic acid, ferulic acid (Figure 13), and p-coumaric acid which are precursors in the biosynthetic pathway to other phenolic compounds such as the lignins, coumarins, and flavonoids [144]. Most phenolic acids have a bitter taste and presumably contribute to the bitter taste of cottonseed products [19]. Gossypol, gossypurpurin, gossyrubilone, and other phenolic acid analogs [30,144] presented in Figure 14 are common secondary compounds isolated from cotton seeds and it is believed they occur in other parts of the cotton plant.

Tannins and Coumarins
Tannins are a large class of poly phenolic water-soluble compounds which have molecular weights in the range of 500-4000 g/mol. Plant tannins are divided into two classes, the hydrolysable tannins

Tannins and Coumarins
Tannins are a large class of poly phenolic water-soluble compounds which have molecular weights in the range of 500-4000 g/mol. Plant tannins are divided into two classes, the hydrolysable tannins which derive their base unit from gallic acid, and condensed tannins, which arise from proanthocyanidins (condensed flavonols), as well as flavonoid and non-hydrolyzable tannins [144]. Condensed tannins are normally found in combination with alkaloids, polysaccharides, or proteins. These are the class of tannins reported to occur in cotton [148] and act as pesticides, protecting the cotton plant against predators [19]. The coumarins are another group of phenolic acids isolated from cotton. Scopoletin, a coumarin derivative and its glycoside, scopolin presented in Figure 15 have been identified in cotton plant tissue confirming the report that coumarins occur in the free form and as glycosides in cotton, as well as other plants [144]. which derive their base unit from gallic acid, and condensed tannins, which arise from proanthocyanidins (condensed flavonols), as well as flavonoid and non-hydrolyzable tannins [144]. Condensed tannins are normally found in combination with alkaloids, polysaccharides, or proteins. These are the class of tannins reported to occur in cotton [148] and act as pesticides, protecting the cotton plant against predators [19]. The coumarins are another group of phenolic acids isolated from cotton. Scopoletin, a coumarin derivative and its glycoside, scopolin presented in Figure 15 have been identified in cotton plant tissue confirming the report that coumarins occur in the free form and as glycosides in cotton, as well as other plants [144].

Fatty Acids, Carbohydrates and Proteins
Fatty acids are carboxylic acids with long aliphatic chains that are synthesized in the cytosol of plant cells from malonyl-CoA, which in turn is derived from acetyl-CoA. Palmitic acid ( Figure 16) is a base fatty acid from which other fatty acids are formed by 2-carbon increments or reduction. The synthesis of palmitic acid ( Figure 17) from the precursor malonyl-CoA follows a five step repeating cycle of acylation, condensation, reduction, dehydration, and reduction, which is catalyzed by the fatty acid synthase complex [149,150].

Fatty Acids, Carbohydrates and Proteins
Fatty acids are carboxylic acids with long aliphatic chains that are synthesized in the cytosol of plant cells from malonyl-CoA, which in turn is derived from acetyl-CoA. Palmitic acid ( Figure 16) is a base fatty acid from which other fatty acids are formed by 2-carbon increments or reduction. The synthesis of palmitic acid ( Figure 17) from the precursor malonyl-CoA follows a five step repeating cycle of acylation, condensation, reduction, dehydration, and reduction, which is catalyzed by the fatty acid synthase complex [149,150]. which derive their base unit from gallic acid, and condensed tannins, which arise from proanthocyanidins (condensed flavonols), as well as flavonoid and non-hydrolyzable tannins [144]. Condensed tannins are normally found in combination with alkaloids, polysaccharides, or proteins. These are the class of tannins reported to occur in cotton [148] and act as pesticides, protecting the cotton plant against predators [19]. The coumarins are another group of phenolic acids isolated from cotton. Scopoletin, a coumarin derivative and its glycoside, scopolin presented in Figure 15 have been identified in cotton plant tissue confirming the report that coumarins occur in the free form and as glycosides in cotton, as well as other plants [144].

Fatty Acids, Carbohydrates and Proteins
Fatty acids are carboxylic acids with long aliphatic chains that are synthesized in the cytosol of plant cells from malonyl-CoA, which in turn is derived from acetyl-CoA. Palmitic acid ( Figure 16) is a base fatty acid from which other fatty acids are formed by 2-carbon increments or reduction. The synthesis of palmitic acid ( Figure 17) from the precursor malonyl-CoA follows a five step repeating cycle of acylation, condensation, reduction, dehydration, and reduction, which is catalyzed by the fatty acid synthase complex [149,150].  which derive their base unit from gallic acid, and condensed tannins, which arise from proanthocyanidins (condensed flavonols), as well as flavonoid and non-hydrolyzable tannins [144]. Condensed tannins are normally found in combination with alkaloids, polysaccharides, or proteins. These are the class of tannins reported to occur in cotton [148] and act as pesticides, protecting the cotton plant against predators [19]. The coumarins are another group of phenolic acids isolated from cotton. Scopoletin, a coumarin derivative and its glycoside, scopolin presented in Figure 15 have been identified in cotton plant tissue confirming the report that coumarins occur in the free form and as glycosides in cotton, as well as other plants [144].

Fatty Acids, Carbohydrates and Proteins
Fatty acids are carboxylic acids with long aliphatic chains that are synthesized in the cytosol of plant cells from malonyl-CoA, which in turn is derived from acetyl-CoA. Palmitic acid ( Figure 16) is a base fatty acid from which other fatty acids are formed by 2-carbon increments or reduction. The synthesis of palmitic acid ( Figure 17) from the precursor malonyl-CoA follows a five step repeating cycle of acylation, condensation, reduction, dehydration, and reduction, which is catalyzed by the fatty acid synthase complex [149,150].  Saturated fatty acids which occur in the cotton plant include myristic acid (tetradecanoic acid), melissic acid (triacontanoic acid), palmitic acid (hexadecanoic acid), stearic acid (octadecanoic acid), and palmitoleic acid (9-hexadecanoic acid) [18,19,151]. Unsaturated fatty acids identified in cotton include eicosadienoic acid, linoleic acid (octadecadienoic acid), linolenic (octadecatrienoic acid), and elaidic acid (octadecenoic acid) [18,19]. Most fatty acids identified in cotton are free fatty acids (not linked to any molecules) and play functional roles as a source of energy for plant growth [19].
Cotton, like all plants, is comprised of cellulose and hemicelluloses, proportions of which vary between different parts of the plant. Cotton fibre itself is comprised mainly of cellulose at levels greater than 94% by weight [25]. Raffinose is a unique minor sugar found in cotton plants predominantly in the seed [117,118].
Alkali and water soluble proteins are also found in cotton [152], including water soluble globular proteins vicilin and legumin (Table 1) present in the seeds of cotton [20]. Proline-rich protein H6 is involved in the development of the cell wall structure of cotton fibre [153,154].

Genotypes and Varieties
Cotton plants can be categorised as glanded and glandless cotton. Glanded cotton contains pigment glands distributed in tissues and organs of the cotton plant which are rich in gossypol and terpenoid aldehydes [155]. Glandless cotton was developed from the wild-type glanded cotton by McMichael [156] in order to tackle the challenge of gossypol extraction from cottonseed and cotton seed oil [157]. Since then, different varieties of glandless and glanded cotton have been developed, but the absence of pigment glands has made glandless cotton susceptible to infection and pest infestation [155,158]. Glanded cotton contains more proteins, fatty acids, sugars, and terpenoids in comparison with glandless cotton [159], with very little variation between varieties within each group when cultivated under the same environmental conditions [151,160], although Dowd et al. [151] found variation in fatty acid composition was influenced more by genotype than environmental factors, with up to 62.4% of palmitoleic acid content being controlled by genotype and only 5.4% of the variation in linoleic acid induced by environment.

Non-Transgenic and Cotton Transgenic Cotton Differences
Cultivation of transgenic Bt cotton has been widely practised [161,162] which has led to interest in the possibility of induced alterations in the chemical composition, as well as nutritional value of Bt cotton. Yan, et al. [163] reported that all chemical compounds present in non-transgenic cotton were also present in transgenic cotton and indicated that Bt cotton contained higher concentration of some monoterpenes e.g., alpha and beta pinene and lesser concentrations of myrcene and ocimene when compared to non-transgenic cotton. It has been suggested that the increased production of pinene in Bt cotton can be attributed to the activity of genes which cause the plant to repel insects/pests [164][165][166]. Nutritional evaluation of Bt cotton relative to non-transgenic cotton by Mohanta, et al. [167] revealed slight variations in concentration of proteins and carbohydrates in both types of cotton. Overall, these findings suggest transgenic cotton differs slightly from non-transgenic cotton by the general composition of proximate constituents (moisture content, crude fat, and total ash), fibres, minerals, and secondary metabolites.

Pharmacological Properties of Compounds in Cotton
Several studies have emphasized the importance of plants to the pharmaceutical and medical industry [168][169][170]. Cotton is described as a medicinal plant because of the chemical compounds that have been isolated from it [21,83]. A number of compounds found in cotton play pharmacological roles in nature (Table 2) including anti-microbial, anti-inflammatory, cytotoxic, anti-cancer, and contraceptive roles in both humans and animals. Monoterpenes such as myrcene, pinene, camphene, limonene, and sabinene isolated from cotton possess anti-microbial, anti-inflammatory, anti-cancer, anti-oxidant, and gastro-protective properties [28,171,172].

Anti-Microbial Properties
In vitro and in vivo studies with compounds derived from cotton have found they elicit various effects in most experimental cells and animals. Monoterpenes such as pinene present in the leaves of cotton possess anti-microbial activity against fungi and bacteria. Concentrations as low as 5 µg/mL and 117 µg/mL were reported to have anti-microbial activity towards bacteria and fungi, respectively [173,174]. Only positive enantiomers of the compound induced this effect. The phenolic acid 4-hydroxybenzoic acid which has anti-microbial properties against gram positive and gram-negative bacteria at IC 50 value of 160 µg/mL [175] is another compound present in the leaves of cotton. The degree of anti-microbial activity of these compounds varies across micro-organisms. This was observed in fungal toxicity assays with 4-hydroxybenzoic acid on Ganoderma boninense at concentrations as low as 0.5-2.5 µg/mL [176].

Anti-Inflammatory and Anti-Oxidant Properties
Chemical compounds, such as trans-caryophyllene, caryophyllene oxide, α-humulene, and β-amyrin, are compounds which exert different anti-inflammatory properties. α-Humulene and trans-caryophyllene are reported to prevent chemical-induced paw oedema in rats with 50 mg/kg of both compounds inducing the same anti-inflammatory effects as 0.5 mg/kg of dexamethasone (a steroid anti-inflammatory medication) [32]. At doses of 12 mg/kg and 25 mg/kg body weight of experimental mice, caryophyllene oxide induced anti-inflammatory and analgesic properties almost equivalent to that of an aspirin at a dose of 100 mg/kg body weight of the experimental animals [177]. In humans, studies using peripheral blood mononuclear cells (PMBCs), 1, 2, and 5 µg/mL of β-amyrin promoted the secretion of IL-6 cytokine [178] which is actively involved in pro-inflammatory and anti-inflammatory immune responses. Anti-oxidant properties of β-amyrin and farnesene from "in vitro" studies using human blood cells showed that doses as low as 1 µg/mL [178] and 100 µg/mL [179], respectively, induced anti-oxidant activities in a time-dependent manner. Table 2. Biological activities of different compounds present in cotton.

Cytotoxic and Contraceptive Properties
Cytotoxic activities associated with compounds isolated from cotton are mostly reported in relation to cancer cell lines. α-Bisabolol, a common compound present in cotton possesses the ability to induce apoptosis in malignant carcinoma cell lines without affecting the viability of healthy cells [184]. A dose of 2 µM of α-bisabolol is reported to be effective against cancer cell lines, but an increase in dosage from 50 to 250 µM can induce cytotoxicity in normal cells. Another sesquiterpene, caryophyllene oxide, also exhibits cytotoxic properties against cancer cell lines with a minimum dose of 3.125 µM resulting in reduction in viabilities of the target cells, with this effect more pronounced as the dosage increased [185]. Gossypol is a major compound present in cottonseed oil and other parts of the cotton plant and has been found to have contraceptive properties in mammals. In human males, a concentration of 0.3 mg/kg of body weight can induce azoospermia in a time-dependent manner, whereas in male rats, a concentration of 30 mg/kg will induce the equivalent effect [192]. The contraceptive property of gossypol is not restricted to males alone as a study by Randel, Chase, and Wyse [193] indicated that this compound, if administered at a dose of 40 mg/kg body weight of female mammals, induces abnormal oestrous cycles and reduced pregnancy rates.

Conclusions
In this review, it has been shown that the whole cotton plant is a reservoir of a wide variety of compounds which have a range of biological functions and exploitable applications. The distribution of compounds in the cotton plant provides knowledge of the chemical content of cotton waste derived from harvesting and cotton ginning operations. Potentially valuable chemical compounds with application in food manufacturing, perfumery, and pharmaceutical industries are found in components of these cotton processing by-products (burr, leaves, crushed seeds, sticks, roots, and flowers of the cotton plant). Gossypol, which is known to have contraceptive properties is not only concentrated in the seeds of cotton, but occurs in the roots and possibly in other parts of the plant. Phenolic compounds and terpenes present in the cotton burr stem, leaves, flowers, stalks, and roots have insecticidal, herbicidal, and phytotoxic properties that could be exploited. This review has highlighted that cotton waste products can be sources of biologically valuable compounds. Special consideration should be given to CGT as a low cost resource because it is centrally stockpiled and collocated with existing infrastructure. Therefore, investigating the occurrence of these chemical compounds in cotton by-products can contribute to recycling and value adding of waste generated from cotton ginning.