The Role of Somaclonal Variation in Plant Genetic Improvement: A Systematic Review

: The instability of in vitro cultures may cause genetic and epigenetic changes in crops called somaclonal variations. Sometimes, these changes produce beneﬁcial effects; for example, they can be used in breeding programs to generate new cultivars with desirable characteristics. In this article, we present a systematic review designed to answer the following question: How does somaclonal variation contribute to plant genetic improvement? Five electronic databases were searched for articles based on pre-established inclusion and exclusion criteria and with a standardized search string. The somaclonal variation technique has been most frequently applied to ornamental plants, with 49 species cited in 48 articles, and to the main agricultural crops, including sugarcane, rice, banana, potato and wheat, in different countries worldwide. In 69 studies, a technique was applied to evaluate the genetic diversity generated between clones, and, in 63 studies, agronomic performance characteristics were evaluated. Other studies are related to resistance to pathogens, ornamental characteristics and resistance to abiotic stresses. The application of the plant growth regulators (PGRs) benzylaminopurine (BAP) and dichlorophenoxyacetic acid (2,4-D) was the most common method for generating somaclones, and randomly ampliﬁed polymorphic DNA (RAPD) molecular markers were the most commonly used markers for identiﬁcation and characterization. Somaclonal variation has been used in genetic improvement programs for the most economically important crops in the world, generating genetic diversity and supporting the launch of new genotypes resistant to diseases, pests and abiotic stresses. However, much remains to be explored, such as the genetic and epigenetic mechanisms from which somaclonal variation is derived.


Introduction
Plant diseases caused by phytopathogens cause losses to the global economy of more than 220 billion dollars annually [1].At least 70 billion dollars are lost due to invasive pests worldwide, not to mention the loss of biodiversity caused by pathogens.In addition, abiotic factors such as water deficit, salinity and temperature extremes cause approximately 30 billion dollars in losses to global agriculture.This reality threatens the food security of several countries and harms small farmers and individuals living in regions where food security has not yet been achieved [1].
Therefore, genetic improvement programs seek ways to reduce the impacts caused by diseases, pests and abiotic stresses on agricultural crops through the development of resistant or tolerant cultivars.In order to achieve this goal, different strategies are used.Plant cell and tissue culture are traditionally used for the production, conservation and improvement of plant resources from an asexual process where clonal multiplication is expected to generate genetically uniform plants [2,3].However, Braun [4] made the first observation and report of variation originated in cell and tissue cultures defined as somaclonal variation [5].This has been one of the biggest longstanding problems, i.e., obtaining the genetic fidelity of plants from tissue culture in vitro [6].However, in 1981, Larkin and Scowcroft identified somaclonal variation as a potential for crop enhancement, and this was later documented by other researchers [3,7,8].
Since then, new somaclones from different cultures with characteristics useful for breeding, such as resistance to pathogens, tolerance to abiotic stresses and high productivity, have been launched [9][10][11][12][13].Somaclonal variation, in which clones of genetically identical plants have different phenotypes after regeneration, was observed in most explants subjected to micropropagation.It is more evident when cells are propagated in culture for long periods of time and when explants/micropropagated plants suffer several subsequent subcultures.The first studies involved genetic and epigenetic variations, which led to the hypothesis that plant growth hormones, such as auxins and cytokinins, could be responsible for these genetic changes observed in plants [14][15][16].
Rai [17] discussed the source and genetic basis of somaclonal variation, its detection methods and the advantages of this tool for agriculture, with the main emphasis on some useful somaclonal variants released as cultivars.Other studies have reviewed the potential application of somaclonal variants in the improvement of horticultural crops [18] and described the current status of understanding the genetic and epigenetic changes that occur during tissue culture [19].To summarize the current status of knowledge generated on somaclonal variation in plant breeding, this article presents a systematic review (SR) of studies conducted in the last 16 years.The approach presented here makes use of the SR tool, which provides a summary of all the relevant evidence available on the applications of this tool in plant breeding.The main countries that work on somaclonal variation, the somaclones of various cultures generated globally, the purposes of the generated somaclones, the methods for induction of somaclonal variation, the number of subcultures, PGRs most used in the induction of somaclonal variation and their doses, the explants preferentially used, the main phenotypic characteristics observed in the somaclones, the molecular markers frequently used in the studies to detect somaclonal variation and information on the gene expression of some somaclones generated are presented.

Materials and Methods
This review was constructed based on preferred reports for SR and meta-analyses (PRISMA) using the open access software State of the Art by SR (Start) v.3.3Beta 03; the three main steps used were planning, execution and summarization.
In the planning stage, a protocol was built https://doi.org/10.5281/zenodo.7674327(accessed on 12 February 2023) to monitor the entire review process.The following features were defined: title, objective, keywords, research questions, research sources, research period covered and criteria for the inclusion/exclusion of articles.The main research question guiding the SR was as follows: How does the somaclonal variation technique contribute to plant genetic improvement?Based on this question, the secondary questions, which are described in Table 1, were defined.
Table 1.List of questions about the use of somaclonal variation as a tool in the genetic improvement of agricultural crops to be answered by a systematic review of articles published in the last fifteen years.

Research Questions
Q1.In which cultures has the somaclonal variation technique been applied?Q2.For what purposes is the somaclonal variation technique applied?Q3.What PGRs and doses are most used to generate somaclonal variants?Q4.How many subcultures were made to generate somaclones?Q5.In which countries is the somaclonal variation technique most often applied?Q6.Which somaclones have already been generated?Q7.What are the most frequent changes observed in the phenotypic characteristics of somaclones?Q8.What molecular tools are used to characterize somaclonal variants?
The execution stage consisted of three phases: research, selection and extraction.The electronic searches were performed using a search string defined with the following keywords: "plant breeding" AND "somaclonal" OR "somaclone variation".This search string was designed to cover the largest possible number of articles on the subject and was used to identify articles in five databases: Web of Science (http://apps.isiknowledge.com) (accessed on 15 February 2020), PubMed (http://www.ncbi.nlm.nih.gov/pubmed)(accessed on 15 February 2020), Springer (https://www.springer.com/br)(accessed on 15 February 2020), Portal of Journals CAPES (http://www.periodicos.capes.gov.br/)(accessed on 15 February 2020).and Google Scholar (https://scholar.google.com.br/schhp?hl=en&as_sdt=0,5) (accessed on 15 February 2020).Each database was searched for articles published over a period of 16 years.Some documents were considered relevant but were published after the selection stage, so they were added manually.The results were exported in the BIBTEX, MEDILINE or RIS formats compatible with Start software.
We used a protocol for the development of the SR, and the search terms were based on the four PICO inclusion components (i.e., population, intervention, comparison, outcome and study type) [20] (Table 2).Initially, in the selection phase, only the title, abstract and keywords were read, and the articles that contained the terms defined in the search string within these features were selected.In the extraction phase, the articles were read in full, and the articles were accepted according to the predefined inclusion (I) and exclusion (E) criteria: (I) articles that contain in the title, abstract or keywords the terms plant breeding and somaclonal or somaclonal variation; (E) articles published in languages other than English; (E) articles that deviate from the topic; (E) review articles; (E) theses, dissertations and manuals; (E) book chapters; (E) articles published in annals of events; and (E) articles on the evaluation of plant fidelity after in vitro multiplication.
In the summarization step, graphs, tables, word clouds and bibliometric maps were generated to compose an SR.The frequencies of articles were calculated for the questions described in Table 1.The graphs were generated in R software [21] with the ggplot2 and dplyr packages.The bibliometric analyses were performed using VOSviewer_1.6.17 software [22].

Risk of bias
To evaluate the risk of bias among the articles selected for this SR, we adapted the Cochrane risk of bias tool protocol [23].Three authors (MSF, AJR and FSN) evaluated the quality of the methods used to select the included studies, and the questions used to assess the risk of bias were the same as those developed for the protocol (found in Table 1).The studies were classified according to the number of questions answered that contributed to the SR.Three classifications were adopted: 1.
Low risk of bias (low)-articles that answered 100% of the proposed questions.

2.
Moderate risk of bias (moderate)-articles that answered up to 60% of the questions.

3.
High risk of bias (high)-articles that answered up to 30% of the questions.
In addition, all the PRISMA guidelines were carefully followed; the PRISMA checklist is available for download at https://doi.org/10.5281/zenodo.7674859(accessed on 20 February 2020).

Screening of Studies
Figure 1 represents the PRISMA flow diagram used to screen the articles analysed in this review.The Web of Science was the database that contributed most to this review, with 1192 articles (27%).PubMed Central contributed 1069 articles (25%), followed by Google Scholar with 1010 (23%), Springer with 997 (23%) and the CAPES journal portal with 75 (2%) articles.Eleven important articles were manually added to this review because they reported the generation and study of somaclones with resistance to diseases, abiotic stresses and agronomic and molecular aspects [12,13,15,[24][25][26][27][28][29][30][31].In total, 4351 articles were identified in the databases, of which 882 were duplicates and 3725 were eliminated in the selection process.In the extraction phase, 629 articles were read in full, and 410 were excluded because they did not meet the inclusion criteria.A total of 219 articles were selected for this SR.The manuscripts were stored in an open access digital library available at https://doi.org/10.5281/zenodo.7641768(accessed on 22 February 2023).

Bibliometric Analysis
A bibliometric map was made from the titles of the accepted articles (n = 219) (Figure 2A).There was a predominance of the terms somaclonal variation, somaclonal variant and somaclone between 2010 and 2015, which indicates a trend of publications during this period.The term RAPD (Randomly Amplified Polymorphic DNA) was also predominant in studies published between 2005 and 2015, showing that this molecular technique was used in previous studies and that new approaches related to molecular markers are possibly being adopted nowadays (Figure 2A).A second bibliometric map revealed the five journals with the largest numbers of publications on the theme of somaclonal variation; Plant Cell and Tissue and Organ Culture had the most publications, followed by the African Journal of Science and Technology, In Vitro Cellular and Developmental Biology-Plant, Plant Cell Reports and Euphytica (Figure 2B).

Main Countries and Cultures Evaluated
Studies on somaclonal variation in plant breeding were found in 42 countries, but most are concentrated in India (43) (Figure 3).Other countries that published a relatively high number of articles on the subject were Pakistan (18), China (18), Egypt (14), Brazil (12), Iran (11), the United States (10), Poland (10) and South Korea (9).Countries with fewer than 10 published articles are represented in bright green in the map shown in Figure 3. Regarding the agricultural crops studied, 82 species were evaluated, separated by crop types and summarized in Table S1.The plant species that are among the 10 most important crops in terms of production, according to data from the Food and Agriculture Organization (FAO) of the United Nations, were not separated.The other species were classified by cultivation type: fruits (9 species and 26 articles); forage, grasses and cereals (16 species and 21 articles); vegetables, roots and tubers (7 species and 17 articles); medicinal (13 species and 15 articles), condiments and spices (4 species and 9 articles); and ornamental (24 species and 45 articles) (Table S1).The most studied species were sugarcane (30), rice (18), banana (13), potato (10) and wheat (11) (Table S1).

Bibliometric Analysis
A bibliometric map was made from the titles of the accepted articles (n = 219) (Figure 2A).There was a predominance of the terms somaclonal variation, somaclonal variant and somaclone between 2010 and 2015, which indicates a trend of publications during this period.The term RAPD (Randomly Amplified Polymorphic DNA) was also predominant in studies published between 2005 and 2015, showing that this molecular technique was used in previous studies and that new approaches related to molecular markers are possibly being adopted nowadays (Figure 2A).A second bibliometric map revealed the five journals with the largest numbers of publications on the theme of somaclonal variation; Plant Cell and Tissue and Organ Culture had the most publications, followed by the African Journal of Science and Technology, In Vitro Cellular and Developmental Biology-Plant, Plant Cell Reports and Euphytica (Figure 2B).In India, the largest numbers of studies have been conducted on sugarcane (14), medicinal plants (8) and forage, grasses and cereals (7); in Pakistan, sugarcane (14) and potato (5); in China, rice (6) and ornamental plants (5); in Egypt, potato (3), vegetables, roots and tubers (3) and wheat (3); in Iran, fruits (4); in Brazil, ornamental plants (7) and fruits (2); in the United States and South Korea, ornamental plants (6,4); and in Poland, vegetables, roots and tubers (Figure 3).

Main Countries and Cultures Evaluated
Studies on somaclonal variation in plant breeding were found in 42 countries, but most are concentrated in India (43) (Figure 3).Other countries that published a relatively high number of articles on the subject were Pakistan (18), China (18), Egypt (14), Brazil (12), Iran (11), the United States (10), Poland (10) and South Korea (9).Countries with fewer than 10 published articles are represented in bright green in the map shown in Figure 3. Regarding the agricultural crops studied, 82 species were evaluated, separated by crop types and summarized in Table S1.The plant species that are among the 10 most important crops in terms of production, according to data from the Food and Agriculture Organization (FAO) of the United Nations, were not separated.The other species were classified by cultivation type: fruits (9 species and 26 articles); forage, grasses and cereals (16 species and 21 articles); vegetables, roots and tubers (7 species and 17 articles); medicinal (13 species and 15 articles), condiments and spices (4 species and 9 articles); and ornamental (24 species and 45 articles) (Table S1).The most studied species were sugarcane (30), rice (18), banana (13), potato (10) and wheat (11) (Table S1).

Methods for Inducing Somaclonal Variation
Regarding the method used to induce somaclonal variation, 154 articles mentioned only PGRs to induce variation.In 65 articles, previously generated somaclones were studied, and the method used for their generation was not reported (Figure 5).A higher number of studies was directed to evaluate the somaclones in the context of existing genetic diversity ( 69), followed by studies on agronomic traits for genetic improvement (63), pathogen-resistant somaclones (29), somaclones with ornamental characteristics (22), tolerance to salinity (17), tolerance to abiotic stress (10) and tolerance to water deficit (9) (Figure 5).

Methods for Inducing Somaclonal Variation
Regarding the method used to induce somaclonal variation, 154 articles mentioned only PGRs to induce variation.In 65 articles, previously generated somaclones were studied, and the method used for their generation was not reported (Figure 5A).A higher number of studies was directed to evaluate the somaclones in the context of existing genetic diversity (69), followed by studies on agronomic traits for genetic improvement (63), pathogen-resistant somaclones (29), somaclones with ornamental characteristics (22), tolerance to salinity (17), tolerance to abiotic stress (10) and tolerance to water deficit ( 9  ied, and the method used for their generation was not reported (Figure 5A).A higher number of studies was directed to evaluate the somaclones in the context of existing genetic diversity ( 69), followed by studies on agronomic traits for genetic improvement (63), pathogen-resistant somaclones (29), somaclones with ornamental characteristics (22), tolerance to salinity (17), tolerance to abiotic stress (10) and tolerance to water deficit ( 9   (KIN/KT), 23 idolacetic acid (IAA); 15 reported indole-3-butyric acid (IBA); and 12 tiazuron (TDZ).Sixty-five articles did not mention the use of PGRs, as they evaluated only somaclones previously generated in other studies (Figure 6).The most used PGRs to generate somaclones with desirable agronomic characteristics in molecular studies of genetic diversity and pathogen resistance were BAP, 2,4-D and NAA, respectively (Figure 6).IAA was mainly used to promote variations related to resistance to pathogens; KIN, IBA and TDZ were used to induce variation in order to obtain the molecular characteristics of genetic and agronomic variability generated in somaclones (Figure 6).There was high variation between the doses of the PGRs applied in the different manuscripts, varying from 0.01 mg/L to 16 mg/L (Figure 7).In general, the most reported doses of PGRs varied between the PGRs, whereas BAP presented the highest number of different doses applied per manuscript followed by 2,4-D and NAA (Figure 7).The most applied doses for the BAP were 1 mg/L (23), 2 mg/L (21), 0.05 mg/L (17) and 3 mg/L (9).For the 2,4-D, the most applied doses were 2 mg/L (26), 1 mg/L (18) and 3 mg/L (10) (Figure 8).The most applied doses for the NAA were 1 mg/L (13), 0.05 mg/L (11), 2 mg/L (9) and 0.1 mg/L (8).The KIN was mostly applied in doses of 0.05 mg/L (8), 1 mg/L (8) and 2 mg/L (7); IAA was preferably applied in doses of 2 mg/L (8) and 1 mg/L (5).The most applied doses for the TDZ and IBA were 1 mg/L (8, 6) and 2 mg/L (4, 5), respectively (Figure 7).There was high variation between the doses of the PGRs applied in the different manuscripts, varying from 0.01 mg/L to 16 mg/L (Figure 7).In general, the most reported doses of PGRs varied between the PGRs, whereas BAP presented the highest number of different doses applied per manuscript followed by 2,4-D and NAA (Figure 7).The most applied doses for the BAP were 1 mg/L (23), 2 mg/L (21), 0.05 mg/L (17) and 3 mg/L (9).For the 2,4-D, the most applied doses were 2 mg/L (26), 1 mg/L (18) and 3 mg/L (10) (Figure 8).The most applied doses for the NAA were 1 mg/L (13), 0.05 mg/L (11), 2 mg/L (9) and 0.1 mg/L (8).The KIN was mostly applied in doses of 0.05 mg/L (8), 1 mg/L (8) and 2 mg/L (7); IAA was preferably applied in doses of 2 mg/L (8) and 1 mg/L (5).The most applied doses for the TDZ and IBA were 1 mg/L (8, 6) and 2 mg/L (4, 5), respectively (Figure 7).Of the articles inserted in this SR, 17 referred to the time of subculture in months or years, ranging from one month to 40 years.In this case, five studies reported that the subcultures were carried out for one month and some subcultures for two months, four months and two years; both reported in three articles.The other subculture times were reported in only 1 article, such as 40, 14 and 10 years and 8 months (Figure 8).The studies that made clear the number of subcultures totaled 38; within these studies, the highest number recorded was 25 subcultures, and the lowest was only 2 subcultures (Figure 8).The number of subcultures recorded in most articles were three (7), four (5) and five (5).

Types of Explants
Among the sources of explants used, most articles mentioned leaves, except in studies of the species Vitis vinifera, Vanilla planifolia, Pisum sativum, Pennisetum glaucum and plants belonging to the family Poaceae and Orchidaceae.Seeds were the second most used source of explants, and this type of explant was most common among species belonging to the family Orchidaceae, Triticum species and other crops.In the articles inserted, the most reported cultures where somaclones were produced include Saccharum officinarum, species belonging to the Orchidaceae family and species belonging to the genus Musa.Leaves, seeds and rhizomes were also used as sources of explant (Figure 9).Of the articles inserted in this SR, 17 referred to the time of subculture in months or years, ranging from one month to 40 years.In this case, five studies reported that the subcultures were carried out for one month and some subcultures for two months, four months and two years; both reported in three articles.The other subculture times were reported in only 1 article, such as 40, 14 and 10 years and 8 months (Figure 8).The studies that made clear the number of subcultures totaled 38; within these studies, the highest number recorded was 25 subcultures, and the lowest was only 2 subcultures (Figure 8).The number of subcultures recorded in most articles were three (7), four (5) and five (5).

Types of Explants
Among the sources of explants used, most articles mentioned leaves, except in studies of the species Vitis vinifera, Vanilla planifolia, Pisum sativum, Pennisetum glaucum and plants belonging to the family Poaceae and Orchidaceae.Seeds were the second most used source of explants, and this type of explant was most common among species belonging to the family Orchidaceae, Triticum species and other crops.In the articles inserted, the most reported cultures where somaclones were produced include Saccharum officinarum, species belonging to the Orchidaceae family and species belonging to the genus Musa.Leaves, seeds and rhizomes were also used as sources of explant (Figure 9).Of the articles inserted in this SR, 17 referred to the time of subculture in months or years, ranging from one month to 40 years.In this case, five studies reported that the subcultures were carried out for one month and some subcultures for two months, four months and two years; both reported in three articles.The other subculture times were reported in only 1 article, such as 40, 14 and 10 years and 8 months (Figure 8).The studies that made clear the number of subcultures totaled 38; within these studies, the highest number recorded was 25 subcultures, and the lowest was only 2 subcultures (Figure 8).The number of subcultures recorded in most articles were three (7), four (5) and five (5).

Types of Explants
Among the sources of explants used, most articles mentioned leaves, except in studies of the species Vitis vinifera, Vanilla planifolia, Pisum sativum, Pennisetum glaucum and plants belonging to the family Poaceae and Orchidaceae.Seeds were the second most used source of explants, and this type of explant was most common among species belonging to the family Orchidaceae, Triticum species and other crops.In the articles inserted, the most reported cultures where somaclones were produced include Saccharum officinarum, species belonging to the Orchidaceae family and species belonging to the genus Musa.Leaves, seeds and rhizomes were also used as sources of explant (Figure 9).

Phenotypic Modifications
Regarding the most frequent phenotypic modifications in somaclones, 69 studies described phenotypic modifications caused by genetic variation in several cultures (Table 3).Phenotypic changes were observed in plant structure, pigmentation, roots, stems, pseudostems, flowers, leaves, fruits and seeds.Several studies have described morphological changes in leaves, especially changes in colour and length, as detailed in Table 3. Regarding the plant structure, the articles that reported phenotypic changes referred to the presence of dwarf plants in different crops, such as pineapple, coffee and banana (Table 3).

Fruits and Seeds
Number of fruits [80] Strawberry (Fragaria × ananassa) Number of fruits, fruit shape and difference in texture [81] Chili Pepper (Capsicum Annuum L.) Number of fruits and total production of fresh and dried fruits [82] Grass pea (Lathyrus sativus L.) Pod width and length, number of pods/plant, number of seeds/pod [83] Musa cv.'Grand Naine' Bunch length [84] Tomato (Lycopersicon esculentum Mill.) Number of bunches, number of fruits/plant, fruit firmness and fruit weight [85] Wheat (Triticum aestivum L.) Ear length and grain yield [86] Rice (Oryza sativa L.) cv PR113 Grains per panicle, grain weight and grain yield per plant [87] Sorghum (Sorghum bicolor L.) Increase in seed size and grain yield [88] Millet (Eleusine coracana) Grain yield per plant [89] Tomato (Lycopersicon esculentum Mill.)Number of fruits [90] In relation to changes caused in pigmentation, the presence of albino phenotypes was documented only in millet and wheat crops.For modifications caused in the roots, the potato crop showed a reduction in number and conformity, and date palm and wheat crops showed an increase in root length (Table 3).Changes in the stems were reported mainly for sugarcane where phenotypes with colour variation, smaller diameter or increase in diameter and length were described, and the number of internodes increased (Table 3).
Phenotypic changes in the pseudostem were observed only for banana genotypes with changes in length increase and colour appearance variations.In relation to leaves, the alterations were reported mainly in medicinal plant species to increase substances used for therapeutic and ornamental purposes, where the presence of genotypes with variegation characteristics or alterations in colour and conformity are commercially desirable.Similarly, morphological changes in flowers have been documented only in ornamental plants.On the other hand, changes in fruits and seeds were reported in important food crops, mainly to increase the number of fruits in tomato and grain yields in rice, sorghum and corn (Table 3).

Molecular Studies
To detect somaclonal variations and analyse the genetic stability of plants grown in vitro, DNA-based molecular markers are the most commonly used approach.Many molecular markers were used in the studies included in this review, which varied according to culture and evaluation purpose (Table S2).As we have already shown in our bibliometric analysis, randomly amplified polymorphic DNA (RAPD) and Intersimple sequence repeat (ISSR) molecular markers were used in most studies by the year 2018, with a change in recent years to a greater number of studies with other markers, such as Methylation Sensitive Amplification Polymorphism (MSAP), Simple Sequence Repeat (SSR), Single Nucleotide Variants (SNV) and Amplified fragment length polymorphism (AFLP) (Figure 10).In the last year, only analyses applying single nucleotide polymorphism (SNPs) markers were reported.As expected for the set of inserted articles, the objective of using each of the different molecular markers reported is to verify the mechanism related to somaclonal variation either by methylation in DNA or changes in the sequence of DNA base pairs.Some articles also evaluate, through markers, the presence of mutations (Table S2).
Among the 219 accepted articles, 12 evaluated the gene expression of the generated somaclones.Studies of expression of genes related to disease resistance, ornamental traits, protein expression and other molecular mechanisms are described in detail in Table S3.
A word cloud was made to identify the relevant genes analysed in somaclone studies, where the size of the name of each gene indicates the number of articles that describe the expression of the gene (Figure 11).The most frequent genes were PMADS4, Expansin and OP J-06, respectively.PMADS4 genes are considered higher-order protein complexes, re- As expected for the set of inserted articles, the objective of using each of the different molecular markers reported is to verify the mechanism related to somaclonal variation either by methylation in DNA or changes in the sequence of DNA base pairs.Some articles also evaluate, through markers, the presence of mutations (Table S2).
Among the 219 accepted articles, 12 evaluated the gene expression of the generated somaclones.Studies of expression of genes related to disease resistance, ornamental traits, protein expression and other molecular mechanisms are described in detail in Table S3.
A word cloud was made to identify the relevant genes analysed in somaclone studies, where the size of the name of each gene indicates the number of articles that describe the expression of the gene (Figure 11).The most frequent genes were PMADS4, Expansin and OP J-06, respectively.PMADS4 genes are considered higher-order protein complexes, responsible for changes in floral morphology in somaclonal variants.The Expansin gene is related to cell expansion; in the articles of this review, this gene was related to dwarfism events in somaclones.The Op J-06 genes are responsible for the Foc (Fusarium oxysporum f. sp.cubense) resistance response to banana somaclonal variants.Other genes were also noted in the word cloud, which indicates their expression in many studies of this review, such as the TDFs genes that are fragments derived from transcription and the RPK2 genes that are involved in signal transduction.These are in addition to NPR1 genes which function as master regulators of the plant hormone salicylic acid (SA) signalling and play an essential role in plant immunity (Figure 11).

Risk of Bias
The articles that answered 100% of the questions were classified as having a low risk of bias (180), and the articles that answered up to 60% of the questions were classified as having a moderate risk of bias (39) (Table S4).Manuscripts that answered up to 30% of the questions were not included, as they were considered as having a high risk of bias.The results indicate that the selected articles composing this SR are of high quality.

Screening of Studies
This SR comprises articles that aimed to generate somaclonal variants or study somaclones generated or marketed in the last 16 years.Therefore, many articles were eliminated in the extraction stage (410) because they dealt only with genetic variability without breeding purposes, where somaclonal variation is labelled in germplasm banks or in seedlings for field planting as an undesirable characteristic; in these cases, the objective is to ensure the genetic fidelity of plants.On the other hand, we included in our SR a set of 219 articles that deal specifically with the use of the technique for obtaining somaclonal variants with desirable characteristics to plant breeding programs.Although our study in-

Risk of Bias
The articles that answered 100% of the questions were classified as having a low risk of bias (180), and the articles that answered up to 60% of the questions were classified as having a moderate risk of bias (39) (Table S4).Manuscripts that answered up to 30% of the questions were not included, as they were considered as having a high risk of bias.The results indicate that the selected articles composing this SR are of high quality.

Screening of Studies
This SR comprises articles that aimed to generate somaclonal variants or study somaclones generated or marketed in the last 16 years.Therefore, many articles were eliminated in the extraction stage (410) because they dealt only with genetic variability without breeding purposes, where somaclonal variation is labelled in germplasm banks or in seedlings for field planting as an undesirable characteristic; in these cases, the objective is to ensure the genetic fidelity of plants.On the other hand, we included in our SR a set of 219 articles that deal specifically with the use of the technique for obtaining somaclonal variants with desirable characteristics to plant breeding programs.Although our study includes an extensively large number of articles, which makes it difficult to extract and discuss in detail all the data, we try to list the main data obtained in summary form to derive conclusions and tendencies regarding the proposed subject.
Our bibliometric analysis confirmed that the term "somaclone" began to be more frequent in the last two decades, when studies on the induction of somaclonal variation began to be developed for genetic purposes (Figure 2).At that time, several journals that are focused on publications in the areas of tissue culture and biotechnology began to publish articles with terms related to "somaclonal variation" (Figure 3).However, in previous years, the changes from in vitro cultivation described in different studies were tested to evaluate the genetic fidelity of plants in relation to the original plant and did not have the objective of generating somaclones to be applied in the genetic improvement of crops.Thus, the term "somaclones" becomes more frequent in recent years for this purpose [17,91,92]).

Cultures Evaluated in Different Countries
Among the countries that perform studies on somaclonal variation, India stands out as the country with the largest number of studies on this technique and is also the country that has generated the largest number of somaclones in the world, especially for sugarcane (Figure 4).India is the largest producer of sugarcane in the world [1], which may explain why there is a significant number of studies on somaclonal variation in this crop included in this SR.
Raza et al. [50] obtained the same results with somaclones of the BL4 cultivar.In turn, Doule et al. [47] and Nikam et al. [93] obtained somaclones with high Brix values that are useful for commercial cultivation.The sugarcane somaclonal variants Co94012 and VSI434 were developed in India and presented desirable characteristics, such as high yield, high sucrose content and moderate resistance to red rot.Somaclone VSI434 is the second sugarcane cultivar launched in India using somaclonal variation [43].
Ethanol production increased from 662 million litres in 1980 to 61 billion litres in 2018, and it is estimated that in 2022 the demand for ethanol will reach 97 billion litres worldwide.Currently, the United States leads the global ethanol market, followed by Brazil.Brazil is the main producer of sugarcane in the world, responsible for 40% of global production of this crop, which is the main raw material in the Brazilian ethanol industry.The development of sugarcane somaclones may contribute to increased ethanol production, increasing the production of biofuels worldwide [94][95][96][97].
A large variety of somaclones have been released for some plant species, especially ornamental plant crops; this sector has wide possibilities due to the great diversity that exists among ornamental species.The climate, altitude, culture of a region, etc., contribute to the genetic diversity among species of ornamental plants in different countries [32,98].Many somaclones are generated from ornamental plants, especially Chrysanthemum and Cereus, the most common ornamental plants included in this SR [78,79,98].The genetic variability that occurs in vitro, such as changes in colours, textures and plant size, contributes to the emergence of new phenotypic characteristics, enabling the launch of new ornamental plants and contributing significantly to this agribusiness.
Other crops with somaclones that have been generated for commercial purposes in the global food industry are rice, banana, potato and wheat [29,99].La Candelaria and Yerua are two rice somaclones that were used as sources of alleles for the development of new strains with tolerance to salinity [100].Wheat crops have also generated somaclones with tolerance to this abiotic factor [101].Other wheat somaclones were allele sources for the development of new somaclone strains with higher root growth under drought tolerance stress [42].
The generation of somaclonal variants allowed the selection and commercialization of some somaclones in certain cultures.In the banana crop, to obtain cultivars of the Cavendish subgroup tolerant to Fusarium oxysporum f. sp.cubense, tropical breed four (Foc-TR4), Sun et al. [102] identified somaclonal variants and selected nine resistant banana trees that survived in fields severely infested with Foc in China in 2010.Hwang and Ko [103] generated the cultivar 'Formosana' (GCTCV-218), a somaclone of Foc-TR4-tolerant banana, which is already in use by farmers and traders in some Asian countries.

Methods for Inducing Somaclonal Variation
Among the methods used for induction of somaclonal variation, methods that depend on PGRs were cited in 148 studies in the SR.The BAP and 2,4-D at doses of 0.5 mg/L, 1 mg/L and 2 mg/L were the most commonly used.The BAP is a cytokinin used for regulating the growth and development of plants in vitro [14].The identification of genetic variation in micropropagated plants indicates that BAP has become a tool for breeding programs, since this regulator has been used to induce somaclones with desirable characteristics.The second most used PGR in callus culture processes was 2,4-D, since one of the functions of 2,4-D is to act in callogenesis, which is an important process for the indirect production of plants.Calli contain cells or groups of cells that have active cell division centres.According to Corpes et al. [104], the balance between auxins and cytokinins may directly influence the process of callus formation and development.
The use of these PGRs in high doses, combined with the number of subcultures, causes stress that leads to cellular instability, triggering genetic or epigenetic variations in plants in vitro.Genetic alterations are permanent, usually hereditary and non-reversible, such as changes in DNA base pairs, insertion, deletion or base substitution.Epigenetic changes are changes in the DNA methylation pattern and can be reversible, causing the loss of epigenetic characteristics generated in a plant [17,92,105].Another factor of paramount importance for studies on the induction of somaclonal variation is the number of subcultures, which directly relates to the stress caused to the plant in vitro and induces genetic variation in plants.The use of PGRs, such as cytokinins and auxins, directly affects the genetic variation in plants subjected to subcultures, providing genetic variability and allowing the selection of traits of interest for breeding programs [14,18,92,[106][107][108][109].
PGRs and the number of subcultures interfere with the generation of genetic variations in vitro and are of fundamental importance in the induction of somaclonal variation [110].The combination of a high number of subcultures and a culture medium containing TDZ allowed the selection of somaclones resistant to Fusarium wilt (subtropical race 4, Foc STR4) in the cultivars 'Prata Anã' (Musa, AAB) [11] and 'Grand Naine' (Musa, AAA) [12].According to the literature studied, the stem apices were the most popular explants for induction of somaclonal variation in banana.The explant most commonly used to induce somaclonal variation in sugarcane was young leaf meristem tissue [111].This type of explant is preferable because the formation of embryogenic calli occurs in young leaves close to the meristem, inducing greater genetic variation [112,113].Another widely used explant was seeds, especially in orchids.The successful use of seeds as explants in in vitro culture is due to the availability throughout the entire year of most crops that can be transformed via callus and have more growth of buds in direct regeneration [114,115].

Phenotypic Modifications
In nature, the appearance of genetic variation occurs more slowly and can occur between hundreds and thousands of years when compared to the induction of in vitro variation.Therefore, some genetic alterations observed in the field may come from micropropagated plants in which the use of PGRs and frequent subcultures occurs [16].The occurrence of somaclonal variation in micropropagated plants has been studied for many years, and these variations occur in diverse cultures subjected to in vitro cultivation.Somaclones can be identified in a greenhouse, in the field and in vitro by observing changes in plant traits, such as leaf colour, texture, etiolation and other phenotypic changes (Table 2).
Epigenetic changes are responsible for phenotypic changes observed in somaclones, and these changes, such as loss of DNA methylation, may be reversible [15].
DNA methylation in the form of 5-methylcytosine (5mC) is an important epigenetic marker involved in gene expression and plays an important role in plant regulation and development [116]; in plants, it usually occurs in cytosine bases in all sequence contexts [92,117,118].
Although not recorded in our data, genetic and/or epigenetic changes that occur in vitro can also generate chimeras (mosaics).In chimeras, the variations affect the function of chloroplasts in different regions in the plant tissues of the same plant.This event occurs through variations in their plastomas, i.e., the region responsible for governing the expression of genes related to photosynthesis, with this change resulting in an albino phenotype [16].These changes are responsible for presenting altered morphological characteristics in micropropagated plants.
In our study, we described the phenotypic changes in different parts of micropropagated plants in vitro to obtain somaclones (Table 3).In general, our data demonstrate that the adoption of in vitro micropropagation methods with the use of PGRs BAP, 2,4-D, NAA, TDZ, IAA and IBA at different doses together with successive cultivation has the potential to cause desirable modifications to the genetic improvement of various crops of agricultural and commercial importance.
Many results showed that supplementation with high concentration of 6-benzylaminopurine (4.0 mg/L BAP) alone or combined with indole-butyric acid (IBA) produces a higher percentage of dwarf variants [32][33][34][35][36][37].Thus, plants with the dwarf phenotype have been reported for some crops; it serves as a marker for the presence of variations or as an important characteristic to facilitate cultural treatments and management in monocultured species, or as characteristics of ornamental interest [36,37].In pineapple culture, useful mutants were identified with less spiny leaves that are easier to manage in the field and hence, represent another dwarf phenotype with ornamental value [35].In wheat crop, a new strain of buckwheat, AS34, was developed by somatic variation and will be useful in wheat breeding programs, particularly because the modification of high commercial varieties reduces the risk of tilting; this is one of the most important agronomic characteristics of wheat [34].
The morphological alterations were seen more in plants of ornamental and medicinal interest.The SVT14 variants of Caladiums (Caladium × hortulanum Birdsey) presented rounder and thicker leaves and, in Chrysanthemum (Dendranthema grandiflora), changes were described in relation to the number of flowers, flower size, flower weight, leaf weight, stem weight or plant size, as well as a reduction in flowering induction time [68,78,79].
In the tobacco crop (Nicotiana tabacum), promising somaclones were developed with variations in the increase in length, width and number of leaves that can contribute to higher productivity of the crop [59].Morphological changes in fruits and seeds were also found.Our results showed that tomato crop stood out with studies that obtained somaclones with changes in the number of fruits, i.e., an agronomic characteristic of great importance for this crop [80,85,90].
Our results showed promising results for obtaining improved cultivars in relation to grain yield, which is a target characteristic for the genetic improvement of large agricultural crops, such as corn, rice and wheat [86,87,89].

Molecular Studies
Some changes in the plant genome are not morphologically identified, and even visible changes require molecular evaluation.Thus, molecular markers are often used to identify these variations [3].Based on polymerase chain reaction (PCR), several molecular markers, such as AFLPs, ISSRs and SSR markers, start codon-directed polymorphisms (ScoTs) and RAPDs, have been used to identify somaclonal variation [41].The RAPD markers were the most commonly used to identify genetic variation in the studies included in this review [99,119].According to our data, RAPD marker tests were widely applied to select these variations in micropropagated seedlings mainly up to the year 2018 (Figure 10).Although currently these markers are reported as very variable and are falling into disuse, the adoption of this technique for some time is justified because it is simpler and more economical, and by the ease of application in a less technical laboratory considering that the studies inserted in this SR are since 2007.
In addition, the use of RAPD markers depends on genetic markers located in parts of the DNA sequence, and large amounts of DNA are not required to locate the sequences.These markers are polymorphic and express genetic variations in band imprinting, thus making it possible to perform genetic mapping to indicate genetic diversity in parental genotypes; this is very useful for identifying variants among genotypes in germplasm banks with genetic characteristics that differ from clones of genotypes stored in banks [120,121].However, we indicate that there may be a tendency to use improvements in the RAPD technique, such as Sequence Characterized Amplified Region (SCAR), DNA amplification fingerprint (DAF) and sequence-related amplified polymorphism (SRAP).
The ISSR marker is a low-cost and highly efficient method that detects very small genetic variations and is widely used in studies of plant genetic diversity and to determine genetic relationships.Similar to RAPD markers, ISSRs are dominant markers and do not require prior sequencing.One of the advantages of the AFLP technique, besides being a low-cost technique, is the detection of a larger number of loci and providing a wide coverage of the genome.AFLP markers are capable of detecting genetic variations such as chimeras and identification of mutants [122][123][124].
The IRAP and REMAP markers are based on retrotransposons.Retrotransposons move through an RNA molecule, are dispersed throughout the plant genome and can contain thousands of copies, thus contributing to size, structure, diversity and variation in the genome which may affect gene function.The IRAP and REMAP markers are, therefore, considered very efficient molecular markers to investigate genetic variability in plants [125].Such markers were used to study genetic variation induced by tissue culture in date palms (Phoenix dactylifera L.) and alkaligrass (Puccinellia chinampoensis Ohwi) [126,127].Other studies have demonstrated the efficacy of these markers to evaluate genetic diversity and stability in crops such as beans [128], Egyptian barley [129] and date palm [130].
Single nucleotide polymorphism (SNP) can be applied to characterize allelic variation, genome-wide mapping and as a tool for marker-assisted selection.In the last decade, the identification of SNPs plays an important role in molecular genetics providing a better understanding of genetic architecture and the identification of several economically important characteristics in various crops [131][132][133][134].
Some articles addressed the gene expression of the generated somaclones, providing information about the genes involved in the expression of morphological and genetic traits (Table S3).Analysis of the expression of genes involved in resistance to Fusarium oxysporum f. sp.cubense tropical race 4 (TR4) Guijiao 9, a somaclonal variant of banana belonging to the Cavendish subgroup, revealed that during the onset of infection by Foc TR4, resistant Guijiao 9 showed a higher number of differentially expressed genes (DEGs) than the susceptible Williams cultivar.Multiple resistance pathways were activated in Guijiao 9, and the DEG genes were involved in plant-pathogen interactions, signal transduction, secondary metabolism and other processes.This suggests that the pathogen response is regulated by multigene networks of DEG genes related to resistance [102].
In the study of Lee et al. [77], gene expression analysis was used to evaluate levels of endoreduplication in the variants of Phalaenopsis WP, an ornamental species.The study indicated that the high levels of endoreduplication in these variants are associated with changes in the normal growth of petals and leaves.In addition, high expression levels of the HPY2 gene are associated with endoreduplication only in some cases, indicating that additional genes are involved in the induction of polyploidy in Phalaenopsis WP variants.However, the PMADS4 gene studied was highly expressed in the petals of normal plants compared to those of somaclones, indicating its normal function in the development of floral parts.Hsu et al. [135] also studied gene expression in somaclones and found five sequences that showed higher expression levels in the wild plant than in Phalaenopsis Hsiang Fei cv.HF.These genes correspond to sequences encoding casein kinase, isocitrate dehydrogenase, cytochrome P450, EMF2 and an unknown protein.Two other sequences found in this study, whose roles were unknown, were expressed at a higher level in the somaclone plant than in the wild-type plant.The authors concluded that mosaic colour patterns and aberrant flower shapes may be caused by these genes in somaclonal variants of Phalaenopsis Hsiang Fei cv.HF.Further studies on the gene expression of somaclones are needed and may provide a more complete view of the genes involved in the changes that occur in somaclones.Understanding the mechanisms of somaclonal variation, as well as the expressed genes, may provide an alternative to generate somaclones of all cultures using previously described genes.

Conclusions
A total of 219 articles published between 2007 and 2022 were included in this review, encompassing a large number of studies in which somaclonal variants of various cultures were generated.The in vitro genetic diversity created in several plant species and agricultural crops has led to the emergence of characteristics related to resistance to biotic factors, improved agronomic performance and tolerance to abiotic stresses.Somaclonal variation has been used in genetic improvement programs of several crops worldwide, generating genetic diversity and providing the launch of new genotypes of important agricultural crops, such as sugarcane, wheat, rice, potato, banana and ornamental and medicinal plants, among others, with resistance to diseases, pests and abiotic stresses.
India, Pakistan, China, Egypt, Iran and Brazil have the largest numbers of studies on somaclonal variation in the world.Studies on sugarcane, ornamental plants and fruit plants have been the most common over the last 16 years.Studies involving the induction of somaclonal variation focused on the identification of molecular genetic variation, the selection of useful agronomic traits, resistance to pathogens, tolerance to salinity and tolerance to water deficit.Studies evaluating somaclones with tolerance to abiotic stresses, such as lead tolerance, toxic metal tolerance and copper tolerance, were also cited.This indicates that the induction of somaclonal variation has been explored in recent decades from several perspectives.
PGRs and frequent subcultures are the most commonly used techniques for the induction of somaclonal variation according to the results of this review.The PGRs BAP and 2,4-D with doses of 0.5 mg, 1 mg and 2 mg/L were the most commonly used.The use of subcultures and PGRs, and the concentrations of these PGRs to induce somaclonal variation, does not require very sophisticated techniques; this makes them accessible for studies of somaclonal variation in breeding programs.In addition, the launch of new cultivars derived from somaclonal variation is not a bureaucratic process and is considered inexpensive; it differs from the development of cultivars derived from other methods, such as genetically modified (GM) crops, which face major social and ethical obstacles.
It is observed that techniques for inducing somaclonal variation have been applied to a variety of crops.With the success of these techniques, many cultivars with agronomic characteristics useful for agriculture, such as nutrient quality, yield, disease resistance and tolerance to abiotic stress, should be included in different genetic improvement programs, and future studies may provide relevant information.Each year, new cultivars are launched, and many are being studied and evaluated for marketing purposes.
There is still a broad expectation that increasing the understanding of the mechanisms involved in somaclonal variation, the expression of genes of the generated somaclones and information about the biochemical and molecular pathways involved in the selection of somaclonal variants needs to be further explored.Future molecular research may help in the identification of somaclonal variants through polymorphic fragments involved in the process of somaclonal variation and selection of some genes associated with unique characteristics of somaclones.The expansion of knowledge on the genetic and epigenetic mechanisms of somaclonal variation will increase its use in crop breeding.

Agronomy 2023 , 32 Figure 1 .
Figure 1.PRISMA flowchart.Process of selecting articles for inclusion or exclusion in a systematic review of the application of the somaclonal variation technique for plant genetic improvement; n = number of articles.

Figure 1 .
Figure 1.PRISMA flowchart.Process of selecting articles for inclusion or exclusion in a systematic review of the application of the somaclonal variation technique for plant genetic improvement; n = number of articles.

Figure 2 .
Figure 2. Biometric maps of manuscripts in the last 16 years regarding somaclonal variation in plant genetic breeding.Frequency of keywords (A); Frequency of Scientific Journals that published the most (B).

Figure 2 .
Figure 2. Biometric maps of manuscripts in the last 16 years regarding somaclonal variation in plant genetic breeding.Frequency of keywords (A); Frequency of Scientific Journals that published the most (B).

Figure 3 .
Figure 3. Number of articles published on somaclonal variation around the world in the last 16 years and main plant species studied.The countries shown in the light green colour have a lowest number of articles about somaclonal variation.The medium to intense green colours represent countries with approximately 21 studies on somaclonal variation, and the red colour represents countries with a higher number of studies on somaclonal variation.

Figure 3 .
Figure 3. Number of articles published on somaclonal variation around the world in the last 16 years and main plant species studied.The countries shown in the light green colour have a lowest number of articles about somaclonal variation.The medium to intense green colours represent countries with approximately 21 studies on somaclonal variation, and the red colour represents countries with a higher number of studies on somaclonal variation.Agronomy 2023, 13, x FOR PEER REVIEW 8 of 32

Figure 4 .
Figure 4. Number of generated somaclones separated by culture in the studies included in a systematic review of the application of somaclonal variation in plant genetic improvement.

Figure 4 .
Figure 4. Number of generated somaclones separated by culture in the studies included in a systematic review of the application of somaclonal variation in plant genetic improvement.

Figure 5 .
Figure 5. Number of articles separated by the reported purpose for generating somaclones.The data were generated for a systematic review of the application of somaclonal variation in plant genetic improvement.Of the articles that reported the use of PGRs, 68 reported benzylaminopurine (BAP), 62 dichlorophenocytic acid (2,4-D); 40 acetic α-naphthalene acid (NAA); 25 kinin

Figure 5 .
Figure 5. Number of articles separated by the reported purpose for generating somaclones.The data were generated for a systematic review of the application of somaclonal variation in plant genetic improvement.

Figure 8 .
Figure 8. Pie charts summarizing the data of subculture time and number of subcultures in published articles in the last 16 years recognized in the systematic review: The role of somaclonal variation in plant genetic improvement: a systematic review.

Figure 8 .
Figure 8. Pie charts summarizing the data of subculture time and number of subcultures in published articles in the last 16 years recognized in the systematic review: The role of somaclonal variation in plant genetic improvement: a systematic review.

Agronomy 2023 , 32 Figure 8 .
Figure 8. Pie charts summarizing the data of subculture time and number of subcultures in published articles in the last 16 years recognized in the systematic review: The role of somaclonal variation in plant genetic improvement: a systematic review.

Figure 9 .
Figure 9.Most frequently used explants for induction of somaclonal variation per culture.The data were obtained from 219 articles included in the review of the application of somaclonal variation for plant breeding.

Figure 10 .
Figure 10.Frequency of molecular markers associated with strategies to identify genetic variation over the last 16 years.The data were obtained from articles included in the systematic review of the application of somaclonal variation in plant breeding.

Figure 10 .
Figure 10.Frequency of molecular markers associated with strategies to identify genetic variation over the last 16 years.The data were obtained from articles included in the systematic review of the application of somaclonal variation in plant breeding.

Agronomy 2023 , 32 Figure 11 .
Figure 11.Word cloud of the frequency of genes with differentiated expression shared in manuscripts regarding somaclonal variation in plant breeding.

Figure 11 .
Figure 11.Word cloud of the frequency of genes with differentiated expression shared in manuscripts regarding somaclonal variation in plant breeding.

Table 2 .
Definition of the PICO terms for the research question addressed in this study of somaclonal variation over the last 16 years.

Table 3 .
Morphological characteristics associated with the somaclonal variation event in different cultures.