A Critical View of Different Botanical, Molecular, and Chemical Techniques Used in Authentication of Plant Materials for Cosmetic Applications

: A number of approaches can be implemented to ensure plant-based material authentication for cosmetic applications. Doing this requires knowledge and data dealing with botany, molecular biology, and analytical chemistry, the main techniques of which are described here. A comprehensive and critical view of the methods is provided with comments as well as examples of their application domains.


Introduction
Plant sourcing is of growing importance for the cosmetic industry for a number of reasons, including regulations, ethics, marketing, and security. Indeed, this industry is increasingly trying to replace animal-sourced components. Ethical issues have become important for consumers over the last decades. A trademark can suffer a harsh backlash when ethically doubtful behavior of a company is revealed. The Nagoya Protocol has triggered many countries to enact increasingly stringent laws and regulations aimed at preventing so-called "biopiracy" in a number of different fields, such as food and cosmetics. Control and norms are increasingly strict, not only by state agencies but also between B2B partners or during press investigations. All potentially harmful compounds have to be absent from raw plant materials, whatever their origin. Such contaminants can result from harvest conditions (plant or plant pathogen content, residues of phytosanitary products, physiological issues) or can originate from improper post-harvest conditions (molds, labeling errors, voluntary adulteration). Globalization further complicates the task, because most of the providers are located in different countries with different regulations, but also common (vernacular) plant names can differ, thus increasing the risk of confusion or intentional frauds. The origin of plants harvested or cultivated becomes of great importance, not only for cosmetic companies, but also for the lawmakers, resellers, and consumers. In order to minimize this risk, proactive behavior is essential. It requires the presence of reliable people in the place of production (able to identify correctly local plants) and also the proper training of harvesters (to avoid confusion between similar species, to identify the correct stage of maturity, etc.). Nevertheless, a posteriori controls based on botanical, molecular, and/or chemical aspects remain necessary to ensure correct authentication and secure sourcing. Identification as well as authentication of a vegetal raw material requires a wide range of knowledge and technical skills and have to be taken into account as early as the conception of the product. The plant species has to be the right one, and its content in active molecules should comply with specifications. The quality, efficiency, and obviously the safety of products made from plant material can be affected by a number of factors. Indeed, even subtle variations in the raw plant material can greatly influence these three aspects and also alter the results of clinical trials based on these plants or plant extracts. The Food and Drug Administration (FDA) of the United States maintains a precise, up-to-date database of plants associated in the literature with human and/or animal intoxication. Plants prone to contamination or falsification include: (1) those that are not well characterized due to their recent introduction to the market, (2) those for which the raw material is subject to trade limitation or restriction, and (3) those for which quality assessment is considered as insufficient or inadequate. Adulteration (voluntary or not) or contamination can trigger adverse effects for which the cause can be difficult to determine after their occurrence. Unfortunately, cases of intentional adulterations still exist in the cosmetic industry. The control of supply chains and good manufacturing practices minimize these risks and have in part solved this problem. Nevertheless, despite a number of regulations aiming at ensuring the conformity of a product to a defined composition, these regulations are often lacking guidelines describing precisely the analytical techniques to be performed to ensure conformity to the norm. As an example, the FDA directive 21 CFR 11.5 describes accurately the good practice for the manufacturing, labeling, and storage of products used as food supplements, but fails to mention specific analysis. Before evaluating the quality of raw material to produce cosmetics, a crucial question should be the quality of the tests used to characterize this starting material. Indeed, a study cannot be considered valid scientifically if the tested material has not been authenticated and characterized, so that this material can be replicated and produced on a larger scale. In the case of plants, there may be erroneous identification of the collected plant, falsification with other species, or contamination by foreign ingredients. While many studies refer to the use of standardized botanical material, this usually implies that chemical normalization based on the quantification of one or more selected marker compounds has been performed. However, although chemical normalization is very important, it has limited utility when the starting material is not well characterized in botanical terms. As an example, the phytochemical profile of Tanacetum parthenium (L.) Sch.Bip. is not the same as that of Magnolia grandiflora L. but both contain the bioactive molecule parthenolide. Thus, a parthenolide-based phytochemical authentication in absence of botanical characterization could be misleading. Without proper product authentication, the resulting studies cannot be extrapolated to other products on the market or compared to other studies due to inconsistencies in the identity of the plant matrix. Authentication tools can vary widely depending on the industrial plant or the process, from a simple botanic or morphological identification of a plant to more elaborate approaches based on genetic and/or analytical chemistry methods. Most of the identification methods in use nowadays are derived from a basic scientific area called pharmacognosy, which is the study of medicinal drugs derived from plants or other natural sources, a field of inquiry that goes back hundreds of years. Each identification of the origin of a plant material requires the use of a unique set of authentication tools to provide proof of identity. Most of these tools have been enhanced through recent technological advances and require extensive knowledge, information, infrastructure, equipment, and skills to implement.
Methods described herein can be used to control externally procured products or to select batches or suppliers in order to detect: -Frauds, intentional or not, for examples of raw plant material that do not conform to the desired plant, the sample provided, or the biological activities. The plant material can be picked by mistake because of its physical resemblance to the desired species. -Falsification of the product origin (place of picking, date of harvest) -Adulteration by partial contamination with other plants -Adulteration by adding products intended to increase the weight with cheap products devoid of interest (e.g., wheat flour to be mixed with any white powder) -Toxins resulting from the proliferation of microorganisms before or after harvest -Remnants of phytosanitary products (forbidden ones or used too late before harvest) Adulterants may have lower or no quality but also more problematically, a toxicity by itself, e.g., Pinus seeds in 2009 provoked thousands of cases of dysgeusia (loss of the tasting sense, temporary in these cases) which were reported by a food safety committee [1]. Indeed, this issue was due to the trading of seeds of Pinus armandii Franch. in place of one of the safe-for-food species such as Pinus pinea L. or Pinus koreansis Siebold & Zucc. In some cases, contamination of foodstuffs did result from the deliberate addition of toxic ingredients for the purpose of deceiving quality control, as was the case with the melamine, designed to increase the results of protein levels given by assays of baby milk or soycake. The examples given below come mostly from the field of food or medicinal ingredients but are applicable to the supply of plants for cosmetic use and, indeed, some products are used in both the food and cosmetic industries. For example, vanilla and lavender are used in both domains and are often subject to fraud [2]. True saffron is often, totally or partly, replaced by much less expensive plant material providing a similar colour. One can note that fraud can be result only of misidentification of a plant species during picking.

Botany in Authentication
It is easy to confuse a species for another one when focusing only on morphological traits supposedly specific to one specie (e.g., Mexican argemone and milk thistle share the same eccentric leaf pattern). The initial step in the identification and authentication of botanical materials requires conventional data concerning the taxon and botanical knowledge. The picking of plants should be accompanied by the scientific (Latin) and vernacular names, the name of the person who picked it, the date of collection and its place (GPS data when possible, description of the landscape), a code number related to collection, the organs harvested, the organoleptic properties, and the storage method. A visual representation (drawing and/or photography) of the plant before and after picking can be useful. When collecting plants for the research and development step, a voucher of the plant can be deposited in a public or private herbarium. This sample should show the organs which will constitute the harvest and should be dried and stored according to best practices for herbarium (it can include treatments devoted to a better preservation of the color along with time). This sample is then mounted on an herbarium sheet in an appropriate format to retain the identifiable morphological characteristics of the sample, thus assisting in the identification and authentication of the overall population collected. The herbarium sample must also include morphological identification "keys". These taxonomic keys (fruits, flowers, etc.) are often used as macro-characteristics for the identification of most plant genera. These preliminary steps are too often ignored because the collectors think they know what they are collecting and can guarantee the identity of the plant material. The authentication will take place either at the time of collection or after drying and/or storage. Observation with naked eyes or under a magnifying glass is a very fast and effective ways to detect rough substitutions, especially if it is already suspected and that a univocal criterion allows for the distinction between the desired species and that provided. The recognition criteria based on number, size, hairiness, and shape of organs often allow for distinguishing between closely related species, even when only dried materiel is available. Thus Salvia (sage) or chamomile may have foliar or floral morphologies sufficiently distinct to allow detection on a dry sample. Salvia sclarea L., widely used in cosmetics, presents long bracts not found in other species of sage. The main issue is that botany is a field-based science, and consequently, it is quite difficult to identify a plant that has never encountered before. A botanist originating from the area where the plant is growing is the best person to do the job, but it can be hard to hire one. Classical flora such as Bonnier & Layens [3] have keys to the taxa based on the floral organs; indeed, identifying a plant before or after flowering is a brain teaser. The taxonomy of plant species should be marked according to morphological characteristics using botanical guidelines. There are various published flora and botanical monographs with taxonomic keys from different floristic regions and countries that are useful for species identification. The reproductive organs together with vegetative ones, i.e., stem, leaf, and underground organs and their modifications, are also crucial and should be carefully investigated during plant taxonomic identification. The Internet is also a good source of alternative keys to taxa based on traits other than flower organs, but one should be cautious with websites or databases not owned by recognized academic institutions. "The plant list" and "The International Plant Names Index" are reliable sources of current taxonomic data. They can be found using the websites http://www.theplantlist.org/ and http://www.ipni.org/index.html, respectively. Keep in mind that contrary to a commonly held idea, photography is often less descriptive than a drawing, which stresses the specific features of a given plant; photography should be used mainly to confirm the identification. Unfortunately, drying or grinding can hinder the identification of plants, and organs, such as flowers, can be absent from the samples (often constituted only of roots or leaves). Confusion is increased by the fact that vernacular names can be very numerous and differ between places (even close), and even botanical taxa are prone to changes. Linnaeus, in 1753, initiated the binomial system of names with a generic name and specific epithet. Before him, the names were described in Latin of the plant or its use, e.g., the white clover was named Trifolium caule repente spicis depressis siliquis tetraspermis, now it is Trifolium repens L. This binomial name is unique for a given species, but nevertheless, research in botany provides new data which lead to changes in taxonomy and thus newer ones can replace old names, or genera can shift from one family to another. As long as an old flora (or a botanist) does not disappear, an obsolete version of the name can stay in use and it explains the existence of synonyms, especially if the plant and name were quite popular. The case of Coleus is a good example, first named by Linnaeus himself was Ocimum scutellarioides L. The first name given is called the basionym, the L. stands for the taxonomist who gave the name, here Linnaeus. Some decades ago, its correct name was Coleus blumei (name given in honor of the botanist Blume who brought it from Indonesia to Europe) and it is still referred to as this name by gardeners and plant growers. However, in 2006, the proper name was changed to Solenostemon scutellarioides (L.) Codd (it should have become a Solenostemon blumei but such a name did already exist for another plant!) and, in 2012, molecular biology data indicated that the genera Solenostemon had no striking differences between the genera Plectranthus. Now, Coleus is officially named Plectranthus scutellarioides (L.) R.Br., but most people, even scientists, still use the name Coleus.

Molecular Biology Methods Based on Genetic Fingerprints
Appearing in the 1980s, and popularized by their use in scientific police work, molecular markers are based on the differences in the genetic (DNA) sequences existing between individuals and species. A molecular marker can be defined as a gene or a DNA sequence located on a chromosome that has the potential to discriminate two genotypes (individuals or species). The knowledge of the whole DNA sequence is not necessarily needed because a number of protocols are based only on the detection of differences in size or on matching of flanking sequences that do not require the knowledge of the sequence in between. Nevertheless, databases comprising the patterns obtained for different species are often indispensable for fraud detection. One very important advantage of these methods is that they require only minute quantities of a sample to be assayed and work for whatever part of the organ available, as DNA is the same in every part of a given plant, contrary to its phytochemical content. It is of particular advantage for molecular markers to be efficient whatever the age or environment of the plant. Molecular markers can even be used on end products that underwent extensive and complex processing, as some DNA is often found after all of the industrial steps of production. Even a very low percentage of fraudulent addition can be detected, which is not always the case for methods employing chemical analysis. For example, the adulteration of saffron (Crocus sativus L.) with petals of safflower (Carthamus tinctorius L.) can be detected at ratios as low as 1% [4]. Similarly, the fraudulent incorporation of wild curcuma (Curcuma zedoaria (Christm.) Roscoe) into culinary curcuma (Curcuma longa L.) can be detected using RAPD, a PCR based method [5].
Compared to morphologic characteristics (botany) and phytochemical markers, this DNA-based technology can provide an efficient, accurate, and less costly way to test the authenticity of hundreds of samples simultaneously because the process can be easily automated. The DNA-based authentication of medicinal plants can be used as a tool for both quality control and safety monitoring of herbal cosmetic, pharmaceutical, and nutraceutical products. Compared to the genetic fingerprinting techniques commonly used in human medical science or forensic science, this type of analysis to identify plant-derived samples represents a unique challenge due to the lack of available sequence data and limited knowledge of the genetic diversity present in many plant species, which are two important obstacles to the development of simple and reliable genetic markers. The recent development of genetic profiling techniques can be applied to certain plant species and/or related species (or potentially adulterates) for which genomic data are available. The methods we will describe here can, for the most part, be adapted to efficient broadband analysis.
DNA obtained from freshly harvested, young, and fast-growing tissue is generally of better quality. On the other hand, powdered plant samples composed of highly differentiated tissues contain relatively poor-quality DNA and the extraction steps are technically much more challenging. The storage, drying, and grinding of plants during the manufacturing process also tend to lead to degradation of the genetic material. In addition, a potential complexation with chemical agents or adjuvants can also hinder the manipulation and analysis of plant material when using technologies based on enzymatic reactions such as PCR. Commercial genomic DNA extraction kits can be used to isolate good-quality DNA from various plant materials without the need to change the extraction procedure for different species. Samples can be fresh, air-dried, freeze-dried, or frozen and ground for DNA extraction. Several methods can then be used to determine the identity of the samples. Genetic information acquired during such experiments should be stored in a database (internal or public) for easier access and comparison purposes.

Presentation of the Different Types of Molecular Markers
Molecular markers are based on the fact that cleavage by restriction enzymes or pairing of primers are specific to short sequences of the genetic code which may vary between individuals or species. These events will not occur in the same way in genetic materials of different species. They can also be based on the existence of very short sequences repeated in the genome, with a different number of copies according to the organism. They became more varied and adapted to identity searches between genotypes, varieties, species, etc. They were also developed through PCR, which is associated with most of the techniques used and is a method able to detect a DNA sequence which is based on the recognition of two parts of the genetic code located at its ends. It is very sensitive because even tiny traces of DNA are enough. It also makes it possible to detect contaminations by microorganisms by searching their specific genes. The disadvantage is the necessity of having an idea of what one is looking for, since part of the target sequence must be known to practice a conventional PCR (with specific primers). However, this restriction does not apply to techniques using short primers of a dozen bases (e.g., RAPD, see below). This PCR technique is the one used (qPCR version, that is to say, quantitative) to detect GMOs, which may concern cosmetics, for example, soy. It is possible to detect only the varieties of GMOs for which introduced sequences are known, which is the case for the moment for GMOs grown in most countries. It is possible to use it: (i) To detect the presence of very small quantities of product because even a minimal percentage contamination will be visible, making it superior to biochemical methods. Thus, it was possible to detect adulterations of olive oils [6] or presence of powdered onions and garlic in what was presented as pepper powder [7]; (ii) To detect the presence of pathogenic and toxigenic fungi without relying on the presence of a corresponding mycotoxin, with the main advantage of being an early detection system. However, without a precise quantification of these toxins, it does not provide sufficient information about the possible use of the materials. Mycotoxins are toxic chemical compounds produced by certain fungi found in crops affected by diseases, even when the contamination seems only superficial. Two fungal microorganisms that synthesize these carcinogenic compounds can be found: Fusarium producing fumonisins or Aspergillus producing aflatoxins. It is possible to detect several at once with the multiplex PCR technique [8].
Thanks to technical evolutions and the ingenuity of researchers, we now have a large number of molecular markers adapted to different types of detection. They make it possible to demonstrate restriction fragment length polymorphisms (RFLPs), amplified fragments length polymorphisms (AFLPs), random amplification of polymorphic DNAs (RAPDs), single sequence repeats (SSRs), isothermal amplification (LAMP), single nucleotide polymorphism (SNP, Single Nucleotide Polymorphism), etc. Other techniques such as DNA barcoding, microsatellite-based markers, or Next Generation Sequencing (NGS)-based markers are relatively recent. Each of these techniques that target a particular component of the genome or is completely arbitrary faces unique methodological, technical, and material challenges. Therefore, no DNA marker can today be considered ideal. The use of a particular marker will therefore depend on the objectives. A detailed description of these various markers can be found in the following publications: RFLP: Restriction Fragment Length Polymorphism [9]; RAPD: Randomly Amplified Polymorphic DNA [10]; AFLP: Amplified Fragment Length polymorphism [11]; VNTR: Variable Number of Tandem Repeat, or minisatellites [12]; SSR: Simple Sequence Repeat, or microsatellites [13]; CAPS: Cleaved Amplified Polymorphic Sequence [14]; COS: Conserved Ortholog Set [15]; SNP: Single Nucleotide Polymorphism [16]; In/Del: Insertion Deletion [17]; SCAR: derived from RAPD but more reliable (they nevertheless require amplified sequence sequencing previously carried out by RAPD) [18].

Restriction Fragment Length Polymorphisms (RFLP)
RFLPs are considered to be one of the first developments in the field of genetic markers and are at the origin of the development of molecular genetics. The technique is based on the principle of restriction fragment length variation due to the appearance of mutations in binding/cleavage sites recognized by restriction enzymes. In addition, any reorganization in the genomic region flanked by restriction sites that also disrupt their distribution, and thus cause polymorphism, also contributes to the RFLP. Digested fragments that vary in size should be separated using Southern blot analysis and therefore visualized by hybridization to specific probes. The positive point of this marker is its codominant nature. Like other DNA-based markers, there is extensive literature (mainly on the PCR-RFLP technique) that suggests the use of this technique in the identification of medicinal plants. The limitations of this technique are a limited sensitivity of RFLP-associated detection because it is very difficult to obtain valid profiles (especially from old samples), long duration of analysis, and the potential use of radioactivity, which can, however, be overcome by using fluorescence, luminescence, or PCR-based detection systems. Moreover, this technique is difficult to automatize.

Simple Sequence Repeats (SSR) or Microsatellites
The SSR markers consist of monomeric DNA units of two to five base pairs repeated several times at a specific locus. These markers are entirely codominant and widely used for marker-assisted selection, lineage establishment, population structure establishment, and so on. These microsatellite markers are locally duplicated in tandem and dispersed throughout the genome, so it is important to amplify a locus-specific microsatellite using primers that are specific to that locus. The design of these primers obviously involves the precise identification of microsatellites by a sequence examination. To be quite relevant, the identification of these microsatellite markers also involves the cloning of a library enriched in markers and the analysis of their sequences. This can be done directly by sequencing, but the clones containing repeats can also be identified and characterized by hybridization techniques using a labeled probe (fluorescence, bioluminescence, or radioactive) containing the repeats. The DNA segment containing the marker(s) is then verified by sequencing and used for the design of specific primers. These SSR markers are known for their hypervariability resulting from DNA polymerase replicative accidents (adding or removing one or two nucleotides) occurring during DNA replication at these repeat regions. Nevertheless, it requires much time and cost to isolate and characterize each SSR locus when the DNA sequence of a plant species is not available. Another disadvantage could be the appearance of null alleles due to either a too low temperature of hybridization of the primers, nucleotide sequence divergence (between varieties for example), poor quality, or a low amount of DNA [19]. This can lead to difficulties in determining allelic and genotypic frequencies and underestimation of the degree of heterozygosity [20].

Random Amplification of Polymorphic DNA (RAPD)
RAPD technology uses short synthetic oligonucleotide primers (typically 10 to 12 base pairs) to amplify genomic DNA by PCR at a low hybridization temperature. The RAPD primers are randomly designated with a sequence containing at least 40% GC and no palindromic sequence to avoid the creation of stem-loop secondary structures. Since the size of the primers used is short and the hybridization temperature is low (generally between 28 and 38 • C), the profile of the amplified DNA varies in size as a function of the degree of nucleotide sequence homology between the primers and the complementary genomic sequences, as well as the extension time selected by the manipulator. The amplified fragments thus generated are separated by agarose gel electrophoresis, as a function of their respective sizes. In most cases, the RAPD fragments are derived from the amplification of a single locus, thus providing a polymorphism based on the presence or absence of bands related to the amplified fragments observed on the gel. The amplification bands obtained may vary in intensity, which may be due to differences in the number of copies or the relative abundance of the sequences. However, many studies question this use of RAPD markers, showing that no reliable correlation can be established between the number of copies and the amplified band intensity. These differences in band intensity may simply be related to the efficiency of the PCR amplification which may vary according to other parameters, such as the quantity and quality of the DNA, the presence of contaminants, etc.
Laboratories with limited budgets prefer RAPDs because the entire process requires only a PCR thermocycler and agarose gel electrophoresis and requires only a low level of technicality. RAPDs can effectively differentiate taxa below the level of the species because they reflect both the coding and noncoding regions of the genome. However, the major disadvantage associated with RAPD markers is its poor reproducibility, especially interlaboratory, which is mainly related to the low hybridization temperature of the primers used. This problem can be partly solved by choosing a suitable DNA extraction protocol to remove all contaminants by optimizing the parameters of the PCR reaction, evaluating different pairs of oligonucleotide primers, and taking into account only reproducibly amplified DNA fragments.

Amplified Fragment Length Polymorphism (AFLP)
This technique is a combination of RFLP and RAPD techniques, based on the detection of restriction fragments after PCR amplification. It can be used on DNA of diverse origin, complexity, or quality. AFLP markers can be produced without any prior knowledge of the sequence using a limited set of primers. The amplification fragments are generated after digestion and then ligation of adapters that will serve as a hybridization site for the nested primers used by the PCR amplification. The amplified products are then separated by 6% denaturing polyacrylamide gel electrophoresis and detected either by autoradiography using radioactivity, fluorescence, or luminescence detection. AFLP markers are generally considered to be excellent markers for the analysis of genetic diversity and therefore for sample authentication.
Nevertheless, as with RAPD markers, they are dominant markers. This technique nevertheless requires the purification of a nondegraded high molecular weight DNA. In addition, the procedure may also involve the use of harmful radioactive material. This last point can be easily overcome by using efficient fluorescence-based detection.

Inter-Simple Sequence Repeats (ISSR)
The ISSRs are dominant markers based on PCR amplification of a DNA segment located between two microsatellite repeat regions separated by an amplifiable distance. The principle of this technique relies on the use of microsatellite markers as primers, making it possible to target multiple genomic loci, leading to the amplification of intercalated genomic DNA sequences of different sizes that can be used as markers. The primers for ISSRs can be either 5 -anchored or nonanchored with one to four degenerated bases extended to 5 flanking sequences. Compared to RAPDs, the size of the primers used in the ISSR approach is much longer (from 18-to 25-mer), which allows hybridization of the primers at higher temperatures, thus leading to higher stringency that allows ISSR markers to have much higher reproducibility than RAPDs. However, this reproducibility can vary according to the detection method used or the quality of the samples to be analyzed.

Sequence Characterization of Amplified Regions (SCAR)
These are certainly the most effective markers for plant material authentication, based on the simultaneous use of a sequence-specific mono-locus marker and a codominant marker, in which the forward and reverse primers have been designated in particular regions, including AFLP, RAPD, and/or ISSR related to a trait of interest. The SCAR marker can be either a specific gene or a random DNA fragment present in the genome of an organism, and the primers for amplification are randomly located in (or close to) AFLP, RAPD, and/or ISSR markers. The designed primers are used to distinguish the target species from other related species by the presence of a unique, distinct, and intense band in the desired sample. The length of the primers (from 20 to 25 bases) and their GC contents are chosen to be very specific to the target sequence. It is a fast, reliable, and highly reproducible marker commonly used in molecular biology. The markers developed from AFLP and SSR markers are more reproducible but also more expensive, longer in their implementation, and also more technically demanding. To convert a single RAPD, AFLP, ISSR, or SSR marker into a SCAR marker, the sequence corresponding to that marker must be purified and sequenced. The nucleotide sequence should then be analyzed and its unique character confirmed by comparing known DNA sequences available in various databases to synthesize specific SCAR primers. These markers have the advantage of being codominant and are highly reproducible. Molecular biologists prefer these markers and the most recent data show that these markers are the most appropriate for the authentication of plant samples. The only limit to the widespread use of this type of marker is the need for sequence data to design specific PCR primers. A good example of the application of this technique is the detection of adulteration of one of the ginseng species Panax quinquefolius using other cheaper species like P. ginseng [21]. A 25 bp insertion is present in the SCAR of P. ginseng but is lacking in P. quinquefolius. Primers designed from this sequence were able to authenticate six Panax species and two adulterants.

DNA Barcoding
DNA barcoding can be defined as the use of short nuclear DNA sequences or sequences deriving from intracellular organelles in order to identify an organism. This technique has now been used for more than a decade, following the work of Hebert and colleagues [22], who first proposed that DNA coding for the mitochondrial enzyme CO-1 (cytochrome oxidase 1) could be used to generate a DNA barcode in animals. As mitochondrial genes in plant systems evolve slowly with very low substitution rates, they are poorly suitable for DNA barcode generation. An alternative is the use of nuclear and especially chloroplast genomes, which have much higher substitution rates. In particular, many regions of the chloroplast genome have been evaluated for use in plant systems by "The Consortium for the Barcode of Life Plant Working Group" [23]. For plants, two chloroplast genes replace the CO1 gene-matK and rbcL (genes chosen in 2009)-which are more suitable because the COI gene does not evolve as fast in plants. In fact, a good candidate gene for DNA barcoding is one that evolves rapidly and is found in all organisms. The gene matK, coding for a maturase that excises introns, and the gene rbcS, coding for a subunit of the Rubisco, an enzyme involved in photosynthesis and therefore specific to plants [24], are good candidates. The chloroplast gene rbcL is considered to be universal but has a rather low species resolution, while, conversely, the matK gene is very resolutive, i.e., it allows for a good separation of plant species. It is therefore interesting to note that the combination of these two markers (i.e., their joint use) could show better results in terms of species discrimination even when the concerned taxa are very close genetically. More recent work, adding the ITS nuclear gene to the matK + rbcL combination, has resulted in a much better ability to discriminate closely related species. Bruni et al. [25] showed the use of DNA barcoding to distinguish toxic species from nontoxic cultivated species in the Solanum genus (which includes potato and eggplant, but also many species rich in alkaloids). Nevertheless, it has been reported that DNA barcoding does not distinguish between recently divergent species, as has been observed by the study on DNA barcoding recognition in Lamiaceae of the genera Mentha, Ocimum, Origanum, Salvia, Thymus, and Rosmarinus (analysis using the combination matK + rbcL and ITS) [26]. It should also be mentioned that the use of DNA barcoding depends to a large extent on the amplification of the DNA, which can be degraded. This is one of the most difficult aspects because the characterized sequences are longer than 500 bp and thus this degradation can hinder amplification [27]. Therefore, like other classical markers, DNA barcoding should not be considered the universal marker.

Loop Mediated Isothermal Amplification (LAMP)
Recently appearing, this technique aims to dispense with conventional PCR equipment because the detection is done visually, thanks to a color reaction that is proportional to the amplification reaction of the target DNA made at room temperature. The advantages are its speed and simplicity that make detection on site possible without going through a laboratory [28][29][30]. LAMP amplification is a molecular biology technique that relies on strand displacement DNA synthesis using a particular Bst DNA polymerase [28]. Unlike other DNA markers based on the use of polymerases, here two pairs of primers (internal and external) are needed to amplify the target gene. The specificity, efficiency, and speed of implementation are very high. Most studies agree that this technique can amplify specifically, within a genome, a given gene discriminating a single nucleotide. The DNA synthesis (polymerization step catalyzed by Bst polymerase) takes place over one hour at a single temperature which is between 60 and 66 • C, depending on the length and sequence of the primer pairs used. In addition, this technique does not require specific equipment like a thermocycler but only the use of simple incubators or block heaters sufficient to provide and maintain the amplification temperature. Separation of the amplified fragments by gel electrophoresis is also not required since LAMP products can be detected by the turbidity manifested by the production of a large amount of by-product-the pyrophosphate ion, producing an insoluble white precipitate of magnesium pyrophosphate. Recently, the measurement of the "end-point" fluorescence emitted by the SYBR green incorporated during amplification has made it possible to increase the precision and reduce the detection threshold. The applications of LAMP markers for plant identification are still rather rare, but there is much work in the scientific literature attesting to the relevance of its use in plant authentication studies. However, the design of the primers used by LAMP technology is complex and a minimum of two pairs of primers is needed to identify six different regions of sequence of the target DNA region.

Next Generation Sequencing (NGS)
The emergence of reversible chain termination sequencing coupled with high-resolution detection capability, popularized as Next Generation Sequencing (NGS), allows for the simultaneous sequencing of a large number of molecules, thus simultaneously generating a huge amount of sequence data. The technique is based on the principle of fragmentation and immobilization on a solid support of DNA templates which are then amplified and sequenced. The technology uses pyrosequencing, in which the released pyrophosphate ions are detected during the addition of nucleotides by the DNA polymerase. NGS technology relies heavily on the use of complex algorithms for filtering, assembling, and analyzing sequences. Many NGS data-analysis tools are now commercially available. It should be noted that there is a method for analyzing transcriptome data from a species by sequencing their cDNA or RNA sequencing (RNAseq) rather than studying the entire genome. Access to this type of transcriptomic data makes it possible to provide details on the biosynthesis capacities of a particular plant organ grown in the related environment. These data are cell-type specific and depend on the culture conditions, and therefore can be used to identify genes involved in a metabolic pathway of interest for cosmetic applications. In this context, the NGS data have already been widely used to identify the molecular signatures in the transcriptome related to particular physiological functions of the tissues under consideration. This method is fast but also expensive.

Important Remarks about DNA Markers for Authentication
The genetic profile of medicinal and aromatic plant species will find applications in the quality control of cosmetics obtained from these species, not only by the cosmetics industries, but also by the organizations responsible for monitoring their quality. While technologies based on DNA analysis are undeniably superior to other methods in the authentication process, it should be noted that they have a number of disadvantages in that they require high quality DNA, which is not always compatible with the treatments used during the harvesting, drying, or storage of raw materials. In particular, changes in temperature and/or pH can lead to partial degradation (fragmentation) of the DNA, making PCR-based assays difficult. Samples may also have been contaminated by endophytic fungi that may distort the results obtained by techniques based on the analysis of dominant markers such as RAPD, AFLP, and ISSR markers. This type of contamination can also influence the results obtained by conventional DNA or NGS-type sequencing approaches. However, this can be overcome with a primer design specific to the sample to be analyzed. Many markers, such as the ITS region of the 18S, 5.8S, and 26S ribosomal cistron, may exhibit intraspecific sequence variation due to nonfunctional (pseudogenes) paralogous sequences, which compromises their use for DNA barcoding approaches, for which only orthologous sequences can be used. Consequently, cloning steps and sequencing of PCR products is sometimes needed to ensure the identity of the sequences analyzed. In order to develop a marker for the identification of taxa, it is necessary to perform specific DNA analyses of not only closely related species and/or common botanical varieties but also possible contaminants, which may make the process longer, more difficult, and more expensive. If one of the major advantages of DNA markers is that they are not tissue-specific, this can also become a problem when possible adulterants come from another part of the same plant. For example, sometimes only the flower of a plant has cosmetic properties, but other parts of it may be mixed in, either voluntarily to increase the mass for profit or unintentionally during harvest. In this type of situation, only a chemical analysis will make it possible to ensure the absence of potentially dangerous compounds.

Analytical Chemistry Techniques Based on Biochemical Patterns
These techniques allow for the discrimination of chemotypes within species, as well as maturity stages and origins. They are often quantitative analyses able to detect frauds by adulteration or substitution, but also too low contents in active components due to bad maturation state and incorrect drying or storage. Nowadays, the best authentication protocols are based onto analytical chemistry, such as HPLC (discussed later), capillary electrophoresis, or gas chromatography, which is associated with powerful detection means. To consider a method as valid, it is necessary that a sampling among a population is made in order to establish a representative reference pattern. Once identified, some of the main components of the sample will be selected as phytochemical markers which have to be representative of the species (it is even better if they are specific to this species) and, if possible, responsible for the plant biological activity of interest for cosmetics. As for the comparison of fingerprints in criminology, the more numerous the markers, the higher the level of confidence of the analysis for authentication. Indeed, using several phytochemical markers reduces the possibility of undetected adulteration.
For a valid method standard, availability and powerful identification methods, such as mass spectrometry, are both required. Standards (either natural or synthesized) should be commercially available and their purity guaranteed, otherwise "homemade" standards can be used if they meet these requirements and can be purified at mg scale quantities. Most often, the validation of the method (assuming precision, specificity, robustness, and reproducibility) is the weak point for the adoption of a method in quality control. The method must also be selective and linear, in the required range, and present low limits of detection (LOD) and limits of quantitation (LOQ). Validation can be normalized or acquired only for one lab. In every case, the development of a method is time consuming but provides a valuable tool for the identification and authentication of plant samples. A more global analysis of a phytochemical pattern can be considered, with the help of a statistical analysis such as hierarchical clustering analysis (HCA) or principal component analysis (PCA), in order to reflect the chromatogram as a whole or in part and compare it to a compilated population of reference samples. The main advantage of such an approach is that identifying all the pics is not required to determine if a sample is similar to those of the reference panel. Nevertheless, identifying the molecules responsible for the biological activity in the extract should always be considered as mandatory.

High Performance Liquid Chromatography (HPLC)
HPLC is, for sure, the most useful and widespread method for the separation of the molecules constituting a complex mixture. It gives access to both qualitative and quantitative data. Compounds found in minute quantities, such as mycotoxins and pesticide remnants, are targets of choice for this method of detection and quantitation [31,32]. This technique uses a stationary phase fixed in a column, through which flows the mobile liquid phase carrying along the molecules of an extract. These molecules are more or less retained, according to their affinity with the stationary phase. Reverse phase columns are well fitted to the separation of natural molecules. They are filled with silica, grafted with alkyls (often C18) to provide a good resolution, and confer hydrophobicity to the column. Chiral HPLC columns present a peculiar grafting which allows for separation of enantiomers, thus making HPLC the sole technique (with circular dichroism) to distinguish between these isomers. This feature is of great importance for natural compounds, which are frequently found under different isomeric forms, among which only one is active and/or is allowed to reach the market. The mobile phase is generally made of a mixture of solvents, like alcohol-water or acetonitrile-acidified water, the ratio of which changes during the analysis. Molecules are detected by their absorption ability, most often in the UV domain. A diode array detector (DAD) after the column allows for simultaneous detection and measurement at a number of wavelengths during the separation of the extract components. It helps to differentiate and thus to quantify the different molecules that could be present under the same pic. Although the development of an HPLC-UV analytic protocol can be fitted to a number of various samples, sometimes UV detection is not feasible due to a too low absorbance in UV. In this case, an Evaporative Light Scattering (ELS) detector could be used. ELS offers near-universal detection of nonvolatile and semivolatile sample components. It is not dependent on light absorption but relies on the diffusion of light through the component after nebulization, and this diffusion is directly proportional to the concentration. Moreover, ELS detectors are not prone to the problems arising from UV absorbing solvents. When using HPLC, the availability of several phytochemical markers makes the authentication more reliable, as it is highly unlikely that the pics of the adulterant components overlap perfectly with all the pics of an authentic sample. Although efficient, HPLC methods are often quite long, which is a major drawback when assaying a number of samples. For this reason, Ultra Performance Liquid Chromatography (UPLC) has been developed. Unlike HPLC, it requires only 15 minutes per sample (instead of as long as 80 min for many HPLC protocols). The same detection methods are applicable and the maximal resolution and limit of detection (in the ng/mL range) are kept unchanged. On the downside, HPLC is quite expensive in terms of equipment and requires skilled labor for both running and interpreting data. For small companies, High-performance thin-layer chromatography (HPTLC) can provide a cheaper and quicker routine method. As opposed to column chromatography (e.g., GC, HPLC), HPTLC utilizes a flat (planar) stationary phase of silica for separation, and the mobile phase is a mixture of apolar and polar solvents. Silica particles are smaller than for TLC (5-7 µm as compared to 10-15 µm for TLC) and thus allow for a better resolution. Detection is performed under a UV bulb or with a dye solution. Robotized devices are available as well as software based on densitometry analysis of images, helping identification and quantitation.

Gas Chromatography (GC)
GC is a technique of interest for volatile compounds but is not well fitted to molecules having a high molecular mass. The mobile phase is a noble gas, which does not react with the molecules to be analyzed, and drives the volatile molecules through the stationary liquid or solid phase which will retain, and thus retard, them more or less depending on their affinity. It is generally coupled with mass spectrometry (see below). The analysis performed by Detaillat et al. [33] used GC to identify the Pinus seeds species which had provoked the dysgeusia cases mentioned earlier.

Mass Spectrometry (MS)
To fix identification issues, a number of researchers have developed analytic techniques based on spectroscopy. They rely on detection methods with wide spectra, typically of the whole chemical pattern of an extract. One of the first methods using it was GC coupled with MS. In this technique, the volatile components of a plant extract are injected in the GC apparatus, and after separation, they are identified by their specific mass. The ratio between identified pics can be used as basis for further statistical analysis, allowing for the grouping of samples with similar contents. MS has been also coupled with HPLC. MS relies on an ionization of the molecules, for which the mass/charge (m/z) of the fragments are precisely determined to allow the structural determination of the molecule of a given compound. Nowadays, high resolution MS (HRMS) can perform this measurement with a high accuracy. A limitation of MS is that it does not allow for distinguishing isomers.

Nuclear Magnetic Resonance (NMR)
NMR is another excellent way to identify compounds. It is used to detect adulterations of beverages and oils [34]. It is based on the capacity of certain atoms such as carbon and hydrogen to transiently absorb the energy of electromagnetic radiation and then to restitute it, by resonance, at a frequency specific to each atom. Proton and carbon resonances are enough for most of the analysis. 13 C is present at about 1%, which is enough to use it. Sometimes combined with HPLC, NMR allows for the identification of the molecules of an extract (2D NMR) and determination of a molecule structure.

Infrared Spectroscopy (IR)
Fourier-transformation infrared spectroscopy (FTIR, 400-4000 cm −1 ) as well as near infrared spectroscopy (NIR, 4000-12,500 cm −1 ) have been used to authenticate samples. These IR methods are based on the vibration, stretching, movements, and flexions present in molecules found in samples. NIR has the advantage of being quick, with the drawback of not allowing for precise knowledge of the content because the resolution is poor. NIR methods can also be affected by the water content of the samples. FTIR provides better spectroscopic resolution but can also suffer from the presence of water in samples. It relies on the ability of the components to diffuse the infrared light, which allows them to be listed in a mixture like oil [35]. Contrary to FTIR and NIR which are based on the absorbance of IR signals, FT-Raman uses the measurement of the dispersion resulting from a specific wavelength. This eliminates the water problem and allows for performance of the analysis even through some kinds of wrapping. FTIR and FT-Raman can be extremely useful for authentication of samples but require a wide and precise knowledge of what represents each pic in relation to the various molecules of the sample.

Metabolomic Approaches
Coupling these spectroscopic methods with chemometric/metabolomic analyses begins with an initial spectroscopic measurement of authentic samples to fill a spectra database, allowing for comparisons. Once the compilated spectra are available, the signals are submitted to an alignment, a regional exclusion, a binning (gathering of the data), a normalization, and a scaling. Eventually, the use of statistical software based on HCA or PCA allows for the grouping of populations of samples on the basis of similarities within these populations. The major drawbacks of these methods are the lack of a profound understanding of the whole range of signals in the spectra and the need for expensive apparatuses and skilled people to operate them.

Isotopic Abundance
Depending on the kind of climate and their type of photosynthesis, plants synthetize compounds of which the richness in isotopes differs from the chemical element source (generally CO 2 ). This is due to isotopic discrimination when the enzymes like Rubisco fix the CO 2 or when CO 2 has to enter the leaves through tiny holes such as stomata. As an example, 13 C will be less abundant in sugars made by the plant than in the CO 2 contained in the air (around 1.1%). This will also differ if a sugar beet, a vine, or a maize biosynthesized the sugar because the latter has a C4 type of photosynthesis and the vine has probably been grown in a dryer climate [36]. Indeed, drought stress is also a variable parameter that influences the isotopic content, thus it is also possible to discriminate between plants belonging to the same species grown in different years and at different locations, enabling fraud detection for the origin of commodities. This method is already of common use to fight fraud in wines [37] or honeys [38,39], for which frauds about the origin are not rare. They are therefore usable for cosmetic ingredients for which a region of origin is claimed.

Immunological Techniques
Although widespread for the detection of adulteration of animal products (e.g., milks) or of the presence of plant allergens in food (e.g., peanut), immunological techniques such as the ELISA assay are poorly documented in plant commodities. Nevertheless, there are some reports of the use of this technique for medicinal plants such as Saussurea [40]. Indeed, contrary to animal products which concern mass markets of few animal species, developing such a test would be economically relevant only for the most common species, as it requires specific antigen knowledge and the production of antibodies against them.

Conclusions
The choice of technique will depend on the data available in a given domain and on the more or less processed material to be assayed, as well as the kind of adulteration or quality defects to be detected. For example, molecular markers can help to distinguish between two species and thus can detect adulterations with a wrong plant, but they are unable to provide evidence of poor content in an active molecule, the latter requiring analytical chemistry methods. The authentication methods described here require prior knowledge concerning the plants to be identified in order to be able, based on comparisons of morphological criteria, molecular markers, or phytochemical profiles, to affirm a sample's identity or its dissimilarity with the species to which it is supposed to belong. It is therefore important to have access to databases with botanical, genomic, and biochemical information. These data must be generated, which requires extensive research. This work becomes technically and financially feasible with the appearance of technical innovations which allow for data generation at high throughput. While only 20 years ago a global consortium of laboratories and astronomical sums were needed to sequence the (small) genome of Arabidopsis thaliana (L.) Heynh., the first plant sequence to be published [41], it is now possible for only a few thousand euros to sequence any plant because there has been an exponential progress in this area. Similarly, mass spectrometry techniques have in recent years made essential progress. Generating data is not enough; they have to be accessible and allow for easy comparisons. For this purpose, it is necessary to create databases and servers containing all of these data. The approach of Cosmetopoeia goes in this direction and should not be limited to botanical and ethno-botanical data but should also include the implementation of the development of these tools in order to make these data accessible to a large consortium of private and public actors [42].