Development of a DNA Metabarcoding Method for the Identification of Insects in Food

Insects have the potential to become an efficient and reliable food source for humans in the future and could contribute to solving problems with the current food chain. Analytical methods to verify the authenticity of foods are essential for consumer acceptance. We present a DNA metabarcoding method that enables the identification and differentiation of insects in food. The method, developed on Illumina platforms, is targeting a 200 bp mitochondrial 16S rDNA fragment, which we found to be suitable for distinguishing more than 1000 insect species. We designed a novel universal primer pair for a singleplex PCR assay. Individual DNA extracts from reference samples, DNA extracts from model foods and food products commercially available were investigated. In all of the samples investigated, the insect species were correctly identified. The developed DNA metabarcoding method has a high potential to identify and differentiate insect DNA in the context of food authentication in routine analysis.


Introduction
Limited land available for livestock production, increasing greenhouse gas emissions and a rising world population are becoming a challenge for the growing meat and protein demand worldwide [1]. Insects have the potential to become a sustainable, efficient, and reliable source of food for humans and overcome some of the burdens of the meat industry [2]. Depending on their species and state of metamorphosis, insects can contain remarkable amounts of proteins, calories, fat, vitamins, and minerals, and therefore complement or even replace meat in the human diet [3]. In large parts of the world such as Africa, Asia and South America, the so-called entomophagy (the consumption of insects as a food source for humans) is common in traditional cuisine [1]. Insects are widespread in many regions worldwide and comparatively easy to propagate. Since many years, they have been among the most important sources of nutrients, especially for developing countries that are regularly affected by starvation [3,4]. In Europe, entomophagy is not yet widespread, but since the 21st century, public and economic interest has been growing due to EU subsidies.
The production and placing on the market of insects and parts thereof are regulated in Europe by the legislation on Novel Foods [5]. The yellow mealworm (Tenebrio molitor), rather their larvae, was the first insect to be approved in the EU, followed by the migratory locust (Locusta migratoria), the house cricket (Acheta domesticus), and the buffalo worm larvae (Alphitobius diaperinus) [6][7][8][9]. Further insect species can be expected to be approved by the EU. Insects can not only be an intentional component in food products but can also occur unintentionally as storage pests [3]. Due to possible production of allergenic substances and symbiosis with mycotoxins, the presence of pests can be of great importance to human health [10]. Food authenticity is important in terms of food fraud, the quality and safety of ingredients and cross-contamination. Food can be considered authentic if it is in its original state and complies with its declaration. Premium or high-priced products are especially prone to be adulterated by cheap or low-quality ingredients and thus need to be verified by analytical methods to support control [11]. At least the insects recently approved for food use should be detectable and discriminable from other insect species. DNA-based analytical methods are gaining more and more importance because they enable specific and fast analysis and have a broad range of applications. DNA is not only present in almost all foods but is also quite heat tolerant and can therefore be used as a parameter in processed foods [12]. Polymerase chain reaction (PCR) assays for insect identification have been developed in singleplex and multiplex PCR format [13,14]. However, the number of detectable insects is low due to the currently still small number of detection methods for insects in food. Multiplex methods are also limited in the number of optical channels in the detection unit of the real-time PCR device. In addition, a method should allow the analysis of highly processed foods, but published assays may fail to amplify degraded DNA because the designed primer system forms PCR products that are too long. These limitations can be overcome by using methods of barcode sequencing with universal primer systems. DNA barcodes are usually composed of conserved regions at both ends and a variable part between the primer binding sites to discriminate between the species of interest [15,16]. In traditional DNA barcoding, PCR products gained through amplification of the designated DNA barcode region, e.g., cytochrome oxidase I gene [17] are then subjected to Sanger sequencing [18][19][20].
To increase the efficiency of this method, a combination of DNA barcoding with next generation sequencing (NGS) is favorable [21,22]. So-called DNA metabarcoding enables the detection of a larger number of species simultaneously and identifying them through reference sequences [23,24]. For this purpose (correct) database entries are required [25]. DNA metabarcoding methods to identify and differentiate species have already been developed and published, e.g., the detection of mammals and birds in foods, and bivalve species in seafood [26][27][28][29].
In this study, we aimed to develop a DNA metabarcoding method that uses relatively short PCR products of approximately 200 bp in length, allowing identification and differentiation of insect species in processed food products. The method was developed using the Illumina MiSeq ® and iSeq ® platforms.

Samples
Insect samples (pure material of individual species) from the Institute for Sustainable Plant Production, Vienna, Austrian Agency for Health and Food Safety (AGES) and insect-containing food obtained from supermarkets and online shops were used for the experiments. Experts at the Institute for Sustainable Plant Production confirmed identity of the insect species used as reference samples. Preferably, reference samples were ordered alive or alternatively already dried or frozen. Furthermore, self-made insect cookies and burgers (which have served as model foods in previous studies) were obtained from the Food Control Authority of the Canton of Zurich, Zurich, Switzerland. The cookies contained three insect species in equal proportions, while the burgers had an asynchronous composition from 0.1 to 10% [14]. All samples were kept at a temperature of −20 • C until DNA extraction was performed. The samples used for the development of the method, including the four insect species commonly consumed in Europe, are listed in Table 1. The selection criterion was the affiliation of these insects to the main representatives of edible insects [1].

DNA Extraction
At the beginning, all samples were either cut into smaller pieces or homogenized in a mortar or lab mill. After that step, samples were lysed in the presence of a hexadecyltrimethylammonium bromide/polyvinylpyrrolidone extraction solution (CTAB/PVPbuffer) and proteinase K at elevated temperature under constant shaking. Then, DNA extraction was performed using a commercially available kit. The Maxwell RSC Pure-Food GMO and Authentication Kit from Promega (Madison, WI, USA) and the Maxwell ® 16 instrument (Promega, Madison, WI, USA) were used for DNA isolation following the manufacturer's instructions. The DNA extraction procedure was verified by including negative and positive extraction controls. The yield of the DNA extracts was measured fluorometrically with the fluorometer using Qubit ® 2.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA). The Qubit ® dsDNA broad range assay kit (2-1000 ng) and, for low DNA concentrations, the Qubit ® dsDNA high sensitivity assay kit (0.2-100 ng) were used according to the manufacturer's protocol. The purity of the DNA was also checked using the ratio of the absorption at 260 and 280 nm (QIAxpert spectrophotometer, Qiagen, Hilden, Germany). The DNA extracts were frozen at −20 • C until further use.

Reference Sequences and Primer Design
We used the "Worldwide list of recorded edible insects, Jongema, 2017" as the basis for our search for insect DNA sequences. Reference sequences in FASTA format for individual species were downloaded from the National Center for Biotechnology Information (NCBI, Bethesda, MD, USA) and imported into the CLC Genomics Workbench software (version 11, Qiagen, Hilden, Germany). Preferably, entire mitochondrial DNA sequences were derived from the NCBI RefSeq database due to their expert-proven reliability. The sequences of the mitochondrial 16S rDNA were extracted from the complete genomes and multiply aligned by using the default settings of the CLC Genomics Workbench software (version 11, Qiagen, Hilden, Germany). The primers used were manually designed for this multiple sequence alignment. Four forward and three revers primers have been designed and tested in 12 combinations to amplify a~200 bp barcode region of mitochondrial 16S ribosomal DNA from different insect species. The sequences of the primers tested are shown in Table 2. The formation of primer dimers was checked by using the OligoAnalyzer Tool provided by Integrated DNA Technologies (IDT, Coralville, IA, USA). Calculations of the annealing temperature of the primers were performed using specialized computer programs as displayed in the TIB Molbiol product description (Berlin, Germany). The target-specific primers, including the overhang adapter sequences were purchased from TIB Molbiol (Berlin, Germany). Table 2. Primer sequences tested in this study.

Illumina Overhang Adapter Sequences
Forward TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG

Reverse GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG
To verify the successful amplification of the designed primers, real-time PCR of DNA from positive controls was performed (PCR results of the reference samples are shown in Figure S1 (Supplementary Materials)). The individual DNA extracts of the insect species were used as reference samples or positive controls (see Table 1). During PCR-optimization, the DNA input amount of 12.5 ng and the amount of 'ready-to-use' HotStarTaq Master Mix Kit from Qiagen (Hilden, Germany) were kept constant and applied as previously published [26]. The annealing temperature (58-62 • C), primer concentrations (final concentrations 0.1-0.8 µM), the addition of magnesium chloride solution (1.5 mM or 3 mM MgCl 2 ) and PCR cycle numbers (30, 35 and 40) were varied. Real-time PCR reactions were carried out using a fluorescent intercalating dye (EvaGreen ® (20× in water)) in 96-well plates on the LightCycler ® 480 System (Roche, Penzberg, Germany). The correct length of the PCR products was checked by agarose gel electrophoresis, and melting curve analysis was used to detect any non-specific artifacts. The volume of the PCR reactions was 25 µL, made up of 22.5 µL reaction mix and 2.5 µL of diluted DNA sample (5 ng/µL) as template. For the no-template control (NTC), water was used instead of DNA. Possible contamination is checked by including negative extraction controls. The reaction mixture comprises Master Mix with fluorescent dye, primers, nuclease-free water and no or additional magnesium chloride solution.

Library Preparation and NGS
DNA sequencing of the samples was performed using the MiSeq ® and iSeq ® 100 platform from Illumina (San Diego, CA, USA). The DNA extracts were typically diluted to a concentration of 5 ng/µL, those with a lower concentration were used undiluted.
DNA libraries were prepared as described previously [26] with minor modifications (magnetic beads volume: 36 µL; average library size: 226 bp; the iSeq ® 100 instrument denatured the diluted libraries automatically during the sequencing process). The DNA library was diluted with 10 mM Tris-HCL at pH 8.6 to the concentration of 4 nM (MiSeq ® ) or 1 nM (iSeq ® 100), respectively. The concentration of the pooled DNA libraries (5 µL for MiSeq ® and 7 µL for iSeq ® 100) was measured using the Qubit ® 4.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA). All paired-end sequencing runs were carried out using either the iSeq ® 100 i1 Reagent v2 (300 cycles) or MiSeq ® Reagent Kit v2 (300 cycles) at a final loading concentration of 8 pM. A 5% PhiX spike-in was used as sequencing control.
Reference samples and the DNA extracts from model foods were sequenced on both sequencing platforms (two sequencing runs, one replicate per run). The commercial food products were sequenced with the MiSeq ® or the iSeq ® 100 platform (Illumina, San Diego, CA, USA). The obtained DNA sequences of the reference samples were compared after sequencing on the MiSeq ® and the iSeq ® 100 platform for each individual reference sample. The data of the sequence comparison are presented in Supplementary Figure S2.

Results and Discussion
The purpose of the present study was to develop a DNA metabarcoding method, which can be used for the authentication of various insect species and products thereof. The samples tested consisted of 18 references samples, DNA extracts from insect cookies and burgers, as well as from commercial food products.
Therefore, we focused our search on DNA barcodes no longer than 200 bp to enable detection of species in raw and processed insect-containing food products. Mitochondrial DNA, in particular the mitochondrial 16S ribosomal DNA gene, was chosen as a source of markers since we have already used this gene for our mammalian and poultry assay. Furthermore, the DNA libraries should be sequenced with 300-cycle Illumina reagent kits to allow for the simultaneous analysis of insect samples along with those of mammalian and poultry species using the recently published DNA metabarcoding method [26]. We designed primers targeting a region of the mitochondrial 16S ribosomal RNA gene ( Table 2). All primers were tested for their applicability on the different DNA extracts from reference samples (Table 1). With the primer pair Fwd-I-3 and Rev-I-1 a high amount of PCR products with the expected length was obtained and thus, this primer set was considered applicable for use in practice. The optimal PCR-conditions were determined as follows: HotStar- Taq   The pairwise comparison tool of the CLC Genomics Workbench software was used to compare the selected DNA barcode region of 1100 insect species to identify similarities and differences. A typical graphical representation of a sequence comparison of DNA sequences from the 18 reference samples is shown in Figure 2. A color scheme is used to highlight the relationship between the DNA barcode regions, with blue representing differences and dark red representing high similarity in the variable region of the DNA sequences. Analysis of the data showed that 92% of all insects (DNA barcodes) under investigation can be discriminated from each other. The sequence alignment data revealed that the selected DNA barcode region cannot discriminated between all species of the following genera: Drosophila spp., Chrysomya spp., Bactrocera spp., Cheumatopsyche spp., Sinopodisma spp., Fruhstorferiola spp., Chorthippus spp., Stenocatantops spp., Gomphocerus spp., Traulia spp., Filchnerella spp., Bryodema spp., Oedaleus spp., Tetrix spp., Cryptolestes spp., Anax spp., Euphaea spp., Actias spp., Dendrolimus spp., Ostrinia spp., Magicicada spp., Bombus spp., Bryodemella spp., Culex spp., Pomacea spp., Pontia spp., Rapisma spp., Traulia spp. and Vespa spp. Furthermore, the following pairings cannot be distinguished with the developed marker system, because the base sequence in the variable region of the amplified barcode is identical:  The pairwise comparison tool of the CLC Genomics Workbench software was used to compare the selected DNA barcode region of 1100 insect species to identify similarities and differences. A typical graphical representation of a sequence comparison of DNA sequences from the 18 reference samples is shown in Figure 2. A color scheme is used to highlight the relationship between the DNA barcode regions, with blue representing differences and dark red representing high similarity in the variable region of the DNA sequences. Analysis of the data showed that 92% of all insects (DNA barcodes) under investigation can be discriminated from each other. The sequence alignment data revealed that the selected DNA barcode region cannot discriminated between all species of the following genera: Drosophila spp., Chrysomya spp., Bactrocera spp., Cheumatopsyche spp., Sinopodisma spp., Fruhstorferiola spp., Chorthippus spp., Stenocatantops spp., Gomphocerus spp., Traulia spp., Filchnerella spp., Bryodema spp., Oedaleus spp., Tetrix spp., Cryptolestes spp., Anax spp., Euphaea spp., Actias spp., Dendrolimus spp., Ostrinia spp., Magicicada spp., Bombus spp., Bryodemella spp., Culex spp., Pomacea spp., Pontia spp., Rapisma spp., Traulia spp. and Vespa spp. Furthermore, the following pairings cannot be distinguished with the developed marker system, because the base sequence in the variable region of the amplified barcode is identical:

Analysis of DNA Extracts from Reference Samples
The optimized DNA metabarcoding method was applied to identify insect species in individual DNA extracts from reference samples. The results obtained by DNA metabarcoding are shown in Table 3. The table displays average values for the total raw reads, the total reads that passed the analysis pipeline in Galaxy, and the total reads correctly assigned to the eighteen species (based on two replicates, with one exception for "Pachnoda marginata"). The number of correctly assigned reads ranged from approximately 17,000 and 148,000 reads for the selected number of samples for the sequencing experiment, resulting in a clear identification of the insect species. The four EU-approved edible insect species and all other insect species tested were identified at a high rate (>97% identity with reference sequences) using this workflow. The comparison of the DNA sequences after sequencing of the reference samples on the MiSeq ® and the iSeq ® 100 platform showed no deviation ( Figure S2). Although all of the eighteen reference samples were correctly assigned on both sequencing platforms, in case of Galleria mellonella (Greater wax moth), Gryllodes sigillatus (Tropical house cricket), Plodia interpunctella (Indian meal moth), and Lethocerus indicus (Water bug) the obtained sequences by next generation sequencing were not identical to the reference sequences. There were up to four mismatches between the reads of the individual representative sequences and the corresponding reference sequences in the user-defined database imported from NCBI, indicating gaps or errors in the database ( Figure S2).

Analysis of DNA Extracts from Reference Samples
The optimized DNA metabarcoding method was applied to identify insect species in individual DNA extracts from reference samples. The results obtained by DNA metabarcoding are shown in Table 3. The table displays average values for the total raw reads, the total reads that passed the analysis pipeline in Galaxy, and the total reads correctly assigned to the eighteen species (based on two replicates, with one exception for "Pachnoda marginata"). The number of correctly assigned reads ranged from approximately 17,000 and 148,000 reads for the selected number of samples for the sequencing experiment, resulting in a clear identification of the insect species. The four EU-approved edible insect species and all other insect species tested were identified at a high rate (>97% identity with reference sequences) using this workflow. The comparison of the DNA sequences after sequencing of the reference samples on the MiSeq ® and the iSeq ® 100 platform showed no deviation ( Figure S2). Although all of the eighteen reference samples were correctly assigned on both sequencing platforms, in case of Galleria mellonella (Greater wax moth), Gryllodes sigillatus (Tropical house cricket), Plodia interpunctella (Indian meal moth), and Lethocerus indicus (Water bug) the obtained sequences by next generation sequencing were not identical to the reference sequences. There were up to four mismatches between the reads of the individual representative sequences and the corresponding reference sequences in the user-defined database imported from NCBI, indicating gaps or errors in the database ( Figure S2).

Analysis of DNA Extracts from Model Foods
We investigated the suitability of the DNA metabarcoding method for processed and heat-treated food samples with known insect species composition. Therefore, we analyzed model food products (five insect cookies and four insect burgers) from a Swiss laboratory from a previous research project. A detailed product information is given in Köppel et al., 2019 [14]. In general, the cookies and burgers contained three insect species (Tenebrio molitor, Acheta domesticus, Locusta migratoria) in a ratio from 0.1 to 10.0% (w/w) in the presence of wheat flour or ground meat, respectively. The results obtained for the nine model foods are summarized in Table 4. The DNA metabarcoding method allowed the correct and sensitive identification of all insect species present down to a spiking level of 0.1% in model food samples. It was also shown that the barcode developed, with a length of 200 base pairs, allows discrimination of the three insect species in the heat-treated model samples, even if the spiking material was prepared asynchronously. These results indicate that the DNA metabarcoding method based on the primer set Fwd-3 and Rev-I-1 is applicable for the detection of insect species in processed food products.

Analysis of Insect Samples Commercially Available
To assess the suitability of our DNA metabarcoding method to commercially available foods, 38 food products declared to contain insects were analyzed. According to the declaration, 23 samples (1-23) should contain buffalo worm species, four samples (24)(25)(26)(27) should contain mealworm species, and 11 samples (28-38) should contain cricket species. In order to represent a wide spectrum of available products, both pure insect samples in dried, milled or roasted form and mixed products with a very low insect content of only 0.1% were selected. The results showed that insect DNA could be detected in all samples, and the number of correctly assigned sequences reached at least about 73,000 reads. Table 5 summarizes the results obtained for the 38 commercial food products from supermarkets and online stores. Our results confirmed the presence of the three species according to their declaration and the suitability of the method for the identification of insect components down to a presence of only 0.1%.

Conclusions
DNA metabarcoding is considered an advanced tool for monitoring food authenticity, a reference method for the detection of animal species and birds has already been developed [29]. In this study we developed a DNA metabarcoding method that has great potential for identifying insect species in food and can serve as an effective screening method for species authentication in food products that may contain insects. A singleplex PCR assay was developed for the amplification of the short DNA target region of the mitochondrial 16S rDNA gene that serves as a DNA barcode. The applicability of the novel DNA metabarcoding method was investigated by analyzing individual DNA extracts from reference samples, nine heat-treated model foods, as well as DNA extracts from 38 commercially available food products. Analysis of the tested samples demonstrated that the method is suitable for insect identification, even in processed or complex foods down to an insect content of only 0.1%. This sensitivity was also achieved in model foods with asynchronous composition of insect-containing ingredients. There were 38 commercial foods with declared insect ingredients, including compound products and pure products in dried, roasted and powdered form, that were checked for correct labeling. Noticeably, the declared insect ingredient was confirmed in all 38 commercial products tested.
For many insects, PCR methods are lacking for reliable detection, so a major advantage of DNA metabarcoding is the simultaneous detection of a large number of insect species in one testing approach. To determine further performance parameters of the DNA metabarcoding method presented and to assess whether the method also allows semi-quantitative statements, an interlaboratory validation of the method should be carried out. Although the insect species currently relevant in food production were clearly detected, successful differentiation on species level was not possible for all samples examined in silico.
A limiting factor for the application of metabarcoding methods is still the lack of sequencing equipment in laboratories and gaps in sequence database content, especially for insect species. The transferability of the method to different platforms, it runs successfully on both Illumina MiSeq ® and iSeq ® 100 instruments, and by combining different applications (joint sequencing of plant and animal species, bacteria, etc.), the costs can be kept sufficiently low, should laboratories consider purchasing such equipment.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/foods12051086/s1, Figure S1: amplification plots Insect reference material; Figure S2: sequence alignments Insect reference material; Table S1: database entries Insect DNA sequences.  Institutional Review Board Statement: Ethical review and approval was waived for this study as no live animals were used or slaughtered to achieve the aims of the study.

Data Availability Statement:
The datasets generated during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.