Complete Mitogenomes of Three Carangidae (Perciformes) Fishes: Genome Description and Phylogenetic Considerations

Carangidae are ecologically and economically important marine fish. The complete mitogenomes of three Carangidae species (Alectis indicus, Decapterus tabl, and Alepes djedaba) were sequenced, characterized, and compared with 29 other species of the family Carangidae in this study. The length of the three mitogenomes ranged from 16,530 to 16,610 bp, and the structures included 2 rRNA genes (12S rRNA and 16S rRNA), 1 control region (a non-coding region), 13 protein-coding genes, and 22 tRNA genes. Among the 22 tRNA genes, only tRNA-Ser (GCT) was not folded into a typical cloverleaf secondary structure and had no recognizable DHU stem. The full-length sequences and protein-coding genes (PCGs) of the mitogenomes of the three species all had obvious AT biases. The majority of the AT-skew and GC-skew values of the PCGs among the three species were negative, demonstrating bases T and C were more plentiful than A and G. Analyses of Ka/Ks and overall p-genetic distance demonstrated that ATP8 showed the highest evolutionary rate and COXI/COXII were the most conserved genes in the three species. The phylogenetic tree based on PCGs sequences of mitogenomes using maximum likelihood and Bayesian inference analyses showed that three clades were divided corresponding to the subfamilies Caranginae, Naucratinae, and Trachinotinae. The monophyly of each superfamily was generally well supported. The divergence time analyses showed that Carangidae evolved during three geological periods, the Cretaceous, Paleogene, and Neogene. A. indicus began to differentiate from other species about 27.20 million years ago (Mya) in the early Miocene, while D. tabl (21.25 Mya) and A. djedaba (14.67 Mya) differentiated in the middle Oligocene.


Introduction
The typical mitochondrial genome (mitogenome) of fish consists of circular double-stranded DNA molecules totaling 16~19 kb, which are composed of 37 genes comprising two rRNA genes (12S

Genome Organization and Base Composition
The complete mitogenomes of A. indicus, D. tabl, and A. djedaba were 16,553, 16,545, and 16,563 bp in length, respectively. Most genes were encoded by the heavy (H) chain, except for eight tRNA genes (Tyr, Cys, Glu, Pro, Ala, Asn, Ser, and Gln) and the ND6 gene, which were encoded by the light (L) chain ( Figure 1).
The GC content of the 37 genes varied from 29.85% to 58.33%, and eleven intergenic spacers (IGSs) were identified totaling 66 bp in all three fishes ( Table 1). The largest IGS was located on the L strand between tRNA-Cys and tRNA-Asn, which was 38, 37, and 37 bp in A. indicus, D. tabl, and A. djedaba, respectively ( Table 1). The overlap of mitochondrial genes made the genome even tighter. Seven overlaps were identified in each species, totaling 25 bp in A. indicus, 22 bp in D. tabl, and 19 bp in A. djedaba (Table 1). The GC content of the 37 genes varied from 29.85% to 58.33%, and eleven intergenic spacers (IGSs) were identified totaling 66 bp in all three fishes ( Table 1). The largest IGS was located on the L strand between tRNA-Cys and tRNA-Asn, which was 38, 37, and 37 bp in A. indicus, D. tabl, and A. djedaba, respectively ( Table 1). The overlap of mitochondrial genes made the genome even tighter. Seven overlaps were identified in each species, totaling 25 bp in A. indicus, 22 bp in D. tabl, and 19 bp in A. djedaba (Table 1).  In incremental order, the average base compositions of the three complete mitogenomes were 16.64% G, 25.67% T, 27.74% A, and 29.94% C (Table S1), showing that the C and A content was much higher than the G and T content. This phenomenon has also been found in other species of the family Carangidae, all of which have negative AT-skew and positive GC-skew values (Table S1), showing significant chain-specific or chain asymmetry bias. The mechanism underlying chain bias was often interpreted as evidence of asymmetrical directional mutation stress [20]. The genes with the highest average GC content were identified in tRNA-Thr (52.78-58.33%) and tRNA-Ser (50.75-57.35%) ( Table 1).

Protein-Coding Genes (PCGs)
The 13 PCGs in the mitogenomes ranged from 165 (ATP8) to 1839 (ND5) bp (Table S2). The total length of the 13 PCGs ranged from 11,425 to 11,428 bp (Table S2), accounting for 68.99-69.05% of the entire mitogenomes. Most of the PCGs started with a typical ATG codon, except for ATP6 (in A. djedaba) and COXI (in all three species), which started with ATA and GTG, respectively (Table 1). TAA or TAG was the termination codon in most PCGs and TAA was the most frequently used codon. However, the CYTB, ND2, ND3, ND4, COX2, and COX3 genes in A. indicus and D. tabl, and the COX2, ND4, and CYTB genes in A. djedaba had incomplete termination codons TA or T ( Table 1).
The majority of the AT-skew and GC-skew values of the PCGs among the three species were negative, demonstrating that bases T and C were more plentiful than A and G ( Figure 2). The lowest AT-skew and highest GC-skew values were all found in the ND6 gene, which is unusual but similar to other observations of strand asymmetry [21][22][23]. The AT-skews for the concatenated sequence of the PCGs in all 32 species were negative, completely different from those of the entire genome. The GC-skews were all less than zero, which is identical to the complete genome (Table S1). Therefore, in view of the strange phenomenon of an AT-skew < 1 in the protein coding sequence, we analyzed the base composition of different positions of codons in the 13 PCGs, and found that the content of base T in the second codon reached 40.5% (Table S2), resulting in the increase of T content in the whole protein coding sequence, which explained the phenomenon of AT-skew < 1 in the protein coding sequence. We used the ratio of non-synonymous/synonymous mutations (Ka/Ks) value and overall pgenetic distance to describe the evolutionary rate and conservation of PCGs. The average Ka/Ks was 0.0253 in the 13 PCGs of the three Carangidae and ranged from 0.0059 (COXI) to 0.0967 (ATP8) ( Figure  3). The Ka/Ks values of all PCGs were < 1, suggesting purifying selection on the functional genes [24]. In theory, purified selection would eliminate harmful mutations in the population [25]. The ATP8 gene had the highest Ka/Ks value, and it is also found in fungi [26], flies [27], snails [28], and insects [29]. The COXI gene had the lowest value, which showed it was the most conserved gene. The highest overall p-genetic distances were in the ND6 and ATP8 genes (0.197 and 0.200, respectively), while the lowest were in the three cytochrome oxidase genes (COXI: 0.138, COXII: 0.128, COXIII: 0.135) ( Figure  4) based on comparisons of the full-length PCGs. This suggested lowest evolutionary rates in the three cytochrome oxidase genes and higher rates in ATP8 and ND6. Overall p-genetic distances once again showed that the ATP8 gene had the highest rate of evolution and was the least conserved gene.  We used the ratio of non-synonymous/synonymous mutations (Ka/Ks) value and overall p-genetic distance to describe the evolutionary rate and conservation of PCGs. The average Ka/Ks was 0.0253 in the 13 PCGs of the three Carangidae and ranged from 0.0059 (COXI) to 0.0967 (ATP8) (Figure 3). The Ka/Ks values of all PCGs were < 1, suggesting purifying selection on the functional genes [24]. In theory, purified selection would eliminate harmful mutations in the population [25]. The ATP8 gene had the highest Ka/Ks value, and it is also found in fungi [26], flies [27], snails [28], and insects [29]. The COXI gene had the lowest value, which showed it was the most conserved gene. The highest overall p-genetic distances were in the ND6 and ATP8 genes (0.197 and 0.200, respectively), while the lowest were in the three cytochrome oxidase genes (COXI: 0.138, COXII: 0.128, COXIII: 0.135) ( Figure 4) based on comparisons of the full-length PCGs. This suggested lowest evolutionary rates in the three cytochrome oxidase genes and higher rates in ATP8 and ND6. Overall p-genetic distances once again showed that the ATP8 gene had the highest rate of evolution and was the least conserved gene. We used the ratio of non-synonymous/synonymous mutations (Ka/Ks) value and overall pgenetic distance to describe the evolutionary rate and conservation of PCGs. The average Ka/Ks was 0.0253 in the 13 PCGs of the three Carangidae and ranged from 0.0059 (COXI) to 0.0967 (ATP8) ( Figure  3). The Ka/Ks values of all PCGs were < 1, suggesting purifying selection on the functional genes [24]. In theory, purified selection would eliminate harmful mutations in the population [25]. The ATP8 gene had the highest Ka/Ks value, and it is also found in fungi [26], flies [27], snails [28], and insects [29]. The COXI gene had the lowest value, which showed it was the most conserved gene. The highest overall p-genetic distances were in the ND6 and ATP8 genes (0.197 and 0.200, respectively), while the lowest were in the three cytochrome oxidase genes (COXI: 0.138, COXII: 0.128, COXIII: 0.135) ( Figure  4) based on comparisons of the full-length PCGs. This suggested lowest evolutionary rates in the three cytochrome oxidase genes and higher rates in ATP8 and ND6. Overall p-genetic distances once again showed that the ATP8 gene had the highest rate of evolution and was the least conserved gene.   Note: Values were calculated using the first and second nucleotide positions, the entire codon, and over the full sequence.

Mitochondrial Gene Codon Usage
Serine and leucine were encoded by six codons each, while the other amino acids were encoded by four or two codons ( Figure 5). Codon frequency referred to the frequency of codon occurrence in all PCGs. Leu (CUC, CUA), Phe (UUC), Ile (AUC), and Ala (GCC) were the most frequent codons, while termination codons (UAG) and Cys (UGU) were the least frequent ( Figure 5). The relative synonymous codon usage (RSCU) refers to the relative probability that a particular codon was encoded in a synonymous codon of the corresponding amino acid. Arg (CGA), Lys (AAA), Gln (CAA), and Ser (TCC) were the most frequent, while Ser (AGT, TCG), Thr (ACG), Leu (TTG), Ala (GCG), and Pro (CCG) were rare in the RSCU analysis ( Figure 5). The chain asymmetry caused by the difference in base composition could also affect codon usage. C and A bases were more frequent in the third positions of codons than were U and G bases (Table S2), which showed that the codons NNA and NNC were a minority, whereas the synonymous codons NNU and NNG were the majority.

Mitochondrial Gene Codon Usage
Serine and leucine were encoded by six codons each, while the other amino acids were encoded by four or two codons ( Figure 5). Codon frequency referred to the frequency of codon occurrence in all PCGs. Leu (CUC, CUA), Phe (UUC), Ile (AUC), and Ala (GCC) were the most frequent codons, while termination codons (UAG) and Cys (UGU) were the least frequent ( Figure 5). The relative synonymous codon usage (RSCU) refers to the relative probability that a particular codon was encoded in a synonymous codon of the corresponding amino acid. Arg (CGA), Lys (AAA), Gln (CAA), and Ser (TCC) were the most frequent, while Ser (AGT, TCG), Thr (ACG), Leu (TTG), Ala (GCG), and Pro (CCG) were rare in the RSCU analysis ( Figure 5). The chain asymmetry caused by the difference in base composition could also affect codon usage. C and A bases were more frequent in the third positions of codons than were U and G bases (Table S2), which showed that the codons NNA and NNC were a minority, whereas the synonymous codons NNU and NNG were the majority. Note: Values were calculated using the first and second nucleotide positions, the entire codon, and over the full sequence.

Mitochondrial Gene Codon Usage
Serine and leucine were encoded by six codons each, while the other amino acids were encoded by four or two codons ( Figure 5). Codon frequency referred to the frequency of codon occurrence in all PCGs. Leu (CUC, CUA), Phe (UUC), Ile (AUC), and Ala (GCC) were the most frequent codons, while termination codons (UAG) and Cys (UGU) were the least frequent ( Figure 5). The relative synonymous codon usage (RSCU) refers to the relative probability that a particular codon was encoded in a synonymous codon of the corresponding amino acid. Arg (CGA), Lys (AAA), Gln (CAA), and Ser (TCC) were the most frequent, while Ser (AGT, TCG), Thr (ACG), Leu (TTG), Ala (GCG), and Pro (CCG) were rare in the RSCU analysis ( Figure 5). The chain asymmetry caused by the difference in base composition could also affect codon usage. C and A bases were more frequent in the third positions of codons than were U and G bases (Table S2), which showed that the codons NNA and NNC were a minority, whereas the synonymous codons NNU and NNG were the majority.

Ribosomal and Transfer RNA Genes
The length of the 12S rRNA gene ranged from 951 to 954 bp, while the 16S rRNA gene ranged from 1716 to 1728 bp (Table 1). Consistent with most Osteichthyes, the 12S and 16S rRNA genes were separated by tRNA-Val and located between the tRNA-Leu and tRNA-Phe genes. The AT content ranged from 52.53% to 53.77% (Table S1). The average AT-and GC-skews of these two rRNA genes were 0.178 to 0.205 and -0.119 to -0.082, respectively (Table S2).
Twenty two tRNA genes were identified in the mitogenomes of the three species (Table 1). Their sizes ranged from 66 (tRNA-Cys) to 75 (tRNA-Leu and tRNA-Lys) bp. The distribution of nucleotides was not identical to the rest of the entire mitogenome: the rRNAs and PCGs had a negative GC-skew value, while the 22 tRNA genes had a positive GC-skew value from 0.035 to 0.055 among the three species (Table S2). Codons and anticodons typically showed one-to-one correspondence. However, leucine was determined by two anticodons (UAG and UAA) and serine was determined by UGA and GCU in the three species (Table 1). The typical cloverleaf secondary structure was predicted for all tRNA genes of the three species, except tRNA-Ser (GCT), although many U-G base pairs and noncomplementary pairs existed in the stem regions. According to a previous report, stem mismatches were repaired by post-transcriptional editing and were common in mitochondrial tRNA genes [30]. There was no recognizable DHU stem in tRNA-Ser (GCT) in the mitogenomes of the three species, which is common in most vertebrate mitogenomes [31,32]. Nevertheless, it functioned like typical tRNAs to fit the ribosomes by adjusting its structural conformation [33].

Ribosomal and Transfer RNA Genes
The length of the 12S rRNA gene ranged from 951 to 954 bp, while the 16S rRNA gene ranged from 1716 to 1728 bp (Table 1). Consistent with most Osteichthyes, the 12S and 16S rRNA genes were separated by tRNA-Val and located between the tRNA-Leu and tRNA-Phe genes. The AT content ranged from 52.53% to 53.77% (Table S1). The average AT-and GC-skews of these two rRNA genes were 0.178 to 0.205 and −0.119 to −0.082, respectively (Table S2).
Twenty two tRNA genes were identified in the mitogenomes of the three species (Table 1). Their sizes ranged from 66 (tRNA-Cys) to 75 (tRNA-Leu and tRNA-Lys) bp. The distribution of nucleotides was not identical to the rest of the entire mitogenome: the rRNAs and PCGs had a negative GC-skew value, while the 22 tRNA genes had a positive GC-skew value from 0.035 to 0.055 among the three species (Table S2). Codons and anticodons typically showed one-to-one correspondence. However, leucine was determined by two anticodons (UAG and UAA) and serine was determined by UGA and GCU in the three species (Table 1). The typical cloverleaf secondary structure was predicted for all tRNA genes of the three species, except tRNA-Ser (GCT), although many U-G base pairs and non-complementary pairs existed in the stem regions. According to a previous report, stem mismatches were repaired by post-transcriptional editing and were common in mitochondrial tRNA genes [30]. There was no recognizable DHU stem in tRNA-Ser (GCT) in the mitogenomes of the three species, which is common in most vertebrate mitogenomes [31,32]. Nevertheless, it functioned like typical tRNAs to fit the ribosomes by adjusting its structural conformation [33].

CGView Comparison Tool (CCT) Map
Using the A. indicus mitogenome as the reference sequence, all available mitogenomes in the family Carangidae were compared using CCT. This visual map showed that the differences of the species sequence of the family Carangidae mainly came from the control region sequence, and the other parts of the sequences were very similar (Figure 6). A. indicus was most closely related to A. ciliaris and then with Uraspis secunda and U. helvola (Figure 6). The identity in the whole genome sequence was lower than that of the PCGs (Figure 6 and Figure S2). COXI was the most conserved gene, which makes it suitable for evolutionary analyses and molecular phylogenetics in the family Carangidae. By contrast, ATP8 was the least conserved gene in all species. The result was consistent with the previous analysis of Ka/Ks and overall p-genetic distance.

CGView Comparison Tool (CCT) Map
Using the A. indicus mitogenome as the reference sequence, all available mitogenomes in the family Carangidae were compared using CCT. This visual map showed that the differences of the species sequence of the family Carangidae mainly came from the control region sequence, and the other parts of the sequences were very similar (Figure 6). A. indicus was most closely related to A. ciliaris and then with Uraspis secunda and U. helvola (Figure 6). The identity in the whole genome sequence was lower than that of the PCGs (Figure 6 and Figure S2). COXI was the most conserved gene, which makes it suitable for evolutionary analyses and molecular phylogenetics in the family Carangidae. By contrast, ATP8 was the least conserved gene in all species. The result was consistent with the previous analysis of Ka/Ks and overall p-genetic distance.  Cluster of orthologous groups (COG) could indicate the presence of certain proteins in a given species in a particular PCG, and could be used to determine whether a particular metabolic pathway exists in a species [34]. Twelve COGs were detected in 11 PCGs, excluding ATP8 and ND6 ( Figure  S1). ND5 had two COGs (p COG and c COG), and each of the other 10 PCGs had a single COG (c COG). Each type of COG had its own unique functions [35]. The functions of the 12 COGs identified were related to metabolism. The p COG participated in inorganic ion transport and metabolism while  Cluster of orthologous groups (COG) could indicate the presence of certain proteins in a given species in a particular PCG, and could be used to determine whether a particular metabolic pathway exists in a species [34]. Twelve COGs were detected in 11 PCGs, excluding ATP8 and ND6 ( Figure S1). ND5 had two COGs (p COG and c COG), and each of the other 10 PCGs had a single COG (c COG). Each type of COG had its own unique functions [35]. The functions of the 12 COGs identified were related to metabolism. The p COG participated in inorganic ion transport and metabolism while the c COG participated in energy production and transformation [36]. The COGs in ND5 indicated that ND5 is involved in two processes: inorganic ion transport and metabolism and energy production and conversion.

Phylogenetic Analyses
The phylogenetic relationships of 32 Carangidae species were reconstructed based on the concatenated sequence of their mitochondrial PCGs genes (Figure 7). Phylogenetic trees inferred from the maximum likelihood (ML) and Bayesian inference (BI) analyses had largely identical topological structure and are illustrated together in Figure 7. The topology produced three clades corresponding to the subfamilies Caranginae, Naucratinae, and Trachinotinae. The monophyly of each subfamily was well supported with high branch support values (ML bootstrap values > 0.85, BI posterior probabilities = 1.00). The phylogenetic relationships support the groupings (Trachinotinae + (Naucratinae + Caranginae)), which were consistent with morphological taxonomy [37] and previous molecular systematics studies based on single mitochondrial gene sequences [18,19]. The intergeneric and interspecific taxonomic positions were also clearly resolved within each of the subfamilies. A. indicus was most closely related to A. ciliaris, and they formed a group (genus Alectis) that was a sister to genus Carangoides. D. tabl was on the same branch with the other Decapterus species and formed a sister group with genus Trachurus. A. djedaba together with A. kleinii formed a group (genus Alepes) with the genus Atule as sister to them. the c COG participated in energy production and transformation [36]. The COGs in ND5 indicated that ND5 is involved in two processes: inorganic ion transport and metabolism and energy production and conversion.

Phylogenetic Analyses
The phylogenetic relationships of 32 Carangidae species were reconstructed based on the concatenated sequence of their mitochondrial PCGs genes (Figure 7). Phylogenetic trees inferred from the maximum likelihood (ML) and Bayesian inference (BI) analyses had largely identical topological structure and are illustrated together in Figure 7. The topology produced three clades corresponding to the subfamilies Caranginae, Naucratinae, and Trachinotinae. The monophyly of each subfamily was well supported with high branch support values (ML bootstrap values > 0.85, BI posterior probabilities = 1.00). The phylogenetic relationships support the groupings (Trachinotinae + (Naucratinae + Caranginae)), which were consistent with morphological taxonomy [37] and previous molecular systematics studies based on single mitochondrial gene sequences [18,19]. The intergeneric and interspecific taxonomic positions were also clearly resolved within each of the subfamilies. A. indicus was most closely related to A. ciliaris, and they formed a group (genus Alectis) that was a sister to genus Carangoides. D. tabl was on the same branch with the other Decapterus species and formed a sister group with genus Trachurus. A. djedaba together with A. kleinii formed a group (genus Alepes) with the genus Atule as sister to them.
Furthermore, Kaiwarinus equula is considered a synonym of Carangoides equula according to the current taxonomic system [38]. However, phylogenetic analysis in this study showed that K. equula and Carangoides spp. were on the different branches (Figure 7), supporting that K. equula belongs to an independent genus and it should be the accepted name instead of C. equula.  Furthermore, Kaiwarinus equula is considered a synonym of Carangoides equula according to the current taxonomic system [38]. However, phylogenetic analysis in this study showed that K. equula and Carangoides spp. were on the different branches (Figure 7), supporting that K. equula belongs to an independent genus and it should be the accepted name instead of C. equula.

Estimation of Divergence Times
The divergence timescale within Carangidae was estimated based on phylogenetic trees by using the RelTime-ML method. The divergence times showed that A. indicus began to differentiate from other species about 27.20 million years ago (Mya) within the early Miocene, while D. tabl, 21.25 Mya, and A. djedaba, 14.67 Mya, did so in the middle Oligocene (Figure 8 and Figure S2). The Carangidae began to differentiate at the end of the Cretaceous, about 79 Mya, when the Cretaceous-Paleogene extinction event happened. The split between Trachinotinae and (Naucratinae + Caranginae) was 66.69 Mya within the middle Paleocene epoch. The divergence time of the subfamily Naucratinae and Caranginae could be dated back to 65.91 Mya. Most of the species differentiated in the Paleogene (Figure 8 and Figure S2), supporting previous findings that the Paleogene was the heyday of species evolution [39]. This phenomenon might be due to the huge environmental changes caused by the Cretaceous mass extinction, which led to the outbreak of species diversity [40]. Previous studies based on Cytb sequences showed that the genera Trachurus and Seriola [41] began to differentiate at 20.1 and 55 Mya, respectively. In this study, the full mitogenome sequence analysis indicated that the genera Trachurus and Seriola began to differentiate at 25.51 and 57.41 Mya ( Figure S2), which is a little earlier than the previous estimates.

Estimation of Divergence Times
The divergence timescale within Carangidae was estimated based on phylogenetic trees by using the RelTime-ML method. The divergence times showed that A. indicus began to differentiate from other species about 27.20 million years ago (Mya) within the early Miocene, while D. tabl, 21.25 Mya,and A. djedaba,14.67 Mya, did so in the middle Oligocene (Figure 8 and Figure S2). The Carangidae began to differentiate at the end of the Cretaceous, about 79 Mya, when the Cretaceous-Paleogene extinction event happened. The split between Trachinotinae and (Naucratinae + Caranginae) was 66.69 Mya within the middle Paleocene epoch. The divergence time of the subfamily Naucratinae and Caranginae could be dated back to 65.91 Mya. Most of the species differentiated in the Paleogene (Figure 8 and Figure S2), supporting previous findings that the Paleogene was the heyday of species evolution [39]. This phenomenon might be due to the huge environmental changes caused by the Cretaceous mass extinction, which led to the outbreak of species diversity [40]. Previous studies based on Cytb sequences showed that the genera Trachurus and Seriola [41] began to differentiate at 20.1 and 55 Mya, respectively. In this study, the full mitogenome sequence analysis indicated that the genera Trachurus and Seriola began to differentiate at 25.51 and 57.41 Mya ( Figure  S2), which is a little earlier than the previous estimates.

Sample Collection and DNA Extraction
Wild samples of A. indicus, D. tabl, and A. djedaba were captured in the South China Sea at 22°48' Figure 8. Chronogram for the 32 species of Carangidae with a single outgroup Larimichthys crocea based on the Bayesian topology calculated from analysis of the 13 PCGs. Divergence times were estimated using three calibrations. The red fonts represent the three species sequenced in this study, and the blue fonts represent the outgroup species.  2015), respectively. The dorsal muscle was harvested and returned to the laboratory in anhydrous alcohol, and genomic DNA was extracted using a DNeasy Blood and Tissue kit (QIAGEN, (Qiagen, Venlo, The Netherlands) following the manufacturer's protocol. DNA was quantified by ultra-micro spectrophotometry and gel electrophoresis. All animal experiments were conducted in accordance with the guidelines and approval of the Animal Research and Ethics Committees of South China Agricultural University.

Sequencing, Assembly, and Annotation
A library was constructed on the Illumina platform. High-quality DNA samples were randomly broken into fragments for paired-end sequencing, and the DNA libraries were constructed according to the procedure of Illumina DNA library construction. After selecting the fragments, finishing the ends, and adding poly A, adapters were added to both ends of the DNA fragment with ligase. The ligated product was amplified, selected, and purified. Finally, a 100 bp library was generated for sequencing on the Illumina HiSeq2500 instrument [42].
The SOAPdenovo2 (BGI, Shenzhen, China) software package was used for de novo assembly of the mitogenomes [43]. OGDRAW 1.3.1 (Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany) was used to draw a mitochondrial genetic map [44]. The PCGs and ribosomal genes were identified by blast [45]. The rules for the vertebrate mitochondria genetic code were used [46]. The structure of the transfer RNA genes was predicted with tRNA scan-SE ver. 1.3.1 (Washington University School of Medicine, St Louis, Missouri, USA) and confirmed with MitoFish (http://mitofish.aori.u-tokyo.ac.jp/) [47].

Sequence Analyses
The mitochondrial strand asymmetry was assessed using the formulas: [48]. The tandem repeat finder Dot2dot was used to identify repeat sequences [49]. Codon frequency referred to the occurrence frequency of codons in all PCGs and relative synonymous codon usage (RSCU) referred to the relative probability that a particular codon was encoded in the synonymous codon of the corresponding amino acid. The codon frequency and RSCU values were calculated with MEGA 7.0 (Tokyo Metropolitan University, Tokyo, Japan) [50] and both indicate codon usage [51]. The RSCU values were independent of the amino acid usage, and the codon abundance directly reflected the degree of preference of codon usage [52]. If there was no preference for codon use, the RSCU value of the codon was equal to 1. When the RSCU value of a codon was > l, it indicated that the codon was used relatively frequently and vice versa.

Phylogenetic Analyses
The concatenated sequences of 13 PCGs (without stop codons) extracted from the reported 32 mitogenomes of Carangidae species, including A. indicus, D. tabl, and A. djedaba, were used to construct phylogenetic trees with MEGA 7.0, MrBayes 3.1 (University of California, La Jolla, San Diego, CA, USA) [53], and iTOL (https://itol.embl.de/) [54], using Larimichthys crocea (GenBank accession number: EU339149) as the outgroup. Bayesian Inference (BI) was analyzed in MrBayes 3.1 [55] and maximum likelihood (ML) analysis was performed in MEGA 7.0. When performing Bayesian analysis, general time reversible (GTR) was adopted in the data substitution model and the ratio of the difference between sites was set as invgamma. The Markov chain Monte Carlo method was used to estimate the posterior probability by calculating 10,000 generations, sampling once every 10 generations, and discarding 250 aged samples from the obtained samples to summarize and establish a shared tree.
The standard deviation of the split frequency of the BI tree was less than 0.01 to guarantee the BI tree was reliable. The ML tree was constructed with 1000 bootstrap replicates. DeepFin (http://deepfin.org/) [56] and the NCBI taxonomy server [57] were used for further analyses.

Divergence Times
Divergence times of the 32 Carangidae were estimated with the GTR + G + I modeling and RelTime-ML method with MEGA 7.0 [58]. The TimeTree was computed using two calibration constraints. In addition, the divergence times of 27 genera of Carangidae were estimated using the TimeTree database [59]. To ensure consistency between the divergence times and phylogenetic tree, the divergence times were generated by importing the Bayesian phylogenetic tree into MEGA 7.0. The CGView comparison tool was used to draw a circular genetic similarity map of Carangidae using A. indicus mitogenome sequence as the reference sequence [60].

Conclusions
The study characterized the complete mitogenomes of three Carangidae species (i.e., Alectis indicus, Decapterus tabl, and Alepes djedaba). They are composed of 37 genes comprising two rRNA genes, 22 tRNA genes, 13 protein-coding genes, and a non-coding region. Comprehensive analyses of Ka/Ks, overall p-genetic distance, and a CCT map demonstrated the highest evolutionary rate in ATP8, while COXI/COXII appear to have the lowest evolutionary rate. The phylogenetic trees inferred from the maximum likelihood and Bayesian inference using mitochondrial PCGs genes supported three clades correspondent to subfamily and the groupings (Trachinotinae + (Naucratinae + Caranginae)) and also provided support for the monophyly of these subfamilies. However, complete mitogenome sequences from Scomberoidinae, the fourth subfamily in Carangidae, were absent in the phylogenetic analyses, making their position still unclear. More sampling of Scomberoidinae will be needed for a thorough phylogenetic analysis of the major groups within Carangidae.