pTC Plasmids from Sulfolobus Species in the Geothermal Area of Tengchong, China: Genomic Conservation and Naturally-Occurring Variations as a Result of Transposition by Mobile Genetic Elements

Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1) containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs). pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS) and a miniature inverted-repeat transposable element (MITE) were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading.


Introduction
Plasmids occur widely in archaea. Research on archaeal plasmids has been focused on a narrow range of archaea including members of Sulfolobaceae, Haloarchaeaceae, Thermococcaceae and various methanogens. So far, over 60 plasmids have been isolated from archaea. Among them, a total of 24 plasmids or plasmid virus hybrids are derived from Sulfolobaceae, with all but two (pDL10 and pAH1 from Acidianus) from Sulfolobus species, and, therefore, they are known as Sulfolobus plasmids. Sulfolobus plasmids include primarily cryptic and conjugative plasmids (CPs). All known archaeal CPs (13 in total) were isolated from species of Sulfolobaceae. The first Sulfolobus CP, pNOB8, was isolated from Sulfolobus sp. NOB8H2, a strain obtained from a hotspring in Japan [1,2]. Later, pING family plasmids (pKEF9, pHVE14, pARN3 and pARN4) and pSOG family plasmids were isolated from different S. islandicus isolates derived from Icelandic hotspring samples [3][4][5]. Recently, pAH1 was obtained from Acidianus hospitalis [6]. A survey shows that up to 3% of the Sulfolobus isolates from terrestrial hotsprings in Iceland contain a CP [5].
Comparative genomic analysis reveals the architectural conservation among Sulfolobus CPs of different geographical origins. The genomes of the CPs are characterized by the presence of three functionally distinct regions, i.e., sections A, B and C [5,7]. Section A encodes conjugative functions. Proteins encoded by section A ORFs are transmembrane proteins presumably involved in DNA transfer. They include highly conserved homologues of the bacterial conjugative proteins TraG and TrbE [2]. Section B contains a putative origin of replication with characteristic sequence repeats. Section C is proposed to encode proteins for the initiation and regulation of plasmid replication [7]. Despite the availability of increasing genomic data on Sulfolobus CPs, little is known about the molecular mechanisms of replication and host interactions of these plasmids.
Known Sulfolobus CPs have shown striking variation in genome stability. While some of the Sulfolobus CPs (e.g., pAH1, pNOB8, pSOG2) are relatively stable, at least in their natural host, others (e.g., pNOB8-33, pING1) are extremely unstable. A number of mechanisms, including the host defense systems such as the CRISPR-CAS systems, homologous recombination and transposition by mobile genetic elements (e.g., ISs), have been proposed to be responsible for the instability of the CPs [2,[7][8][9]. In any case, the plasticity of Sulfolobus CPs likely results from CP-host interactions which contribute to the co-evolution of plasmids and their archaeal hosts.
Here, we report the isolation and characterization of the conjugative plasmid pTC1 from Sulfolobus tengchongensis RT8-4, a strain obtained from the hot spring in Tengchong, Yunnan, China. The genome sequence of pTC1 was determined and compared with other known Sulfolobus CPs. We also conducted a survey on pTC1 and its plasmid variants in the geothermal area of Tengchong. We found that pTC plasmids were widespread in the area of survey. Genomic variations of the pTC plasmids as a result of transposition by an IS or an MITE were characterized to gain insight into the generation of the diversity of the plasmids.

Sample Collection and Screening for Plasmids
Water and sediment samples were collected from hot springs and mud holes with temperatures ranging from 61 to 94 °C and pH from 2.0 to 6.0 in a geothermal area in Tengchong, Yunnan, China, and transported in sterile 50-mL centrifuge tubes at ambient temperature. An aliquot of each sample (1-2 mL of spring water or 1-2 g of mud) was inoculated into Zillig's medium (100 mL) supplemented with 0.2% tryptone [3]. Incubation was carried out at 80 °C with moderate shaking. When microbial growth became apparent (1-2 weeks), the enrichment culture was tested for the presence of covalently closed circular DNA (cccDNA) by using the alkaline lysis method [10]. If cccDNA was detected, the enrichment culture was streaked onto 0.8% Gelrite plates containing Zillig's medium supplemented with 0.2% tryptone. The plates were incubated for 5-7 days at 80 °C . Single colonies formed on the plates were picked and grown in Zillig's liquid medium. The presence of plasmids was determined by extraction of the total DNA by alkaline lysis, restriction digestion of the DNA and agarose gel electrophoresis of the restriction digests [3,4,11]. Isolates containing plasmids were identified by amplifying the 16S rRNA gene with a pair of Archaea-specific primers [12] and subsequent sequencing. The culture was also subjected to electron microscopy to determine if the extrachromosomal genetic element was a plasmid or a virus. S. tengchongensis strain RT8-4 [13] was also tested as described above for the presence of plasmids.

Electron Microscopy
Samples were applied to carbon grids, negatively stained with 2% uranyl acetate and examined under a HitachiH-600A electron microscope [13].

DNA Sequencing and Genome Analysis
Plasmid pTC1 DNA was extracted from S. tengchongensis RT8-4 by the alkaline lysis method and purified using a plasmid purification kit (Biodev Co., Beijing, China). The purified plasmid DNA was digested with EcoRI. Restriction fragments were purified by electrophoresis in 1% agarose and subsequent extraction using a Gel Extraction Kit (QIAGEN), and cloned into pBluescript KSII. Clones containing plasmid fragments were collected, and the inserts were sequenced from both ends. PCR reactions were conducted to confirm the linking relationships of the fragments and to fill gaps. Sequence assembly was performed using DNASTAR. Open reading frames (ORFs) encoding polypeptides of >60 amino acid residues in size were identified by the ORF Finder at NCBI. Two ORFs that code for polypeptides shorter than 60 amino acid residues were also identified because of their high E-values. ORFs were searched against the GenBank by BLASTP at NCBI. The sequence of pTC1 has been deposited in GenBank (Accession Number NC_005969.1).

Detection of Transposition by ISSte1 and MITE at the Specific Sites
Enrichment cultures were established with hot spring and mud hole samples as described above. Cells were harvested from each enrichment culture by centrifugation. Total DNA was isolated by phenol/chloroform extraction. Plasmid DNA was extracted by using the alkaline lysis method. PCR amplification was carried out on the total DNA or the plasmid DNA using either the primer pair 5'-TTGGATTATCATTAGCATTTTGTCATC/5'-CTTCAATAACCTCAATAAGCCCC for the ISSte1 site or the primer pair 5'-CACCCCATTGAGTGGACGAT/5'-CCTCATCTACTCTTTACAG for the MITE site. Products shorter than expected for the insertion of the full-length ISSte1 or MITE were cloned into pEasy-T1 (TransGen Biotech, Beijing, China). Following transformation of the resulting plasmids into E. coli DH5α (TransGen Biotech), transformants were selected and grown in liquid LB medium. The plasmids were isolated, and the inserts were sequenced.

Identification of Plasmid pTC1
We previously reported the isolation and identification of Sulfolobus tengchongensis strain RT8-4 from a hot spring in Tengchong in Southwestern China [13]. In a subsequent study, we isolated total DNA from the strain and digested it with various restriction endonucleases. When the digests were subjected to agarose gel electrophoresis and ethidium bromide staining, a stoichiometric pattern of bands with intensity significantly higher than that of smearing bands derived from the genomic DNA of the host was observed, suggesting the presence of an extrachromasomal genetic element in the host cell ( Figure 1). This suggestion was supported by the finding that extraction of episomal DNA from S. tengchongensis RT8-4 by using the alkaline lysis procedure yielded high molecular weight DNA species that migrated faster than the genomic DNA of the host. To determine whether the DNA was derived from a virus or a plasmid, we examined the culture supernatant of S. tengchongensis RT8-4, and no virus-like particles were observed under an electron microscope. Based on these results, we conclude that a plasmid existed in S. tengchongensis RT8-4, and designated this plasmid as pTC1.
pTC1 plasmid appeared to exist in high copy number in the original enrichment cultures, but was unstable to plating or sub-culturing. Upon repeated culture transfers, the copy number of pTC1 gradually decreased to levels undetectable even by Southern hybridization. A similar result was obtained for pTC2, the only other pTC plasmid obtained in a colony-purified strain, from a close relative of S. tengchongenesis RT8-4 based on 16S rRNA sequence similarity. The conjugational transferability of pTC1 and pTC2 was tested from the appropriate donor strains using the assays described by Schleper et al. [1,14] with a plasmid-free S. tengchongenesis or S. solfataricus strain as a recipient. Despite repeated efforts, however, no conjugation was detected. It is possible that the tested recipient strains were not appropriate for the assays. In addition, pTC plasmids were not integrated into the host genomes, as revealed by Southern hybridization.

Genomic Analysis of pTC1
Both strands of plasmid pTC1 were sequenced. The restriction patterns of the plasmid, predicted from the genomic sequence, agreed well with the results of restriction digestion. As revealed by sequencing, pTC1 is a circular double-stranded DNA molecule of 20,417 bp in length. The G+C content of pTC1 was 41.4%, which was higher than that of its host (34.4%) [13] as well as those of other known Sulfolobus CPs. The plasmid contains 26 open reading frames (ORFs) (Figure 2; Table 1). In most of the plasmid, genes are tightly packed or even overlapping. Among all known Sulfolobus CPs, pTC1 is the smallest and carries fewest ORFs.  To determine the similarity of pTC1 ORFs to homologous ORFs from other CPs, we assembled a database consisting of ORFs from 12 other known Sulfolobus CPs (i.e., pNOB8, pING1, pARN3, pARN4, pAH1, pSOG1, pSOG2, pLD8501, pYN01, pKEF9, pHVE14 and pMGB1). As revealed by BLASTP with an E-value cutoff of <10 −3 , only three ORFs from pTC1 (i.e., ORF108, ORF92a and ORF162) have no homologues in the other twelve Sulfolobus CPs. All the three ORFs are located in section C of the plasmid genome (see below). The rest of the pTC1 ORFs share 22.50%~69.99% similarity at the amino acid sequence level with their respective best matches in the other 12 CPs.
Genomic comparison shows that pTC1 resembles known Sulfolobus CPs in architecture [6,9]. Like all other Sulfolobus CPs, pTC1 comprises three conserved and functionally distinct genomic sections ( Figure 2 and Table 1). Section A of pTC1 accounts for about half of the plasmid (~12.5 kb) and contains 15 ORFs that are conserved among Sulfolobus CPs. All section A proteins, except for those encoded by ORF144, 124, 213, 142 and 99b, are predicted to possess transmembrane helix motifs, and are believed to be involved in plasmid conjugation. These include two highly conserved Sulfolobus CP proteins that are likely homologues of the bacterial conjugative ATPase proteins VirD/TraG (ORF621) and TrbE (ORF1019). Bacterial TraG is the coupling protein that connects the membrane-spanning mating pair formation (Mpf) complex with the cytoplasmic nucleoprotein relaxosome complex at the cytoplasmic membrane during conjugative transfer of DNA [15]. However, it is worth noting that no genes encoding relaxase, a bacterial homologue which is involved in the mobilization of the conjugational plasmid transfer [9], were found in the pTC1 genome. Also unusual was the observation that ORFs probably involved in transcriptional regulation (i.e., ORF144, 124, 213 and 99b) interrupted the otherwise contiguous cluster of ORFs encoding transmembrane proteins. The contiguous array of these proteins is highly conserved among other Sulfolobus CPs.
Section B contains multiple sequence repeats and no ORFs. Based on Z-curve and GC-skew analyses, the origin of replication for pTC1 is located in a 209-bp stretch, downstream of ORF142, in a large intergenic region (Figure 2). The 209-bp region shares limited sequence similarity to the replication origin of pSOG2, pKEF9, pING1, pARN3, and pSOG1 ( Figure 3). However, the conserved sequence TCTATACCCCC, found in the putative origin of the other CPs [9], is not present in pTC1 (Figure 3). On the other hand, there are two perfect direct repeats (TCCCCGGAACT) and three imperfect direct repeats (TCTCTCCNNCT) in this region as well as four imperfect repeats (AGCAGCGCTTGYCCT) 529-959 bp downstream of this region. The presence of these direct and inverted repeating sequences supports the possibility that pTC1 initiates replication in this region [16]. Section C, which includes a tightly packed cluster of six to nine genes, is conserved among previously characterized Sulfolobus CPs and has been implicated in the initiation and regulation of plasmid DNA replication [7,9]. These genes encode hypothetical relaxase (e.g., some pING ORFs), putative RepA (pING1_ORF99, pKEF9_ORF106), …, various types of transcriptional regulators, such as CopG (pAH1_p28, pKEF9_p28, pYN01_p11, pNOB8_24, pMGB1_p26), MarR-like protein(pAH1_p22, pKEF9_p24, etc.), Zinc finger protein (pAH1_p36), etc., as well as 1-4 hypothetical proteins. Surprisingly, section C in pTC1 is not well conserved as compared to the corresponding region in other Sulfolobus CPs. Most of the CPs encode a putative RepA protein. However, no homologue of a gene encoding putative RepA was found in pTC1. A plrA gene was identified in pTC1. PlrA, a highly conserved transcriptional regulator in Sulfolobus plasmids, is speculated to be involved in plasmid segregation [9]. All previously described CPs contain an ORF for a pNOB8-type integrase of the tyrosine recombinase superfamily. However, no homologue of the integrase gene was found in pTC1. Instead, an insertion element and a miniature inverted-repeat transposable element (MITE) were detected in section C (see below).
The insertion sequence (IS), denoted ISSte1 [17], is 762 bp in length and located in the intergenic region between ORF107 and ORF104 (Figure 4a). It contains a transposase (Tpase) gene (ORF204) flanked by 16-bp perfect inverted repeats (IR: GTGTGTGTCCAACAAT). Notably, there is a pair of perfect inverted repeats (IR': GGGTCAGGACGGG) between the left IR and ORF204. The insertion of ISSte1 has led to the duplication of the target site of 8 bp (DR: ATCACAAA). Sequence analysis shows that ORF204-encoded Tpase contains the conserved DDE domain ( Figure S1) [17]. This, together with the size and sequence of the IRs, places ISSte1 in the IS6 family [17]. MITEs are mobile genetic elements derived from ISs with DDE Tpases, and contain flanking terminal IRs but no internal Tpase gene [17][18][19]. The MITE sequence identified in pTC1 is only 142 bp long (Figure 4b). Intriguingly, the entire element resides in ORF108. It would be of interest to determine if ORF108 is functional. If the ORF encodes a functional protein, the presence or absence of the MITE at the site of transposition might be of significance to the biology of the plasmid. It is clear, however, that ORF108 is not essential for the plasmid since it is absent from all other known Sulfolobus CPs. The MITE of pTC1 carries long imperfect IRs (LIR: AATCGAGAACGGTGTCAT, RIR: ATGAGACATTCCTCGTT) at the two ends and is bounded by long imperfect DRs (LDR: AAGTGAGGTGATGTAAATGGTCAC, RDR: CAGTATATCATGTACTGGTAC. The intervening sequence between the two flanking IRs shares no significant similarity with known sequences in the public database, indicating that the element is a type II MITE. Since both IRs and DRs of the MITE are not well conserved, it appears that the element has lost the mobility due to mutation.

pTC Plasmids in the Geothermal Area of Tengchong
To determine the distribution of pTC1 and its variants in acidic hot springs and mud holes in Tengchong, we collected 35 samples at various spots in a general area of ~0.1 km 2 . Enrichment cultures from 27 of the samples were successfully established in Zillig's medium supplemented with 0.2% tryptone [3]. Plasmid DNAs were isolated from 11 of the enrichment cultures by using the alkaline lysis method. Restriction analysis showed that the EcoRI cleavage patterns of seven of the plasmids resembled that of pTC1, whereas those of the other four were also indistinguishable from that of pTC1 except for one fragment ( Figure 5). An additional fragment of ~400 bp appeared in one of the pTC variants, whereas the 6.8-kb EcoRI fragment, which harbored ISStel, was shortened to 6.0, 5.5 and 5.2 kb, respectively, in the other three variants. The ~400-bp EcoRI fragment of the former pTC variant, denoted pTC2, was sequenced. As shown in Figure 4 and Figure S2b in supplementary, pTC2 was 163 bp shorter than pTC1 in that fragment, and the MITE found in pTC1 was absent from pTC2 except for a single DR of the element. Therefore, pTC1 appears to have resulted from the insertion of the MITE in pTC2. These results suggest that pTC plasmids were widely distributed in the geothermal area of Tengchong, and pTC1 was likely a predominant species of these plasmids.
The presence of the insertion sequences in pTC1 prompted us to look for naturally-occurring variants of this plasmid. PCR reactions were carried out on total DNA or plasmid DNA samples isolated from each enrichment culture using a pair of primers targeting sequences upstream and downstream of ISStel. The primers were designed based on the sequences inside the upstream ORF104 and downstream ORF107 to ensure specific amplification of the sequence from pTC plasmids. Because of the absence of the integration of pTC into the host genome, it would be most likely that the specific PCR products were derived from the sequences of free pTC plasmids. A total of 90 PCR fragments shorter than that expected for the transposition by a full-length ISStel were obtained and subsequently sequenced. These sequences fall into nine categories (Figure 4 and Figure S2a in supplementary). Only a small fraction (14.4%) of the fragments contained the unaltered plasmid sequence, suggesting that either no transposition by ISStel had occurred at the site or an inserted ISStel had undergone precise excision, leaving behind a single copy of the target sequence (i.e., DR). In any case, the target site appears to be efficiently used for ISStel transposition. No perfect reversion of the transposition by ISStel to the target site was observed. In the remainder of the fragments, transposition of ISStel to the target site and subsequent excision of the inserted IS in an imprecise manner were detected. A large deletion spanning the entire ISStel as well as both DRs and a 14-bp 3'-flanking sequence was found in 70.1% of these fragments. Some of them carried a small additional deletion either upstream (17.4%) or downstream (12.7%) of the target site for ISStel transposition. The rest of these fragments (29.8%) had smaller deletions, which started 8-65 bp downstream of LIR and mostly ended 5-19 bp downstream of RDR. So all of them retained LDR and LIR, but lost the Tpase gene. Taken together, these observations suggest that deletion of ISStel proceeded more frequently in an imprecise manner than in a precise manner in natural environments. A diverse array of pTC variants associated with the transposition of MITE and subsequent deletion of the inserted element was also observed. As shown in Figure 4 and Figure S2b in supplementary, the MITE sequence was either entirely absent or deleted in different fashions. The sequence lacking the entire MITE may have yet to undergo transposition by the MITE. It is also possible that the sequence resulted from the precise deletion of the inserted MITE form a pTC1-like plasmid. Since the LIR of the pTC1 MITE is identical to the target sequence, the divergence between the two inverted repeats suggests that the MITE in pTC1, acquired through transposition, has probably become immobilized through mutation of the RIR. Imprecise deletion of the MITE apparently occurred as observed with ISStel.

Discussion
A large number of extrachromosomal genetic elements have been isolated from members of the family Sulfolobaceae thriving in various geothermal areas around the globe. In our previous study, we showed that S. tengchongensis RT8-4, isolated from a hot spring in Tengchong, China, was infected by the spindle-like virus STSV1 [20]. Subsequent analyses led to the identification of the plasmid pTC1 in the same Sulfolobus isolate. In our surveys, pTC1 and its variants were identified from 11 out of 27 enrichment cultures established with samples collected from various spots in the geothermal area of Tengchong. When a more sensitive PCR-based assay targeting the pTC1 sequences flanking ISStel was employed, 14 out of 20 samples produced positive results. Therefore, pTC plasmids are widespread in the Tengchong area. No other plasmids were found in our surveys.
The apparent conservation of pTC1 section A, especially the presence of ORFs encoding bacterial homologues of VirD/TraG (ORF621) and TrbE (ORF1019), suggests that pTC1 and its variants are conjugative plasmids. However, despite repeated efforts, we were unable to demonstrate the conjugational transfer of the plasmids under our experimental conditions. The lack of conjugation activity might be attributed to the absence of an identifiable relaxase-encoding gene in pTC plasmids. However, this argument is weakened by the finding that several known Sulfolobus CPs lacking the relaxase (e.g., pSOG1, 2) were capable of conjugational transfer [21,22]. It is possible that pTC plasmids lack other ORFs or cis elements required for conjugational transfer. It is also possible that the conjugation assay conditions, e.g., selection of recipient strains, were inadequate for the detection of the plasmid transfer. Therefore, more tests are needed before a conclusion can be drawn.
The copy number of pTC1 in the cells in the original enrichment cultures was high but rapidly decreased upon sub-culturing. pTC1 differs most significantly from other known Sulfolobus CPs in section C of the plasmid, which is proposed to encode proteins for plasmid DNA replication. For example, unlike many other known Sulfolobus CPs, pTC1 possesses no repA gene. The absence of a RepA homologue in pTC1 suggests that this protein is probably not essential for the replication of the plasmid. This and the observation that RepA was also missing from pSOG plasmids [9] suggest that an unidentified protein may serve as a replication initiator in Sulfolobus CPs lacking RepA. In addition, pTC1 encodes no Par-like proteins, which are involved in partitioning of bacterial plasmids. This provides a possible explanation for the instability of pTC plasmids during repeated culture transfers.
Mobile elements (i.e., ISSte1 and MITE) play an important role in generating the diversity of pTC plasmids. ISSte1 appears to be efficiently inserted into pTC plasmids in nature since no plasmids that had not undergone the transposition were isolated from over 30 hot spring and mud hole samples. Interestingly, the inserted ISSte1 was subjected to inactivation and removal in a range of manners, leaving behind various forms of IS relics. However, in no relics was a Tpase-encoding gene found. Perfect reversal of the transposition process, which would have left two DRs in tandem in the resulting plasmid, was not detected. The insertion element in only a small fraction (14.4%) of the plasmids containing various forms of IS relics was removed precisely, returning the insertion site to the pre-transposition state. As observed for the spontaneous deletion in bacteria, the non-tandem DRs at the boundaries of the inserted ISSte1 probably promoted the precise excision by host-encoded functions [23,24]. The low frequency of the event is not unexpected in view of the finding that precise excision of the ISs was rarely detected in Sulfolobus [25]. The remaining relics presumably resulted from imprecise excision of inserted ISSte1 in different manners. Degradation of ISSte1 in an imprecise mode did not seem to take place randomly but preferentially yielded relics retaining 5'-end parts of the ISSte1. The molecular mechanisms for the processes are unclear, but they may include abortive transposition by transposase [26]. It is equally possible that unknown host defense systems are involved in the removal of ISSte1. The MITE in pTC1 apparently underwent removal in a similar fashion. Taken together, our results are consistent with the notion [27] that the host cell may have developed a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading.