Size Polymorphism in Alleles of the Myoglobin Gene from Biomphalaria Mollusks

Introns are common among all eukaryotes, while only a limited number of introns are found in prokaryotes. Globin, globin-like proteins are widely distributed in nature, being found even in prokaryotes, a wide range of patterns of intron-exon have been reported in several eukaryotic globin genes. Globin genes in invertebrates show considerable variation in the positions of introns; globins can be found without introns, with only one intron or with three introns in different positions. In this work we analyzed the introns in the myoglobin gene from Biomphalaria glabrata, B. straminea, B. tenagophila. In the Biomphalaria genus, the myoglobin gene has three introns; these were amplified by PCR, analyzed by PCR-RFLP. Results showed that the size (number or nucleotides), the nucleotide sequence of the coding gene of the myoglobin are variable in the three species. We observed the presence of size polymorphisms in intron 2, 3; this characterizes a homozygous/heterozygous profile, it indicates the existence of two alleles which are different in size in each species of Biomphalaria. This polymorphism could be explored for specific identification of Biomphalaria individuals.


Introduction
In 1977, several research groups published their findings about DNA coding sequences (exons) broken by non-coding sequences (introns) in eukaryotes [1][2][3]. Introns are widespread among eukaryotes [4,5], while only a limited number of introns are found in prokaryotes.
Cavalier-Smith [6] suggested that introns were selfish DNA with no distinct cellular function, however there is much to discuss about the presence of these non-coding sequences [7]. After the ENCODE project consortium, it is known that intronic regions are transcribed in humans, indicating that protein coding loci are more transcriptionally complex than previously thought [8,9].
Independently in 1978, Doolittle [10] and Darnell [11] proposed the "Introns-Early theory" which states that introns are very early elements in the genome, which have been present since the eukaryote/prokaryote divergence. In the "Introns-Late theory" introns appeared recently in the eukaryotic genome [6,12,13]. A growing collection of introns has been found in different positions in some animal species while closely related species do not have the introns [14][15][16][17][18].
In the evolutionary history of introns, the selective stretch which modulates their evolution and the relative contribution of its eukaryotic gene loss or insertion is not understood [19]. In interpreting the evidence that a few introns may have a common position between animals, plants and fungi, one must bear in mind their evolutionary history. Comparisons of rRNA sequences support the idea that the three kingdoms of multicellular eukaryotes form monophyletic groups, which seem to have originated almost simultaneously, possibly from a common ancestor. This would also explain the conservation of the position of some introns among homologous genes within these kingdoms, without having to invoke the presence of introns in these genes from the very beginning [20].
Many authors have attempted to interpret the distribution of intron in terms of a mixture of movement and removal of introns, but the difficulty of phase-shifting movement causes serious doubts about these interpretations. In contrast, some gene families for which phylogenies can be traced show patterns that clearly indicate intron insertions [21].
The protein coding sequences are interrupted by introns at specific points, which could be used to investigate, for instance, the evolution of protein families. The gene encoding the ancestral globin chain is assumed to be interrupted by three introns inserted in the B, E and G helices. However, the conservation of this intron pattern and the exact insertion positions of the introns during evolution is a matter of ongoing discussion [22].
Intron location and sequence may also shed light on the origin of some protein genes, for instance the polymeric globin genes occurring in mollusks and arthropods [23]. In this way introns could be used to study evolution as well as the origin, phylogeny and systematic of the Mollusca phylum. The mollusk Biomphalaria glabrata has three introns in the myoglobin gene at positions A3.2, B12.2 and G7.0; the two last ones are conserved.
Identifying individuals from the Biomphalaria genus is a difficult task because they are very similar in morphology. Classic methodology of identification is based on comparative morphology of shells and anatomy of the reproductive organs, however there is much intra-specific variation [24]. Classic methods of identifying Biomphalaria species are important; however association with molecular techniques can result in more solid diagnostic [25]. In 1996, Vidigal and co-workers [26] showed that it is possible to distinguish B. glabrata and B. tenagophila by ribosomal gene amplification. In 1991, Knight and co-workers [27] showed intra-specific variability in B. glabrata susceptibility and resistance to Schistosoma mansoni by RFLP. These results show the efficiency of molecular techniques to identify the mollusks.
In this work, we analyze the introns of the myoglobin gene of mollusks from the Biomphalaria genus in order to evaluate their potential use for precise identification in systematic studies.

Results and Discussion
The size (in base pairs) of the amplicons of intron 1 was analyzed by polyacrylamide gel electrophoresis. All individuals of the three Biomphalaria species showed a DNA fragment of about 1,300 bp ( Figure 1).
Amplification of intron 2 of B. straminea generated an amplicon larger than 1,300 bp, of about 1,500 bp. In B. glabrata and B. tenagophila, some individuals were observed to have a large amplicon of about 1,400 bp, and some individuals showed a smaller fragment (of about 1,300 bp for B. glabrata and about 1,200 bp for B. tenagophila), whereas some individuals showed the two fragments ( Clear and satisfactory amplification using genomic DNA as a template is not a simple task, since nonspecific amplifications and artifacts could contaminate the true results. After obtaining the best protocol for amplification, this was repeated at least five times. The monitoring of all amplifications showed the results on the size of introns from Biomphalaria, which are described in the text; the gels presented as figures have some artifacts that do not refer to amplification of introns. The models for putative alleles from the three Biomphalaria species are shown in Figure 4. They were built based on the nucleotide sequence of the myoglobin gene from B. glabrata deposited in data bank and sequence of the cDNA myoglobins of B. tenagophila and B. straminea that were deposited in Genbank (accession nº EF646378 and EF646379), which are very similar to B. glabrata myoglobin (>96% identity).
PCR-RFLP was used to obtain rapid information about the nucleotide sequence of the myoglobin introns from Biomphalaria; for individuals who had introns of different sizes, the largest fragment was digested.    In no individual from the Biomphalaria species was fragmentation of intron 1 observed by restriction enzymes. The intron 2 from B. straminea and B. tenagophila was digested by Hind III only, however it was digested in different places, generating fragments of different sizes.
Intron 3 from B. straminea was cutted by Xba I, Xho I, Bgl II and Hind III, and the same intron from B. tenagophila was cutted by Xba I and Hind III. Intron 3 from B. glabrata was not cut by any enzyme ( Figure 5).
In 2004, Sokolova and Boulding [28] described size polymorphism in one intron of the enzyme N-aminopeptidase among individual mollusks of the Littorina genus. A similar result has been obtained in this work in the myoglobin gene of the three species of Biomphalaria mollusks. We observed that B. tenagophila, B. glabrata and B. straminea have size polymorphism in the myoglobin gene. This polymorphism, which characterizes a profile of homozygosity/heterozygosity for the introns of the myoglobin gene, had not been described for B. glabrata, whose gene has been characterized.
The first intron of B. glabrata has 1,116 bp, the second intron has 1,008 bp and third has 582 bp [18]. The difference in size observed between the cited introns of B. glabrata and those obtained in this work is due to segments of exons and/or segments of the UTR amplified together with the intron in the PCR. The same referred fragments are present in the introns of the other Biomphalaria species analyzed. The first elucidation of globin gene structure was that of the human hemoglobin α and β chains. These genes have two introns conserved exactly at positions B12.2 (α-helix B in the protein) and G6.3 [29]. According to Dewilde et al. [22], the structure of the globin gene of B. glabrata contains three intron/four exon. The B. glabrata myoglobin gene contains an unprecedented α-helix intron at position A3.2 that encodes a monodomain globin. It must, therefore, be considered as a newly inserted intron.
The analyses of the amplicons of the introns from B. tenagophila and B. straminea suggest the same pattern for myoglobin gene.
The typical gene structure of three exons/two introns are found in all vertebrate globins and in most annelids [30]. The results observed in amplicons from intron 2 and 3 suggest that the studies of Biomphalaria species present a homozygous/heterozygous pattern in the second and third intron of the myoglobin gene which can be observed by its size. The polymorphisms are confined to introns 2 and 3. Perhaps, it is possible that the conservation of size in intron 1 is because this intron position is evolutionarily most recent.
The globin genes of vertebrates and plants contain, respectively, two and three introns and their insertion positions are conserved. The three intron/four exon pattern of plants is proposed to be ancestral and all other globin gene structures would be derived mainly by intron loss [31]. Several non vertebrate and protozoan globin genes have been characterized and it has become clear that the intron/exon pattern is less conserved than originally expected [32].
B. straminea presents a size polymorphism in the third intron of the myoglobin gene. B. tenagophila and B. glabrata contain size polymorphism in the second intron, and any of the three species showed size polymorphism in the first intron. This kind of polymorphism and the homozygous/heterozygous pattern was not shown for the myoglobin gene of B. glabrata which had been described.
Analysis of unique individuals indicates that there are two alleles of different sizes which code for myoglobins in the Biomphalaria genus. The difference is about 200 bp and each Biomphalaria specie presents polymorphism in one intron only. The smaller alleles were observed in B. glabrata and the bigger ones were observed in B. straminea. By analyzing the size it is believed that the nucleotide sequence of the myoglobin gene of the B. glabrata deposited in data bank could be the minor allele of this specie.
No differences were observed in size of the primary sequences (amino acids) of the myoglobins from the three Biomphalaria species such that the contribution of the exon to the size of the amplicons is the same for all three Biomphalaria species. The difference among the myoglobin introns from B. glabrata, B. straminea e B. tenagophila suggests a potential for using these as molecular markers to carry out the specific identification of these mollusks.
Analysis of the distribution of the possible alleles of the gene encoding myoglobin among individuals of three species of Biomphalaria do not indicated a relation to the geographical origin of the mollusks, but this relationship cannot be ruled out, since it would be necessary to analyze a larger number of individuals.
The sequencing of heterozygous genotypes with different alleles in size is a difficult task, and RFLP analysis is a way to get quick information about the nucleotide sequence [28]. Analysis by PCR-RFLP of myoglobin introns showed that introns 2 and 3 are different in the three Biomphalaria species. Intron 1 was not cut by any of the five enzymes chosen, but this does not mean that the nucleotide sequence is the same, since introns are elements that accumulate a large number of mutations during evolution [7].
The Biomphalaria genus covers organisms morphologically very similar, making their identification difficult. The classical identification is based on comparative morphology of the shells and anatomy of reproductive organs, but the enormous intraspecific variation of morphological and anatomical features commits to a specific classification. Identification may also be hampered by the small size of the specimens, whose distinctive morphological characters are not very evident [24].
Monis [25] highlights the importance of traditional methods of identification, and suggests its association with molecular techniques for obtaining a more robust and reliable result. In 1996, Vidigal and co-workers [26] demonstrated that it is possible to specifically differentiate B. glabrata and B. tenagophila by amplification of portions of their ribosomal gene. These same authors in 2001 were able to separate the species B. glabrata, B. tenagophila and B. straminea by PCR-RFLP using the gene for ITS (internal transcribed spacer) that refers to a piece of non-functional RNA situated between structural ribosomal RNA.
In the same context, Knight et al. [27], using the gene for ribosomal RNA to detect size polymorphisms in restriction fragments (RFLP), were able to show intraspecific variation in B. glabrata susceptible and resistant to infection by Schistosoma mansoni. These results show the efficiency of molecular techniques in the identification of these mollusks.

Genomic DNA Extraction of Snails
Genomic DNA was purified from foot tissue of each individual of Biomphalaria using a DNA purification kit (Wizard Genomic DNA Purification, Promega) according to recommendations from the manufacturer. The DNA quantification and purity were analyzed by spectrophotometry at 260 and 280 nm using a UV-160A UV-Visible recording spectrophotometer (Shimadzu). The DNA quality was determined by 1% agarose gel electrophoresis. The DNA was not pooled at any step.

Amplification of the Introns from the Biomphalaria Myoglobin Gene
Amplification of the introns were performed by PCR using three sets of primers which were drafted based on nucleotide sequence from the B. glabrata myoglobin gene available in GenBank Accession Code U89283. The primers were BIO 1F and BIO 2 (Dewilde et al. 1998) and the other primers annealing in exon regions or in the UTR (untranslated region) to provide the complete sequence of the three introns of the myoglobin gene ( Figure 6). All amplifications were reproduced at least five times.
The PCR for intron 1 was performed for 30 cycles of 94 °C for 30 s, 65 °C for 60 s, 72 °C for 2 min and a final elongation step at 72 °C for 10 min. The amplification of the introns 2 and 3 used the same reaction, except for an annealing temperature of 60 °C. The amplicons were analyzed by 6% polyacrylamide gel electrophoresis with silver stain; the base pair markers X174/Hae III digested (Amersham Biosciences) and 1 kb Plus DNA ladder (Invitrogen) were applied.

PCR-RFLP
PCR-RFLP was used to analyze the similarity of the nucleotide sequence from myoglobin introns. The three introns of the myoglobin gene from three Biomphalaria species were amplified as previously described and amplicons were submitted to digestion with the restriction endonucleases Xba I, Bgl II, Xho I, Hind III and Bam HI (Invitrogen). Only intronic fragments were digested, because the aim was to observe if the nucleotide sequence of the introns differed among species of Biomphalaria. The use of the full gene (whole gene amplification) in RFLP would not allow identification of the differences (cuts) belonging to the introns and exons, and in which intron. Reactions were performed according to the manufacturer and were evaluated by 6% polyacrylamide gel electrophoresis with silver staining; the base pair marker X174/Hae III digested (Amersham Biosciences) was applied.

Conclusion
The gene encoding myoglobin from B. glabrata, B. straminea and B. tenagophila displays size polymorphisms (base pairs) in the intronic regions, featuring a profile of homozygosity/heterozygosity in this gene. Considering the difference among the introns of myoglobin from B. glabrata, B. straminea and B. tenagophila, there is the possibility of using these introns as molecular markers for specific identification of these mollusks.