Mechanisms used for genomic proliferation by thermophilic group II introns

37Citations
Citations of this article
82Readers
Mendeley users who have this article in their library.

Abstract

Mobile group II introns, which are found in bacterial and organellar genomes, are site-specific retro elments hypothesized to be evolutionary ancestors of spliceosomal introns and retro transpo sons in higher organisms. Most bacteria, however, contain no more than one or a few group II introns, making it unclear how introns could have proliferated to higher copy numbers in eukaryotic genomes. An exception is the thermophilic cyanobacterium Thermosynechococcus elongatus, which contains 28 closely related copies of a group II intron, constituting,1.3% of the genome. Here, by using a combination of bioinformatics and mobility assays at different temperatures, we identified mechanisms that contribute to the proliferation of T. elongatus group II introns. These mechanisms include divergence of DNA target specificity to avoid target site saturation; adaptation of some intron-encoded reverse transcriptases to splice and mobilize multiple degenerate introns that do not encode reverse transcriptases, leading to a common splicing apparatus; and preferential insertion within other mobile introns or insertion elements, which provide new unoccupied sites in expanding non-essential DNA regions. Additionally, unlike mesophilic group II introns, the thermophilic T. elongatus introns rely on elevated temperatures to help promote DNA strand separation, enabling access to a larger number of DNA target sites by base pairing of the intron RNA, with minimal constraint from the reverse transcriptase. Our results provide insight into group II intron proliferation mechanisms and show that higher temperatures, which are thought to have prevailed on Earth during the emergence of eukaryotes, favor intron proliferation by increasing the accessibility of DNA target sites. We also identify actively mobile thermophilic introns, which may be useful for structural studies, gene targeting in thermophiles, and as a source of thermostable reverse transcriptases. © 2010 Mohr et al.

Figures

  • Figure 1. T. elongatus group II Intron families and insertion sites. The 25 intact introns are classified into six families (F1–F6) based on their EBS sequences. Three other introns are fragments (TeI3g retains ,340 nts of the 39 part of the intron starting in the En domain of the IEP; TeI3m lacks regions upstream of DIVa(39); and TeI3n has a large internal deletion between DId(59) and DIVa1). Colors highlight EBS sequences and complementary nucleotide residues in the IBS sequences. The EBS2 sequence of TeI4h could not be identified unambiguously from the secondary structure model and was defined by in vivo selections with donor and recipient plasmids in which potential EBS2 and IBS2 nucleotide residues were randomized (G.M. and A.M.L., unpublished data). doi:10.1371/journal.pbio.1000391.g001
  • Figure 2. T. elongatus group II intron RNA secondary structure and IEP. (A) Predicted secondary structure of TeI4h. Differences in TeI4c, TeI4f, and TeI3c are indicated in red, boxed, and blue letters, respectively. The structure consists of six conserved domains (DI–DVI). Subdomains and further subdivisions are denoted with letters followed by numbers (e.g., DIc1). Greek letters indicate nucleotide sequences involved in long-range tertiary interactions [2]; 59 and 39 exon (E1 and E2, respectively) are boxed; and splice sites are indicated by open arrowheads. The gray boxes show a region of DIII that is replaced by a different sequence in TeI3c (blue, inset). (B) Secondary structure of DIV of ORF-containing TeI4 introns. The figure shows the secondary structure of DIV of TeI4h, with differences in TeI4c, 4f, and 4g indicated in red, boxed, and white letters in black boxes, respectively. The two potential start codons and the stop codon of the intron ORF are circled, and the arrow between the two potential start codons indicates the site at which TeI3c and other F3 introns insert into TeI4c and other F1 introns, resulting in the formation of twintrons. Regions that differ substantially in the ORF-less TeI3 introns are shaded gray. (C) Secondary structure of DIV of the ORF-less TeI3 introns. The figure shows the secondary structure of TeI3c, with differences in TeI3f, 3k, and 3l indicated in orange, green, and purple, respectively. Regions that differ from the ORF-containing TeI4 introns are shaded gray. Potential base pairings between the DIVa1 and DIVa2 loops are indicated at the upper right. A red circle highlights the extra U residue in DIVa1 of ORF-less introns (see also Figure S2). (D) Schematic of the TeI4h IEP. Conserved protein domains are: RT, containing conserved amino acid sequence blocks RT1–7 characteristic of the finger and palm regions of retroviral RTs; X/Thumb, region associated with maturase activity and corresponding in part to the RT thumb; D, DNA binding; and En, DNA endonuclease; RT-0 is a region conserved in the RTs of non-LTR retroelements [1,29]. Multiple sequence alignments of the TeI4h and other IEPs are shown in Figure S1. doi:10.1371/journal.pbio.1000391.g002
  • Figure 3. Phylogeny of T. elongatus introns. The figure shows a phylogram for all 25 intact T. elongatus introns. TeI4 introns were aligned with TeI3 introns by deleting ORF sequences in DIVb (positions 755–2290 of TeI4h). RNA sequences were aligned with ClustalX [49], and the alignment was refined manually and used as input for Phylip (ver. 3.69, with default parameters [50]). The phylogenies were generated with program modules DNAdist and DNAcomp using all of the Distance settings (F84, Kimura, Jukes-Cantor, LogDet) independently and varying the out-group (EcI5 or random Te intron). Trees were visualized with Treeview [51,52] and were essentially the same regardless of distance or out-group settings. Support for the major groupings of the phylogram was obtained by bootstrapping 1,000 data sets (using Seqboot from Phylip ver. 3.69) and using these as input for DNAdist. The output of the latter program was then used to obtain a consensus tree with Consense. The numbers indicate the percentage of times a particular grouping occurred in the 1,000 data sets. doi:10.1371/journal.pbio.1000391.g003
  • Figure 4. TeI4h intron mobility assays. (A) E. coli genetic assay of intron mobility. The CapR donor plasmid uses a T7lac promoter (PT7lac) to express a DORF intron (I-DORF) with short flanking 59 and 39 exons (E1 and E2, respectively) and the IEP downstream of E2. The intron, which carries a T7 promoter (PT7) in DIVb, integrates into a target site (ligated E1–E2 sequences) cloned in an Amp R recipient plasmid upstream of a promoterless tetR gene, thereby activating that gene. The donor and recipient plasmids are derivatives of pACD2X and pBRR-tet, respectively (see Materials and Methods). The assays are done in E. coli HMS174(DE3), which contains an IPTG-inducible T7 RNA polymerase, with intron expression induced with 500 mM IPTG for 1 h at different temperatures. Mobility efficiencies are calculated as the ratio of (TetR+AmpR)/AmpR colonies. (B) Mobility efficiency of the TeI4h-DORF (blue) and Ll.LtrB-DORF (red) introns as a function of induction temperature. The donor plasmid for the Ll.LtrB-DORF intron was pACD2X [47]. doi:10.1371/journal.pbio.1000391.g004
  • Table 1. Mobility efficiencies of TeI4h and effect of mutations at different temperatures.
  • Table 2. Mobility efficiencies for intron donor plasmids expressing different combinations of T. elongatus group II intron RNAs and IEPs.
  • Figure 5. Identification of critical nucleotide residues in the distal 59-exon and 39-exon regions of the DNA target sites of T. elongatus introns. (A) Intron donor plasmid TeI4h*/4h* at 37uC. Intron donor plasmid TeI4h*/4h* at 48uC. (C) Intron donor plasmid TeI4c/4c at 48uC. (D) Intron donor plasmid TeI3c/4c at 48uC. Selection experiments were done in E. coli HMS174(DE3) containing the indicated intron donor plasmid and a recipient plasmid library randomized at the positions shown, as described in Materials and Methods. After selection by plating on LB medium containing antibiotics, AmpR+TetR colonies were analyzed by colony PCR and sequencing of the 59- and 39-integration junctions to identify nucleotide residues in active target sites. The WebLogo representation [39] depicts nucleotide frequencies at each randomized position in ,100 selected target sites, corrected for biases in the initial pool based on sequences of ,100 randomly chosen recipient plasmids [28]. The sequence of the intron-insertion site in the T. elongatus genome is shown, with white bases on black background indicating randomized nucleotides belonging to IBS2. Summarized below are nucleotide frequencies (percent) at each randomized position in (i) active target sites after intron insertion (‘‘selected’’), (ii) randomly chosen recipient plasmids from the original pool (‘‘pool’’), and (iii) active target sites corrected for nucleotide frequency biases in the initial pools (‘‘corrected’’). The latter were used to generate the WebLogos. In some cases, percentage totals do not equal 100 due to rounding off. doi:10.1371/journal.pbio.1000391.g005

References Powered by Scopus

The CLUSTAL X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools

36598Citations
N/AReaders
Get full text

Treeview: An application to display phylogenetic trees on personal computers

10137Citations
N/AReaders
Get full text

WebLogo: A sequence logo generator

9720Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Direct CRISPR spacer acquisition from RNA by a natural reverse transcriptase-Cas1 fusion protein

158Citations
N/AReaders
Get full text

Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing

150Citations
N/AReaders
Get full text

Mobile bacterial group II introns at the crux of eukaryotic evolution

113Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Mohr, G., Ghanem, E., & Lambowitz, A. M. (2010). Mechanisms used for genomic proliferation by thermophilic group II introns. PLoS Biology, 8(6). https://doi.org/10.1371/journal.pbio.1000391

Readers over time

‘10‘11‘12‘13‘14‘15‘16‘17‘18‘19‘20‘21‘22‘23‘2405101520

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 30

43%

Researcher 23

33%

Professor / Associate Prof. 16

23%

Lecturer / Post doc 1

1%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 53

73%

Biochemistry, Genetics and Molecular Bi... 17

23%

Chemistry 2

3%

Chemical Engineering 1

1%

Save time finding and organizing research with Mendeley

Sign up for free
0