Literature DB >> 18226261

Computational prediction and molecular confirmation of Helitron transposons in the maize genome.

Chunguang Du1, Jason Caronna, Limei He, Hugo K Dooner.   

Abstract

BACKGROUND: Helitrons represent a new class of transposable elements recently uncovered in plants and animals. One remarkable feature of Helitrons is their ability to capture gene sequences, which makes them of considerable potential evolutionary importance. However, because Helitrons lack the typical structural features of other DNA transposable elements, identifying them is a challenge. Currently, most researchers identify Helitrons manually by comparing sequences. With the maize whole genome sequencing project underway, an automated computational Helitron searching tool is needed. The characterization of Helitron activities in maize needs to be addressed in order to better understand the impact of Helitrons on the organization of the genome.
RESULTS: We developed and implemented a heuristic searching algorithm in PERL for identifying Helitrons. Our HelitronFinder program will (i) take FASTA-formatted DNA sequences as input and identify the hairpin looping patterns, and (ii) exploit the consensus 5' and 3' end sequences of known Helitrons to identify putative ends. We randomly selected five predicted Helitrons from the program's high quality output for molecular verification. Four out of the five predicted Helitrons were confirmed by PCR assays and DNA sequencing in different maize inbred lines. The HelitronFinder program identified two head-to-head dissimilar Helitrons in a maize BAC sequence.
CONCLUSION: We have identified 140 new Helitron candidates in maize with our computational tool HelitronFinder by searching maize DNA sequences currently available in GenBank. Four out of five candidates were confirmed to be real by empirical methods, thus validating the predictions of HelitronFinder. Additional points to emerge from our study are that Helitrons do not always insert at an AT dinucleotide in the host sequences, that they can insert immediately adjacent to an existing Helitron, and that their movement may cause changes in the flanking region, such as deletions.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18226261      PMCID: PMC2267711          DOI: 10.1186/1471-2164-9-51

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Helitrons represent a new class of transposable elements recently uncovered in animals and plants [1], including maize [2-4]. The first two Helitrons described in maize were the causative agents of stable mutations: one in the shrunken2 mutant sh2-7527 [2] and another one in the barren stalk1 reference mutant ba1-Ref [3]. The termini of a 6525-bp Helitron in the ba1-Ref mutant share striking similarity with those of the Helitron insertion in the sh2-7527 mutant, indicating that they belong to the same family. Lai et al. [4] reported that two Helitrons, HelA and HelB, accounted for all of the genic differences distinguishing two previously described bz locus haplotypes [5]. HelA is 5.9-kb long and contains sequences for three of the four genes found only in the McC bz-locus haplotype. A nearly identical copy of HelA was isolated from a different chromosomal site in the B73 inbred. Both sites appear to be polymorphic in maize, suggesting that these Helitrons have been active recently. Basic Helitron features include: • Conserved TC and CTAG sequences at the 5' and 3' termini, respectively • Palindromes (16- to 20-bp 'hairpin loops') 10–15 bp upstream of the 3' terminus • Flanking A and T host nucleotides at the 5' and 3' termini, respectively The Figure 1 of a recent paper [4] comparing Helitron end sequences contains the 5' and 3' termini of the maize Helitrons HelA-1 and HelB from line McC, HelA-2 from B73, the Helitron insertions in mutants sh2-7523 and ba1-Ref, and the rice Helitron2_OS. Helitron sequences are in uppercase letters and the invariant host nucleotides where the Helitrons insert are in lowercase letters. Conserved nucleotides at the 5' and 3' termini are in bold uppercase letters and the inverted repeats at the 3' termini are underlined. The nonconserved body of the Helitrons is represented by dots.
Figure 1

Helitron end sequence alignment by Lai et al. [4]. It contains the 5' and 3' termini of the maize Helitrons HelA-1 and HelB from line McC, HelA-2 from B73, the Helitron insertions in mutants sh2-7523 and ba1-Ref, and the rice Helitron2_OS. Helitron sequences are in uppercase letters and the invariant host nucleotides where the Helitrons insert are in lowercase letters. Conserved nucleotides at the 5' and 3' termini are in bold uppercase letters and the inverted repeats at the 3' termini are underlined. The nonconserved body of the Helitrons is represented by dots.

Helitron end sequence alignment by Lai et al. [4]. It contains the 5' and 3' termini of the maize Helitrons HelA-1 and HelB from line McC, HelA-2 from B73, the Helitron insertions in mutants sh2-7523 and ba1-Ref, and the rice Helitron2_OS. Helitron sequences are in uppercase letters and the invariant host nucleotides where the Helitrons insert are in lowercase letters. Conserved nucleotides at the 5' and 3' termini are in bold uppercase letters and the inverted repeats at the 3' termini are underlined. The nonconserved body of the Helitrons is represented by dots. Besides the typical Helitron features they all share, there are two invariant CGs located 10 bp apart in each member of the palindromic repeat, the second one occurring just 9 bp from the 3' end. In the HelA subgroup, there is an invariant AA dinucleotide between the palindromic repeats. The 3' terminal 30 bp of HelA are very conserved with other Helitrons. In fact, of those 30 bp, HelA shares 26 and 24 bp, respectively, with the Helitrons previously identified as the causative agents of mutations at sh2 and ba1. One remarkable feature of Helitrons is their ability to capture gene sequences, a feature that makes them of considerable potential evolutionary importance. However, because Helitrons lack the typical structural features of other DNA transposable elements, identifying them is a challenge. Currently, most researchers identify Helitrons manually by comparing sequences. For example, Wang and Dooner [6] identified Helitrons by vertical comparisons of the bz regions from 8 different maize inbred lines. Although very precise, this approach is time consuming. Just lately, one model-based identification of Helitrons was introduced for Arabidopsis thaliana [7]. With the maize whole genome sequencing project underway, an automated computational Helitron searching tool is needed. The characterization of Helitron activities in the maize genome needs to be addressed in order to better understand the impact of Helitrons on the organization of the maize genome.

Results

Identification of Helitrons by in silico Analysis

There are basically two main non-autonomous categories of Helitrons in maize, Hel1 or HelA, and Hel2 or HelB. The majority of identified Helitrons in maize are of the HelA type (listed in Table 1, which was kindly provided by Dr. S. Lal), so our HelitronFinder program is focussed exclusively on the prediction of maize HelA type Helitrons.
Table 1

Known HelA Type Helitrons in Maize

HelitronMaize lineAccessionStartEndSizeSource
HelA-1bW22DQ186636151895189He & Dooner, 2005 [10]
HelA-1cW22DQ186637151895189Li & Dooner, 2005 [11]
Hel1-1AF2934571~17700Lal et al., 2003 [2]
Hel1-2AY645947165256525Gupta et al., 2005 [3]
Hel1-3aB73AF46693483708295034581Gupta et al., 2005 [3]
Hel1-3bB73AF466932384717378035310Gupta et al., 2005 [3]
Hel1-4BSS53AF09044744082215817751Gupta et al., 2005 [3]
Hel1-5aMcCDQ186635158585858Lai et al., 2005 [4]
Hel1-7aB73AY6644132108852059384946Morgante et al., 2005 [5]
Hel1-7cMo17DQ00240847752569049153Brunner et al, 2005 [12]
Hel1-7dMo17DQ00240661262663135052Brunner et al, 2005 [12]
Hel1-8B73AY66441324054925975519207Morgante et al., 2005 [5]
Hel1-9B73AY664413774850702677Morgante et al., 2005 [5]
Hel1-10B73AY66441489533816137919Morgante et al., 2005 [5]
Hel1-12B73AY37148896529897356793Morgante et al., 2005 [5]
Hel1-13B73AY5309511346221380543433Morgante et al., 2005 [5]
Hel1-14B73AY66441926209227304910958Morgante et al., 2005 [5]
Hel1-15B73AY6644152660362675371502Morgante et al., 2005 [5]

All the Helitrons in this table, which was kindly provided by Dr. S. Lal, have been published. The pertinent references are listed under the "Source" column. The accession numbers refer to entries in the GenBank sequence database: the Helitron coordinates in the sequence are identified under the "Start" and "End" columns.

Known HelA Type Helitrons in Maize All the Helitrons in this table, which was kindly provided by Dr. S. Lal, have been published. The pertinent references are listed under the "Source" column. The accession numbers refer to entries in the GenBank sequence database: the Helitron coordinates in the sequence are identified under the "Start" and "End" columns. The 'hairpin loop' and the CTAG termini at the 3' end of known Helitrons are the key characteristics for the identification of new Helitrons. The most challenging part is to identify the 5' end. For this purpose, we selected the first 25 nucleotides from the 5' end of each known Helitron of Table 1 and aligned them using Clustal [8]. There is a strong similarity in the first 18 nucleotides among the aligned Helitrons (Fig. 2). The consensus from the alignment is our main criterion to search for the 5' end of new Helitrons.
Figure 2

Alignment of the first 25 nucleotides of known maize Helitron 5' ends. A * means that all the sequences at that particular location are the same. There is a strong similarity in the first 18 nucleotides among the aligned Helitrons. The consensus from the alignment is our main criterion to search for the 5' end of new Helitrons.

Alignment of the first 25 nucleotides of known maize Helitron 5' ends. A * means that all the sequences at that particular location are the same. There is a strong similarity in the first 18 nucleotides among the aligned Helitrons. The consensus from the alignment is our main criterion to search for the 5' end of new Helitrons. We chose the first 18 nucleotides from Figure 1 as our 5' end search criterion: TC [TC] [CA]TA [CT]TA [CA] [TC] [TCA] [TA] [T or none]AAG. Ambiguous nucleotides at a particular location are included within brackets []. The 3' ends of known Helitrons have CTAG termini. For HelA type Helitrons, the double 'A' is often in the middle of the 'CG' bases in the hairpin loop (Fig. 3). The approaches used for searching 3' ends are detailed in figure 4.
Figure 3

Alignment of the last 50 nucleotides of known maize Helitrons 3' end. A * means that all the sequences at that particular location are the same. The 3' ends of known Helitrons have CTAG termini. For HelA type Helitrons, the double 'A' is often in the middle of the 'CG' bases in the hairpin loop.

Figure 4

The heuristic algorithm for searching 3' end of Helitrons. The 'hairpin loop' and the CTAG termini at the 3' end of known Helitrons are the key characteristics for the identification of new Helitrons. For HelA type Helitrons, the double 'A' is often in the middle of the 'CG' bases in the hairpin loop.

Alignment of the last 50 nucleotides of known maize Helitrons 3' end. A * means that all the sequences at that particular location are the same. The 3' ends of known Helitrons have CTAG termini. For HelA type Helitrons, the double 'A' is often in the middle of the 'CG' bases in the hairpin loop. The heuristic algorithm for searching 3' end of Helitrons. The 'hairpin loop' and the CTAG termini at the 3' end of known Helitrons are the key characteristics for the identification of new Helitrons. For HelA type Helitrons, the double 'A' is often in the middle of the 'CG' bases in the hairpin loop. We downloaded maize sequences from the GenBank non-redundant database to our local Sun workstation and used the HelitronFinder program to predict Helitron candidates. There are 44 and 102 predicted Helitrons in our "high quality" and "medium quality" outputs, respectively. The output files are in text format, with a GenBank accession number for each predicted Helitron. Outputs specifically identify Helitron sequences as being in a forward or reverse complement orientation. The HelitronFinder program also successfully identified all the known Helitrons listed in Table 1.

Confirmation of Helitrons by Molecular Analysis

We randomly selected five predicted Helitrons from the program's high quality output for molecular verification. PCR primers were designed based on the flanking sequence of each predicted Helitron. We surveyed 11 maize inbred and genetic lines for three of the five Helitron candidates and 15 lines for the other two. Four sets of primers successfully amplified either the Helitron-occupied or the Helitron-vacant site from different lines. The PCR products highlighted in bold in Table 2 were cloned and sequenced for further confirmation.
Table 2

Molecular Verification of Helitrons

GermplasmSilico 1Silico 2Silico 3Silico4
4Co63VacantOccupiedxVacant
A188xxxVacant
A636VacantOccupiedOccupiedVacant
B73VacantOccupiedOccupiedVacant
BSSS53OccupiedxxVacant
McCVacantOccupiedVacantOccupied
H99VacantVacantxVacant
M14VacantOccupiedxOccupied
Mo17VacantVacantxVacant
W22VacantOccupiedVacantx
W23OccupiedxVacantOccupied
CML139xVacant
I137 TNxVacant
Ki3xx
bz-RVacantVacant

Silico 1, Silico 2, Silico 3, and Silico 4 are Helitron candidates predicted by the HelitronFinder program. 4Co63, A188, A636, B73, BSSS53, H99, M14, Mo17, W22, W23, CML139, I137 TN, and Ki3 are inbred lines. McC and bz-R are genetic lines. Vacant: amplified PCR product lacks a Helitron.

Occupied: amplified PCR product contains a Helitron, which was confirmed by sequencing.

X: no PCR product detected.

Blank: line not tested for the corresponding Helitron candidate.

The PCR products highlighted in yellow have been cloned and sequenced for further confirmation.

Molecular Verification of Helitrons Silico 1, Silico 2, Silico 3, and Silico 4 are Helitron candidates predicted by the HelitronFinder program. 4Co63, A188, A636, B73, BSSS53, H99, M14, Mo17, W22, W23, CML139, I137 TN, and Ki3 are inbred lines. McC and bz-R are genetic lines. Vacant: amplified PCR product lacks a Helitron. Occupied: amplified PCR product contains a Helitron, which was confirmed by sequencing. X: no PCR product detected. Blank: line not tested for the corresponding Helitron candidate. The PCR products highlighted in yellow have been cloned and sequenced for further confirmation. Four out of five selected predicted Helitrons were confirmed by PCR products from different maize inbred lines (Table 2). They are named Silico 1, 2, 3, and 4, and are predicted from BSSS53, B73, B73, and McC sequences, respectively. The "occupied" and "vacant" entries denote PCR bands corresponding to the presence and absence of Helitrons, respectively. The X sign stands for no PCR amplification product. In addition to the inbreds from whose sequences they were predicted, Helitrons were detected in other inbreds. Thus, Silico 1 is present in W23, besides BSSS53, but absent in eight other inbred lines; Silico 2 is present in 4Co63, A636, McC, M14, and W22, besides B73, but absent in H99 and Mo17; Silico 3 is present in A636, besides B73, but absent in McC, W22, W23, and a bz-R genetic line, and Silico 4 is present in M14 and W23, besides McC, but absent in 4Co63, A188, A636, B73, H99, Mo17, CML139, I137TN, and bz-R. Silico1 was predicted from the BSSS53 sequence. A Helitron-occupied site was also detected in W23 while Helitron-vacant sites were detected in 4Co63, A636, B73, McC, H99, M14, Mo17, and W22 (Fig. 5). This result reveals +/- polymorphism among different inbred lines and confirms that the predicted Helitron, Silico1, is genuine.
Figure 5

Silico1 PCR products. Lanes: 1, size markers; 2, 4Co63; 3, A188; 4, A636; 5, B73; 6, BSSS53; 7, McC; 8, H99; 9, M14; 10, Mo17; 11, W22; 12, W23; 13, H2O. Silico1 is predicted from BSSS53 via our HelitronFinder software and is underlined in order to differentiate it from other lines. A Helitron-occupied site was also detected in W23 while Helitron-vacant sites were detected in 4Co63, A636, B73, McC, H99, M14, Mo17, and W22.

Silico1 PCR products. Lanes: 1, size markers; 2, 4Co63; 3, A188; 4, A636; 5, B73; 6, BSSS53; 7, McC; 8, H99; 9, M14; 10, Mo17; 11, W22; 12, W23; 13, H2O. Silico1 is predicted from BSSS53 via our HelitronFinder software and is underlined in order to differentiate it from other lines. A Helitron-occupied site was also detected in W23 while Helitron-vacant sites were detected in 4Co63, A636, B73, McC, H99, M14, Mo17, and W22. Silico3 was predicted from the B73 maize sequences. A total of 15 maize lines were used for molecular verification of this HelitronFinder prediction (Fig. 6). Both B73 and A636 show Helitron occupied sites, whereas lines McC, A188, W22, W23, and bz-R show Helitron vacant sites. In addition to the Helitron band amplified from B73, there was a faint band of the same size as the vacant site. We cloned and sequenced this product and confirmed it to be a vacant site.
Figure 6

Silico 3 PCR products. Lanes: 1, size markers; 2, blank; 3, 4Co63; 4, A188; 5, A636; 6, B73; 7, BSSS53; 8, McC; 9, CML139; 10, H99; 11, I137TN; 12, Ki3; 13, M14; 14, Mo17; 15, W22; 16, W23; 17, bz-R. Silico 3 is predicted from B73 via our HelitronFinder software and is underlined in order to differentiate it from other lines. Both B73 and A636 show Helitron occupied sites, whereas lines McC, A188, W22, W23, and bz-R show Helitron vacant sites. In addition to the Helitron band amplified from B73, there was a faint band of the same size as the vacant site. We sequenced this product and confirmed it to be a vacant site.

Silico 3 PCR products. Lanes: 1, size markers; 2, blank; 3, 4Co63; 4, A188; 5, A636; 6, B73; 7, BSSS53; 8, McC; 9, CML139; 10, H99; 11, I137TN; 12, Ki3; 13, M14; 14, Mo17; 15, W22; 16, W23; 17, bz-R. Silico 3 is predicted from B73 via our HelitronFinder software and is underlined in order to differentiate it from other lines. Both B73 and A636 show Helitron occupied sites, whereas lines McC, A188, W22, W23, and bz-R show Helitron vacant sites. In addition to the Helitron band amplified from B73, there was a faint band of the same size as the vacant site. We sequenced this product and confirmed it to be a vacant site.

Characterizations of Helitrons in the Maize Genome

Discovery of Two Adjacent Helitrons

The HelitronFinder program identified two adjacent, head-to-tail Helitrons in a maize BAC sequence with GenBank accession number AF466202 (Fig. 7). This is the first case of back-to-back Helitrons detected in the maize genome. A peculiarity of these head-to-tail Helitron configurations is that the TC 5' terminus of the second Helitron follows the CTAG 3' terminus of the first, creating a novel G/T junction, rather than the A/T junction normally found at a Helitron's 5' end. Pritham and Feschotte [9] reported several cases of perfect head-to-tail junctions of two Helitron elements in the genome of the bat Myotis lucifugus. They suggested that these were tandem repeats of Helitrons in the Myotis lucifugus genome. They also argued that one would expect the A of the host target site to occur between the CTAG end of the first element and the TC start of the second element if the elements had inserted independently. We aligned these two adjacent maize Helitrons and found that the sequences differed significantly and contained different genes or gene fragments. This indicates they are not tandem repeats, but arose by consecutive insertions.
Figure 7

Two adjacent Helitrons detected in the r1 region of B73 (GenBank accession number AF466202).

Two adjacent Helitrons detected in the r1 region of B73 (GenBank accession number AF466202). We designed four pairs of primers for these two Helitrons, F1/R1, F3/R3, F2/R4, and F4/R4 (Fig. 8). F and R represent forward and reverse primers, respectively. According to the PCR products in Table 3, we detected both Helitrons in lines A636 and B73, only Helitron No.2 in lines McC, W22, and W23, and neither Helitron in lines A188, CML139, H99, Ki3, M14, or Mo17. This result lends itself to two interpretations. One possibility is that Helitron No.2 (left) inserted into the maize genome first and that Helitron No.1 (right) inserted subsequently, and noncanonically, at the GT dinucleotide found at the 3' end of Helitron No.2. An alternative is that Helitron No.1 inserted first and Helitron No.2 inserted subsequently, and canonically, at the AT dinucleotide created by the host A and the T at the 5' end of Helitron No.1. Following the formation of this head-to-tail configuration (found in lines B73 and A636), Helitron No.1 would have excised cleanly (see next section), leaving only Helitron No.2 at the insertion site (as in McC, W22, and W23).
Figure 8

Location of PCR primers flanking and internal to adjacent Helitrons identified in sequence AF466202. We designed four pairs of primers for these two Helitrons: F1/R1, F3/R3, F2/R4, and F4/R4. F and R represent forward and reverse primers, respectively.

Table 3

Molecular Analysis of Two Adjacent Helitrons

Inbred LineF1 + R1F3 + R3F2 + R4F4 + R4Conclusions
4Co63xxN/A1 kb
A188xx0.7 kbN/ANo.1-No.2-
A6363 kb0.6 kbxxNo.1+No.2+
B733 kb0.6 kbx1 kbNo.1+No.2+
BSSS53xxN/AN/A
McC0.5 kb0.6 kbx0.7 kbNo.1-No.2+
CML139xx0.7 kbN/ANo.1-No.2-
H99xx0.7 kbxNo.1-No.2-
I137TNxN/AN/A1 kb
Ki3xx0.7 kbN/ANo.1-No.2-
M14xx0.7 kbN/ANo.1-No.2-
Mo17xx0.7 kbN/ANo.1-No.2-
W220.5 kb0.6 kbxN/ANo.1-No.2+
W230.5 kb0.6 kbN/AN/ANo.1-No.2+

PCR results from different primer combinations.

+: Helitron present

-: Helitron absent

x: no PCR amplification

N/A: no PCR test

Conclusions were based on PCR results. Both No.1 and No.2 Helitrons were detected in lines A636 and B73, only Helitron No.2 in lines McC, W22, and W23, and neither Helitron in lines A188, CML139, H99, Ki3, M14, or Mo17. No conclusion could be reached for 4Co63, BSSS53, and I137 TN based on the above PCR results.

Location of PCR primers flanking and internal to adjacent Helitrons identified in sequence AF466202. We designed four pairs of primers for these two Helitrons: F1/R1, F3/R3, F2/R4, and F4/R4. F and R represent forward and reverse primers, respectively. Molecular Analysis of Two Adjacent Helitrons PCR results from different primer combinations. +: Helitron present -: Helitron absent x: no PCR amplification N/A: no PCR test Conclusions were based on PCR results. Both No.1 and No.2 Helitrons were detected in lines A636 and B73, only Helitron No.2 in lines McC, W22, and W23, and neither Helitron in lines A188, CML139, H99, Ki3, M14, or Mo17. No conclusion could be reached for 4Co63, BSSS53, and I137 TN based on the above PCR results.

A Putative Helitron Somatic Excision

We further cloned and sequenced the PCR products of Silico3 from lines A636, B73, McC, W22, W23, and bz-R. Fig. 9 presents the sequence alignment showing the insertion of the predicted Helitron Silico3 in A636 and B73. There is no Helitron insertion in McC (C7053), W22, W23, or bz-R. The sequence results validate the HelitronFinder's prediction. It is interesting that, in addition to an occupied site, B73 also shows a weak Silico3 vacant-site-sized band (Fig. 6). Sequencing of this PCR product confirmed it to be an unoccupied site (Fig. 9). There are no sequence polymorphisms in the adjacent sequences to rule out the possibility that this band arose from DNA contamination in the B73 DNA preparation. Alternatively, however, this band may represent Helitron somatic excision products, which have been found at other polymorphic sites in maize (Y. Li and H.K. Dooner, unpublished data). This is a surprising result in light of the fact that Helitrons presumably transpose by a rolling circle transposition mechanism that does not generate empty sites.
Figure 9

Alignment of Silico 3 sequences indicating the insertion of the predicted Helitron Silico3 in A636 and B73. There is no Helitron insertion in McC, W22, W23, or bz-R. It is interesting that, in addition to an occupied site, B73 also shows a weak Silico3 vacant-site-sized band in Fig. 4. Sequencing of this PCR product confirmed it to be an unoccupied site.

Alignment of Silico 3 sequences indicating the insertion of the predicted Helitron Silico3 in A636 and B73. There is no Helitron insertion in McC, W22, W23, or bz-R. It is interesting that, in addition to an occupied site, B73 also shows a weak Silico3 vacant-site-sized band in Fig. 4. Sequencing of this PCR product confirmed it to be an unoccupied site.

Deletion of Helitron Flanking Regions

The PCR products of Silico1 (Fig. 5) from A636, B73, BSSS53, Mo17, W23, and 4Co63 were also cloned and sequenced. In addition to the BSSS53 inbred line from which Silico1 was predicted, we were able to amplify and sequence the 5' end of Silico1 from W23. The sequences of Silico 1 occupied and vacant sites are aligned in Fig. 10. Silico1 is present in W23 and BSSS63 and absent from B73, A636, 4Co63, and Mo17. The 3' flanking region in B73 is identical to that in BSSS53. However, the 3' end flanking regions of Silico1 in A636, 4Co63, and Mo17 are missing 38 nucleotides. The presence of the same deletion in three different lines points to a common origin of this chromosomal segment. Possibly, the deletion arose following the imprecise excision of Silico 1 from an occupied site in a common progenitor of these lines.
Figure 10

Alignment of Silico 1 sequences. Silico1 is present in W23 and BSSS63 and absent from B73, A636, 4Co63, and Mo17. The 3' flanking region in B73 is identical to that in BSSS53. However, the 3' end flanking regions of Silico1 in A636, 4Co63, and Mo17 are missing 38 nucleotides.

Alignment of Silico 1 sequences. Silico1 is present in W23 and BSSS63 and absent from B73, A636, 4Co63, and Mo17. The 3' flanking region in B73 is identical to that in BSSS53. However, the 3' end flanking regions of Silico1 in A636, 4Co63, and Mo17 are missing 38 nucleotides.

Discussion

Helitrons are novel transposons that have not been well characterized experimentally. Implementing our maize Helitron discovery algorithm, we found two adjacent Helitrons, which we arbitrarily named No.1 and No.2, in the r1 region of B73 (Figs. 7 and 8). Here, we propose two models for how these adjacent Helitron arose. One hypothesis is that these are tandem repeats, which arose by the Helitron's rolling circle mechanism of replication, as postulated by Pritham and Feschotte [9]. An alternative hypothesis is that one Helitron inserted next to an existing Helitron. The sequence data support the latter model. Helitron No.1 contains an S-receptor kinase gene with only one exon, whereas Helitron No. 2 carries an aldose reductase gene. We attempted to align these two Helitrons, excluding the S-receptor kinase and aldose reductase genes. There are large differences between the two Helitrons, indicating that Helitrons No. 1 and No. 2 do not represent tandem repeats. Our characterization of PCR products from several maize lines support the second hypothesis of two independent insertions, but the order of insertion is not clear. Helitron No. 2 could have inserted first, and No.1 subsequently, next to the 3' end of No.2, in which case No.1 would have inserted at a GT site, instead of the canonical AT site. Alternatively, the two Helitrons could have inserted in reverse order, followed by the precise excision of Helitron No.1 in a common progenitor of modern maize lines having only Helitron No. 2 at the insertion site. Most known Helitrons in Table 1 carry gene fragments and not fully functional genes. One of the two adjacent Helitrons (No. 1) contains a gene with only one exon. We searched GenBank with both nucleotide and amino acid sequence queries and found a cognate single-exon gene in rice. This may indicate that Helitron No. 1 carries a fully functional gene. It is not clear at this point how Helitrons acquire host sequences, but it is important to learn if Helitrons have the ability to trap fully functional genes and mobilize them around the genome. More studies need to be conducted to determine if the gene inserted into Helitron No.1 is a fully functional gene. We detected a putative Helitron excision product in the B73 inbred (Fig. 9), but could not rule out DNA contamination because of the absence of polymorphisms in the adjacent sequences. All four predicted Helitrons are present in some inbred lines and absent in others. This shows that Helitrons are active in the maize genome. We speculate that insertions and excisions of Helitrons can cause changes in the flanking regions, as the 38-bp deletion shown in Fig. 10.

Conclusion

We have identified 140 new Helitron candidates in maize with our computational tool HelitronFinder. Four out of five candidates were confirmed to be real by empirical methods, thus validating the predictions of our program. Additional points to emerge from our study are that Helitrons may not always insert at an AT dinucleotide in the host sequences, that they can insert immediately adjacent to an existing Helitron, and that Helitron movement may cause changes in the flanking region, such as deletions.

Methods

Heuristic Search Algorithm of HelitronFinder

The HelitronFinder program is written in PERL and uses its regular expression abilities to look for the specified patterns of Helitrons in maize genome. The update_blastdb.pl script provided by NCBI was modified to work with the HelitronFinder program to download the maize genome DNA sequences in fasta file format when requested. The HelitronFinder will search the input DNA sequences from both forward and reverse directions. For each direction, there are two main subroutines to search for the 5' and 3' ends, respectively. The 5' end subroutine uses the consensus derived from Figure 1 as its search criterion. This is relative straightforward. However, the 3' end structure is more complex, requiring a search for 16- to 20-bp palindromes in the DNA sequences. More specifically, we look for palindromes containing the self-pairing CG and the double A in the middle of the HelA type Helitrons. Then, the subroutine will identify 3' CTRR termini within 20 bp downstream of the palindrome and output the sequences from the beginning of the palindrome to the 3' CTRR terminus, along with their coordinates. For each possible instance of a 5' end, the subroutine lists the closest 3' ends within 50,000 bases. The HelitronFinder program has two levels of constraints for the searching criteria, high quality and medium. The 5' end criterion of the high quality constraint is: (TC [CT] [CA]TA [CT]TA [CA] [TC] [ATC] [ATC])([ATCG])([TA]TAAG) The 3' end criterion of the high quality constraint is: (CG)([ATCG]{3,5})(AA)([ATCG]{3,5})(CG)([ATCG]{9})(CTAGT) The double 'A' in bold is one of the characteristics of HelA type Helitron. The high quality searching criterion is mainly targeting this type of Helitrons. For the medium searching criterion, we use less constraints than the high quality criterion. The 5' end consensus is as close to the high quality as possible. However, we pick the less conserved 3' end as below: (CG)([ATCG]{9,12})(CG)([ATCG]{1,13})(CT [AG] [AG]T) This will be able to predict HelB type Helitrons as well.

Primer Design

PCR primer pairs were designed based on the 500 bp of sequences flanking each Helitron end. Silico 1 primers: Forward CTGCACCACCGTCTCTACAA Reverse TAGCCGCTCCTAAGAAGCAC Silico 2 primers: Forward GCGACCAAACCATAGCAAAA Reverse AGGGGCATGAGTAGCTTCCT Silico 3 primers Forward1 (F1) CCACTTCTCCAGTTCCTTGG Reverse1 (R1) GGGCGTAACATCATGTCATT Forward2 (F2) GTTGGGACCCAGCTGTTAGA Reverse2 (R2) ACCAAGAAGTTGGCCTCTCC Forward3 (F3) AGGGTTTTCGTTGGAGGAGT Reverse3 (R3) GATTCGAGTGTCCGCTTGAT Forward4 (F4) AAGACAGCGGCTAGGGTTTT Reverse4 (R4) TGTTTTGCACGGTGTGGTAG Silico 4 primers Forward TATCCCCGAGTCAAAACTGC Reverse CGACGACAGCTTCACTGACA

Cloning, Sequencing

PCR products then were cloned into pGEM-T easy vector (Promega). Sequences were obtained through 3700 DNA Analyzer using Big Dye v3.1 terminal reaction (Applied Biosystem). Consensus sequences were used for analysis.

Availability and Requirements

The HelitronFinder program is available for public access at The detailed description and sample run are also provided at the website.

Authors' contributions

CD conceived, designed and coordinated the study, carried out the sequence alignment and drafted the manuscript. JC implemented HelitronFinder in PERL. LH carried out the PCR and sequence analysis of the predicted Helitrons and helped to draft the manuscript. HKD designed and coordinated the study and helped to write the manuscript.
  10 in total

1.  Rolling-circle transposons in eukaryotes.

Authors:  V V Kapitonov; J Jurka
Journal:  Proc Natl Acad Sci U S A       Date:  2001-07-10       Impact factor: 11.205

2.  The maize genome contains a helitron insertion.

Authors:  Shailesh K Lal; Michael J Giroux; Volker Brendel; C Eduardo Vallejos; L Curtis Hannah
Journal:  Plant Cell       Date:  2003-02       Impact factor: 11.277

3.  Multiple sequence alignment with the Clustal series of programs.

Authors:  Ramu Chenna; Hideaki Sugawara; Tadashi Koike; Rodrigo Lopez; Toby J Gibson; Desmond G Higgins; Julie D Thompson
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

4.  A novel class of Helitron-related transposable elements in maize contain portions of multiple pseudogenes.

Authors:  Smriti Gupta; Andrea Gallavotti; Gabrielle A Stryker; Robert J Schmidt; Shailesh K Lal
Journal:  Plant Mol Biol       Date:  2005-01       Impact factor: 4.076

5.  Origins, genetic organization and transcription of a family of non-autonomous helitron elements in maize.

Authors:  Stephan Brunner; Giorgio Pea; Antoni Rafalski
Journal:  Plant J       Date:  2005-09       Impact factor: 6.417

6.  Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize.

Authors:  Michele Morgante; Stephan Brunner; Giorgio Pea; Kevin Fengler; Andrea Zuccolo; Antoni Rafalski
Journal:  Nat Genet       Date:  2005-07-31       Impact factor: 38.330

7.  Model-based identification of Helitrons results in a new classification of their families in Arabidopsis thaliana.

Authors:  Sébastien Tempel; Jacques Nicolas; Abdelhak El Amrani; Ivan Couée
Journal:  Gene       Date:  2007-08-30       Impact factor: 3.688

8.  Gene movement by Helitron transposons contributes to the haplotype variability of maize.

Authors:  Jinsheng Lai; Yubin Li; Joachim Messing; Hugo K Dooner
Journal:  Proc Natl Acad Sci U S A       Date:  2005-06-10       Impact factor: 11.205

9.  Massive amplification of rolling-circle transposons in the lineage of the bat Myotis lucifugus.

Authors:  Ellen J Pritham; Cédric Feschotte
Journal:  Proc Natl Acad Sci U S A       Date:  2007-01-29       Impact factor: 11.205

10.  Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus.

Authors:  Qinghua Wang; Hugo K Dooner
Journal:  Proc Natl Acad Sci U S A       Date:  2006-11-13       Impact factor: 11.205

  10 in total
  24 in total

1.  Excision of Helitron transposons in maize.

Authors:  Yubin Li; Hugo K Dooner
Journal:  Genetics       Date:  2009-03-02       Impact factor: 4.562

2.  Haplotype structure strongly affects recombination in a maize genetic interval polymorphic for Helitron and retrotransposon insertions.

Authors:  Limei He; Hugo K Dooner
Journal:  Proc Natl Acad Sci U S A       Date:  2009-04-24       Impact factor: 11.205

3.  Synergy of two reference genomes for the grass family.

Authors:  Joachim Messing
Journal:  Plant Physiol       Date:  2009-01       Impact factor: 8.340

4.  Structure-based discovery and description of plant and animal Helitrons.

Authors:  Lixing Yang; Jeffrey L Bennetzen
Journal:  Proc Natl Acad Sci U S A       Date:  2009-07-21       Impact factor: 11.205

5.  A cornucopia of Helitrons shapes the maize genome.

Authors:  Cédric Feschotte; Ellen J Pritham
Journal:  Proc Natl Acad Sci U S A       Date:  2009-11-19       Impact factor: 11.205

6.  The Helitron family classification using SVM based on Fourier transform features applied on an unbalanced dataset.

Authors:  Rabeb Touati; Afef Elloumi Oueslati; Imen Messaoudi; Zied Lachiri
Journal:  Med Biol Eng Comput       Date:  2019-08-17       Impact factor: 2.602

7.  HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes.

Authors:  Wenwei Xiong; Limei He; Jinsheng Lai; Hugo K Dooner; Chunguang Du
Journal:  Proc Natl Acad Sci U S A       Date:  2014-06-30       Impact factor: 11.205

8.  Helitrons in Drosophila: Chromatin modulation and tandem insertions.

Authors:  Guilherme B Dias; Pedro Heringer; Gustavo C S Kuhn
Journal:  Mob Genet Elements       Date:  2016-03-07

9.  The polychromatic Helitron landscape of the maize genome.

Authors:  Chunguang Du; Nadezhda Fefelova; Jason Caronna; Limei He; Hugo K Dooner
Journal:  Proc Natl Acad Sci U S A       Date:  2009-11-19       Impact factor: 11.205

10.  Fragments of the key flowering gene GIGANTEA are associated with helitron-type sequences in the Pooideae grass Lolium perenne.

Authors:  Tim Langdon; Ann Thomas; Lin Huang; Kerrie Farrar; Julie King; Ian Armstead
Journal:  BMC Plant Biol       Date:  2009-06-07       Impact factor: 4.215

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.