| Literature DB >> 28448614 |
Rita Rey-Baños1, Luis E Sáenz de Miera1, Pedro García1, Marcelino Pérez de la Vega1.
Abstract
Retrotransposons with long terminal repeats (LTR-RTs) are widespread mobile elements in eukaryotic genomes. We obtained a total of 81 partial LTR-RT sequences from lentil corresponding to internal retrotransposon components and LTRs. Sequences were obtained by PCR from genomic DNA. Approximately 37% of the LTR-RT internal sequences presented premature stop codons, pointing out that these elements must be non-autonomous. LTR sequences were obtained using the iPBS technique which amplifies sequences between LTR-RTs. A total of 193 retrotransposon-derived genetic markers, mainly iPBS, were used to obtain a genetic linkage map from 94 F7 inbred recombinant lines derived from the cross between the cultivar Lupa and the wild ancestor L. culinaris subsp. orientalis. The genetic map included 136 markers located in eight linkage groups. Clusters of tightly linked retrotransposon-derived markers were detected in linkage groups LG1, LG2, and LG6, hence denoting a non-random genomic distribution. Phylogenetic analyses identified the LTR-RT families in which internal and LTR sequences are included. Ty3-gypsy elements were more frequent than Ty1-copia, mainly due to the high Ogre element frequency in lentil, as also occurs in other species of the tribe Vicieae. LTR and internal sequences were used to analyze in silico their distribution among the contigs of the lentil draft genome. Up to 8.8% of the lentil contigs evidenced the presence of at least one LTR-RT similar sequence. A statistical analysis suggested a non-random distribution of these elements within of the lentil genome. In most cases (between 97% and 72%, depending on the LTR-RT type) none of the internal sequences flanked by the LTR sequence pair was detected, suggesting that defective and non-autonomous LTR-RTs are very frequent in lentil. Results support that LTR-RTs are abundant and widespread throughout of the lentil genome and that they are a suitable source of genetic markers useful to carry out further genetic analyses.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28448614 PMCID: PMC5407846 DOI: 10.1371/journal.pone.0176728
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Schematic representation of representative LTR retrotransposons.
The main characteristics of autonomous and non-autonomous elements are represented. LTR-retrotransposons have long terminal repeats (LTRs) in direct orientation. Autonomous elements contain at least two genes, called gag and pol. The gag gene encodes a capsid-like protein and the pol gene encodes a polyprotein that is responsible for protease (PR), reverse transcriptase (RT), RNase H (RH) and integrase (INT) activities. PBS, primer binding site; PPT, polypurine track. Non-autonomous elements, such as large retrotransposon derivatives (LARDs) and terminal repeat retrotransposons in miniature (TRIMs), lack most or all coding sequence. Non-LTR retrotransposons are divided into long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs). LINE coding regions include a gag-like protein (ORF), an endonuclease (EN) and reverse transcriptase (RT). Both LINEs and SINEs usually terminate by a poly(A) sequence [5]. Thick lines below the elements indicate the sequences amplified in lentil in this work; the first letter c in the nomenclature indicates that the sequence was identified as a copia and g as a gypsy element. Drawings not made to scale.
Fig 2Genetic map obtained with markers showing Mendelian segregations.
The markers from the parental L. culinaris Lupa are indicated in red, that is, these bands were observed in the parental Lupa but were absent in the other parental, and vice versa for markers in black. Linkage groups are numbered from LG1 to LG8. A LOD score of 4 was used. Markers preceded by a P are iPBSs, by R are REMAPs and S indicates the SSR markers included. Partial distances in cM are indicated to the left of LGs while the total LG distance is displayed at the bottom. The insert to the right corresponds to the boxplot distribution of the distance in cM between consecutive markers.
Partial retrotransposon sequences obtained from lentil cultivar Lupa.
| Primers | Amplified family | Clone nomenclature (length in bp) |
|---|---|---|
| Copia (ca) | ||
| Copia-314 (259), | ||
| Copia-302, | ||
| Copia-305, Copia-308, Copia-310, | ||
| Gypsy (ga) | ||
| Gypsy (ga) | Gypsy1-104, | |
| Copia (ga) | ||
| Tnana (cb) | ||
| Tnana (cc) | ||
| Tnana (cd) | ||
| Tnana (cc) | ||
| Tnana (gb) | ||
| RNaseH/MseI (ce) | Copia | |
| RNaseH/MseI (gc) |
1 Lengths in bp (within parentheses) refer to all afore written sequences.
2 Letters between parentheses indicate the retrotransposon region amplified as represented in Fig 2. See supplementary Table 1 for primers used.
3 Gr nomenclature refers to the dendrogram groups of Fig 3.
4 Sequences underlined were used in the in silico search of the lentil draft genome.
Fig 3Phylogenetic trees of reverse transcriptase sequences.
Trees show the relationships between lentil sequences and Medicago truncatula (Mtr) sequences. A, Ty1-copia sequences; B, Ty3-gypsy sequences. Lentil sequences are within boxes indicating the different linkage groups (Gr) to which they belong, groups were related to the M. truncatula clades as described by Piednöel et al. [46] and the M. truncatula sequence numbers as in Wang and Liu [44]. Red color denotes the presence of premature stop codons in the reading frames.
Retrotransposon long terminal repeat sequences amplified from lentil cultivar Lupa.
| LTR name ( | Family | Primers (length in bp) |
|---|---|---|
| LTR-Angela ( | ||
| LTR-Glycine ( | ||
| LTR-SIRE ( | ||
| LTR-Peabody ( | ||
| LTR-Ogre ( | ||
| LTR-Cassandra ( |
1 Length in bp (within parentheses) refers to all afore written sequences.
2 Letters between parentheses indicate the retrotransposon region amplified as represented in Fig 2.
3 Sequence derived from Smýkal et al. (2009).
Number of in silico hits of lentil retrotransposon sequence types in relation to distances between consecutive hits.
| “Probes” | Distance (in bp) | Number of Contigs | |||
|---|---|---|---|---|---|
| > 10 | > 1,000 | > 10,000 | > 50,000 | ||
| 27 | 27 | 26 | 26 | 26 | |
| 82 | 82 | 81 | 81 | 81 | |
| 88 | 88 | 86 | 85 | 85 | |
| 35 | 35 | 34 | 34 | 34 | |
| 14 | 14 | 14 | 13 | 13 | |
| 349 | 347 | 341 | 337 | 337 | |
| 18 | 18 | 18 | 18 | 18 | |
| 626 | 624 | 612 | 605 | 605 | |
| 4,772 | 4,566 | 4,485 | 4,362 | 4,326 | |
| 789 | 783 | 771 | 761 | 761 | |
| 2,292 | 2,264 | 2,233 | 2,190 | 2,175 | |
| 4,489 | 4,363 | 4,271 | 4,131 | 4,079 | |
| 5,207 | 4,918 | 4,847 | 4,736 | 4,701 | |
| 27 | 27 | 25 | 25 | 25 | |
| 3,079 | 3,036 | 2,967 | 2,907 | 2,892 | |
| 7,968 | 7,875 | 7,482 | 7,101 | 6,974 | |
| 24,656 | 23,857 | 22,727 | 20,384 | 19,715 | |
| Tnana copia | 274 | 263 | 168 | 163 | 163 |
| Tnana gypsy | 15,064 | 14,507 | 14,078 | 13,757 | 13,677 |
| RNaseH copia | 478 | 478 | 472 | 463 | 463 |
| RNaseH gypsy | 3,515 | 3,456 | 3,410 | 3,356 | 3,341 |
| 1,378 | 1,365 | 1,251 | 1,224 | 1,220 | |
| 29,838 | 27,937 | 26,442 | 23,706 | 22,971 | |
| LTR-Angela | 3,709 | 2,521 | 2,239 | 2,106 | 2,079 |
| LTR-Smykal | 2,448 | 2,370 | 2,086 | 1,965 | 1,941 |
| LTR-Glycine | 26 | 26 | 20 | 20 | 20 |
| LTR-SIRE | 10,454 | 10,331 | 8,696 | 7,847 | 7,659 |
| 14,144 | 13,090 | 10,898 (25.7%) | 9,680 | 9,409 | |
| LTR-Peabody | 15,529 | 14,964 | 12,885 | 11,756 | 11,467 |
| LTR-Ogre | 20,802 | 20,618 | 19,135 | 16,475 | 15,913 |
| 36,323 | 35,451 | 30,559 (72.0%) | 25,193 | 24,096 | |
| LTR-Cassandra ( | 1,141 | 1,062 | 998 (2.3%) | 979 | 973 |
1 Rtr = reverse transcriptase. Group refers to the different clusters of Fig 3.
2 The values in these lines are not the sum of the previous numbers because two different in silico “probes” can match within the distance range and considered as the same hit.
Fig 4Contig length (bp) density distribution.
Figure shows the distributions of contig lengths; blue and red color lines indicate contigs with at least one lentil LTR-RT sequence or 10 or more lentil LTR-RT, respectively. The black line indicates the distribution of all lentil contigs, V0.8 genome. n = number of contigs.
Fig 5Boxplot distributions of contig length according to the number of “hits” generated by the lentil LTR-RT sequences.
Numbers at the bottom indicate the number of hits per contig while those on top to the number of contigs in each class. The first distribution was obtained when two hits were considered different if they were separated by at least 10 bp, the second distribution when hits were separated by at least 10,000 bp.
Fig 6Goodness of fit testing a Poisson distribution of the LTR-RT number according to the square root of the number of contigs.
The continuous line indicates the theoretical distribution and bars the real number of contigs within of each class. A between hit distance of > 10,000 bp was considered.
Fig 7Diagram of the contigs in the lentil genome containing the highest number of LTR-RT sequences.
Contig size (above) and contig name (below) are indicated. Arrows indicate sequences’ orientations. Blue boxes indicate putative RT flanked by two LTRs, red boxes indicate the presence of a reverse transcriptase sequence between LTRs. LTR are named according to their lineage (Table 2) and the internal sequences according to the nomenclature used in Table 1.