| Literature DB >> 26194356 |
Dario Copetti1,2, Jianwei Zhang3, Moaine El Baidouri4,5, Dongying Gao6, Jun Wang7, Elena Barghini8, Rosa M Cossu9, Angelina Angelova10, Carlos E Maldonado L11, Stefan Roffler12, Hajime Ohyanagi13, Thomas Wicker14, Chuanzhu Fan15, Andrea Zuccolo16, Mingsheng Chen17, Antonio Costa de Oliveira18, Bin Han19, Robert Henry20, Yue-Ie Hsing21, Nori Kurata22, Wen Wang23, Scott A Jackson24, Olivier Panaud25, Rod A Wing26,27.
Abstract
BACKGROUND: Comparative evolutionary analysis of whole genomes requires not only accurate annotation of gene space, but also proper annotation of the repetitive fraction which is often the largest component of most if not all genomes larger than 50 kb in size.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26194356 PMCID: PMC4508813 DOI: 10.1186/s12864-015-1762-3
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Features and source materials for the 11 Oryza species and Leersia perrieri. Approximately 1.5× of single-end raw Illumina reads was used for de novo repeat library construction and assembled sequences were used to detect full-length elements. Two independent RE libraries were developed for each of the two O. sativa subspecies. The genome assembly of O. officinalis was not available for the full-length element characterization
| Species | Genome Type | Est. Genome Size | Illumina Reads | Assembled Genome | |||||
|---|---|---|---|---|---|---|---|---|---|
| Mb | Ref.a | Avg. Read Length (bp) | # of Reads (M) | Source | Statusb | Size(Mb) | Source | ||
|
| AA | 389 | 1 | 97 | 6.5 | R. Wing – Unpubl. | chr | 373.2 |
|
|
| AA | 439 | 3 | 98 | 6.8 | SRA ERR120613 | chr | 338.0 | EMBL-EBI PRJEB4137 |
|
| AA | 466 | 2 | 100 | 7.0 | R. Wing – Unpubl. | chr | 374.5 |
|
|
| AA | 448 | 3 | 79 | 8.7 | Y. Hsing – Unpubl. | chr | 337.9 | NCBI AWHD00000000 |
|
| AA | 357 | 4 | 75 | 7.2 | R. Wing – Unpubl. | chr | 285.0 | NCBI ADWL00000000 |
|
| AA | 411 | 5 | 116 | 5.0 | R. Wing – Unpubl. | chr | 308.3 | NCBI ABRL00000000 |
|
| AA | 464 | 5 | 101 | 6.9 | R. Wing – Unpubl. | chr | 372.9 | NCBI ALNU02000000 |
|
| AA | 352 | 5 | 75 | 8.0 | W. Wang – Unpubl. | scf | 344.6 | W. Wang – Unpubl. |
|
| AA | 435 | 5 | 99 | 5.9 | R. Wing – Unpubl. | chr | 335.7 | NCBI ALNW00000000 |
|
| BB | 425 | 3 | 97 | 6.6 | R. Wing – Unpubl. | chr | 393.8 | NCBI AVCL00000000 |
|
| CC | 651 | 3 | 107 | 9.6 | N. Kurata – Unpubl. | - | - | - |
|
| FF | 362 | 3 | 100 | 5.5 | SRA SRR350707 | chr | 260.8 | NCBI AGAT01000000 |
|
| - | 323 | 6 | 100 | 4.9 | R. Wing – Unpubl. | chr | 266.7 | NCBI ALNV00000000 |
aReferences: a: [10]; b: [29]; c: [30]; d: [31]; e: [32]; f: Arunuganathan K. pers. comm
bStatus: chr: chromosome pseudomolecules; scf: scaffolds
Full length TEs. Number of complete elements identified in the 12 genome assemblies. For each element type, both the total and non-redundant amount of complete elements are listed. The redundancy removal and the count of full-length Helitron copies were created following a different strategy (see Additional file 2)
| Species | LTR-R | TRIM |
|
| NA-DNAT |
| |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Total | Non-Red. | Total | Non-Red. | Total | Non-Red. | Total | Non-Red. | Total | Tot. Ends | FL copies | |
|
| 3240 | 1278 | 2911 | 11 | 280 | 230 | 10468 | 172 | 77 | - | - |
|
| 1498 | 945 | 2724 | 11 | 276 | 197 | 10504 | 172 | - | - | - |
|
| 1905 | 1253 | 3252 | 11 | 288 | 274 | 9596 | 168 | - | - | - |
|
| 1283 | 945 | 2248 | 11 | 232 | 175 | 10176 | 168 | - | - | - |
|
| 1520 | 865 | 2297 | 11 | 221 | 106 | 7416 | 162 | 93 | - | - |
|
| 1041 | 580 | 2384 | 11 | 244 | 206 | 9863 | 169 | - | - | - |
|
| 637 | 424 | 2470 | 11 | 264 | 200 | 9205 | 169 | - | - | - |
|
| 77 | 59 | 2344 | 11 | 344 | 293 | 5910 | 156 | - | 2279 | 338 |
|
| 339 | 288 | 2351 | 11 | 257 | 234 | 7316 | 162 | - | 2122 | 714 |
|
| 2906 | 2359 | 1659 | 7 | 120 | 103 | 6720 | 147 | - | 2437 | 949 |
|
| 395 | 360 | 1699 | 12 | 13 | 11 | 4963 | 104 | - | - | - |
|
| 872 | 810 | 1624 | 6 | 116 | 110 | 3541 | 115 | - | 893 | 196 |
| Total | 15713 | 10166 | 27963 | 124 | 2655 | 2139 | 95678 | 1864 | 170 | 7731 | 2197 |
Fig. 1Structure of the Rice TE database. The Rice TE database (RiTE-db) is composed of three data sets: publicly characterized TEs and repeats, de novo repeat libraries, and full-length elements isolated from genome assemblies. Result can be downloaded for the users' needs or used to build custom database quires for BLAST searches
Fig. 2RiTE database search function. Four main classifiers and checkboxes allow for the customization of search parameters (a), that are listed and visualized (b). Search results can be used as a dynamic BLAST database (c) or downloaded locally by using the links provided in via email
Fig. 3The RiTE database BLAST function. A query sequence can be aligned to the entire RiTE-db or to a customized subset of sequences. Alignment parameters can be tuned and the results can be visualized as pairwise alignments or in a tabular format