| Literature DB >> 18474116 |
Jeremy D DeBarry1, Renyi Liu, Jeffrey L Bennetzen.
Abstract
BACKGROUND: Higher eukaryotic genomes are typically large, complex and filled with both genes and multiple classes of repetitive DNA. The repetitive DNAs, primarily transposable elements, are a rapidly evolving genome component that can provide the raw material for novel selected functions and also indicate the mechanisms and history of genome evolution in any ancestral lineage. Despite their abundance, universality and significance, studies of genomic repeat content have been largely limited to analyses of the repeats in fully sequenced genomes.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18474116 PMCID: PMC2412881 DOI: 10.1186/1471-2105-9-235
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Schematic of the AAARF algorithm. Shown is an illustration of the AAARF process used to assemble builds representing high copy number repeat families from sample sequence data.
Sanger and 454 results compared to the seven most abundant LTR retrotransposon families in maize
| 10.7 | 11–14 | 1.6 | 23706 (FL) | 8 | 11269 (F) | 10 | |
| 9.4 | 8.5–10 | 1.3 | 10041 (FL) | 4 | 9432 (FL) | 3 | |
| 7.1 | 6.5–9 | 1.3 | 8150 (FL) | 3 | 9599 (FL) | 2 | |
| 4.8 | 7.3 | 0.6 | 7412 (FL) | 7 | 1225 (I) | 4 | |
| 3.9 | 10.5–13.5 | 0.6 | 8469 (F) | 4 | 6459 (I) | 3 | |
| 3.5 | 8.5 | 0.6 | 7264 (F) | 5 | 1600 (I) | 4 | |
| 3.1 | 11.7 | 2.7 | 8971 (I) | 1 | 2107 (I) | 5 |
(a) Percent genome composition data from Meyers et al. 2001.
(b) Xilon percent genome composition from Meyers, pers. comm.
FL = Builds representing full-length copies of repeat families, F = Builds representing fragmented copies of repeat families, I = Incomplete builds
Overall results of AAARF tests of Sanger and simulated 454 data sets
| 10000 | 7821671 (0.33%) | 180 | 46 | 5 | 49 | 23 | 57 | |
| 50419 | 5045000 (0.21%) | 63 | 2 | 2 | 12 | 1 | 46 |
(a) Builds not further analyzed
Parameter settings for Sanger and 454 tests
| 150 | 89 | 1.00E-25 | 150 | 3 | 2 | |
| 30 | 88 | 1.00E-10 | 30 | 3 | 2 | |
| 150 | 50 | 90 | 1.00E-10 | 13 | ||
| 30 | 40 | 15 | 10 | 4 | BL2SEQ Word Size: 7 | |
Figure 2Comparison of an AAARF-produced build to the sample sequence dataset. Shown is the BLAST result of an AAARF-generated build compared to the sample sequence dataset used to create it. The bottom, metered line represents the full-length Opie build (8,150 bp) from the Sanger sequence test. Smaller lines above represent sample sequences. Regions of shared similarity between the build and the sample sequences are indicated by the position of the sample sequences relative to the build. A region of increased coverage on the build, combined with sample sequences whose similarity to the build stops at the same position (boxed area), indicates the likely presence of an LTR.