| Literature DB >> 28158585 |
Cibele G Sotero-Caio1, Roy N Platt1, Alexander Suh2, David A Ray1.
Abstract
Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes.Entities:
Keywords: retrotransposons; transposable element; transposons; vertebrate
Mesh:
Substances:
Year: 2017 PMID: 28158585 PMCID: PMC5381603 DOI: 10.1093/gbe/evw264
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Examples of the major TE types and a general classification scheme. Structural features are presented as follows: LTR retrotransposons – nucleocapsid protein (GAG), envelope protein (ENV), polyprotein (POL) that includes a protease (PRO), integrase (IN), reverse transcriptase (RT), and an RNase H (RH). Non-LTR retrotransposons – nuclear chaperone protein (ORF1), endonuclease domain (EN), RNA polymerase III promoter (A and B), poly-A or other repetitive tail (A(n) or ATTCTRTG(n)). Class II elements – transposase (TPASE), zinc finger domain (ZnF), replicase (REPL), helicase (HELI), polymerase B (PolB), ATPase (ATP), proteins of unknown function (?) and integrase (INT). Coding regions are depicted as boxes. Non-coding regions are depicted as lines. In some instances, the ENV gene in endogenous retroviruses may be missing (dotted line). Triangles represent repeated DNA sequences and the orientation of the triangle reflects the orientation of terminal repeats, if present.
. 2.—(A) A general phylogeny of vertebrates. (B) Pie charts comparing the percentage of a genome derived from TEs. The area of the pie chart is proportional to genome size except for the west African lungfish and Axolotl. The genomes of these species are exceptionally large. A scaled pie-chart representing the human genome is presented for to illustrate this aspect. Please note that in all cases, the TE content is based on estimates made using the methods by the individual research teams studying each genome. Thus, the methods employed varied and represent the authors’ best estimates obtained using those methods.