| Literature DB >> 25797062 |
Jingyin Yu1, Tao Ke1, Sadia Tehrim1, Fengming Sun1, Boshou Liao2, Wei Hua2.
Abstract
Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available.Entities:
Mesh:
Year: 2015 PMID: 25797062 PMCID: PMC4369376 DOI: 10.1093/database/bav017
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Schematic illustration of the PTGBase sitemap. (A) Analysis flowchart to generate the tandem arrays and tandem duplicated genes. (B) Diagram of the PTGBase web server. (C) Web interface of the PTGBase sitemap.
Statistics of tandem duplicated genes of genome-sequenced plant species in PTGBase
| Latin name | Common name | Genome size | Gene number | Number of TD genes | Number of TD clusters | Release version |
|---|---|---|---|---|---|---|
| Lyraterockcress | 206.7M | 32 670 | 3609 | 1485 | Version 1.0 (Apr 2011) | |
| Arabidopsis | 125M | 35 386 | 3503 | 1383 | TAIR 9.0 (Jun 2009) | |
| Heterokont algae | 57M | 11 501 | 176 | 84 | JGI 1.0 (Sep 2007) | |
| Purple false brome | 260M | 31 029 | 3233 | 1326 | Phytozome v6.0 | |
| Cabbage | 630M | 45 758 | 3443 | 1514 | Version 1.0 | |
| Chinese cabbage | 485M | 41 173 | 4501 | 1918 | Version 1.1 | |
| Pigeonpea | 833M | 48 680 | 3837 | 1736 | IIPG v1.0 | |
| Papaya | 370M | 27 950 | 1937 | 822 | ASGPB v1.0 | |
| Green algae | 130M | 17 113 | 1220 | 521 | Version 4.2 | |
| Microalgae | 46M | 9 791 | 451 | 207 | JGI 1.0 (Sep 2010) | |
| Chickpea | 738M | 28 269 | 1396 | 610 | Version 1.0 | |
| Watermelon | 425M | 23 440 | 2248 | 863 | Version 1.0 | |
| Orange | 367M | 36 450 | 3253 | 1261 | CITRUS v1.0 (2012) | |
| Cucumber | 350M | 26 682 | 2077 | 830 | Phytozome v6.0 | |
| Strawberry | 240M | 34 809 | 3823 | 1582 | GDR v1.0 | |
| Soybean | 1,100M | 75 778 | 5764 | 2384 | v1.1 (Jun 2013) | |
| Cotton | 761.4M | 40 976 | 4674 | 1866 | Version 2.1 | |
| Flax | 373M | 43 484 | 4169 | 1823 | Phytozome v9.1 v1.0 | |
| Lotus | 472M | 42 399 | 1211 | 543 | Release 2.5 | |
| Apple | 742M | 63 541 | 16 602 | 6638 | GDR v1.0 | |
| Barrel medic | 500M | 53 423 | 3761 | 1653 | Mt3.5 v3 (Jun 2011) | |
| Banana | 523M | 36 549 | 1352 | 571 | CIRAD v1.0 | |
| Rice | 466M | 67 393 | 4544 | 1931 | IRGSP v1.0 | |
| Diatom algae | 27.4M | 10 402 | 440 | 206 | JGI 2.0 (May 2007) | |
| Moss | 480M | 38 354 | 797 | 381 | Version 1.6 (Jan 2008) | |
| Western poplar | 480M | 45 033 | 5224 | 2084 | JGI 2.0 (Feb 2010) | |
| Plum flower | 280M | 31 390 | 5059 | 1985 | prunusmumegenome v1.0 | |
| Castor bean | 350M | 31 221 | 2613 | 1075 | Release 0.1 (May 2008) | |
| Selaginella | 212M | 22 285 | 1676 | 748 | Version 1.0 (Dec 2007) | |
| Sesame | 357M | 27 148 | 2848 | 1126 | Version 1.0 | |
| Millet | 490M | 38 801 | 4879 | 2027 | Phytozome v9.1 | |
| Tomato | 900M | 34 727 | 4173 | 1640 | Version 2.3 | |
| Potato | 844M | 39 031 | 4504 | 1839 | Version 3.4 | |
| Sorghum | 730M | 29 448 | 4256 | 1664 | Sbi 1.4 (Dec 2007) | |
| Rockface star-violet | 140M | 27 132 | 2316 | 980 | Thellungiella v2.0 | |
| Cacao | 430M | 46 143 | 3867 | 1604 | Release 0.9 (Sep 2010) | |
| Grape vine | 490M | 26 346 | 3668 | 1405 | Genoscope (Aug 2007) | |
| Green alga | 138M | 15 285 | 1402 | 595 | JGI 1.0 (Jun 2007) | |
| Maize | 2300M | 63 293 | 2837 | 1220 | Release 5a (Nov 2010) |
TD, tandem duplicated.
Functional classification of tandem duplicated genes in PTGBase
| Latin name | Common name | Numbers of tandem duplicated genes | InterPro | Gene Ontology | ||
|---|---|---|---|---|---|---|
| Gene number | Percentage (%) | Gene number | Percentage (%) | |||
| Lyraterockcress | 3609 | 3490 | 96.70 | 2428 | 67.28 | |
| Arabidopsis | 3503 | 3453 | 98.57 | 2579 | 73.62 | |
| Heterokont algae | 176 | 160 | 90.91 | 121 | 68.75 | |
| Purple false brome | 3233 | 3149 | 97.40 | 2401 | 74.27 | |
| Cabbage | 3443 | 3287 | 95.47 | 2345 | 68.11 | |
| Chinese cabbage | 4501 | 4238 | 94.16 | 3200 | 71.10 | |
| Pigeonpea | 3837 | 3532 | 92.05 | 2346 | 61.14 | |
| Papaya | 1937 | 1824 | 94.17 | 1353 | 69.85 | |
| Green algae | 1220 | 1068 | 87.54 | 552 | 45.25 | |
| Microalgae | 451 | 438 | 97.12 | 263 | 58.31 | |
| Chickpea | 1396 | 1284 | 91.98 | 971 | 69.56 | |
| Watermelon | 2248 | 2168 | 96.44 | 1679 | 74.69 | |
| Orange | 3253 | 3105 | 95.45 | 2385 | 73.32 | |
| Cucumber | 2077 | 2024 | 97.45 | 1520 | 73.18 | |
| Strawberry | 3823 | 3594 | 94.01 | 2649 | 69.29 | |
| Soybean | 5764 | 5646 | 97.95 | 4335 | 75.21 | |
| Cotton | 4674 | 4357 | 93.22 | 3235 | 69.21 | |
| Flax | 4169 | 4061 | 97.41 | 3012 | 72.25 | |
| Lotus | 1211 | 1177 | 97.19 | 888 | 73.33 | |
| Apple | 16 602 | 14 085 | 84.84 | 10 361 | 62.41 | |
| Barrel medic | 3,761 | 3498 | 93.01 | 2544 | 67.64 | |
| Banana | 1352 | 1328 | 98.22 | 1082 | 80.03 | |
| Rice | 4544 | 4374 | 96.26 | 3110 | 68.44 | |
| Diatom algae | 440 | 406 | 92.27 | 206 | 46.82 | |
| Moss | 797 | 720 | 90.34 | 530 | 66.50 | |
| Western poplar | 5224 | 5100 | 97.63 | 3839 | 73.49 | |
| Plum flower | 5059 | 4805 | 94.98 | 3551 | 70.19 | |
| Castor bean | 2613 | 2556 | 97.82 | 1908 | 73.02 | |
| Selaginella | 1676 | 1483 | 88.48 | 954 | 56.92 | |
| Sesame | 2848 | 2697 | 94.70 | 2073 | 72.79 | |
| Millet | 4879 | 4582 | 93.91 | 3255 | 66.71 | |
| Tomato | 4173 | 3960 | 94.90 | 2962 | 70.98 | |
| Potato | 4504 | 4008 | 88.99 | 2932 | 65.10 | |
| Sorghum | 4256 | 4169 | 97.96 | 3133 | 73.61 | |
| Rockface star-violet | 2316 | 2230 | 96.29 | 1686 | 72.80 | |
| Cacao | 3867 | 3787 | 97.93 | 2858 | 73.91 | |
| Grape vine | 3668 | 3596 | 98.04 | 2909 | 79.31 | |
| Green alga | 1402 | 1087 | 77.53 | 615 | 43.87 | |
| Maize | 2837 | 2379 | 83.86 | 1647 | 58.05 | |
Figure 2.Major browsing function modules of PTGBase. (A) Overview of browsing functions for tandem arrays in PTGBase. (B) Browsing the tandem duplicated genes by tandem array in special plants. (C) Detailed annotation of a tandem duplicated gene cluster. (D) Detailed annotation of tandem duplicated genes in PTGBase.