| Literature DB >> 22217600 |
Yupeng Wang1, Haibao Tang, Jeremy D Debarry, Xu Tan, Jingping Li, Xiyin Wang, Tae-ho Lee, Huizhe Jin, Barry Marler, Hui Guo, Jessica C Kissinger, Andrew H Paterson.
Abstract
MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/.Entities:
Mesh:
Year: 2012 PMID: 22217600 PMCID: PMC3326336 DOI: 10.1093/nar/gkr1293
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The structure of the MCScanX package illustrating major components and their dependencies.
Figure 2.Sample HTML output displaying multiple alignments of collinear blocks by MCScanX. The first and second columns show duplication depth and gene symbol at each locus of reference chromosomes, where tandems are marked in red. The remaining columns show aligned collinear blocks, where only the symbols of anchor genes are shown.
Figure 3.Different types of plots showing patterns of synteny and collinearity: (A) dual synteny plot, (B) circle plot, (C) dot plot and (D) bar plot, generated by ‘dual synteny plotter, circle plotter, dot plotter and bar plotter’, respectively. Chromosomes are labeled in the format ‘species abbreviation’ + ‘chromosome ID’. os, Oryza sativa; sb, Sorghum bicolor.
Numbers of collinear ortholog pairs and total ortholog pairs and percentage of collinear ortholog pairs in selected angiosperm genomes
| Species | No. of collinear ortholog pairs, No. of total ortholog pairs and percentage of collinear ortholog pairs | ||||||
|---|---|---|---|---|---|---|---|
| Pt | Gm | Vv | Os | Bd | Sb | Zm | |
| At | 14 278, 46 944, 30.4% | 17 498, 58 038, 30.1% | 7378, 24 086, 30.6% | 319, 24 992, 1.3% | 202, 22 719, 0.9% | 350, 24 120, 1.5% | 142, 24 689, 0.6% |
| Pt | – | 34 545, 92 901, 37.2% | 15 734, 38 727, 40.6% | 2121, 37 575, 5.6% | 1632, 32 790, 5.0% | 1523, 36 059, 4.2% | 687, 35 596, 1.9% |
| Gm | – | – | 18 310, 47 652, 38.4% | 1437, 46 916, 3.1% | 1308, 43 130, 3.0% | 1263, 46 631, 2.7% | 501, 47 326, 1.1% |
| Vv | – | – | – | 1315, 19 678, 6.7% | 981, 18 080, 5.4% | 1194, 19 137, 6.2% | 293, 19 501, 1.5% |
| Os | – | – | – | – | 15 492, 34 413, 45.0% | 15 664, 39 695, 39.5% | 14 112, 35 206, 40.1% |
| Bd | – | – | – | – | – | 14 070, 32 701, 43.0% | 13 111, 30 841, 42.5% |
| Sb | – | – | – | – | – | – | 18084, 36826, 49.1% |
At, Arabidopsis thaliana; Pt, Populus trichocarpa; Gm, Glycine max; Vv, Vitis vinifera; Os, Oryza sativa; Bd, Brachypodium distachyon; Sb, Sorghum bicolor; Zm, Zea mays.
Numbers of genes from different origins as classified by duplicate gene classifier in eight angiosperm genomes
| Species | No. of genes | No. of genes from different origins (percentage) | ||||
|---|---|---|---|---|---|---|
| Singletons | WGD | Tandem | Proximal | Dispersed | ||
| 27 105 | 5272 (19.5) | 7321 (27.0) | 769 (2.8) | 892 (3.3) | 12 851 (47.4) | |
| 40 650 | 5014 (12.3) | 20 989 (51.6) | 713 (1.8) | 999 (2.5) | 12 935 (31.8) | |
| 46 360 | 1459 (3.1) | 35 233 (76.0) | 582 (1.3) | 670 (1.4) | 8416 (18.2) | |
| 23 647 | 6275 (26.5) | 3539 (15.0) | 688 (2.9) | 1590 (6.7) | 11 555 (48.9) | |
| 40 634 | 12 720 (31.3) | 5896 (14.5) | 960 (2.4) | 2184 (5.4) | 18 874 (46.4) | |
| 25 524 | 4842 (19.0) | 4575 (17.9) | 697 (2.7) | 827 (3.2) | 14 583 (57.1) | |
| 34 564 | 5839 (16.9) | 5260 (15.2) | 895 (2.6) | 1283 (3.7) | 21 287 (61.6) | |
| 39 365 | 8212 (20.9) | 11 506 (29.2) | 774 (2.0) | 1175 (3.0) | 17 698 (45.0) | |
Figure 4.Circle plot showing collinearity in the MADS box gene family over the gray background of collinearity in Arabidopsis (the collinear blocks in Arabidopsis). The circle plot can be generated by ‘family tree plotter’. Chromosomes are labeled in the format ‘species abbreviation’ + ‘chromosome ID’. at, Arabidopsis thaliana.
Figure 5.Phylogenetic tree of the MADS box gene family in Arabidopsis annotated with collinear and tandem relationships. Curves connecting pairs of gene names suggest either the collinear relationship (red) or tandem relationship (blue). This annotated tree is output from ‘family tree plotter’.
Functional comparison of different synteny and collinearity detection tools (‘+’ and ‘−’ represent ‘yes’ and ‘no’, respectively)
| Tool | Year published | Graphic visualization | Multiple genomes | Multi-alignments | Evolutionary analyses of synteny and collinearity | Analyses of gene families |
|---|---|---|---|---|---|---|
| i-ADHoRe 3 | 2011 | + | + | + | − | − |
| LineUp | 2003 | − | − | − | − | − |
| TEAM | 2003 | − | + | − | − | − |
| MCMuSeC | 2009 | − | + | − | − | − |
| OrthoClusterDB | 2009 | + | + | + | − | − |
| DiagHunter | 2003 | + | − | − | − | − |
| DAGChainer | 2004 | + | − | − | − | − |
| ColinearScan | 2006 | − | − | − | − | − |
| MCScan | 2008 | − | + | + | − | − |
| SyMAP 3.4 | 2011 | + | + | − | − | − |
| FISH | 2003 | − | − | − | − | − |
| Cyntenator | 2010 | − | + | + | − | − |
| MicroSyn | 2011 | + | + | − | − | + |
| Cinteny | 2007 | + | + | − | − | − |
| + | + | + | + | + |