| Literature DB >> 18416813 |
Giulia Soldà1, Mikita Suyama, Paride Pelucchi, Silvia Boi, Alessandro Guffanti, Ermanno Rizzi, Peer Bork, Maria Luisa Tenchini, Francesca D Ciccarelli.
Abstract
BACKGROUND: Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm).Entities:
Mesh:
Year: 2008 PMID: 18416813 PMCID: PMC2330155 DOI: 10.1186/1471-2164-9-174
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Classes of overlapping genes. OGC classification was based on the overlap extent (complete or partial) and on the reciprocal direction of transcription of the involved genes (same or opposite strand). Convergent overlaps involve the 3' termini of both genes, while divergent overlaps involve the 5' ends (UTR and/or CDS). Complete overlap occurs when the entire sequence of one gene is contained within another gene. In nested OGCs one gene lies completely within an intron of the other, while embedded genes can share more than one intron or exon.
Overlapping genes in five Metazoa.
| 23073 | 17794 | 663 | 2374.1 | 749 | 4954.0 | 1409 | 7.9 | 6630.8 | 37.3 | |
| 17970 | 17040 | 656 | 2112.9 | 662 | 4293.7 | 1400 | 8.2 | 5873.7 | 34.5 | |
| 6672 | 6506 | 108 | 396.9 | 155 | 524.2 | 262 | 4.0 | 899.1 | 13.8 | |
| 18768 | 13416 | 1505 | 2172.3 | 2022 | 7483.1 | 3514 | 26.2 | 7876.0 | 58.7 | |
| 21124 | 19359 | 404 | 3615.6 | 494 | 8653.1 | 898 | 4.6 | 10442.9 | 53.9 |
Unique Genes refer to the actual number of sequences used in the analysis, after filtering for splice variants. For each species, the counts of overlapping genes (OGs), overlapping gene pairs (OG pairs), and overlapping gene clusters (OGCs) coming from both real data and random simulations are shown. In the latter case the average number over ten simulations is reported together with the standard deviation (SD). Abbreviations: Hs, Homo sapiens;Mm, Mus musculus;Dr, Danio rerio;Dm, Drosophila melanogaster; Ce, Caenorabditis elegans.
Figure 2Comparative analysis of OGCs in Metazoa. For all species, the bar corresponding to each analyzed feature of the observed overlapping gene sets is followed by the bar corresponding to the random expectation. Since the simulations were repeated ten times, the corresponding standard deviation is associated to the random bars. A. OGC composition. OGCs were analyzed on the basis of the number of genes composing each cluster. The OGCs with more than 4 components are 5 in human, 11 in mouse, 5 in zebra fish, 48 in fly and 7 in worm. B. Type of overlap. Occurrence of partial and complete overlaps in both 2-component and multicomponent OGCs. C. Gene reciprocal arrangement. Distribution of OGCs according to the overlap type (refer to Figure 1). D. Features of the overlapping regions. The plot reports the number of overlaps involving coding sequence for one (CDS/UTR or CDS/intron overlaps) or both genes and the number of overlap involving only noncoding sequence (UTR/UTR and UTR/intron).
Figure 3Conservation of overlapping genes and OGCs within Metazoa. A. Schematic representation of the procedure for detecting the conservation of overlapping genes (red spots) and OGCs (red pairs) between two species. The same pipeline was applied to each pair of species considered in the analysis. B. Pairwise conservation of overlapping genes within Metazoans. In the first column, the numbers in brackets represent the total number of overlapping genes for that species. C. Pairwise conservation of OGCs within Metazoa.
Overlapping genes conservation between human and mouse.
| 749 | 282 | 37.65 | |
| partial convergent | 328 | 153 | 46.65 |
| partial divergent | 115 | 14 | 12.17 |
| partial parallel | 33 | 5 | 15.15 |
| nested antiparallel | 152 | 79 | 51.97 |
| nested parallel | 75 | 22 | 29.33 |
| embedded antiparallel | 16 | 3 | 18.75 |
| embedded parallel | 30 | 6 | 20.00 |
Conservation of overlapping gene (OG) pairs according to their reciprocal arrangement.
Gene structure comparison between human and mouse.
| 68.5 kb | 58.6 kb | 49.3 kb | 31.4 kb | 35.5 kb | 31.4 kb | 55.4 kb | 39.6 kb | |
| 12.4 | 12.2 | 11.3 | 10.8 | 11.45 | 11.4 | 10.2 | 9.0 | |
| 174 | 200 | 184 | - | - | 154 | - | - | |
| 108 | 82 | 42 | - | - | 16 | - | - | |
The number of overlapping gene pairs in each analyzed dataset is reported in brackets. CDS overlaps refer to the overlapping genes whose CDS coordinates are superimposed, while UTR overlaps refer to those cases where the gene coordinates (calculated from transcript start to transcript end) are superimposed.
Figure 4Analysis of the co-ordinate expression of human overlapping genes. Expression patterns of human overlapping genes on the basis of their reciprocal arrangement.