| Literature DB >> 15960803 |
Chih-Hung Jen1, Ioannis Michalopoulos, David R Westhead, Peter Meyer.
Abstract
BACKGROUND: Overlapping transcripts in antisense orientation have the potential to form double-stranded RNA (dsRNA), a substrate for a number of different RNA-modification pathways. One prominent route for dsRNA is its breakdown by Dicer enzyme complexes into small RNAs, a pathway that is widely exploited by RNA interference technology to inactivate defined genes in transgenic lines. The significance of this pathway for endogenous gene regulation remains unclear.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15960803 PMCID: PMC1175971 DOI: 10.1186/gb-2005-6-6-r51
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1A comparison of the arrangements of overlapping gene pairs in Arabidopsis thaliana. A and A' label the start and end of the sense transcript, B' and B label the start and end of the antisense transcript. The total number of genes involved in group 1, 2 and 3 is 2,157, of which 2,147 are unique; the remaining 10 comprise four genes that are members of both group 1 and group 2 pairs, five genes that are members of both group 1 and group 3 pairs, and 1 gene that is a member of both a group 1 and group 3 pair.
Figure 2The organization of convergent overlapping gene pairs with respect to the protein coding capacity of the sense and antisense transcripts.
COPs with sense-antisense overlaps within the coding regions
| Sense gene ID | Annotation | Antisense gene ID | Annotation | ORF overlap (bp) |
| AT1G08260 | DNA-directed DNA polymerase epsilon catalytic subunit, putative | AT1G08270 | Expressed protein | 45 |
| AT1G52010 | Mutator-like transposase family | AT1G52020 | Pseudogene, Ulp1 protease family | 44 |
| AT1G52087 | Hypothetical protein | AT1G52090 | Hypothetical protein | 72 |
| AT1G68935 | Expressed protein | AT1G68940 | Armadillo/beta-catenin repeat protein-related/U-box domain-containing protein | 698 |
| AT2G12855 | Gypsy-like retrotransposon family | AT2G12860 | Gypsy-like retrotransposon family | 116 |
| AT2G19330 | Leucine-rich repeat family protein | AT2G19340 | Membrane protein, putative | 141 |
| AT3G59940 | Kelch repeat-containing F-box family protein | AT3G59950 | Autophagy 4b (APG4b) | 10 |
| AT4G02200 | Drought-responsive family protein | AT4G02210 | Expressed protein | 13 |
| AT4G21366 | S-locus protein kinase-related | AT4G21370 | S-locus protein kinase, putative | 72 |
| AT4G29830 | Transducin family protein/WD-40 repeat family protein | AT4G29840 | Threonine synthase, chloroplast | 587 |
| AT5G18210 | Short-chain dehydrogenase/reductase (SDR) family protein | AT5G18220 | Glycosyl hydrolase family 17 protein | 6 |
| AT5G28232 | Mutator-like transposase family | AT5G28235 | Ulp1 protease family protein | 29 |
| AT5G48200 | Hypothetical protein | AT5G48205 | Hypothetical protein | 334 |
Homology assessment for 89 COPs families that contain 2-11 family members
| Number of COPs family | Number of family members | Sense-gene-encoded proteins with a similarity E-value < 0.001 | 1 kb sense promoter regions with a similarity E-value < 0.001 | Antisense-gene-encoded proteins with a similarity E-value < 0.001 | 1 kb antisense promoter regions with a similarity E-value < 0.001 |
| 1 | 11 | 11 | 7 | 2 | 2 |
| 2 | 8 | 8 | 2 | 0 | 0 |
| 3 | 8 | 8 | 0 | 0 | 2 |
| 4 | 7 | 7 | 0 | 0 | 0 |
| 5-6 | 7 | 7 | 2 | 0 | 0 |
| 7 | 6 | 6 | 2 | 2 | 0 |
| 8-9 | 5 | 5 | 0 | 0 | 0 |
| 10 | 4 | 4 | 2 | 0 | 0 |
| 11-12 | 4 | 4 | 0 | 0 | 0 |
| 13-14 | 4 | 4 | 0 | 2 | 0 |
| 15 | 4 | 4 | 2 | 2 | 0 |
| 16-21 | 3 | 3 | 0 | 0 | 0 |
| 22-72 | 2 | 2 | 0 | 0 | 0 |
| 73-75 | 2 | 2 | 2 | 0 | 0 |
| 76-77 | 2 | 2 | 0 | 2 | 2 |
| 78-89 | 2 | 2 | 0 | 2 | 0 |
The numbers refer to the family members that share sequence similarity of an E-value below 0.001 with at least one other family member. Among the COPs families, the homology is well conserved among sense-gene-encoded proteins, while sequence conservation is rare among antisense-gene-encoded proteins. With the exception of family 1 sense gene promoters, the homology is also poorly conserved among promoter regions of sense and antisense genes.
Expression analysis of 1,866 COPs genes based on expression data from the GSE636 annotated gene-expression database
| Tissue | % of expressed genes among 26,939 | % of expressed genes among 1,866 overlapping COPs genes | % of COPs with jointly expressed sense and antisense genes (observed value) | % of COPs with jointly expressed sense and antisense genes (expected value) |
| Flowers | 52.5% | 67.9% | 45.6% | 46.1% |
| Roots | 51.4% | 63.4% | 38.5% | 40.2% |
| Suspension culture | 53.1% | 66.3% | 42.7% | 44.0% |
| 7 day old seedlings | 49.8% | 64.1% | 40.4% | 41.1% |
Expression of 1,596 COPs genes based on the NASC microarray database
| Tissue | % of expressed genes among 22,746 | % of expressed genes among 1,596 overlapping COPs genes | % of COPs with jointly expressed sense and antisense genes (observed value) | % of COPs with jointly expressed sense and antisense genes (expected value) |
| Flowers | 62.9% | 82.5% | 67.7% | 68.1% |
| Pollen | 31.7% | 36.0% | 12.3% | 13.0% |
| Seedlings, green parts | 57.3% | 76.7% | 57.8% | 58.8% |
| Cotyledons | 55.7% | 75.3% | 56.1% | 56.7% |
| Leaves | 55.6% | 74.8% | 54.5% | 56.0% |
| Roots | 62.6% | 76.9% | 58.3% | 59.1% |
| Hypocotyl | 62.9% | 82.1% | 66.8% | 67.4% |
Representation of spliced genes among COPs, and correlation analysis for transcript modifications among these genes
| Total genes | COPs | ||
| COP genes show a strong positive bias for splicing | |||
| Total | 30,624 | 1,912 | |
| Spliced | 21,157 (69.1%) | 1,723 (90.1%) | 4.7e-113 |
| Spliced COP genes show a positive bias for alternative splicing | |||
| Spliced | 21,157 | 1,723 | |
| Alternatively spliced | 2,331 (11.0%) | 268 (15.6%) | 1.3e-9 |
| Alternatively spliced COP genes do not show a significant bias for alternative splicing at the last intron, TSS variation or polyadenylation site variation | |||
| Alternatively spliced | 2,331 | 268 | |
| Last intron alternative splicing | 1,662 (71.3%) | 195 (72.8%) | 0.31 |
| TSS variation | 1,424 (61%) | 158 (59.0%) | 0.24 |
| Polyadenylation site variation | 1,019 (43.7%) | 107 (39.9%) | 0.10 |
Analysis of preferences for alternative splicing and polyadenylation site variation among spliced COPs genes, in dependence of the termination site of the antisense transcript
| Spliced COP genes with an antisense transcript not overlapping a sense transcript intron region, show a significant negative bias for alternative splicing | |||
| COPs genes | COPs with antisense gene ending 3,000-0 bp before the sense I/E boundary | ||
| Spliced genes | 1,723 | 1,497 | |
| Alternatively spliced genes | 268 (15.6%) | 217 (14.5%) | 0.0018 |
| Spliced COPs genes with an antisense transcript overlapping a sense transcript intron region, show a significant positive bias for alternative splicing | |||
| COPs genes | COPs with an antisense gene ending 0-3,000 bp behind the sense I/E boundary | ||
| Gene with splicing | 1,723 | 226 | |
| Alternative splicing | 268 (15.6%) | 51 (22.6%) | 0.0018 |
| COPs genes | COPs with an antisense gene ending > 40 bp behind the sense I/E boundary | ||
| Gene with splicing | 1,723 | 129 | |
| Alternative splicing | 268 (15.6%) | 35 (27.1%) | 0.00032 |
| Alternatively spliced COPs sense genes with an antisense transcript ending more than 40 bp behind their last I/E boundary, show a positive bias for polyadenylation site variation | |||
| COPs genes | COPs with an antisense gene ending > 40 bp behind the sense I/E boundary | ||
| Alternatively spliced | 268 | 35 | |
| Polyadenylation site variation | 107 (39.9%) | 25 (71.4%) | 5.5e-05 |
Figure 3Illustration of the distance between the end of the antisense transcript and the last intron-exon boundary of the sense transcript. Negative values refer to a termination of the antisense transcript 5' to the intron-exon boundary.