| Literature DB >> 15833117 |
Xiu-Jie Wang1, Terry Gaasterland, Nam-Hai Chua.
Abstract
BACKGROUND: Natural antisense transcripts (NAT) are a class of endogenous coding or non-protein-coding RNAs with sequence complementarity to other transcripts. Several lines of evidence have shown that cis- and trans-NATs may participate in a broad range of gene regulatory events. Genome-wide identification of cis-NATs in human, mouse and rice has revealed their widespread occurrence in eukaryotes. However, little is known about cis-NATs in the model plant Arabidopsis thaliana.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15833117 PMCID: PMC1088958 DOI: 10.1186/gb-2005-6-4-r30
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Relationships between NAT pairs from different datasets. (a) Overlap between cDNA-NAT pairs and genomic-NAT pairs. Among the 332 cDNA-NAT pairs, 145 pairs have corresponding annotated genes for both transcripts. For the other 187 cDNA-NAT pairs, at least one transcript has no counterpart in the current Arabidopsis genome annotation. (b) Overlap between cDNA-, genomic- and genomic-cDNA-NAT pairs. All cDNA-NAT pairs are included in genome-cDNA-NAT pairs. Blue circle, cDNA-NATs; red circle, genomic-NATs; green circle, genomic-cDNA-NATs.
Structure analysis of NAT pairs
| Category | Number of pairs | |||
| cDNA-NAT | genomic-NAT | genomic-cDNA-NAT | Total | |
| Tail to tail (3' to 3') | 181 | 737 | 48 | 966 (72.1%) |
| Head to head (5' to 5') | 97 | 31 | 57 | 185 (13.8%) |
| One transcript contained entirely within the other transcript | 51 | 35 | 90 | 176 (13.1%) |
| Two transcripts overlap only within introns | 3 | 4 | 6 | 13 (1.0%) |
| Total | 332 | 807 | 201 | 1,340 (100%) |
Figure 2Distribution of genomic overlap lengths of NATs. The overlap length of each NAT pair in exons was calculated. The number of NAT pairs (y-axis) is plotted against the overlap lengths (in nucleotides) of exons in each NAT pair (x-axis).
Chromosomal distribution of NAT pairs
| Chromosome | Number of NAT pairs | Chromosome size (Mb) | |||
| cDNA-NAT | genomic-NAT | genomic-cDNA-NAT | Total | ||
| 1 | 85 | 216 | 55 | 356 | 29.1 |
| 2 | 41 | 120 | 40 | 201 | 19.6 |
| 3 | 69 | 142 | 46 | 257 | 23.2 |
| 4 | 48 | 129 | 29 | 206 | 17.5 |
| 5 | 89 | 200 | 31 | 320 | 26.0 |
| Total | 332 | 807 | 201 | 1340 | 115.4 |
Splicing pattern and coding potential of Arabidopsis full-length cDNAs and annotated genes
| UniGene cDNAs | RIKEN cDNAs | The | |
| Total transcripts | 20,683 | 13,181 | 29,993 |
| Number of transcripts with perfect genome match | 17,814 | 12,877 | 29,993 |
| Number of transcripts with ORFs | 16,621 | 12,544 | 26,207 |
| Number of non-spliced transcripts with ORFs | 2,534 | 1,555 | 4,722 |
| Number of transcripts without ORFs | 1,193 | 333 | 3,786 |
| Number of non-spliced transcripts without ORFs | 466 | 130 | 3,786 |
The splicing pattern of each transcript was obtained by aligning its corresponding cDNA sequences to the Arabidopsis genome using sim4. The coding potential of the genomic sequence of each transcript was examined by GeneScan.
Summary of MPSS matches for NAT pairs
| Number of NAT pairs | ||||
| cDNA-NAT | genomic-NAT | genomic-cDNA-NAT | Total | |
| Total NAT pairs | 332 | 807 | 201 | 1,340 |
| Number of pairs with MPSS matches on both strands | ||||
| Total | 103 | 293 | 59 | 455 |
| Expressed absolutely in different libraries | 14 | 49 | 15 | 78 |
| Expressed mainly in different libraries, occasionally in same libraries | 89 | 244 | 44 | 377 |
Examples of NAT pairs with MPSS matches on both strands
| ID | Strand | Libraries | |||||||||||||
| CAF | INF | LEF | ROF | SIF | AP1 | AP3 | AGM | INS | ROS | SAP | S04 | S52 | LES | ||
| Pair A | |||||||||||||||
| At1g09750 | + | ||||||||||||||
| At1g09760 | - | ||||||||||||||
| Pair B | |||||||||||||||
| At1g72060 | + | N | 2 | N | N | N | N | N | 2 | 1 | |||||
| At1g72070 | - | N | 1 | N | N | N | N | N | N | N | |||||
Distinct expression of sense and antisense transcripts of NAT pair A was observed in all but one library. In the library where both transcripts of pair A were expressed, the abundance of one transcript was significantly higher than the other. For NAT pair B, the sense and antisense transcripts were expressed differentially in different libraries. Libraries in which both transcripts of a NAT pairs were expressed are shown in bold; libraries in which transcripts of only one gene of a NAT pairs were expressed are shown in italics. Abbreviations for libraries: CAF, callus - actively growing, classic MPSS; INF, infloresence - mixed stage, immature buds, classic MPSS; LEF, leaves - 21 day, untreated, classic MPSS; ROF, root - 21 day, untreated, classic MPSS; SIF, silique - 24-48 h post-fertilization, classic MPSS; AP1, ap1-10 infloresence - mixed stage, immature buds; AP3, ap3-6 infloresence - mixed stage, immature buds; AGM, agamous infloresence - mixed stage, immature buds; INS, infloresence - mixed stage, immature buds; ROS, root - 21 day, untreated; SAP, sup/ap1 infloresence - mixed stage, immature buds; S04, leaves, 4 h after salicylic acid treatment; S52, leaves, 52 h after salicylic acid treatment; LES, leaves - 21 day, untreated.
Figure 3Distribution of coexpressed and dominantly expressed NAT pairs in different libraries. The number of coexpressed NAT pairs in each library was shown in blue bar and that of dominantly expressed NAT pairs in red bar. See legend of Table 5 for library information.
siRNA matches of NAT pairs
| Category of NAT pairs | Gene ID | Strand | Overlap length (nucleotides) | Description |
| Genomic-NAT | At2g06510 | + | 506 | Replication protein, putative |
| At2g06520 | - | Membrane protein, putative | ||
| At4g35850 | + | 360 | Pentatricopeptide (PPR) repeat-containing protein | |
| At4g35860 | - | Ras-related GTP-binding protein, putative | ||
| At5g20720 | + | 294 | Chaperonin, chloroplast | |
| At5g20730 | - | Auxin-responsive factor | ||
| At5g41680 | + | 587 | Protein kinase family protein | |
| At5g41685 | - | Mitochondrial import receptor subunit TOM7 | ||
| At5g48870 | + | 118 | Small nuclear ribonucleoprotein, putative | |
| At5g48880 | - | Acetyl-CoA C-acyltransferase 1 | ||
| cDNA-NAT | RAFL19-56-G17 | + | 1,209 | No coding potential |
| RAFL09-70-E21 | - | Expressed protein | ||
| At#S18901030 | + | 52 | Putative transcription factor | |
| At#S18898439 | - | Pentatricopeptide (PPR) repeat containing protein | ||
| At#S18900150 | + | 884 | No coding potential | |
| At#S18898471 | - | expressed protein | ||
| At#S18912025 | + | 1,149 | No coding potential | |
| At#S18898946 | - | TCP family transcription factor | ||
| Genomic-cDNA-NAT | At1g07725 | + | 1,640 | Exocyst subunit EXO70 family protein |
| At#S18898556 | - | No coding potential | ||
| At2g16587 | + | 379 | expressed protein | |
| RAFL19-48-E15 | - | No coding potential |
Conserved NAT pairs of Arabidopsis and rice
| ID | Strand | Overlap pattern | Overlap length (nucleotides) | Description | ||
| NAT pair 1 | At5g02820 | + | Tail to tail | 1,138 | DNA topoisomerase VIA | |
| At5g02830 | - | PPR repeat-containing protein | ||||
| Rice | J033010B03 | + | Tail to tail | 1 | DNA topoisomerase VIA | |
| J013135M09 | - | PPR repeat-containing protein | ||||
| NAT pair 2 | At5g54270 | + | Tail to tail | 1,047 | Chlorophyll A-B binding protein | |
| At5g54280 | - | Myosin heavy chain | ||||
| Rice | 006-301-C08 | + | Tail to tail | 4,425 | Chlorophyll A-B binding protein | |
| J013155K02 | - | Myosin heavy chain |