| Literature DB >> 24130698 |
Keiichi Mochida1, Yukiko Uehara-Yamaguchi, Fuminori Takahashi, Takuhiro Yoshida, Tetsuya Sakurai, Kazuo Shinozaki.
Abstract
A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the -3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a "one-stop" information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24130698 PMCID: PMC3793998 DOI: 10.1371/journal.pone.0075265
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Collection of RNA samples for constructing a Brachypodium full-length cDNA library.
| Tissue | Treatment | Growth condition | |
| Normal tissues | Seed at germination | Hydro-culture, growth chamber | |
| Shoot | Hydro-culture, growth chamber | ||
| Leaf at vegetative stage | Pot soil, green house | ||
| Leaf after flowering | Pot soil, green house | ||
| Root | Hydro culture, growth chamber | ||
| Crown | Hydro culture, growth chamber | ||
| Spikelet at flowering | Pot soil, green house | ||
| Spikelet (DAP1–5) | Pot soil, green house | ||
| Spikelet (DAP7–10) | Pot soil, green house | ||
| Spikelet (DAP20–30) | Pot soil, green house | ||
| Callus | Culture on agarose gel, growth chamber | ||
| Stress tissues | Leaves at 2 weeks after germination | 250 mM NaCl 5 h | Hydro-culture, growth chamber |
| 250 mM NaCl 24 h | |||
| 100 µM ABA 5 h | |||
| 100 µM ABA 24 h | |||
| Cold stress 4°C 5 h | |||
| Cold stress 4°C 24 h | |||
| Drought stress (on filter paper) 5 h | |||
| Drought stress (on filter paper) 24 h | |||
| Heat shock 42°C 2 h | |||
| Wounding 1 h | |||
Sequence resources to update structural gene annotation in the Brachypodium genome.
| No. sequences | Min. length (bp) | Max. length (bp) | Mean length (bp) | ||
| Clones sequenced from both ends | 38,446 | ||||
| Full-length cDNA reached from both ends (Contigs) | 16,079 | 150 | 1152 | 808.9 | |
| Partial full-length cDNAs sequences (ESTs) | Sanger FL cDNA ESTs 5′ | 39,358 | 103 | 640 | 581.3 |
| Sanger FL cDNA ESTs 3′ | 38,805 | 105 | 613 | 556.3 | |
| Total sequences in the structural annotation | 94,242 | ||||
| Total sequences mapped onto the Bd21 genome | 91,309 | ||||
Brachypodium gene models corresponding to RBFL sequences.
| Phytozome 8.0 | MIPS1.2 | |
| Gene models with RBFL ESTs | 6,432 | 6,415 |
| Gene models with RBFL full-read contigs | 4,081 | 4,085 |
| Total | 10,513 | 10,500 |
Statistics of updated structural gene annotation in the Brachypodium Bd21 genome.
| Original annotation | ||
| Phytozome 8.0 | MIPS1.2 | |
|
| 31,029 | 31,029 |
| Updated annotation based on the PASA analysis | 31,690 | 31,688 |
| Fused to one gene model (removed from previous annotation) | 20 | 20 |
| Remaining gene models | 31,009 | 31,009 |
| Remaining gene models with modification | 6,241 | 6,180 |
| Remaining gene models with modification in CDS regions | 719 | 717 |
| Remaining gene models with modifications in 5′UTRs | 5,399 | 5,340 |
| 5′UTR addition | 3,222 | 3,195 |
| 5′UTR extension | 2,038 | 2,021 |
| Other modifications in 5′UTRs | 139 | 124 |
| Remaining gene models with modifications in 3′UTRs | 2,606 | 2,571 |
| 3′UTR addition | 965 | 967 |
| 3′UTR extension | 1,471 | 1,451 |
| Other modifications in 3′UTRs | 170 | 153 |
| Newly defined gene models | 681 | 679 |
| Novel gene models (novel loci and TUs) | 362 | 362 |
| Newly added isoforms | 304 | 302 |
| Fused | 10 | 10 |
| Others | 5 | 5 |
Other modifications in UTRs include UTR removal, abbreviation, and any other structural changes.
Figure 1Examples of updated structural annotation based on full-length cDNA sequences in the Brachypodium genome; UTR addition (A), newly identified exon (B), newly identified transcription unit (C), newly identified gene model (D) and fused gene model (E).
Forward and reverse sequence reads of the Brachypodium full-length cDNA are represented in green and red color, respectively. Modified gene structures are indicated with red lines.
Figure 2Promoter architecture in the Brachypodium genome analyzed using updated gene structural annotation.
Distribution of all cis-motifs searched in the −3000 bp promoter regions from putative translation start sites (A). An example of implemented cis-motif data found in the −3000 bp promoter regions, which is scrutinized on a genome browser in the RBFLDB (B). Enriched stress responsive cis-motifs in the −1000 bp promoter regions of genes classified into functional categories based on MapMan ontology, which are hierarchically clustered into functional categories and by the cis motifs (C).
Gene models encoding transcription factors cloned as full-length cDNA.
| TF | GramineaeTFDB | RBFL | TF | GramineaeTFDB | RBFL |
| (R1)R2R3_Myb | 104 | 27 | HSF | 30 | 14 |
| ABI3VP1 | 48 | 4 | JUMONJI | 19 | 2 |
| AP2_EREBP | 153 | 62 | LFY | 1 | 0 |
| ARF | 36 | 11 | LIM | 20 | 8 |
| ARID | 10 | 4 | LUG | 5 | 1 |
| Alfin-like | 12 | 8 | MADS | 75 | 8 |
| Aux_IAA | 38 | 22 | MBF1 | 3 | 2 |
| BBR-BPC | 4 | 1 | Myb_related | 45 | 19 |
| BES1 | 7 | 3 | NAC | 103 | 30 |
| C2C2_Zn-CO-like | 34 | 20 | Nin-like | 15 | 1 |
| C2C2_Zn-Dof | 27 | 9 | PHD | 178 | 43 |
| C2C2_Zn-GATA | 24 | 10 | PLATZ | 14 | 4 |
| C2C2_Zn-YABBY | 13 | 4 | PcG | 51 | 5 |
| C2H2_Zn | 102 | 24 | S1Fa-like | 2 | 2 |
| C3H-TypeI | 78 | 30 | SBP | 18 | 4 |
| CAMTA | 10 | 2 | SRS | 4 | 1 |
| CCAAT_Dr1 | 1 | 1 | TCP | 21 | 5 |
| CCAAT_HAP2 | 12 | 5 | TUB | 12 | 7 |
| CCAAT_HAP3 | 16 | 4 | Trihelix | 8 | 4 |
| CCAAT_HAP5 | 13 | 4 | ULT | 1 | 0 |
| CPP | 11 | 5 | VOZ | 2 | 1 |
| E2F_DP | 8 | 2 | WRKY_Zn | 89 | 23 |
| EIL | 6 | 3 | Whirly | 2 | 1 |
| GARP_ARRB | 9 | 5 | ZIM | 19 | 14 |
| GARP_G2-like | 56 | 18 | atypical_MYB | 36 | 14 |
| GRAS | 47 | 10 | bHLH | 151 | 41 |
| GRF | 28 | 1 | bZIP | 96 | 44 |
| GeBP | 16 | 5 | zf-HD | 16 | 2 |
| HB | 101 | 41 | zf-TAZ | 9 | 4 |
| HMG-box | 11 | 6 | (ambiguous) | (11) | (2) |
| HRT | 1 | 0 | |||
| Total | 2,092 | 657 |
Figure 3Comparative mapping of Pooideae cDNAs or gene models to the Brachypodium genome or barley genome assembly.
The updated gene models of Brachypodium from MIPS 1.2 annotation using RBFL cDNAs were used for this comparative mapping analysis. Summarized results of comparative cDNA mapping of barley and wheat cDNAs to Brachypodium genome. Numbers of queries of barley full-length cDNAs (Hvu FLcDNA), gene models annotated in the barley Morex genome (Hvu Morex cDNA), wheat full-length cDNAs (Tae FLcDNA) and wheat gene models from a shotgun genome assembly (wheat cDNA(UK454)), respectively, mapped to the genic, or inter-genic regions of the Brachypodium genome (Bdi Genome), or not mapped, are represented. Numbers of Brachypodium genes located on the region with mapped Triticeae cDNAs or gene models are also shown (A). An example of mapping results of Triticeae cDNAs with Brachypodium gene annotation (B). Summarized results of comparative cDNA mapping of barley, numbers of queries of Brachypodium transcripts (Bdi Transcript), barley full-length cDNAs (Hvu FLcDNA), wheat full-length cDNAs (Tae FLcDNA), and wheat gene models of a shotgun genome assembly (wheat cDNA (UK454)), respectively, mapped to the genic, or inter-genic regions of the barley Morex genome assembly (Hvu Genome), or not mapped, are represented. Numbers of barley genes located on the region with mapped the queried Pooideae cDNAs or gene models are also shown (C). An example of mapping results of the Pooideae cDNAs with barley gene annotation (D).