| Literature DB >> 25765701 |
Isabelle Lesur1,2, Grégoire Le Provost3,4, Pascal Bento5, Corinne Da Silva6, Jean-Charles Leplé7, Florent Murat8, Saneyoshi Ueno9, Jerôme Bartholomé10,11, Céline Lalanne12,13, François Ehrenmann14,15, Céline Noirot16, Christian Burban17,18, Valérie Léger19,20, Joelle Amselem21, Caroline Belser22, Hadi Quesneville23, Michael Stierschneider24, Silvia Fluch25, Lasse Feldhahn26, Mika Tarkka27,28, Sylvie Herrmann29,30, François Buscot31,32, Christophe Klopp33, Antoine Kremer34,35, Jérôme Salse36, Jean-Marc Aury37, Christophe Plomion38,39.
Abstract
BACKGROUND: Many northern-hemisphere forests are dominated by oaks. These species extend over diverse environmental conditions and are thus interesting models for studies of plant adaptation and speciation. The genomic toolbox is an important asset for exploring the functional variation associated with natural selection.Entities:
Mesh:
Year: 2015 PMID: 25765701 PMCID: PMC4350297 DOI: 10.1186/s12864-015-1331-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Schematic representation of the bioinformatic analysis.
Description of oak transcriptomic assemblies
|
|
|
|
| ||
|---|---|---|---|---|---|
| Number of trimmed sequences considered in the assembly | 2,888,537 | 417,337,626 | / | ||
| Number of trimmed sequences incorporated in the assembly | 2,003,295 | 417,337,626 | 274,867 | ||
| Number of contigs > 100bp, after redundancy reduction | 44,272 | 230,595 | 192,097 | ||
| Number of singletons | 300,373 | 40,119,145 | / | ||
| Consensus (total bp) | / | / | 199,278,344 | ||
| Average contig size (bp) | 937 | 877 | 1,037 | ||
|
|
|
|
| ||
| Number of contigs | 69,154 | 65,712 | 192,097 | ||
| Mean length (bp) | 705 | 1,003 | 1,037 | ||
| Median (bp) | 606 | 734 | 597 | ||
| N50 (bp) | 908 | 1,545 | 1,879 | ||
| Consensus (bp) | 48,751,826 | 65,913,455 | 199,278,344 | ||
| Nb of annotated contigs in SwissProt | 34,614 | 38,292 | 63,857 | ||
| Nb of unique SwissProt ID | 13,333 | 16,429 | 17,476 | ||
|
|
|
|
|
|
|
| OCV3-91k | 90,786 | 148,088,893 | 1,631 | 1,292 | 2,329 |
| OCV3-101k | 101,311 | 51,187,294 | 505 | 328 | 697 |
| OCV3 | 192,097 | 199,278,344 | 1,037 | 597 | 1,879 |
A Characteristics of the long-reads, short-reads and meta assemblies (OCV3).
B Side-by-side comparison of oak unigene sets (OCV1: assembly from Ueno et al. [8], OCV2: assembly from Tarkka et al. [22], and OCV3: this paper). N50 length is defined as the length for which the collection of all contigs of that length or longer contains at least half of the total of the lengths of the contigs.
C Comparison between OCV3-91k (Unigenes with BlastX hit) and OCV3-101k (Unigenes without BlastX hit) subsets: assembled sequences (in bp), mean and median contig sizes, N50.
BlastX results for OCV3 contigs against SwissProt database and the proteomes of five species: , , , ,
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
| Nb of oak contigs with a hit | 84,852 | 82,655 | 81,849 | 78,143 | 77,784 | 63,857 | 90,786 |
| Nb of proteins with a hit | 18,587 | 16,573 | 23,053 | 22,338 | 18,661 | 17,476 | / |
| total nb of proteins or accessions | 28,701 | 26,346 | 45,033 | 46,315 | 35,386 | 540,732 | / |
Mapping results against the genome and gene models
|
|
|
|
|
|---|---|---|---|
| Number of sequences | 192,097 | 90,786 | 101,311 |
| Number of mapped sequences | 59,851 (31.1%) | 53,600(59%) | 6,251 (6.2%) |
| Number of matches | 64,292 | 54,954 | 9,338 |
| Number of matched exons | 209,795 | 200,252 | 9,543 |
| Number of exons/model | 3.26 | 3.64 | 1.02 |
| Mean of identity percent | 83.99% | 82.75% | 91.26% |
| Number of monoexonics | 29,767 | 20,591 | 9,176 |
|
|
|
| |
| Number of sequences | 90,786 | 101,311 | |
| Number of mapped contigs | 64,001 (70.5%) | 8,380 (8.27%) | |
| Number of | 17,038 (59.4%) | 5,265 (18.3%) | |
A Mapping results (BLAT software) for OCV3, OCV3-91k (Unigenes with BlastX hit) and OCV3-101k (unigenes without BlastX hit) against the Prunus persica genome.
B Mapping results (BlastN) for OCV3-91k and OCV3-101k against the Prunus persica gene models.
Figure 2Gene ontology classification of OCV3-91 k contigs. GO Slim terms contributing to the annotation of the 77,277 OCV3_91k contigs. Red: Biological process. Green: Cellular component. Blue: Molecular function.
Figure 3Oak genome orthologous relationships. A. Simplified phylogenic tree placing oaks among sequenced malvid (Arabidopsis, eucalyptus), fabids (peach, poplar) and basal rosid (grape) genomes. B. Distribution of Ks values (x-axis) for orthologous gene pairs between oak and malvids (Arabidopsis, eucalyptus: in blue), between oak and fabids (peach, poplar: in red) and between oak and grape (in green). Grape is considered to be the modern representative of the ancestral eudicot genome. C. Distribution of peach gene models (blue curve) and 9,549 orthologous gene pairs with oak (red curve) along the 8 peach chromosomes. Peach chromosomes are illustrated as a mosaic of 7 colours highlighting the features of the ancestral eudicot chromosome (A1, A4, A7, A10, A13, A16, A19) as in [19].
Subset of genes differentially expressed between Ecodormancy and Swelling buds stages
|
|
|
|
|
|
|---|---|---|---|---|
|
| ||||
| Ribosome biogenesis | T13C7.4 (60S ribosomal protein L14) | AT2G20450 | >100 | |
| Ribosome biogenesis | F12L6.5 (ribosomal protein L23A) | AT2G39460 | >100 | |
| Ribosome biogenesis | T9J14.13 (ribosomal protei S24e) | AT3G04920 | >100 | |
| Ribosome biogenesis | Ribosomal protein L232A | AT3G55280 | >100 | |
| Ribosome biogenesis | T25K17.40 (ribosomal protein L31e) | AT4G26230 | >100 | |
| Ribosome biogenesis | K16F13.2 (40S ribosomal protein S27-3) | AT5G47930 | >100 | |
| Ribosome biogenesis | MUP24.13 (60S ribosomal protein L12) | AT5G60670 | >100 | |
| Ribosome biogenesis | F10K1.22 (60S ribosomal protein L35a) | AT1G07070 | >100 | |
| Ribosome biogenesis | T2P11.7 (60S ribosomal protein L34) | AT1G26880 | >100 | |
| Ribosome biogenesis | F19K6.12 (60S ribosomal protein L37) | AT1G52300 | >100 | |
| Ribosome biogenesis | STV1 (ribosomal protein L24) | AT3G53020 | >100 | |
| Ribosome biogenesis | Zinc-binding ribosomal protein | AT3G60245 | >100 | |
| Ribosome biogenesis | PRPL11 (plastid ribosomal proteinL11) | AT1G32990 | >100 | |
| Ubiquitin dependent rotein catabolic process | UBC28 (ubiquitin conjugating enzyme 28) | AT1G64230 | >100 | Neighbors of sbi-miR169r-3p_agpf_35 |
| Ubiquitin dependent protein catabolic process | FKF1 (flavin-binding kelch repeat F box 1) | AT1G68050 | >100 | Neighbors of ELF3, Neighbors of GI |
| Ubiquitin dependent protein catabolic process | UBQ11 (ubiquitin 11) | AT4G05050 | >100 | Neighbors of heat shock |
| Ubiquitin dependent protein catabolic process | ASK2 (Arabidopsis SKP-Like2) | AT5G42190 | >100 | |
| Ubiquitin dependent protein catabolic process | ATUBA1 (ubiquitin activating enzyme 1) | AT2G30110 | >100 | |
| Response to cold | DREB1A (DREB subfamily A-1) | AT4G25480 | >100 | Neighbors of cold stress, DREB and CBF |
| Response to cold | CBF1 (C repeat/DRE binding factor 1) | AT4G25490 | >100 | Neighbors of cold stress, DREB and CBF |
| Response to cold | 1 (low expression of osmotically responsive gene | AT1G56070 | >100 | Neighbors of cold stress |
| Response to cold | LTI30 (Low temperature induce temperature) | AT3G50970 | >30 | Neighbors of cold stres, ABA and CBF |
| Response to cold | RCI3 (rare cold inducible gene 3) | AT1G05260 | >100 | Neighbors of cold stress |
| Response to cold | Fib (Fibbrilin 1A) | AT4G04020 | >50 | Neighbors of ABA |
| Response to water deprivation | ATBI-1 (Bax inhibitor 1) | AT5G47120 | >100 | Neighbors of drought and COLI |
| Response to water deprivation | SIP3 (CBL interacting protein kinase 6) | AT4G30960 | >100 | |
| Response to water deprivation | CBL9 (calcineurin B like protein 9) | AT5G47100 | >100 | Neighbors of ABA, drought and cold |
| Response to gibberelin stimulus | Gasa1 (GAST1 protein homolg1) | AT1G75750 | >100 | |
| Response to gibberelin stimulus | Gasa2 (GAST1 protein homolg2) | AT4G09610 | >100 | |
| Response to gibberelin stimulus | AGL20 (Agamous like 20) | AT2G45660 | >100 | Neighbors of AP1 |
| Response to high light intensity | Bag6 (Bcl-2-associated athanogene 6) | AT2G46240 | >100 | Neighbors of heat shock |
|
| ||||
| DNA dependent DNA replication initiation | MCM6 | AT5G44635 | >3 | |
| DNA dependent DNA replication initiation | MCM3 | AT5G46280 | >3 | |
| DNA dependent DNA replication initiation | PRL (prolifera) | AT4G02060 | >2 | Neighbors of cell cycle , DNA replication |
| DNA dependent DNA replication initiation | CDC45 (cell division cycle 45) | AT3G25100 | >6 | Neighbors of mitosis , DNA replication |
| DNA dependent DNA replication initiation | T12C22.19 (MCM2) | AT1G44900 | >2 | |
| Regulation of cell cycle and cell division | CYCB 1;4 (cyclin dependent protein kinase) | AT2G26760 | >10 | |
| Regulation of cell cycle and cell division | CYCD1;1 (cyclin D-type protein) | AT1G70210 | >10 | Neighbors of CYCD1;1 and CYCD1;3 |
| Regulation of cell cycle and cell division | CYCD5;1 (cyclin D-type protein) | AT4G37630 | >100 | Neighbors of cell cycle |
| Regulation of cell cycle and cell division | CYCA3;2 (cyclin D-type protein) | AT1G47210 | >2 | Neighbors of morphogenesis and cell differenciation |
| Regulation of cell cycle and cell division | CYCD3;1 (cyclin D-type protein) | AT4G34160 | >7 | Neighbors of CYCD1;1 and CYCD1;3 |
| Response to auxin | OBP1 (OBF binding protein) | AT3G50410 | >3 | Neighbors of cell cycle |
| Response to auxin | Aux1 (auxin influx transporter) | AT2G38120 | >6 | Neighbors of primordium elongation s and cell differenciation |
| Response to gibberellin | Gasa4 (Gast1 protein homolog 4) | AT5G15230 | >10 | Neighbors of heat shock and flower development |
| Response to gibberellin | Myb26 (MYB domain protein 26) | AT3G13890 | >100 | Neighbors of cell development |
| Response to brassinosteroid | Bas1 (cythochrom P450) | AT2G26710 | >7 | Neighbors of leaf development and hormone |
| response to brassinosteroid | T5I8.2 (hercule receptor protein kinase 2) | AT1G30570 | >40 | Neighbors of brassinolide |
| Response to sucrose stimulation | GBF6 (leucine zipper11) | AT4G34590 | >2 | |
| Response to sucrose stimulation | GASA6 (GA stimulated arabidopsis 6) | AT1G74670 | >10 | |
The most differentially expressed GO terms identified in the enrichment analysis are indicated in the first column for each dormancy stage.
The localization of the genes in the functional network is indicated in the last column when available.
Figure 4OCV3-91 k contigs differentially expressed between six pairs of tissues (ecodB: ecodormant bud, swB: swelling bud, XY: differentiating secondary xylem, RO: root, LE: leaf and CA: dedifferentiated callus). Number of significantly differentially expressed contigs identified by three (upper number) and two (lower number) statistical methods.
Figure 5Venn diagram showing the number of differentially expressed genes during bud dormancy release identified by 3 statistical algorithms; 6,004 differentially expressed contigs were identified by the R statistics method, 823 by DESeq and 1,632 by EdgeR (see ).
Figure 6Functional network predicted from the list of genes upregulated in ecodormant bud (in red) or swelling bud (in blue) based on the subset of 663 differentially expressed OCV3-91 k contigs (listed in Additional file ). It is possible to zoom on particular parts of the network in the .tif file.