| Literature DB >> 27855648 |
Catherine J Nock1, Abdul Baten2, Bronwyn J Barkla2, Agnelo Furtado3, Robert J Henry3, Graham J King2.
Abstract
BACKGROUND: The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741.Entities:
Keywords: Crop; Gene space; Genome; Macadamia; Proteaceae; Rainforest; Transcriptome
Mesh:
Substances:
Year: 2016 PMID: 27855648 PMCID: PMC5114810 DOI: 10.1186/s12864-016-3272-3
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Macadamia integrifolia genome and transcriptome sequencing, assembly and annotation statistics
| Library Type | Reads post QC | Nucleotides post QC |
|---|---|---|
| Genome sequencing: | ||
| Illumina GAIIx 480 bp Insert (2x150 bp PE) | 101.7 | 30.51 |
| Illumina GAIIx 700 bp Insert (2x150 bp PE) | 48.6 | 14.58 |
| Illumina HiSeq 8000 bp Insert (2x100 bp MP) | 32.4 | 6.48 |
| Total | 182.7 | 51.57 |
| Transcriptome sequencing: | ||
| Illumina HiSeq Flower (2x100 bp PE) | 82.1 | 16 |
| Illumina HiSeq Shoot (2x100 bp PE) | 70 | 13.7 |
| Illumina HiSeq Leaf (2x100 bp PE) | 76 | 14.9 |
| Total | 228.1 | 44.6 |
| Genome assembly | Contigs | Scaffolds |
| Number | 210,726 | 193,493 |
| Minimum size (bp) | 388 | 500 |
| Maximum size (bp) | 379,349 | 643,490 |
| N50 (bp) | 3522 | 4745 |
| Total assembly length (Mb) | 477 | 518 |
| Transcriptome assembly | Statistics | |
| Number of transcripts | 298,030 | |
| Maximum transcripts length (bp) | 17,814 | |
| Minimum transcript length (bp) | 224 | |
| Mean transcript length (bp) | 823 | |
| Standard deviation (bp) | 886 | |
| Total length (bp) | 245,373,045 | |
| N50 (bp) | 1339 | |
| Genome annotation | Statistics | |
| Number of gene models | 35,337 | |
| Average gene length (bp) | 2518 | |
| Average coding sequence length (bp) | 1090 | |
| Gene models similar to | 74% | |
| Gene models similar to | 79% | |
| Eukaryotic 458 CORE genes availablea | 96% | |
aBLASTP 1e-05
Fig. 1Repeat content of the macadamia genome showing the relative proportions of the long terminal (LTR), long and short interspersed (LINE, SINE), DNA element, simple, low-complexity and unclassified repeats identified using RepeatMasker and RepeatModeller
Fig. 2Venn diagram showing the distribution of gene families (orthologous clusters) among six plant species including early diverging eudicots Macadamia integrifolia, Nelumbo nucifera and core eudicots Arabidopsis thaliana, Eucalyptus grandis, Populus trichocarpa and Vitis vinifera
Hypergeometric test for significantly enriched biological process gene ontology (GO) terms of macadamia-specific gene clusters compared to those identified among six eudicot species
| GO ID | Name |
| Macadamia specific | Six species, total | ||
|---|---|---|---|---|---|---|
| clusters | genes | clusters | genes | |||
| Plant Defense | ||||||
| GO:0002764 | immune response-regulatory signaling | 1.06E-5 | 7 | 18 | 9 | 23 |
| GO:0016045 | detection of bacterium | 3.61E-4 | 8 | 22 | 16 | 53 |
| GO:0010359 | regulation of anion channel activity | 6.23E-4 | 8 | 24 | 17 | 59 |
| GO:0010204 | defense response signaling pathway | 0.00102 | 9 | 28 | 18 | 86 |
| Terpenoid synthesis | ||||||
| GO:0016114 | terpenoid biosynthetic process | 0.03620 | 6 | 25 | 16 | 102 |
| GO:0033383 | geranyl diphosphate metabolic process | 0.00367 | 3 | 10 | 3 | 10 |
| GO:0043693 | monoterpene biosynthetic process | 0.01299 | 3 | 14 | 4 | 44 |
| GO:0006200 | obsolete ATP catabolic process | 0.00653 | 4 | 9 | 5 | 10 |
| GO:0009820 | alkaloid metabolic process | 0.02675 | 5 | 12 | 11 | 55 |
| GO:0006075 | (1- > 3)-beta-D-glucan biosynthetic process | 0.03727 | 4 | 21 | 9 | 90 |
Fig. 3Overview of the biosynthetic and catabolic pathway of cyanogenic glycosides in plants
Candidate genes for cyanogenesis in macadamia
| Macadamia gene model | FPKM | Similar to | BLASTP |
|---|---|---|---|
| CYP79 | |||
| Maca026950-RA | 872.93 | Phenylalanine N-monooxygenasea | 4.2E-177 |
| Tyrosine N-monooxygenaseb | 4.1E-172 | ||
| CYP79D15, AC genec | 7.7E-171 | ||
| CYP71 | |||
| Maca027151-RA | 756.71 | CYP71B16 Cytochrome P450a | 2.9E-083 |
| Maca024545-RA | 48.04 | CYP71B20 Cytochrome P450a | 2.6E-126 |
| Maca026817-RA | 36.31 | CYP71B34 Cytochrome P450a | 5.6E-120 |
| Maca030139-RA | 14.52 | CYP71A1 Cytochrome P450d | 2.1E-100 |
| UGT85 | |||
| Maca010817-RA | 29.91 | UGT85A2 UDP-glycosyltransferasea | 6.0E-176 |
| Maca026370-RA | 16.19 | UGT85A2 UDP-glycosyltransferasea | 1.3E-179 |
| Maca030319-RA | 9.96 | UGT85B1 Cyanohydrin glucosyltransferaseb | 5.9E-119 |
| UGT85A2 UDP-glycosyltransferasea | 4.5E-177 | ||
| β-glucosidase | |||
| Maca000104-RA | 248.96 | BGLU9 Beta-glucosidasea | 9.0E-144 |
| Maca007594-RA | 14.20 | Cyanogenic beta-glucosidase, LI genec | 1.4E-111 |
| HNL | |||
| Maca017028-RA | 4.63 | (R)-mandelonitrile lyase-likea | 6.5E-194 |
| (R)-mandelonitrile lyase, MDL1 genee | 4.6E-074 | ||
a Arabidopsis thaliana; b Sorghum bicolor; c Trifolium repens; d Persea americana;; e Prunus dulcis