| Literature DB >> 30519250 |
Maria Kyriakidou1, Helen H Tai2, Noelle L Anglin3, David Ellis3, Martina V Strömvik1.
Abstract
Polyploidy or duplication of an entire genome occurs in the majority of angiosperms. The understanding of polyploid genomes is important for the improvement of those crops, which humans rely on for sustenance and basic nutrition. As climate change continues to pose a potential threat to agricultural production, there will increasingly be a demand for plant cultivars that can resist biotic and abiotic stresses and also provide needed and improved nutrition. In the past decade, Next Generation Sequencing (NGS) has fundamentally changed the genomics landscape by providing tools for the exploration of polyploid genomes. Here, we review the challenges of the assembly of polyploid plant genomes, and also present recent advances in genomic resources and functional tools in molecular genetics and breeding. As genomes of diploid and less heterozygous progenitor species are increasingly available, we discuss the lack of complexity of these currently available reference genomes as they relate to polyploid crops. Finally, we review recent approaches of haplotyping by phasing and the impact of third generation technologies on polyploid plant genome assembly.Entities:
Keywords: genome assembly; plant genomics; polyploidy; reference genome; third generation sequencing
Year: 2018 PMID: 30519250 PMCID: PMC6258962 DOI: 10.3389/fpls.2018.01660
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Sequenced plant polyploid genomes through May 2018.
| 1 | 206.823 | Scaffold | 2009-11-30 | Tetraploid | Hu et al., | |
| 2 | 978.972 | Chromosome | 2010-01-05 | Allotetraploid | Schmutz et al., | |
| 3 | 15344.7 | Chromosome 3B | 2010-07-15 | Allohexaploid | Choulet et al., | |
| 4 | 705.934 | Scaffold | 2011-05-24 | Autotetraploid | Potato Genome Sequencing Consortium, | |
| 5 | 604.217 | Contig | 2013-09-16 | Tetraploid | Huang et al., | |
| 6 | 214.356 | Scaffold | 2013-11-27 | Tetraploid | Hirakawa et al., | |
| 7 | 697.762 | Scaffold | 2013-11-27 | Allooctaploid | Hirakawa et al., | |
| 8 | 566.55 | Chromosome | 2013-12-18 | 2n, 4n (Beyaz et al., | Dohm et al., | |
| 9 | 45.1659 | Chromosome | 2014-04-16 | Tetraploid | Oryza Chr3 Short Arm Comparative Sequencing Project | |
| 10 | 641.356 | Chromosome | 2014-04-17 | Hexaploid | Kagale et al., | |
| 11 | 976.191 | Chromosome | 2014-05-05 | Allotetraploid | Chalhoub et al., | |
| 12 | 488.954 | Chromosome | 2014-05-22 | Hexaploid | NCBI | |
| 13 | 3643.47 | Scaffold | 2014-05-29 | Allotetraploid | Sierro et al., | |
| 14 | 607.318 | Scaffold | 2015-04-08 | Allotetraploid | Cannarozzi et al., | |
| 15 | 2189.14 | Chromosome | 2015-04-29 | Allotetraploid | Li F. et al., | |
| 16 | 334.384 | Scaffold | 2016-03-15 | Tetraploid | Tanaka et al., | |
| 17 | 563.439 | Scaffold | 2016-03-15 | Allotetraploid | Tanaka et al., | |
| 18 | 397.01 | Scaffold | 2016-03-15 | Allotetraploid | Tanaka et al., | |
| 19 | 455.349 | Scaffold | 2016-05-21 | 2n, 3n hybrids (Wu et al., | South China Botanic Garden, CAS | |
| 20 | 711.72 | Scaffold | 2016-06-13 | Tetraploid | BIO-FD & C CO., LTD | |
| 21 | 1333.55 | Scaffold | 2016-07-11 | Tetraploid | Jarvis et al., | |
| 22 | 954.861 | Chromosome | 2016-07-19 | Allotetraploid | Zhejiang University | |
| 23 | 1748.25 | Scaffold | 2016-07-29 | 2n, 3n, 4n (Van Huylenbroeck et al., | Korea Research Institute of Science and Biotechnology (Kim et al., | |
| 24 | 2566.74 | Scaffold | 2016-10-28 | Tetraploid | Huazhong Agricultural University | |
| 25 | 285.614 | Scaffold | 2016-12-27 | 2n to 6n (Kausar et al., | Urasaki et al., | |
| 26 | 263.788 | Scaffold | 2016-12-30 | Tetraploid (Rothfels and Heimburger, | Butts et al., | |
| 27 | 268.431 | Scaffold | 2017-01-29 | Tetraploid | Lomonosov Moscow State University | |
| 28 | 1169.95 | Contig | 2017-03-03 | It varies (D'Hont, | Riaño-Pachón and Mattiello, | |
| 29 | 295.462 | Scaffold | 2017-03-31 | Hexaploid | Costa et al., | |
| 30 | 10495 | Chromosome | 2017-05-18 | Tetraploid | WEWseq consortium | |
| 31 | 100.689 | Chromosome | 2017-05-31 | 16-ploid | Lan et al., | |
| 32 | 1195.99 | Scaffold | 2017-06-08 | Allotetraploid | Hittalmani et al., | |
| 33 | 456.675 | Chromosome | 2017-07-28 | Tetraploid | Iwate Biotechnology Research Center | |
| 34 | 837.013 | Contig | 2017-08-26 | Autohexaploid | Yang et al., | |
| 35 | 1486.61 | Scaffold | 2017-10-23 | Hexaploid | Zhejiang University | |
| 36 | 629.656 | Scaffold | 2017-10-31 | Autotetraploid | Zhou et al., | |
| 37 | 1141.15 | Chromosome | 2017-11-01 | 2n, 4n, 6n (Besnard et al., | Unver et al., | |
| 38 | 2197.49 | Contig | 2018-01-03 | Hexaploid | Institute of Bioengineering, RAS | |
| 39 | 839.915 | Scaffold | 2018-01-19 | Autotetraploid | Sichuan Agricultural University | |
| 40 | 848.309 | Scaffold | 2018-01-23 | Allotetraploid | China Agricultural University | |
| 41 | 1124.89 | Scaffold | 2018-02-06 | Hexaploid | USDA-ARS | |
| 42 | 220.961 | Scaffold | 2018-02-12 | 2n, 4n etc (Xin-Hua et al., | Center for Cellular and Molecular Platforms | |
| 43 | 67.3266 | Contig | 2018-02-26 | Hexaploid | The Sainsbury Laboratory | |
| 44 | 850.677 | Chromosome | 2018-04-09 | Tetraploid | Shanghai Center for Plant Stress Biology | |
| 45 | 2618.65 | Chromosome | 2018-04-23 | Tetraploid | Henan Agricultural University | |
| 46 | 2538.28 | Chromosome | 2018-05-02 | Allotetraploid | International Peanut Genome Initiative | |
| 47 | 1792.86 | Scaffold | 2018-05-08 | Tetraploid | Shen et al., |
The release date refers to the first release of the genomes in NCBI, before any improvement of the assemblies. Some have been updated after this date.
Third generation sequencing platforms.
| PacBio | Single molecule long-reads, average length ~ 10–18 Kb | False insertions in the raw reads, high error rate. Error correction algorithms are required | |
| Oxford Nanopore | Single molecule long-reads, average length ~ 10 Kb, max 100 Kb | Raw reads with false deletions and homopolymer errors. Requirement for error correction algorithms | |
| Illumina Synthetic Long reads | Synthetic long-reads derived from the short sequencing reads, average length ~ 100 Kb | High rate false indels (insertions, deletions). They require good trimming, correction algorithms | |
| 10X Genomics | Linked reads derived from short-read sequences, average length ~ 100 Kb | Needs designed algorithms and aligners, poor resolution of locally repetitive sequences. Sparse sequencing | |
| BioNano Genomics | Optical mapping of long, fluorescently labeled DNA fragments, average length ~ 250 Kb | Not many algorithms available for a reliable alignment between the optical map and the genome assembly | |
| Hi-C | Pairs short reads with an average length ~ 100 bp, method originally developed to study the 3D folding of the genome | Scattered sequencing with variable genomic distance between pairs |
10X Genomics is very similar to Illumina's SLR, with the difference that 10X Genomics can process more and larger fragments and the assemble of the different fragments does not necessarily depend on the sequencing coverage. Illumina's SLR system synthesizes the sequences of DNA fragment in contrast to 10x Genomics where the reads show only a part of DNA fragments. NA, not applicable.
Figure 1Approaches for reference-based genome assembly. (A) Shorter-read guided assembly. In this method, shorter reads are aligned against the reference genome, a consensus assembly is generated, and structural variations are detected. It can also be used to detect contamination in the sequenced reads. This approach is used when genomes are re-sequenced to detect polymorphisms in individuals. (B) Guided de novo genome assembly of shorter reads. Previously de novo assembled shorter reads are aligned against the reference or a closely related genome to extend the existing contigs. (C) Longer-read guided assembly. Longer reads are aligned against the reference genome, a consensus genome assembly is constructed, and structural variations are detected. (D) Guided de novo genome assembly of longer reads. Longer reads are de novo assembled into contigs, which are aligned against the reference or a closely related genome to be extended.
Figure 2Approaches for de novo assembly genome approaches. (A) Short read assembly. Genome assembly using only shorter reads and any assembly tool to construct contiguous sequences/contigs. (B) Longer reads assembly. Contig (red) assembly using longer reads (long, linked reads, optical maps) followed by scaffold assembly and gap filling. (C) Hybrid genome assembly. In this method, shorter reads can be assembled into contigs and the longer reads can be used for error correction (errors represented by Xs), then the corrected contigs can be assembled into scaffolds and the gaps filled. (D) Hybrid genome assembly using pre-assembled contigs. Longer reads are aligned against de novo pre-assembled contigs from shorter reads, followed by contig extension.
Host-databases of various plant genetic and genomic resources.
| Genbank | Genomic | Various plant species | |
| EMBL | Genomic | Various plant species | |
| DDBJ | Genomic | Various plant species | |
| UniProt | Protein and functional | Various plant species | |
| NCBI | Genomic | Various plant species | |
| GOLD | Genomic, metagenomics, transcriptomic | Various plant species | |
| Phytozome | Genomic | 92 assembled and annotated plant species | |
| Plantgdb | Genomic, transcriptomic | 27 assembled and annotated plant species | |
| Sol | Genomic | 11 | |
| Gramene | Genomic, genetic markers, QTLs | 53 plant species | |
| MaizeGCB | Genomic, annotations, tool host | ||
| Tair | Genetic and molecular biology data | ||
| CottonGEN | Genomic, Genetic and breeding resources | 49 | |
| PLEXdb | Gene expression | 14 plant species | |
| RicePro | Gene expression | ||
| CerealsDB | Genetic markers | ||
| PeanutBase | Genome, MAS, QTLs, Germplasm | ||
| SoyKb | Genetic markers, genomic resources | ||
| SoyBase | Genetic markers, QTLs, genomic resources | ||
| PGDBj | Genetic markers, QTLs, genomic resources | 80 plant species | |
| SNP-Seek | Genotype, Phenotype and Variety information | ||
| GrainGenes | Genome, Genetic markers, QTLs, genomic resources | ||
| ASRP | small RNA | ||
| CSRDB | small RNA | ||
| BrassicaInfo | Genomic | 7 | |
| BRAD | Genomics, Genetic Markers and Maps | ||
| Ensembl Plants | Genomic | 45 plant species | |
| Ipomoea Genome Hub | Genomic, EST | ||
| PGSC | Genomic, annotation | ||
| GDR | Genomics, Genetics, breeding | ||
| HWG | Genomics, Transcriptomics, Genetic Markers | Forest trees and woody plants |