| Literature DB >> 28204581 |
Tilman Schell, Barbara Feldmeyer1, Hanno Schmidt1, Bastian Greshake2, Oliver Tills3, Manuela Truebano3, Simon D Rundle3, Juraj Paule4, Ingo Ebersberger, Markus Pfenninger.
Abstract
Entities:
Year: 2017 PMID: 28204581 PMCID: PMC5381561 DOI: 10.1093/gbe/evx032
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FThe number of described species (Dunn and Ryan 2015; GIGA Community of Scientists 2014) and the fraction of sequenced genomes (http://www.ncbi.nlm.nih.gov/genome/browse/ on September 1, 2016). Animal phyla were obtained from (Dunn et al. 2014). Phyla with genomic record are displayed. Note the logarithmic scaling.
Available Mollusc Genomes
| Species | Assembly Length/Estimated Genome Size = % Assembled | #sequences/N50 (*contigs) | Coverage/Technology | Gap [%] | BUSCOs Present | Number of Annotated Proteins |
|---|---|---|---|---|---|---|
|
| 2.4 Gb/2.7 Gb = 89% | 151,674/475 kb | 92/Illumina | 15.1 | 73.8 | 23,994 |
|
| 906 kb/1.7 Gb | * 1,057/855 bp | 3/Roche 454 | 0 | 0 | — |
|
| 663 kb/? | * 778/849 bp | 3/Roche 454 | 0 | 0 | — |
|
| 558 Mb/890 Mb | 7,659/402 kb | 100/Illumina | 11.8 | 82 | 45,406 |
|
| 1.6 Gb/1.9 Gb | * 2,315,965/1067 bp | 17/Illumina | 0 | 1.6 | — |
|
| 360 Mb/421 Mb | 4,469/1870 kb | 8.87/Sanger | 16.9 | 97.0 | 23,822 |
|
| 579 Mb/1,460 Mb = 39.7% | 295,348/3160 bp | 25.6/Illumina | 0.00062 | 16.6 | — |
|
| 2,160 Mb/2,757 Mb = 78% | 1,126,156/2681 bp | 28.5/Illumina | 0 | 44 | — |
|
| 927 Mb/1,760 Mb | 4,332/918 kb | 66/Illumina | 20.4 | 94.1 | 27,591 |
|
| 916 Mb/929 Mb | 331,401/48 kb | 27.5/Roche 454 | 1.9 | 89.1 | 36,675 |
|
| 833 Mb/1,193 Mb | * 328,378/5.8 kb | 0 | 88 | — |
Note.—An overview from column 2 can be found in supplementary figure 4, Supplementary Material online. Column 5: Fraction of N’s in the assembly. Column 6: BUSCOs: (Benchmarking Universal Single-Copy Orthologs) NMetazoa =843; Present=complete+fragmented.
References: Genome sizes are from the genome publications, if not cited separately.
(Albertin et al. 2015).
(Peñarrubia, Sanz, et al. 2015).
(Gregory 2003).
(Peñarrubia, Araguas, et al. 2015).
(Zhang et al. 2012).
(González-Tizón et al. 2000).
(Nguyen et al. 2014).
(Rodríguez-Juíz et al. 1996).
(Simakov et al. 2013).
(Hinegardner 1974).
(Kenny et al. 2015).
(Barghi et al. 2016).
(Moroz et al.) GCF_000002075.1.
(Lasek & Dower 2013).
(Matty Knight, Coen M. Adema, Nithya Raghavan, Eric S. Loker) GCF_000457365.1.
(unpublished—Ashworth Laboratories 2016) GCA_900036025.1.
(Vinogradov 1998).
FPhotograph of Radix auricularia. Picture by Markus Pfenninger.
Characteristics of the Radix auricularia Genome and Draft Assembly
| Parameter | Value |
|---|---|
| Haploid chromosome number | 17 ( |
| Estimated genome length | 1.51 Gb (Vinogradov 1998) |
| Flow cytometry | 1.58 Gb ± 21.5 Mb (this study) |
| Sequencing coverage | 1.60 Gb |
| Total assembly length | 0.91 Gb single copy or high complexity regions |
| #scaffolds | 4,823 |
| N50 | 578,730 bp |
| Gaps | 6.4% N |
| Coverage | 72x |
| Estimation of gene completeness | 94.6% of BUSCO genes present |
| Gene prediction | 17,338 genes |
| Gene space (UTR, Exons, Introns etc.) | 200.6 Mb = 21.9% of assembly |
| Gene length (median) | 8.0 kb |
| Gene fragmentation | 147,195 exons |
| Exon space | 25.3 Mb = 2.8% of assembly (1.6% of total genome) |
| Exon length (median) | 125 bp |
| Protein length (median) | 332 AA |
Note.—BUSCOs: (Benchmarking Universal Single-Copy Orthologs) NMetazoa =843; Present=complete+fragmented.
FRe-mapping statistics. For details, see supplementary note 2, Supplementary Material online. (A) Coverage distribution per position. The peak is located at a coverage of 72×. The x-axis is given in log-scale. (B) Insert size distributions for the three mate pair libraries with insert sizes of 2, 5, and 10 kb. The high fraction of mate pairs with insert sizes close to 0 is due to the repetitive nature of the Radix genome (supplementary table 3, Supplementary Material online). In particular, repeat stretches that are not properly resolved in the genome assembly interfere with a proper placement of reads.
FCollapsed repeats. (A) Coverage of continuous unambiguous sequence parts of the scaffolds. Outliers from boxplots are not shown. The red line represents the most frequent coverage of 72 × (fig. 3 A). (B) Positions annotated as repeats along continuous unambiguous sequence parts of the scaffolds.