Valentine Murigneux1,2, Subash Kumar Rai1,2, Agnelo Furtado3, Timothy J C Bruxner2, Wei Tian4,5, Ivon Harliwong4,5, Hanmin Wei4,6, Bicheng Yang4,5, Qianyu Ye4,5, Ellis Anderson6,7, Qing Mao6,7, Radoje Drmanac4,6,7, Ou Wang4, Brock A Peters4,6,7, Mengyang Xu4,8, Pei Wu4,9, Bruce Topp3, Lachlan J M Coin1,2,10, Robert J Henry3. 1. Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia. 2. Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia. 3. Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia. 4. BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China. 5. BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia. 6. MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China. 7. Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA. 8. BGI-Qingdao, Building 2, No. 2 Hengyunshan Road, Qingdao 266555, China. 9. BGI-Tianjin, Airport Business Park, Building E3, Airport Economics Area, Tianjin 300308, China. 10. Department of Microbiology and Immunology, University of Melbourne at The Peter Doherty Institute for Infection and Immunity, 792 Elizabeth Street, Melbourne, VIC 3004, Australia.
Abstract
BACKGROUND: Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS: Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS: The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.
BACKGROUND: Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS: Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS: The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.
Authors: Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov Journal: Bioinformatics Date: 2015-06-09 Impact factor: 6.937
Authors: Sergey Koren; Brian P Walenz; Konstantin Berlin; Jason R Miller; Nicholas H Bergman; Adam M Phillippy Journal: Genome Res Date: 2017-03-15 Impact factor: 9.043
Authors: Pirita Paajanen; George Kettleborough; Elena López-Girona; Michael Giolai; Darren Heavens; David Baker; Ashleigh Lister; Fiorella Cugliandolo; Gail Wilde; Ingo Hein; Iain Macaulay; Glenn J Bryan; Matthew D Clark Journal: Gigascience Date: 2019-03-01 Impact factor: 6.524
Authors: Aleksey A Penin; Artem S Kasianov; Anna V Klepikova; Ilya V Kirov; Evgeny S Gerasimov; Aleksey N Fesenko; Maria D Logacheva Journal: Front Plant Sci Date: 2021-03-16 Impact factor: 5.753