Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

Literature DB >> 32338745

To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

R A Leo Elworth¹, Qi Wang², Pavan K Kota³, C J Barberan⁴, Benjamin Coleman⁴, Advait Balaji¹, Gaurav Gupta⁴, Richard G Baraniuk⁴, Anshumali Shrivastava^1,4, Todd J Treangen^1,2.

Abstract

As computational biologists continue to be inundated by ever increasing amounts of metagenomic data, the need for data analysis approaches that keep up with the pace of sequence archives has remained a challenge. In recent years, the accelerated pace of genomic data availability has been accompanied by the application of a wide array of highly efficient approaches from other fields to the field of metagenomics. For instance, sketching algorithms such as MinHash have seen a rapid and widespread adoption. These techniques handle increasingly large datasets with minimal sacrifices in quality for tasks such as sequence similarity calculations. Here, we briefly review the fundamentals of the most impactful probabilistic and signal processing algorithms. We also highlight more recent advances to augment previous reviews in these areas that have taken a broader approach. We then explore the application of these techniques to metagenomics, discuss their pros and cons, and speculate on their future directions.

Entities: CellLine Chemical Disease Species

Mesh：

Year: 2020 PMID： 32338745 PMCID： PMC7261164 DOI： 10.1093/nar/gkaa265

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

62 in total

1. Compressive sensing DNA microarrays.

Authors: Wei Dai; Mona A Sheikh; Olgica Milenkovic; Richard G Baraniuk
Journal: EURASIP J Bioinform Syst Biol Date: 2009-01-13

Review 2. Computational solutions for omics data.

Authors: Bonnie Berger; Jian Peng; Mona Singh
Journal: Nat Rev Genet Date: 2013-05 Impact factor: 53.242

3. The sequence read archive.

Authors: Rasko Leinonen; Hideaki Sugawara; Martin Shumway
Journal: Nucleic Acids Res Date: 2010-11-09 Impact factor: 16.971

4. The ocean sampling day consortium.

Authors: Anna Kopf; Mesude Bicak; Renzo Kottmann; Julia Schnetzer; Ivaylo Kostadinov; Katja Lehmann; Antonio Fernandez-Guerra; Christian Jeanthon; Eyal Rahav; Matthias Ullrich; Antje Wichels; Gunnar Gerdts; Paraskevi Polymenakou; Giorgos Kotoulas; Rania Siam; Rehab Z Abdallah; Eva C Sonnenschein; Thierry Cariou; Fergal O'Gara; Stephen Jackson; Sandi Orlic; Michael Steinke; Julia Busch; Bernardo Duarte; Isabel Caçador; João Canning-Clode; Oleksandra Bobrova; Viggo Marteinsson; Eyjolfur Reynisson; Clara Magalhães Loureiro; Gian Marco Luna; Grazia Marina Quero; Carolin R Löscher; Anke Kremp; Marie E DeLorenzo; Lise Øvreås; Jennifer Tolman; Julie LaRoche; Antonella Penna; Marc Frischer; Timothy Davis; Barker Katherine; Christopher P Meyer; Sandra Ramos; Catarina Magalhães; Florence Jude-Lemeilleur; Ma Leopoldina Aguirre-Macedo; Shiao Wang; Nicole Poulton; Scott Jones; Rachel Collin; Jed A Fuhrman; Pascal Conan; Cecilia Alonso; Noga Stambler; Kelly Goodwin; Michael M Yakimov; Federico Baltar; Levente Bodrossy; Jodie Van De Kamp; Dion Mf Frampton; Martin Ostrowski; Paul Van Ruth; Paul Malthouse; Simon Claus; Klaas Deneudt; Jonas Mortelmans; Sophie Pitois; David Wallom; Ian Salter; Rodrigo Costa; Declan C Schroeder; Mahrous M Kandil; Valentina Amaral; Florencia Biancalana; Rafael Santana; Maria Luiza Pedrotti; Takashi Yoshida; Hiroyuki Ogata; Tim Ingleton; Kate Munnik; Naiara Rodriguez-Ezpeleta; Veronique Berteaux-Lecellier; Patricia Wecker; Ibon Cancio; Daniel Vaulot; Christina Bienhold; Hassan Ghazal; Bouchra Chaouni; Soumya Essayeh; Sara Ettamimi; El Houcine Zaid; Noureddine Boukhatem; Abderrahim Bouali; Rajaa Chahboune; Said Barrijal; Mohammed Timinouni; Fatima El Otmani; Mohamed Bennani; Marianna Mea; Nadezhda Todorova; Ventzislav Karamfilov; Petra Ten Hoopen; Guy Cochrane; Stephane L'Haridon; Kemal Can Bizsel; Alessandro Vezzi; Federico M Lauro; Patrick Martin; Rachelle M Jensen; Jamie Hinks; Susan Gebbels; Riccardo Rosselli; Fabio De Pascale; Riccardo Schiavon; Antonina Dos Santos; Emilie Villar; Stéphane Pesant; Bruno Cataletto; Francesca Malfatti; Ranjith Edirisinghe; Jorge A Herrera Silveira; Michele Barbier; Valentina Turk; Tinkara Tinta; Wayne J Fuller; Ilkay Salihoglu; Nedime Serakinci; Mahmut Cerkez Ergoren; Eileen Bresnan; Juan Iriberri; Paul Anders Fronth Nyhus; Edvardsen Bente; Hans Erik Karlsen; Peter N Golyshin; Josep M Gasol; Snejana Moncheva; Nina Dzhembekova; Zackary Johnson; Christopher David Sinigalliano; Maribeth Louise Gidley; Adriana Zingone; Roberto Danovaro; George Tsiamis; Melody S Clark; Ana Cristina Costa; Monia El Bour; Ana M Martins; R Eric Collins; Anne-Lise Ducluzeau; Jonathan Martinez; Mark J Costello; Linda A Amaral-Zettler; Jack A Gilbert; Neil Davies; Dawn Field; Frank Oliver Glöckner
Journal: Gigascience Date: 2015-06-19 Impact factor: 6.524

5. MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation.

Authors: David Koslicki; Daniel Falush
Journal: mSystems Date: 2016-06-07 Impact factor: 6.496

6. kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity.

Authors: Kevin D Murray; Christfried Webers; Cheng Soon Ong; Justin Borevitz; Norman Warthmann
Journal: PLoS Comput Biol Date: 2017-09-05 Impact factor: 4.475

7. Capturing sequence diversity in metagenomes with comprehensive and scalable probe design.

Authors: Hayden C Metsky; Katherine J Siddle; Adrianne Gladden-Young; James Qu; David K Yang; Patrick Brehio; Andrew Goldfarb; Anne Piantadosi; Shirlee Wohl; Amber Carter; Aaron E Lin; Kayla G Barnes; Damien C Tully; Bjӧrn Corleis; Scott Hennigan; Giselle Barbosa-Lima; Yasmine R Vieira; Lauren M Paul; Amanda L Tan; Kimberly F Garcia; Leda A Parham; Ikponmwosa Odia; Philomena Eromon; Onikepe A Folarin; Augustine Goba; Etienne Simon-Lorière; Lisa Hensley; Angel Balmaseda; Eva Harris; Douglas S Kwon; Todd M Allen; Jonathan A Runstadler; Sandra Smole; Fernando A Bozza; Thiago M L Souza; Sharon Isern; Scott F Michael; Ivette Lorenzana; Lee Gehrke; Irene Bosch; Gregory Ebel; Donald S Grant; Christian T Happi; Daniel J Park; Andreas Gnirke; Pardis C Sabeti; Christian B Matranga
Journal: Nat Biotechnol Date: 2019-02-04 Impact factor: 54.908

To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

1. Compressive sensing DNA microarrays.

Review 2. Computational solutions for omics data.

3. The sequence read archive.

4. The ocean sampling day consortium.

5. MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation.

6. kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity.

7. Capturing sequence diversity in metagenomes with comprehensive and scalable probe design.

8. Fast and accurate short read alignment with Burrows-Wheeler transform.

9. An improved filtering algorithm for big read datasets and its application to single-cell assembly.

10. Efficient Generation of Transcriptomic Profiles by Random Composite Measurements.

1. SPRISS: Approximating Frequent K-mers by Sampling Reads, and Applications.

2. Simplitigs as an efficient and scalable representation of de Bruijn graphs.