Literature DB >> 22419780

SEQanswers: an open access community for collaboratively decoding genomes.

Jing-Woei Li¹, Robert Schmieder, R Matthew Ward, Joann Delenick, Eric C Olivares, David Mittelman.

Abstract

SUMMARY: The affordability of high-throughput sequencing has created an unprecedented surge in the use of genomic data in basic, translational and clinical research. The rapid evolution of sequencing technology, coupled with its broad adoption across biology and medicine, necessitates fast, collaborative interdisciplinary discussion. SEQanswers provides a real-time knowledge-sharing resource to address this need, covering experimental and computational aspects of sequencing and sequence analysis. Developers of popular analysis tools are among the >4000 active members, and ~40 peer-reviewed publications have referenced SEQanswers. AVAILABILITY: The SEQanswers community is freely accessible at http://SEQanswers.com/

Entities: Disease Species

Mesh：

Year: 2012 PMID： 22419780 PMCID： PMC3338018 DOI： 10.1093/bioinformatics/bts128

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

The Human Genome Project represents one of the greatest concerted achievements of the life sciences. This massive global effort jump-started the genomics era and enabled more ambitious and collaborative projects such as the Cancer Genome Atlas (Cancer Genome Atlas Research Network, 2008), 1000 Genome Project (1000 Genomes Project Consortium, 2010) and Human Microbiome Project (The NIH HMP Working Group ). These large population-scale studies, powered by high-throughput sequencing (HTS) technologies, have generated massive amounts of genomic data with the potential to revolutionize genetics and medicine. The translation of these data to actionable medicine, however, is complicated by the challenges of extracting meaningful information from HTS data (Mardis, 2010). The challenge is not purely computational, as bioinformatics is bound by the experimental methods employed to produce genomic data (Alkan ). A successful experiment minimizes false positives and depends on the optimization of an entire pipeline, from sample preparation to computational analysis. As HTS begins to transform nearly all aspects of biological and medical science, more labs will incorporate the production and analysis of genomic data into their studies. However, these experimental and computational methods are evolving at an incredible pace and it is increasingly challenging for smaller research groups outside of major genome centers to stay current. Real-time, interdisciplinary collaboration helps large genome centers optimize analysis pipelines and methods, and allows smaller groups to exploit them, even if they did not have resources to facilitate the initial development.

2 THE SEQANSWERS COMMUNITY

SEQanswers was launched in 2007 as an open forum to enable scientists across disciplines to collaboratively advance genomics and, particularly, HTS technologies. To date, there are >4000 active users visiting the online community each month. There is a rapidly growing number of discussion threads (currently >10 000) that span topics from sequencing platforms, experimental design, data analysis and biological interpretation (Fig. 1A). The SEQanswers community is truly global (Supplementary Fig. S1) and includes members from major genome centers and individual groups, as well as key developers of popular data analysis tools and methods. The community currently hosts >300 new questions, and 1800 new responses per month. This incredibly high rate of participation has led to rapid responses to questions, shortening initial response time from a week in early 2008 to less than a day in 2011 (Fig. 1B). Collaborative and transparent discussion on SEQanswers has triggered the development of new experimental techniques, data analysis methods and pipelines, as well as collaborative assessment of analysis standards (Supplementary Table S1). This innovation is captured in part by >30 peer-reviewed publications that cite SEQanswers so far (http://seqanswers.com/wiki/Papers_Referencing_SEQanswers).

Fig. 1.

SEQanswers is an active and fast growing community. (A) Monthly contributions to SEQanswers measured by the number of new posts (blue points/line) and discussions (orange points/line). Discussion counts include threads with at least two posts and exclude those with no answers. Also excluded are automated publication announcements. (B) The average response time to a new forum thread. SEQanswers is not the only online resource for knowledge sharing and collaboration: major sequencing technology companies have platform-centric user communities, but these are often restricted to customers and exclude the greater scientific community. In contrast, BioStar (Parnell ), an open, community-driven bioinformatics resource, currently hosts >2300 threads, which far exceeds the sum of discussions found on communities operated by sequencing companies (Supplementary Table S2). BioStar's principle feature is to enable researchers to ask questions and obtain brief answers, ranked by community vote, to bioinformatics-related problems. BioStar's success can be attributed to its well-defined scope and focus on a simple question and answer format for bioinformatics. However, this format precludes other forms of collaboration, discussion and debate. SEQanswers differs from BioStar both in format and scope. In an almost complementary capacity, SEQanswers eschews the Q&A format in favor of a more traditional forum format to facilitate collective discussion of technologies, methods and standards of practice. The traditional forum format emphasizes the chronology and evolution of collective thought, rather than focusing on identifying a single, best answer. The scope of SEQanswers differs from BioStar's exclusive bioinformatics focus, including all aspects of genomics, experimental and computational. Finally, in recognition of the sometimes lengthy and tediously detailed threads that can emerge from sequential discussion, we have developed a manually curated database, SEQwiki (Li ), that consists of frequently asked questions, analysis methods, tutorials and sequencing service providers.

3 CONCLUSION

The massive amounts of data and rapid pace of genome technology development necessitates innovations in scientific communication. The current standard for scientific communication between disparate research groups focuses on peer-reviewed research published in traditional scientific journals. These journals have evolved for the Internet age, especially with the new emphasis on open access and fast publishing from both new, exclusively open access journals to traditional journals that have created new outlets for open access publication. While scientific journals will continue to have important roles as curators of research and referees for the peer-review process, there is an opportunity for open, internet-based platforms to supplement traditional journals by enabling the rapid exchange of results, techniques and data, the latter two being crucial for advancing research, but notoriously difficult to access. SEQanswers was designed to address this need for genomics. The community has since developed into a thriving community that offers a wealth of information, including discussions that have facilitated the construction of analysis pipelines and consensus on standards in the genomics community.

7 in total

1. A map of human genome variation from population-scale sequencing.

Authors: Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean
Journal: Nature Date: 2010-10-28 Impact factor: 49.962

2. The NIH Human Microbiome Project.

Authors: Jane Peterson; Susan Garges; Maria Giovanni; Pamela McInnes; Lu Wang; Jeffery A Schloss; Vivien Bonazzi; Jean E McEwen; Kris A Wetterstrand; Carolyn Deal; Carl C Baker; Valentina Di Francesco; T Kevin Howcroft; Robert W Karp; R Dwayne Lunsford; Christopher R Wellington; Tsegahiwot Belachew; Michael Wright; Christina Giblin; Hagit David; Melody Mills; Rachelle Salomon; Christopher Mullins; Beena Akolkar; Lisa Begg; Cindy Davis; Lindsey Grandison; Michael Humble; Jag Khalsa; A Roger Little; Hannah Peavy; Carol Pontzer; Matthew Portnoy; Michael H Sayre; Pamela Starke-Reed; Samir Zakhari; Jennifer Read; Bracie Watson; Mark Guyer
Journal: Genome Res Date: 2009-10-09 Impact factor: 9.043

3. Limitations of next-generation genome sequence assembly.

Authors: Can Alkan; Saba Sajjadian; Evan E Eichler
Journal: Nat Methods Date: 2010-11-21 Impact factor: 28.547

4. The $1,000 genome, the $100,000 analysis?

Authors: Elaine R Mardis
Journal: Genome Med Date: 2010-11-26 Impact factor: 11.117

5. The SEQanswers wiki: a wiki database of tools for high-throughput sequencing analysis.

Authors: Jing-Woei Li; Keith Robison; Marcel Martin; Andreas Sjödin; Björn Usadel; Matthew Young; Eric C Olivares; Dan M Bolser
Journal: Nucleic Acids Res Date: 2011-11-15 Impact factor: 16.971

6. BioStar: an online question & answer resource for the bioinformatics community.

Authors: Laurence D Parnell; Pierre Lindenbaum; Khader Shameer; Giovanni Marco Dall'Olio; Daniel C Swan; Lars Juhl Jensen; Simon J Cockell; Brent S Pedersen; Mary E Mangan; Christopher A Miller; Istvan Albert
Journal: PLoS Comput Biol Date: 2011-10-27 Impact factor: 4.475

7. Comprehensive genomic characterization defines human glioblastoma genes and core pathways.

Authors:
Journal: Nature Date: 2008-09-04 Impact factor: 49.962

7 in total

21 in total

Review 1. Next-generation sequencing data interpretation: enhancing reproducibility and accessibility.

Authors: Anton Nekrutenko; James Taylor
Journal: Nat Rev Genet Date: 2012-09 Impact factor: 53.242

2. Thriving in multidisciplinary research: advice for new bioinformatics students.

Authors: Raymond K Auerbach
Journal: Yale J Biol Med Date: 2012-09-25

3. Semantic biomedical resource discovery: a Natural Language Processing framework.

Authors: Pepi Sfakianaki; Lefteris Koumakis; Stelios Sfakianakis; Galatia Iatraki; Giorgos Zacharioudakis; Norbert Graf; Kostas Marias; Manolis Tsiknakis
Journal: BMC Med Inform Decis Mak Date: 2015-09-30 Impact factor: 2.796

4. Novel long non-coding RNAs are specific diagnostic and prognostic markers for prostate cancer.

Authors: René Böttcher; A Marije Hoogland; Natasja Dits; Esther I Verhoef; Charlotte Kweldam; Piotr Waranecki; Chris H Bangma; Geert J L H van Leenders; Guido Jenster
Journal: Oncotarget Date: 2015-02-28

5. Genome Modeling System: A Knowledge Management Platform for Genomics.

Authors: Malachi Griffith; Obi L Griffith; Scott M Smith; Avinash Ramu; Matthew B Callaway; Anthony M Brummett; Michael J Kiwala; Adam C Coffman; Allison A Regier; Ben J Oberkfell; Gabriel E Sanderson; Thomas P Mooney; Nathaniel G Nutter; Edward A Belter; Feiyu Du; Robert L Long; Travis E Abbott; Ian T Ferguson; David L Morton; Mark M Burnett; James V Weible; Joshua B Peck; Adam Dukes; Joshua F McMichael; Justin T Lolofie; Brian R Derickson; Jasreet Hundal; Zachary L Skidmore; Benjamin J Ainscough; Nathan D Dees; William S Schierding; Cyriac Kandoth; Kyung H Kim; Charles Lu; Christopher C Harris; Nicole Maher; Christopher A Maher; Vincent J Magrini; Benjamin S Abbott; Ken Chen; Eric Clark; Indraniel Das; Xian Fan; Amy E Hawkins; Todd G Hepler; Todd N Wylie; Shawn M Leonard; William E Schroeder; Xiaoqi Shi; Lynn K Carmichael; Matthew R Weil; Richard W Wohlstadter; Gary Stiehr; Michael D McLellan; Craig S Pohl; Christopher A Miller; Daniel C Koboldt; Jason R Walker; James M Eldred; David E Larson; David J Dooling; Li Ding; Elaine R Mardis; Richard K Wilson
Journal: PLoS Comput Biol Date: 2015-07-09 Impact factor: 4.475

6. The NGS WikiBook: a dynamic collaborative online training effort with long-term sustainability.

Authors: Jing-Woei Li; Dan Bolser; Magnus Manske; Federico Manuel Giorgi; Nikolay Vyahhi; Björn Usadel; Bernardo J Clavijo; Ting-Fung Chan; Nathalie Wong; Daniel Zerbino; Maria Victoria Schneider
Journal: Brief Bioinform Date: 2013-06-21 Impact factor: 11.622

7. Reference-independent comparative metagenomics using cross-assembly: crAss.

Authors: Bas E Dutilh; Robert Schmieder; Jim Nulton; Ben Felts; Peter Salamon; Robert A Edwards; John L Mokili
Journal: Bioinformatics Date: 2012-10-16 Impact factor: 6.937

8. Next-generation sequence analysis of cancer xenograft models.

Authors: Fernando J Rossello; Richard W Tothill; Kara Britt; Kieren D Marini; Jeanette Falzon; David M Thomas; Craig D Peacock; Luigi Marchionni; Jason Li; Samara Bennett; Erwin Tantoso; Tracey Brown; Philip Chan; Luciano G Martelotto; D Neil Watkins
Journal: PLoS One Date: 2013-09-26 Impact factor: 3.240

9. Navigating the changing learning landscape: perspective from bioinformatics.ca.

Authors: Michelle D Brazas; B F Francis Ouellette
Journal: Brief Bioinform Date: 2013-03-20 Impact factor: 11.622

10. A biologist, a statistician, and a bioinformatician walk into a conference room… and walk out with a great metagenomics project plan.

Authors: Ann E Stapleton
Journal: Front Plant Sci Date: 2014-06-03 Impact factor: 5.753