| Literature DB >> 26125026 |
Alexandre G de Brevern1, Jean-Philippe Meyniel2, Cécile Fairhead3, Cécile Neuvéglise4, Alain Malpertuy5.
Abstract
Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.Entities:
Mesh:
Year: 2015 PMID: 26125026 PMCID: PMC4466500 DOI: 10.1155/2015/904541
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Human genome sequencing costs. Evolution of the costs between mid-2001 and nowadays, the different important technologies are indicated.
Commonly used scientific Workflow Management software.
| Platform name | Initial creator | License | Bioinfo | Website |
|---|---|---|---|---|
| Galaxy | Emory University (USA) and Penn State University (USA) | Free ($) | +++ |
|
|
| ||||
| KDE | Inforsense (UK) | Commercial | ++ |
|
|
| ||||
| Kepler | UC Davis, UC San Diego, and UC Santa Barbara (USA) | Free | +/− |
|
|
| ||||
| Knime | University of Konstanz (Germany) | Free and commercial | ++ |
|
|
| ||||
| Pipeline Pilot | Accelrys (USA) | Commercial | +++ |
|
|
| ||||
| Taverna workbench | EBI (UK) | Free | +++ |
|
|
| ||||
| VIBE | Incogen (USA) | Commercial | ++ |
|
Columns “Bioinfo” show the level of use of these tools in bioinformatics. The “+++” indicates that the software is used by many teams, “++” indicates the software is starting to be used and “+/−” indicates that very few teams use this software in this field, and “−” the software has not been used yet in that field. The “$” sign indicates that some additional fees are required for access to third parties services such as cloud platform.
Figure 2Amadea software interface. Users create workflow by drag and drop of components on the “working area.”
Figure 3Real-time access to all intermediate results in Amadea. By clicking, user has instantaneous access to (1) output from the data source “chip array data,” (2) the output from “Entrez Gene” database, and (3) results from “Get Gene Info” component.
List of free tools for NGS data visualization or including visualization components.
| Software | Authors | Remarks | Project website |
|---|---|---|---|
| Artemis | Carver et al. [ | Software for integrated visualization and computational analysis |
|
|
| |||
| CisGenome Browser | Jiang et al. [ | Genomic data visualization |
|
|
| |||
| Girafe | Toedling et al. [ | Visualization of genome intervals with aligned reads. Required software: R/Bioconductor [ |
|
|
| |||
| IGV (Integrative Genomics Viewer) | Robinson et al. [ | Genome browser and interactive exploration of large, integrated genomic datasets |
|
|
| |||
| JBrowse |
Westesson et al. [ | Web-based genome browser |
|
|
| |||
| MagicViewer | Hou et al. [ | Assembly visualization and genetic variation annotation tool |
|
|
| |||
| NGSView | Arner et al. [ | Sequence alignment editor |
|
|
| |||
| ngs.plot | Shen et al. [ | Mining and visualization of NGS data |
|
|
| |||
| Savant | Fiume et al. [ | Software for sequence annotation, visualization, and analysis |
|
|
| |||
| seqMonk | Babraham Bioinformatics | Genome Browser |
|
|
| |||
| Tablet | Milne et al. [ | Graphical viewer for next-generation sequence assemblies and alignments |
|
|
| |||
| TGNet |
Riba-grognuz et al. [ | Method to evaluate genome scaffolding. Required software: Blat [ |
|