| Literature DB >> 27171416 |
Pablo H C G de Sá1, Fábio Miranda1, Adonney Veras1, Diego Magalhães de Melo1, Siomar Soares2, Kenny Pinheiro1, Luis Guimarães1, Vasco Azevedo3, Artur Silva1, Rommel T J Ramos1.
Abstract
The advent of NGS (Next Generation Sequencing) technologies has resulted in an exponential increase in the number of complete genomes available in biological databases. This advance has allowed the development of several computational tools enabling analyses of large amounts of data in each of the various steps, from processing and quality filtering to gap filling and manual curation. The tools developed for gap closure are very useful as they result in more complete genomes, which will influence downstream analyses of genomic plasticity and comparative genomics. However, the gap filling step remains a challenge for genome assembly, often requiring manual intervention. Here, we present GapBlaster, a graphical application to evaluate and close gaps. GapBlaster was developed via Java programming language. The software uses contigs obtained in the assembly of the genome to perform an alignment against a draft of the genome/scaffold, using BLAST or Mummer to close gaps. Then, all identified alignments of contigs that extend through the gaps in the draft sequence are presented to the user for further evaluation via the GapBlaster graphical interface. GapBlaster presents significant results compared to other similar software and has the advantage of offering a graphical interface for manual curation of the gaps. GapBlaster program, the user guide and the test datasets are freely available at https://sourceforge.net/projects/gapblaster2015/. It requires Sun JDK 8 and Blast or Mummer.Entities:
Mesh:
Year: 2016 PMID: 27171416 PMCID: PMC4865197 DOI: 10.1371/journal.pone.0155327
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Sequencing information of the genomes used in the analysis.
| Organism | Platform | Library | Read Length | Insert Size | Number of Reads |
|---|---|---|---|---|---|
| Ion Torrent PGM | Fragment | ~220 bp | N/A | 1765213 | |
| Illumina | Paired-end | ~101 bp | 180 bp | 1294104 | |
| Illumina | Mate-Pair | ~37 bp | 3500 bp | 3494070 | |
| Illumina | Paired-end | ~101 bp | 180 bp | 2050868 | |
| Illumina | Mate-Pair | ~101 bp | 3500 bp | 2050868 |
Information of the reference genomes used to validate the filled-in gaps.
| Organism | |||
|---|---|---|---|
| 2325749 | 2872916 | 4628173 | |
| 52,17 | 0,3276 | 68,77 | |
| 1 | 1 | 2 | |
| 0 | 0 | 5 | |
| CP012022.1 | CP007690.1 | GCA_000273405.1 | |
| 8007 | 10709 | 38073 | |
| 6612 | 9460 | 35353 |
Genome assembly information for C. pseudotuberculosis 262, S. aureus and R. sphaeroides.
| Organism | Assembler | Bases (with N) | #Scaffolds |
|---|---|---|---|
| ------- | |||
| SPADES | 2893857 | 4611 | |
| ------- | |||
| ABySS | 3893185 | 5012 | |
| ABySS2 | 3821622 | 125 | |
| Allpaths-LG | 2880676 | 19 | |
| Bambus2 | 2862930 | 17 | |
| MSR-CA | 2872905 | 17 | |
| SGA | 3128388 | 546 | |
| SOAPdenovo | 2924135 | 175 | |
| Velvet | 2877995 | 173 | |
| ------- | |||
| ABySS | 5160167 | 2714 | |
| ABySS2 | 5331930 | 480 | |
| Allpaths-LG | 4609785 | 38 | |
| Bambus2 | 4428612 | 92 | |
| CABOG | 4259679 | 130 | |
| MSR-CA | 4498559 | 44 | |
| SGA | 5614693 | 2096 | |
| SOAPdenovo | 4627058 | 312 | |
| Velvet | 4615068 | 382 |
Gap closure results for the Corynebacterium genome.
| #Gaps | #N | #Gaps GB | #N GB | #Gaps FGAP | #N FGAP | |
|---|---|---|---|---|---|---|
| 24 | 1794 | 11 | 931 | 5 | 360 |
Results of gap closure analysis of Corynebacterium, showing the #Gaps (amount of gaps) and #N (gap length); #Gaps GB and #N GB show the amount of remaining gaps and Ns, respectively, after the use of GapBlaster. The #Gaps FGAP and #N FGAP show the amount of remaining gaps and Ns, respectively, after the use of FGAP.
Gap closure results for GAGE Assemblies.
| AbySS | 66 | 55882 | 55 | 47614 | 45 | 51127 | 69 | 56355 |
| AbySS2 | 33 | 9391 | 27 | 7780 | 17 | 4850 | 35 | 10003 |
| Allpaths-LG | 23 | 9875 | 20 | 9446 | 15 | 8755 | 40 | 10472 |
| Bambus2 | 95 | 29201 | 93 | 29159 | 80 | 27459 | 98 | 30771 |
| MSR-CA | 81 | 10353 | 72 | 7868 | 47 | 7861 | 80 | 11651 |
| SGA | 654 | 300607 | 642 | 292067 | 634 | 298252 | 654 | 312284 |
| SOAPdenovo | 9 | 4857 | 8 | 4837 | 7 | 4708 | 9 | 5010 |
| Velvet | 128 | 17688 | 124 | 17473 | 94 | 15406 | 127 | 19863 |
| AbySS | 261 | 114525 | 261 | 114525 | 256 | 113886 | 306 | 118298 |
| AbySS2 | 235 | 62570 | 233 | 62128 | 228 | 60323 | 290 | 68052 |
| Allpaths-LG | 90 | 21329 | 87 | 20733 | 82 | 19500 | 164 | 24001 |
| Bambus2 | 85 | 57041 | 83 | 56402 | 80 | 55990 | 84 | 56930 |
| CABOG | 193 | 21547 | 192 | 20892 | 190 | 21065 | 191 | 25011 |
| MSR-CA | 356 | 32628 | 349 | 26189 | 347 | 31174 | 336 | 37494 |
| SGA | 938 | 1145600 | 938 | 1145600 | 930 | 1144955 | 930 | 1159235 |
| SOAPdenovo | 38 | 10461 | 37 | 9601 | 37 | 10097 | 38 | 11176 |
| Velvet | 427 | 86815 | 424 | 86785 | 404 | 86063 | 415 | 94150 |
Results of the gap closure process for the data produced by GAGE with several assemblers for S. aureus and R. sphaeroides. Showing the #Gaps (amount of gaps) and #N (gap length); #Gaps GB and #N GB show the amount of remaining gaps and Ns, respectively, after the use of GapBlaster. The #Gaps FGAP and #N FGAP show the amount of remaining gaps and Ns, respectively, after the use of FGAP. The #Gaps GF and #N GF show the amount of remaining gaps and Ns, respectively, after the use of GapFiller.
Comparison of the original results of FGAP and after manual curation with GapBlaster.
| AbySS | 45 | 51127 | 41 | 45439 |
| MSR-CA | 47 | 7861 | 46 | 6359 |
| SGA | 634 | 298252 | 629 | 290825 |
| AbySS2 | 228 | 60323 | 227 | 60040 |
| Allpaths-LG | 82 | 19500 | 81 | 19494 |
| Bambus2 | 80 | 55990 | 79 | 55402 |
| CABOG | 190 | 21065 | 188 | 19568 |
| MSR-CA | 347 | 31174 | 343 | 25592 |
| SOAPdenovo | 37 | 10097 | 36 | 9237 |
| 5 | 360 | 3 | 251 |
The results produced by FGAP were used as input for GapBlaster, and the organism/assemblies that were improved are shown. The #Gaps FGAP and #N FGAP show the amount of gaps and Ns, respectively, for the results of FGAP. The #Gaps after GB and #N after GB show the amounts of remaining gaps and Ns, respectively, after the use of GapBlaster.
Comparison of the original results of GapFiller and after manual curation with GapBlaster.
| AbySS | 69 | 56355 | 66 | 54837 |
| AbySS2 | 35 | 10003 | 30 | 8741 |
| Allpaths-LG | 40 | 10472 | 39 | 10455 |
| Bambus2 | 98 | 30771 | 97 | 30725 |
| MSR-CA | 80 | 11651 | 76 | 9794 |
| SGA | 654 | 312284 | 646 | 307095 |
| AbySS | 306 | 118298 | 304 | 118287 |
| AbySS2 | 290 | 68052 | 288 | 67740 |
| Allpaths-LG | 164 | 24001 | 163 | 23780 |
| CABOG | 191 | 25011 | 190 | 24336 |
| MSR-CA | 336 | 37494 | 333 | 33590 |
| SGA | 930 | 1159235 | 929 | 1159162 |
The results produced by GapFiller were used as input for GapBlaster, and the organism/assemblies that were improved are shown. The #Gaps GF and #N GF show the amount of gaps and Ns, respectively, in the results of GapFiller. The #Gaps after GB and #N after GB show the amount of remaining gaps and Ns, respectively, after the use of GapBlaster.
Comparison of the features of GapBlaster, FGAP and GapFiller.
| Features | GapBlaster | FGAP | GapFiller |
|---|---|---|---|
| Alignment method | Blast+ or Blast Legacy or Mummer | Blast+ | Bowtie or BWA |
| Set Flank Alignment | Yes | Yes | Yes |
| Allow Manual Curation | Yes | No | No |
| Perform Automatic Analysis | Yes | Yes | Yes |
| Based on paired-reads | No | No | Yes |
| Use contigs to fill gaps | Yes | Yes | No |
| Graphical interface | Yes | No | No |
| Improve gap filling results of other softwares | Yes | Not tested | Not tested |
| Correctly fill gaps? | Yes | Yes | Yes |