| Literature DB >> 25431302 |
Chin Lung Lu1, Kun-Tze Chen2, Shih-Yuan Huang3, Hsien-Tai Chiu4.
Abstract
BACKGROUND: Next generation sequencing technology has allowed efficient production of draft genomes for many organisms of interest. However, most draft genomes are just collections of independent contigs, whose relative positions and orientations along the genome being sequenced are unknown. Although several tools have been developed to order and orient the contigs of draft genomes, more accurate tools are still needed.Entities:
Mesh:
Year: 2014 PMID: 25431302 PMCID: PMC4253983 DOI: 10.1186/s12859-014-0381-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The web interface of CAR.
Figure 2The dot plot of draft and reference chromosomes before contig assembly.
Figure 3The dot plot of assembled draft and reference chromosomes after contig assembly.
Draft chromosomal genomes used in the testing dataset
|
|
|
|
|
|
|---|---|---|---|---|
|
| NC_013926 | 1,486,778 | 35 | 98.63 |
|
| NC_000964 | 4,215,606 | 5 | 99.97 |
|
| NC_010816 | 2,375,792 | 58 | 85.47 |
|
| NC_003317 | 2,117,144 | 41 | 90.83 |
|
| NC_003318 | 1,177,787 | 12 | 99.77 |
|
| NC_015857 | 2,138,342 | 55 | 87.47 |
|
| NC_015858 | 1,260,926 | 34 | 84.38 |
|
| NC_007650 | 2,914,771 | 15 | 70.34 |
|
| NC_007651 | 3,809,201 | 28 | 89.90 |
|
| NC_002620 | 1,072,950 | 4 | 99.09 |
|
| NC_014393 | 5,262,222 | 297 | 96.54 |
|
| NC_012590 | 2,790,189 | 90 | 92.94 |
|
| NC_004369 | 3,147,090 | 118 | 95.09 |
|
| NC_012803 | 2,501,097 | 126 | 86.25 |
|
| NC_009525 | 4,419,977 | 220 | 76.84 |
|
| NC_000908 | 580,076 | 24 | 78.54 |
|
| NC_009142 | 8,212,805 | 238 | 97.10 |
|
| NC_015437 | 2,568,361 | 53 | 94.01 |
|
| NC_014623 | 10,260,756 | 472 | 99.10 |
|
| NC_003028 | 2,160,842 | 209 | 90.31 |
|
| NC_013456 | 3,259,580 | 176 | 91.43 |
|
| NC_013457 | 1,829,445 | 33 | 95.31 |
|
| NC_008149 | 4,534,590 | 17 | 83.86 |
The column “# Contig” shows the number of contigs selected for experiments of contig assembly by excluding, for example, those contigs not mapped to reference chromosome. The column “COV” gives the fraction of each genome or chromosome covered by selected contigs.
Comparison of average sensitivity for various contig assembly tools
|
|
|
|
|
|---|---|---|---|
| CAR (PROmer) | 62.71 (67.50) | 49.87 (56.25) | 37.33 (32.25) |
| SIS (PROmer) | 60.82 (67.50) | 48.53 (54.55) | 36.14 (30.40) |
| Mauve Aligner | 60.19 (65.22) | 46.40 (46.88) | 32.86 (22.47) |
| r2cat | 61.64 (78.13) | 43.56 (38.52) | 30.01 (20.51) |
| CAR (NUCmer) | 57.04 (73.68) | 43.38 (39.01) | 28.19 (7.41) |
| SIS (NUCmer) | 55.41 (72.73) | 42.70 (36.67) | 27.56 (6.40) |
| OSLay | 48.38 (62.50) | 34.43 (12.90) | 21.18 (0.60) |
| fillScaffolds (NUCmer) | 49.04 (56.41) | 34.23 (21.83) | 21.36 (4.53) |
| fillScaffolds (PROmer) | 45.19 (50.00) | 33.18 (25.93) | 21.76 (8.75) |
| CONTIGuator | 45.66 (50.00) | 31.53 (15.43) | 19.29 (0.68) |
| Projector2 | 42.58 (40.17) | 29.18 (20.49) | 18.63 (5.00) |
| ABACAS | 33.42 (28.57) | 23.64 (0.38) | 13.01 (0.00) |
This table is sorted in descending order according to the average values shown in the “Top 10” column, where the values in parentheses are medians.
Comparison of average precision for various contig assembly tools
|
|
|
|
|
|---|---|---|---|
| CAR (PROmer) | 68.50 (73.91) | 56.54 (66.04) | 43.30 (40.00) |
| SIS (PROmer) | 66.47 (73.91) | 54.96 (60.00) | 41.84 (38.98) |
| CAR (NUCmer) | 63.71 (81.25) | 51.49 (56.25) | 35.36 (22.22) |
| SIS (NUCmer) | 61.99 (76.92) | 50.54 (56.25) | 34.36 (20.99) |
| OSLay | 61.86 (75.00) | 49.57 (59.41) | 38.00 (33.33) |
| r2cat | 65.59 (79.17) | 48.38 (48.61) | 34.91 (26.67) |
| Mauve Aligner | 60.19 (65.22) | 46.41 (46.88) | 32.88 (22.47) |
| CONTIGuator | 58.95 (66.67) | 41.83 (42.33) | 28.23 (11.11) |
| Projector2 | 57.85 (64.29) | 41.63 (37.50) | 29.04 (20.00) |
| fillScaffolds (NUCmer) | 54.50 (59.26) | 40.34 (30.88) | 26.57 (12.40) |
| fillScaffolds (PROmer) | 48.79 (51.46) | 37.14 (29.15) | 24.67 (12.50) |
| ABACAS | 46.88 (50.00) | 31.54 (14.29) | 20.43 (0.00) |
This table is sorted in descending order according to the average values displayed in the “Top 10” column, where the values in parentheses are medians.
Figure 4Average sensitivity obtained by each tool when the reference genome varies from the closest to the farthest in the phylogenetic distance.
Figure 5Average precision obtained by each tool when the reference genome varies from the closest to the farthest in the phylogenetic distance.
Comparison of genome coverage for various contig assembly tools
|
|
|
|
|
|---|---|---|---|
| CAR (PROmer) | 63.73 (74.68) | 50.67 (58.82) | 37.81 (34.88) |
| SIS (PROmer) | 61.51 (73.85) | 49.81 (55.08) | 37.00 (33.78) |
| Mauve Aligner | 60.57 (72.30) | 46.09 (45.07) | 32.54 (22.62) |
| CAR (NUCmer) | 57.87 (76.06) | 44.30 (44.23) | 29.19 (12.16) |
| SIS (NUCmer) | 56.95 (74.68) | 44.21 (47.43) | 28.81 (10.60) |
| r2cat | 59.21 (71.69) | 41.63 (36.84) | 28.85 (19.48) |
| OSLay | 49.36 (68.09) | 35.71 (13.85) | 21.72 (0.52) |
| fillScaffolds (NUCmer) | 48.47 (61.49) | 33.07 (16.28) | 20.81 (5.26) |
| CONTIGuator | 47.33 (60.06) | 32.87 (17.95) | 19.54 (0.44) |
| fillScaffolds (PROmer) | 43.59 (42.91) | 31.08 (16.95) | 19.99 (7.04) |
| Projector2 | 47.54 (51.58) | 31.07 (20.10) | 20.06 (7.09) |
| ABACAS | 27.48 (8.15) | 21.43 (0.12) | 11.41 (0.00) |
This table is sorted in descending order according to the average values shown in the “Top 10” column, where the values in parentheses are medians.
Figure 6Average genome coverage obtained by each tool when the reference genome varies from the closest to the farthest in the phylogenetic distance.