| Literature DB >> 31705053 |
Thiago de Jesus Sousa1, Doglas Parise1, Rodrigo Profeta1, Mariana Teixeira Dornelles Parise1, Anne Cybelle Pinto Gomide1, Rodrigo Bentos Kato1, Felipe Luiz Pereira2, Henrique Cesar Pereira Figueiredo2, Rommel Ramos3, Bertram Brenig4, Artur Luiz da Costa da Silva3, Preetam Ghosh5, Debmalya Barh6, Aristóteles Góes-Neto1, Vasco Azevedo7.
Abstract
The number of draft genomes deposited in Genbank from the National Center for Biotechnology Information (NCBI) is higher than the complete ones. Draft genomes are assemblies that contain fragments of misassembled regions (gaps). Such draft genomes present a hindrance to the complete understanding of the biology and evolution of the organism since they lack genomic information. To overcome this problem, strategies to improve the assembly process are developed continuously. Also, the greatest challenge to the assembly progress is the presence of repetitive DNA regions. This article highlights the use of optical mapping, to detect and correct assembly errors in Corynebacterium pseudotuberculosis. We also demonstrate that choosing a reference genome should be done with caution to avoid assembly errors and loss of genetic information.Entities:
Mesh:
Year: 2019 PMID: 31705053 PMCID: PMC6841979 DOI: 10.1038/s41598-019-52695-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Information on sequencing and assembling of strains.
| Strains | Sequencing | Reads | Assembly Software | Length (Mb) | Mapped reads (%) | Accession number | Reference |
|---|---|---|---|---|---|---|---|
| 1002B | Ion PGM 200 bp | 739,755 | Mira v. 3.9.18 | 2.33511 | 99.70 | CP012837.1 | Mariano |
| 29156 | Ion PGM 200 bp | 1,400,026 | Newbler v. 2.9 | 2.33865 | 98.02 | CP010795.1 | On this work |
| I19 | Ion PGM 400 bp | 1,255,111 | Spades v. 3.6.0 | 2.33759 | 99.64 | CP002251.2 | On this work |
| 31 | Ion PGM 400 bp | 1,394,211 | SPAdes 3.6.0 | 2.40296 | 99.57 | CP003421.3 | Viana |
| 162 | Ion PGM 200 bp | 2,050,404 | Newbler v. 2.9. | 2.36587 | 98.00 | CP003652.2 | On this work |
| 258 | Ion PGM 200 bp | 260,169 | Spades v. 3.6.0 | 2.36982 | 99.41 | CP003540.2 | Mariano |
| CIP52.97 | Ion PGM 400 bp | 1,427,084 | Mira v. 3.9.18 | 2.36939 | 99.68 | CP003061.2 | On this work |
| MB302 | Ion PGM 400 bp | 1,832,580 | Newbler v. 2.9 | 2.36881 | 99.59 | CP021982.1 | Baraúna |
| T1 | Ion PGM 200 bp | 1,118,022 | Newbler v. 2.9 | 2.3372 | 95.93 | CP015100.1 | Almeida |
| MB11 | Ion PGM 200 bp | 6,753,458 | Mira 4.0.2 | 2.36342 | 99.24 | CP013260.1 | Baraúna |
Figure 1Optical map alignment of the selected ovis biovar strains. Comparisons between the first and the new version (when available), with C. pseudotuberculosis 1002 and 1002B (A); C. pseudotuberculosis I19 (B); C. pseudotuberculosis 29156 (C); C. pseudotuberculosis FRC41 (D); C. pseudotuberculosis T1 (E) are shown. R1 and R2 highlighted regions are events of inversion errors.
Figure 2Optical map alignment of the selected equi biovar strains. Comparisons between the first and the new version (when available), with C. pseudotuberculosis 31 (A); C. pseudotuberculosis Cp162 (B); C. pseudotuberculosis MB11 (C); C. pseudotuberculosis 258 (D); C. pseudotuberculosis CIP52.97 (E); C. pseudotuberculosis 302 (F) are shown.
Comparison between deposited and new version assembly of CpI19, Cp1002 (Cp1002B), Cp258, Cp162, Cp31, and CpCIP52.97 strains.
| Isolates | Bases (bp) | CDS | Pseudogenes |
|---|---|---|---|
| I191st | 2,337,730 | 2,095 | 57 |
| I192nd | 2,337,594 | 2,129 | 45 |
| 10022nd | 2,335,113 | 2,095 | 47 |
| 1002B1st | 2,335,107 | 2,071 | 43 |
| 2581st | 2,314,404 | 2,088 | 46 |
| 2582nd | 2,369,817 | 2,129 | 34 |
| 1621st | 2,293,464 | 2,002 | 87 |
| 1622nd | 2,365,874 | 2,099 | 43 |
| 311st | 2,297,010 | 2,063 | 46 |
| 313rd | 2,402,956 | 2,173 | 4 |
| CIP52.971st | 2,320,595 | 2,060 | 75 |
| CIP52.972nd | 2,369,387 | 2,187 | 62 |
Figure 3Comparative BRIG analysis of ovis biovar strains. Comparative genomic maps of the older versions (outermost circles) and their respective versions with optical map (inner black circles). (A) C. pseudotuberculosis 1002B. (B) C. pseudotuberculosis I19.
Figure 4Comparative BRIG analysis of equi biovar strains. Comparative genomic maps of the older versions (outermost circles in purple) and their respective versions with optical map (inner black circles). (A) C. pseudotuberculosis 258. (B) C. pseudotuberculosis Cp162. (C) C. pseudotuberculosis 31. (D) C. pseudotuberculosis CIP52.97.
Figure 5Comparative MAUVE analysis of ovis biovar strains. Comparison of Genome alignment of C. pseudotuberculosis 1002B, C. pseudotuberculosis 29156, C. pseudotuberculosis FRC41, C. pseudotuberculosis I19 and C. pseudotuberculosis T1 strains.
Figure 6Comparative MAUVE analysis of equi biovar strains. Comparison of Genome alignment of C. pseudotuberculosis 31, C. pseudotuberculosis 258, C. pseudotuberculosis Cp162, C. pseudotuberculosis CIP52.97, C. pseudotuberculosis MB302 and C. pseudotuberculosis MB11.
Information about the quality metrics of the optical maps used.
| Strains | Enzyme | Length (bp) | Number of fragments | Average fragment size (bp) | Maximum fragment size (bp) | Minimum fragment size (bp) | Whole genome coverage |
|---|---|---|---|---|---|---|---|
| 1002B |
| 2,335,144 | 353 | 6,615.139 | 38,715 | 903 | 99.998% |
| 29156 |
| 2,351,288 | 368 | 6,389.37 | 38,839 | 1,275 | 99.462% |
| I19 |
| 2,326,586 | 333 | 6,986.745 | 38,215 | 1,460 | 100.473% |
| FRC41 |
| 2,341,893 | 369 | 6,346.593 | 38,942 | 1,394 | 99.830% |
| 31 |
| 2,372,071 | 346 | 6,855.697 | 35,806 | 1,544 | 101.302% |
| 162 |
| 2,345,656 | 362 | 6,479.713 | 28,013 | 1,509 | 100.861% |
| 258 |
| 2,366,195 | 346 | 6,838.714 | 36,249 | 1,667 | 100.153% |
| CIP52.97 |
| 2,352,141 | 347 | 6,778.504 | 36,145 | 1,567 | 100.733% |
| MB302 |
| 2,363,709 | 362 | 6,529.583 | 36,517 | 1,471 | 100.215% |
| T1 |
| 2,350,532 | 193 | 12,178.922 | 54,138 | 1,787 | 99.432% |
| MB11 |
| 2,347,572 | 366 | 6,414.131 | 36,155 | 1,333 | 100.679% |