| Literature DB >> 35221678 |
Emeline Cherchame1,2, Guy Ilango1,3, Sabrina Cadel-Six1.
Abstract
With the advent of next-generation whole-genome sequencing (WGS), the need for good-quality and well-characterised Salmonella genomes has increased over the past years. Good-quality complete genomes are often required for assembly reference mapping or phylogenetic single nucleotide polymorphism (SNP) analysis. Complete genomes or contigs from specific sources or serovars are also searched for clustering analysis or source attribution studies. Therefore, new bioinformatics tools are needed for the extraction of good-quality and well-characterised genomes from public databases. Here, we developed SalmoDEST, an open-source Python tool capable of extracting Salmonella genomes with a coverage higher than 50x and genome length over 4Mb from the GenBank database in the form of complete genomes or contigs, with verification of the serovar to which they belong and identification of the corresponding multi locus sequence type (MLST) profile. To validate the ability to SalmoDEST to screen for and retrieve genomes of good quality, we compared our results for S. Typhi complete genome with those available in the literature and extracted Salmonella genomes from bovine sources strains isolated worldwide. Finally, we provide in this study a list of 239 complete genomes for 123 serovars of Salmonella of high quality. SalmoDEST is a handy and easy-to-use open-source tool to extract complete genomes or contigs that can be routinely used in public health, food safety and research laboratories. SalmoDEST (SALMOnella Download gEnome Serotype sT) is available at https://github.com/I-Guy/SalmoDEST.Entities:
Keywords: MLST profile determination; SalmoDEST; Salmonella; complete reference genomes; good-quality genomes; serovar prediction
Year: 2022 PMID: 35221678 PMCID: PMC8874161 DOI: 10.1177/11779322221080264
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Figure 1.SALMOnella Download gEnome Serotype sT (SalmoDEST) pipeline.
ST, sequence type.
Figure 2.Histogram of serovar diversity among the 1040 complete Salmonella genomes downloaded from the NCBI GenBank database using the SalmoDEST tool developed in this study. Only serovars with more than five complete genomes and complete antigenic formula are shown, with the exception of S. 4,[5],12: i:- and S. 1,3,19:g, s,t:-.
List of good-quality complete Salmonella genomes (ID, serovar and MLST profile predictions) downloaded from the NCBI GenBank database on 28 June 2021.
| Predicted_serovar | MLST Profile | Accession number | Predicted_serovar | MLST Profile | Accession number | Predicted_serovar | MLST Profile | Accession number |
|---|---|---|---|---|---|---|---|---|
| 1,3,19:g, s,t:- | 217 | CP038604.1 | II 56: b:z6 | 5324 | CP029995.1 | Oranienburg | 3613 | CP033344.1 |
| Abaetetuba | 2041 | CP007532.1 | II 56: z10:e, n,x,z15 | 2403 | CP029992.1 | Orion | 684 | CP030235.1 |
| Aberdeen | 426 | LS483453.1 | II 58: d:z6 | 3379 | CP070222.1 | Oslo | 1370 | CP030231.1 |
| Abony | 1483 | CP007534.1 | II 58:l, z13,z28:- | 1141 | LS483477.1 | Ouakam | 1610 | CP022116.1 |
| Adjame | 3929 | CP049881.1 | IIIa -: z4, z23:- | 106 | CP053584.1 | Panama | 48 | CP012346.1 |
| Adjame | 4023 | CP054827.1 | IIIa 40: z4, z23:- | 6216 | CP041011.1 | Paratyphi A | 85 | CP000026.1 |
| Agona | 13 | CP025452.1 | IIIa 41: z4, z23:- | 2131 | CP000880.1 | Paratyphi A | 129 | CP009049.1 |
| Albany or Duesseldorf | 292 | CP019177.1 | IIIa 48: z36:- | 3711 | LR134150.1 | Paratyphi B | 28 | CP020492.1 |
| Albert | 19 | CP044188.1 | IIIa 53: z4, z23,z32:- | 2127 | CP022504.1 | Paratyphi B var. L(+) tartrate + | 307 | CP000886.1 |
| Anatum | 64 | CP029800.1 | IIIa 53: z4, z23:- | 874 | LR133910.1 | Paratyphi C or Choleraesuis or Typhisuis | 66 | AE017220.1 |
| Anatum | 2167 | CP014620.1 | IIIa 62: z36:- | 2402 | CP006693.1 | Paratyphi C or Choleraesuis or Typhisuis | 68 | CP007639.1 |
| Antsalova | 4407 | CP019116.1 | IIIa 63:g, z51:- | 1425 | CP029991.1 | Paratyphi C or Choleraesuis or Typhisuis | 90 | CP043773.1 |
| Apapa | CP019403.1 | IIIb 47: k:z35 | 1195 | CP053583.1 | Paratyphi C or Choleraesuis or Typhisuis | 114 | CP000857.1 | |
| Bareilly | 203 | CP063684.2 | IIIb 48: i:z | 574 | CP029989.1 | Paratyphi C or Choleraesuis or Typhisuis | 139 | CP012344.2 |
| Bareilly | 909 | CP006053.1 | IIIb 50: k:z | 430 | CP059886.1 | Paratyphi C or Choleraesuis or Typhisuis | 145 | CP051366.1 |
| Bareilly | 5146 | CP034721.1 | IIIb 60: r:z | 3457 | CP011289.1 | Pomona | 451 | CP019186.1 |
| Bergen | 1356 | CP019405.1 | IIIb 60: z52:z53 | 2830 | CP030180.1 | Poona | 308 | CP046279.1 |
| Berta | 435 | CP030005.1 | IIIb 61: i:z | 57 | LS483474.1 | Poona | 447 | CP037891.1 |
| Birkenhead | 424 | CP045958.1 | IIIb 65: c:z | 1260 | CP022135.1 | Poona | 812 | LS483489.1 |
| Bispebjerg | 251 | CP043027.1 | Indiana | 17 | CP028131.1 | Poona | 964 | CP019189.1 |
| Blockley | 52 | CP043662.1 | Infantis | 32 | CP047881.1 | Quebec | 4409 | CP022019.1 |
| Blukwa | 367 | LR134148.1 | Inverness | 1384 | CP019181.1 | Reading | 1628 | CP051307.1 |
| Bovismorbificans | 142 | CP060517.1 | Irumu | LR134144.1 | Rissen | 469 | CP030190.1 | |
| Bovismorbificans | 1499 | CP069297.1 | Isangi | 216 | CP030225.1 | Rubislaw | 94 | CP019192.1 |
| Braenderup | 22 | CP022490.1 | IV -: z4, z23:- | 963 | LS483478.1 | Saintpaul | 27 | CP017723.1 |
| Brancaster | 2133 | CP036166.1 | IV -: z4, z23:- | 3942 | CP051368.1 | Saintpaul | 49 | CP053055.1 |
| Brandenburg | 65 | CP025280.1 | IV [1],40:g, z51:- | 2265 | CP053582.1 | Saintpaul | 50 | CP045954.1 |
| Bredeney | 241 | CP043222.1 | IV 16: z4, z32:- | 596 | CP045761.1 | Saintpaul | 95 | CP023512.1 |
| Bredeney | 897 | CP007533.1 | IV 41: z52:- | 3924 | CP054715.1 | Saintpaul | 680 | CP022491.1 |
| Butantan | 600 | CP046278.1 | IV 45:g, z51:- | 107 | CP030194.1 | Saintpaul | 3602 | CP023166.1 |
| Carmel | 2123 | LS483455.1 | IV 50:g, z51:- | 2882 | LR134159.1 | Sanjuan | 785 | LR134142.1 |
| Cerro | 367 | CP008925.1 | IV 50: z4, z23:- | 2053 | CP053579.1 | Schoeneberg | LR134153.1 | |
| Chester | CP019178.1 | Javiana | 24 | CP004027.1 | Schwarzengrund | 96 | CP045447.1 | |
| Coeln | LR134190.1 | Johannesburg | 471 | CP019411.1 | Schwarzengrund | 322 | CP001127.1 | |
| Concord | 534 | CP044177.1 | Kentucky | 152 | CP022500.1 | Senftenberg | 14 | CP038591.1 |
| Concord | 599 | CP028196.1 | Kentucky | 198 | CP043667.1 | Senftenberg | 185 | CP016837.1 |
| Corvallis | 1541 | CP027677.1 | Kisarawe | 906 | CP030203.1 | Senftenberg | 210 | AP020332.1 |
| Cubana | 286 | CP006055.1 | Kottbus | 212 | CP062220.1 | Senftenberg | 290 | CP034233.1 |
| Dakar | 5734 | CP046280.1 | Kottbus | 808 | CP030211.1 | Sloterdijk | 3179 | CP012349.1 |
| Daytona | LR133909.1 | Krefeld | 1799 | CP019413.1 | Stanley | 29 | CP036167.1 | |
| Derby | 40 | CP028900.1 | Litchfield | 214 | CP030202.1 | Stanley | 1027 | LS483434.1 |
| Derby | 71 | CP026609.1 | Litchfield | 491 | CP019414.1 | Stanleyville | 97 | CP017727.1 |
| Derby | 72 | CP022494.1 | Livingstone | 2247 | CP030233.1 | Stanleyville | 1986 | CP034716.1 |
| Djakarta | CP019409.1 | Llandoff | CP060585.1 | Stanleyville | 4762 | CP034700.1 | ||
| Dublin | 10 | CP032393.1 | London | 155 | CP061159.1 | Sundsvall | 5323 | LS483457.1 |
| Dublin | 4406 | CP019179.1 | London | 504 | CP064709.1 | Taksony | 2204 | LR134146.1 |
| Enteritidis | 11 | CP063700.1 | Lubbock | 413 | CP032814.1 | Telelkebir | 450 | CP030217.1 |
| Enteritidis | 3175 | CP008928.1 | Macclesfield | 4976 | CP022117.1 | Tennessee | 319 | CP014994.1 |
| Florida | 931 | LS483454.1 | Manhattan | 18 | CP019418.1 | Thompson | 26 | CP012514.1 |
| Fresno | 649 | CP032444.1 | Mbandaka | 413 | CP022489.1 | Typhi | 1 | CP003278.1 |
| Gallinarum or Enteritidis | 78 | CP019035.1 | Mbandaka | 3016 | CP019183.1 | Typhi | 2 | AL513382.1 |
| Gallinarum or Enteritidis | 92 | CP022963.1 | Menston | LS483490.1 | Typhi | 8 | LT904887.1 | |
| Gallinarum or Enteritidis | 136 | CP018633.1 | Miami | 85 | CP023470.1 | Typhi | 2138 | LT905088.1 |
| Gallinarum or Enteritidis | 331 | AM933173.1 | Miami | 129 | CP009559.1 | Typhi | 2209 | CP029918.1 |
| Gallinarum or Enteritidis | 1972 | CP045955.1 | Miami | 140 | CP023468.1 | Typhimurium | 19 | AE006468.2 |
| Gallinarum or Enteritidis | 3304 | CP045956.1 | Mikawasima | 5372 | CP034713.1 | Typhimurium | 34 | CP045952.1 |
| Gaminara | 2439 | CP024165.1 | Milwaukee | 1245 | CP030175.1 | Typhimurium | 36 | CP036168.1 |
| Gaminara | 2440 | CP030288.1 | Minnesota | 548 | CP060508.1 | Typhimurium | 99 | CP020922.1 |
| Gateshead | 6131 | CP046291.1 | Montevideo | 4 | CP069518.1 | Typhimurium | 128 | HG326213.1 |
| Give | 516 | CP046277.1 | Montevideo | 81 | CP037893.1 | Typhimurium | 213 | CP035547.1 |
| Give | 654 | CP019174.1 | Montevideo | 138 | CP040380.1 | Typhimurium | 302 | CP014356.1 |
| Goldcoast or Brikama | 358 | CP062223.1 | Montevideo | 316 | CP029336.1 | Typhimurium | 313 | CP060169.1 |
| Goldcoast or Brikama | 2529 | LR134158.1 | Muenchen | 83 | CP016014.1 | Typhimurium | 328 | CP025736.1 |
| Grumpensis | 751 | CP030223.1 | Muenchen | 112 | CP045056.1 | Typhimurium | 568 | CP064919.1 |
| Hadar | 33 | CP022069.2 | Muenchen | 112 | CP045063.1 | Typhimurium | 568 | LR862421.1 |
| Havana | 1237 | LR134187.1 | Muenster | 321 | CP019198.1 | Typhimurium | 2066 | CP009102.1 |
| Heidelberg | 15 | CP005995.1 | Muenster | CP045038.1 | Typhimurium | 2210 | CP040562.1 | |
| Hidalgo or Cocody | CP022663.1 | Napoli | 2095 | CP063140.1 | Typhimurium | 3631 | CP039854.1 | |
| Hillingdon | CP019410.1 | Newport | 5 | CP015923.1 | Typhimurium | 5036 | CP029840.1 | |
| Hvittingfoss | 434 | CP045831.1 | Newport | 31 | CP007559.2 | Typhimurium | 5401 | CP033226.2 |
| Hvittingfoss | 446 | CP022503.1 | Newport | 45 | CP012598.1 | Uganda | 684 | CP051398.1 |
| I 4,[5],12: i:- | 2379 | CP039610.1 | Newport | 118 | CP015924.1 | Virchow | 16 | CP045945.1 |
| I 9: g, m,q:- | 2912 | CP019406.1 | Newport | 132 | CP025232.1 | Wandsworth | 1498 | CP019417.1 |
| I 9: g, p,s:- | 10 | CP030207.1 | Newport | 166 | CP012144.1 | Waycross | 2460 | CP034707.1 |
| II -: z:e, n,x,z15 | 3706 | LS483495.1 | Newport | 350 | CP016010.1 | Weltevreden | 365 | CP014996.1 |
| II 40: z4, z24: z39 | 4415 | LS483456.1 | Newport | 4157 | CP039436.1 | Weltevreden | 2384 | LN890524.1 |
| II 42: r:- | 1208 | CP034717.1 | Newport | 4166 | CP039437.1 | Weslaco | 1088 | LR134143.1 |
| II 47: b:e, n,x,z15 | 3910 | CP053585.1 | Ohio | 329 | CP030181.1 | Worthington | 592 | CP029041.1 |
| II 50: z:e, n,x | 1110 | LS483475.1 | Onderstepoort | 3102 | CP022034.1 | Yoruba | 1316 | CP030209.1 |
| II 55: z39:k | 1121 | CP022139.1 | Oranienburg | 23 | CP019197.1 |