| Literature DB >> 26444974 |
Mui Fern Tan1,2, Cheuk Chuen Siow3, Avirup Dutta4, Naresh Vr Mutha5, Wei Yee Wee6,7, Hamed Heydari8,9, Shi Yang Tan10,11, Mia Yang Ang12,13, Guat Jah Wong14,15, Siew Woh Choo16,17,18.
Abstract
BACKGROUND: Listeria consists of both pathogenic and non-pathogenic species. Reports of similarities between the genomic content between some pathogenic and non-pathogenic species necessitates the investigation of these species at the genomic level to understand the evolution of virulence-associated genes. With Listeria genome data growing exponentially, comparative genomic analysis may give better insights into evolution, genetics and phylogeny of Listeria spp., leading to better management of the diseases caused by them. DESCRIPTION: With this motivation, we have developed ListeriaBase, a web Listeria genomic resource and analysis platform to facilitate comparative analysis of Listeria spp. ListeriaBase currently houses 850,402 protein-coding genes, 18,113 RNAs and 15,576 tRNAs from 285 genome sequences of different Listeria strains. An AJAX-based real time search system implemented in ListeriaBase facilitates searching of this huge genomic data. Our in-house designed comparative analysis tools such as Pairwise Genome Comparison (PGC) tool allowing comparison between two genomes, Pathogenomics Profiling Tool (PathoProT) for comparing the virulence genes, and ListeriaTree for phylogenic classification, were customized and incorporated in ListeriaBase facilitating comparative genomic analysis of Listeria spp. Interestingly, we identified a unique genomic feature in the L. monocytogenes genomes in our analysis. The Auto protein sequences of the serotype 4 and the non-serotype 4 strains of L. monocytogenes possessed unique sequence signatures that can differentiate the two groups. We propose that the aut gene may be a potential gene marker for differentiating the serotype 4 strains from other serotypes of L. monocytogenes.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26444974 PMCID: PMC4595109 DOI: 10.1186/s12864-015-1959-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
List of Listeria species and the number of genomes in ListeriaBase (as on 29th August 2014)
|
| Species | # Draft genomes | # Complete genomes |
|---|---|---|---|
| 1 |
| 2 | 0 |
| 2 |
| 3 | 1 |
| 3 |
| 1 | 2 |
| 4 |
| 1 | 0 |
| 5 |
| 223 | 44 |
| 6 |
| 2 | 1 |
| 7 |
| 0 | 1 |
| 8 |
| 2 | 0 |
| 9 |
| 1 | 0 |
| 10 |
| 1 | 0 |
Fig. 1Overview of ListeriaBase architecture
Summary of the 44 L. monocytogenes genome annotations
| Lineage | Strain | Serotype | Size (bp) | # ORFs | # tRNAs | GC (%) | Isolated from | Country | Year of isolation |
|---|---|---|---|---|---|---|---|---|---|
| Lineage I | 07PF0776 | 4b | 2,901,562 | 2942 | 67 | 38.04 | Human myocardial abscess | USA | - |
| ATCC 19117 | 4d | 2,951,805 | 2957 | 67 | 37.99 | Sheep | USA | - | |
| CLIP 80459 | 4b | 2,912,690 | 2915 | 67 | 38.06 | Clinical outbreak of listeriosis | France | - | |
| L312 | 4b | 2,912,346 | 3045 | 67 | 38.06 | Cheese | - | - | |
| F2365 | 4b | 2,905,187 | 2920 | 67 | 38.04 | Cheese | USA | 1985 | |
| LL195 | 4b | 2,936,689 | 2920 | 67 | 38.01 | - | Switzerland | 1983 – 1987 | |
| SLCC2482 | 7 | 2,972,810 | 2968 | 67 | 37.95 | Human | - | 1966 | |
| SLCC2378 | 4e | 2,972,172 | 2968 | 66 | 37.95 | Poultry | - | - | |
| SLCC2540 | 3b | 2,966,146 | 2994 | 67 | 38.08 | Human | USA | 1956 | |
| SLCC2755 | 1/2b | 2,907,142 | 2972 | 67 | 38.01 | Chinchilla | - | 1967 | |
| J1816 | 4b | 2,947,460 | 3060 | 58 | 37.97 | Turkey deli meat | USA | 2002 | |
| J1-220 | 4b | 3,032,271 | 3088 | 67 | 37.94 | Vegetable | USA | 1979 | |
| CFSAN006122 | - | 2,906,670 | 2922 | 67 | 38 | Cheese | USA | 2013 | |
| J2-064 | 1/2b | 2,943,218 | 2945 | 58 | 38 | Cow | - | - | |
| NE dc2014 | - | 2,904,662 | 2920 | 67 | 38 | Cheese | - | - | |
| J2-1091 | - | 2,981,886 | 3025 | 67 | 38 | Animal | USA | 1995 | |
| J1776 | 4b | 2,953,719 | 2995 | 67 | 37.9 | Turkey deli | USA | 2002 | |
| J1817 | 4b | 2,953,716 | 2999 | 67 | 37.9 | Turkey deli | USA | 2002 | |
| J1926 | 4b | 2,953,708 | 2996 | 67 | 37.9 | Turkey deli | USA | 2002 | |
| N1-011A | - | 3,094,342 | 3169 | 67 | 38 | - | - | - | |
| R2-502 | 1/2b | 3,034,043 | 3079 | 67 | 37.9 | - | - | 1994 | |
| WSLC1042 | 4b | 2,942,168 | 2974 | 67 | 38 | - | Germany | - | |
| Lineage II | 08-5578 | 1/2a | 3,032,288 | 3112 | 58 | 37.96 | Human blood specimen | Canada | 2008 |
| 08-5923 | 1/2a | 2,999,054 | 3063 | 58 | 37.96 | Human | Canada | 2008 | |
| 10403S | 1/2a | 2,903,106 | 2944 | 67 | 38.03 | Human skin lesion | USA | 1968 | |
| EGD-e | 1/2a | 2,944,528 | 2996 | 67 | 37.98 | Rabbit | UK | 1926 | |
| Finland 1998 | 3a | 2,874,431 | 2904 | 67 | 38.05 | - | Finland | 1998 | |
| FSL R2-561 | 1/2c | 2,973,801 | 3051 | 67 | 37.96 | - | - | - | |
| J0161 | 1/2a | 3,000,464 | 3060 | 58 | 37.86 | Human listeriosis outbreak | - | - | |
| SLCC2372 | 1/2c | 2,840,185 | 3037 | 67 | 38.26 | Human | UK | 1935 | |
| SLCC2479 | 3c | 2,976,958 | 3031 | 65 | 37.93 | - | - | 1966 | |
| SLCC5850 | 1/2a | 2,882,234 | 2976 | 67 | 38.04 | Rabbit | UK | 1924 | |
| SLCC7179 | 3a | 2,972,254 | 2927 | 67 | 37.95 | Cheese | Austria | 1986 | |
| NCCP No. 15743 | 1/2a | 2,803,433 | 2868 | 67 | 38.1 | - | - | - | |
| 6179 | 1/2a | 3,010,620 | 3071 | 49 | 37.9 | Cheese | - | - | |
| C1-387 | 1/2a | 2,988,947 | 3043 | 67 | 38 | Turkey breast | New York | 1999 | |
| EGD | 1/2a | 2,907,193 | 2969 | 67 | 38 | Animal | - | 1926 | |
| J2-031 | 1/2a | 2,958,908 | 3024 | 67 | 38 | Cow | - | 1996 | |
| R479a | 1/2a | 2,944,998 | 3008 | 58 | 37.9 | Smoked Salmon | - | - | |
| WSLC1001 | 1/2a | 2,951,235 | 3031 | 67 | 38 | - | Germany | - | |
| Lineage III | HCC23 | 4a | 2,976,212 | 3048 | 67 | 38.19 | Catfish brain | USA | - |
| L99 | 4a | 2,979,198 | 2911 | 67 | 38.19 | Cheese | Netherlands | 1950 | |
| M7 | 4a | 2,976,163 | 3049 | 67 | 38.19 | Cow’s milk | China | - | |
| SLCC2376 | 4c | 2,941,360 | 2839 | 67 | 37.99 | Poultry | - | - |
(All the genomes referred in the table are complete genomes)
Fig. 2Genome comparison and visualization of multiple L. monocytogenes strains. a A tRNA island (TI1) containing 9 tRNA genes was located between two rRNA operons absent in the genomes of J1816, J0161, 08–5578, 08–5923, R479a, 6179 and FSL J2-064 causing lower number of tRNAs observed in these strains compared to other L. monocytogenes strains. b tRNA Island 3 (TI3) was absent in the complete genome of L. monocytogenes 6179, but present the rest of the strains
Fig. 3Pairwise genome comparison between the L. monocytogenes SLCC2376 and L. monocytogenes HCC23 from the lineage III. Three noticeable gaps and insertions can be observed and labelled as 1, 2 and 3 in circles found in the genome sequences of HCC23 and SLCC2376, which predicted to be putative prophage regions by PHAST. Two are intact prophages, whereas another one is a questionable (close to complete) prophage. The green track indicates the histogram bars. Each 10 Kbp window in the diagram is assigned by a histogram bar. The height of each bar illustrates the total number of bases of the opposite genome aligned to this 10 Kbp window region. The upper border of the grey area delineates 10 Kbp height. If the height is higher than the 10 Kbp, it may indicate the genomic region is not specific or containing repetitive regions. A gap may indicate unmapped region which could be an insertion e.g. prophages
Mathematical function for determining pan-genome of the lineages
| Lineage | Formula | Pan-genome |
|---|---|---|
| I, II, III | Y = 585.7852 X0.448 + 2242.6997 | Open |
| I | Y = 353.6843 X0.532 + 2453.6727 | Open |
| II | Y = 584.9790 X0.3580 + 2245.4068 | Open |
| III | Y = −538.9837 X−0.402 + 3364.3139 | Closed |
Y represents the pan-genome size while X represents the number of sequenced genomes (Pan-genome size = infinite when X → ∞). The negative value of exponent for X as shown in the formula indicates that lineage III has a closed pan-genome, meaning no new gene to be found when a new genome is sequenced
Fig. 4Pan-genome and core genome of L. monocytogenes size prediction. a The extrapolation of pan-genome and core genome sizes leads to two separate leaves: the upper leaf represents the pan-genome size and the bottom leaf represents the core genome size. b Curve for the number of expected new genes detected on the subsequent addition of L. monocytogenes genomes. 33 new genes predicted to occur for each addition
Fig. 5Virulence genes appear in different strains and clustered as heat map. There are a total of 92 virulence genes existing in Listeria species, and 78 of these virulence genes are conserved in all 44 L. monocytogenes strains. Lineages I and II of L. monocytogenes contain more virulence genes than lineages III, whereas majority of the virulence genes vital for pathogenicity are absent in L. innocua and L. marthii
Fig. 6Difference between the Auto protein sequences of serotype 4 and non-serotype 4 strains of L. monocytogenes. a The domains present in the Auto protein sequence (FlgJ domain and 7 SH3_8 domains) in L. monocytogenes serotype 4 strains. b The domains present in the Auto protein sequence (FlgJ domain and 4 SH3_8 domains) in L. monocytogenes non-serotype 4 strains. c The multiple sequence alignment of the FlgJ domain of the Auto protein sequences of the serotype 4 and the non-serotype 4 strains of L. monocytogenes with FlgJ domain sequence COG1705 as the reference sequence