| Literature DB >> 26038504 |
Li-Fang Chou1, Ting-Wen Chen2, Yi-Ching Ko1, Ming-Jeng Pan3, Ya-Chung Tian1, Cheng-Hsun Chiu4, Petrus Tang2, Cheng-Chieh Hung1, Chih-Wei Yang1.
Abstract
Leptospira santarosai serovar Shermani is the most frequently encountered serovar, and it causes leptospirosis and tubulointerstitial nephritis in Taiwan. This study aims to complete the genome sequence of L. santarosai serovar Shermani and analyze the transcriptional responses of L. santarosai serovar Shermani to renal tubular cells. To assemble this highly repetitive genome, we combined reads that were generated from four next-generation sequencing platforms by using hybrid assembly approaches to finish two-chromosome contiguous sequences without gaps by validating the data with optical restriction maps and Sanger sequencing. Whole-genome comparison studies revealed a 28-kb region containing genes that encode transposases and hypothetical proteins in L. santarosai serovar Shermani, but this region is absent in other pathogenic Leptospira spp. We found that lipoprotein gene expression in both L. santarosai serovar Shermani and L. interrogans serovar Copenhageni were upregulated upon interaction with renal tubular cells, and LSS19962, a L. santarosai serovar Shermani-specific gene within a 28-kb region that encodes hypothetical proteins, was upregulated in L. santarosai serovar Shermani-infected renal tubular cells. Lipoprotein expression during leptospiral infection might facilitate the interactions of leptospires within kidneys. The availability of the whole-genome sequence of L. santarosai serovar Shermani would make it the first completed sequence of this species, and its comparison with that of other Leptospira spp. may provide invaluable information for further studies in leptospiral pathogenesis.Entities:
Keywords: Leptospira santarosai; hypothetical proteins; leptospirosis; repetitive genome; whole-genome sequencing
Year: 2014 PMID: 26038504 PMCID: PMC4274889 DOI: 10.1038/emi.2014.78
Source DB: PubMed Journal: Emerg Microbes Infect ISSN: 2222-1751 Impact factor: 7.163
Genome features of Leptospira spp.
| Pathogenic | Saprophytic | ||||||
|---|---|---|---|---|---|---|---|
| Features | |||||||
| Genomic structures | CI, CII | CI, CII | CI, CII | CI, CII | CI, CII | CI, CII | CI, CII, p74, |
| Size (Mb) | 3.98 | 4.63 | 4.7 | 4.71 | 3.93 | 3.88 | 3.96 |
| GC (%) | 41.82 | 35 | 35 | 35 | 40.2 | 40.2 | 38.9 |
| Gene | 4191 | 3762 | 3741 | 3759 | 3273 | 3242 | 3675 |
| Coding sequences | 4079 | 3667 | 3683 | 3711 | 2945 | 2880 | 3600 |
| Ribosomal RNAs | 5 | 5 | 5 | 5 | 5 | 5 | 6 |
| GenBank accession number | CP006694; CP006695 | AE016823.1; AE016824.1 | AE010300.2; AE010301.2 | CP001221.1; CP001222.1 | CP000348.1; CP000349.1 | CP000350.1; CP000351.1 | CP000777.1; CP000778.1; CP000779.1 |
Abbreviations: bp, base pair; CI, chromosome I; CII, chromosome II; Mb, mega base pair.
Figure 1Workflow for the de novo assembly of the L. santarosai serovar Shermani genome. CI, chromosome I; CII, chromosome II; OM, optical mapping.
Statistics for de novo assembly of the L. santarosai serovar Shermani strain LT821 (ATCC number 43286) genome.
| Sequencing statistics for the genome mapped reads | ||||||
|---|---|---|---|---|---|---|
| Technology | Number of reads | Coverage | Mean read length (bp) | |||
| Illumina 2×75 bp paired-end (500 bp | 7 987 144 | CI: 142X; CII: 143X | 75 | |||
| Illumina 2×75 bp mate-pair (3000 bp | 27 550 738 | CI: 561X; CII: 548X | 100 | |||
| Roche 454 | 597 201 | CI: 60X; CII: 60X | 410 | |||
| PacBio (filtered subreads) | 40 203 | CI: 4.65X; CII:4.17X | 3514 | |||
| Illumina (paired-end+mate-pair) | 79 | 1055 | 316 426 | 49 532.05 | 101 468 | 3 972 582 |
| Roche 454 | 243 | 101 | 81 757 | 15 956.01 | 26 460 | 3 877 310 |
| PacBio (HGAP) | 13 | 8973 | 20 75 818 | 30 9448.2 | 2 075 818 | 4 022 826 |
| Illumina (paired-end+mate-pair)+PacBio (AHA) | 66 | 169 | 758 047 | 60 811.83 | 366 311 | 4 013 581 |
| Roche 454+PacBio (AHA) | 41 | 1072 | 519 368 | 98 637.02 | 238 493 | 4 044 118 |
Abbreviations: bp, base pair; CI, chromosome I; CII, chromosome II; kb, kilobase.
library size.
Figure 2A circular representation of the L. santarosai serovar Shermani genome, with predicted CDSs. (A) Chromosome I; (B) chromosome II. The inner scale is shown in kb. Circles range from 1 (outer circle) to 6 (inner circle). Circles 1 and 3, genes on forward and reverse strands of CDSs; circles 2 and 4, genes on forward and reverse strands of Clusters of Orthologous Group categories; All genes are color-coded according to their functions: red for lipid transport and metabolism (I), lime for carbohydrate metabolism (G), tan for coenzyme transport and metabolism (H), maroon for translation, ribosomal structure and biogenesis (J), blue for cell motility (N), goldenrod for inorganic ion transport and metabolism (P), cyan for post-translation modification, protein turnover and chaperones (O), plum for signal transduction mechanism (T), yellow for secondary metabolites biosynthesis (Q), green for amino acid transport and metabolism (E), olive for energy production and conversion (C), dark khaki for cell division/chromosome partitioning (D), magenta for nucleotide transport and metabolism (F), indigo for transcription (K), purple for replication, recombination and repair (L), dark cyan for cell wall/membrane/envelope biogenesis (M), dull gray for general function prediction only (R), silver for unknown functions (S); circle 5, GC content; circle 6, GC bias ((G-C)/(G.C). This figure was prepared in CGView.
Unique regions belonging to GIs and containing unique genes in L. santorosai serovar Shermani
| Region (bp) | Pfam description | |
|---|---|---|
| Chromosome I | ||
| 571632–596112 | LSS03699; hypothetical protein | Protein of unknown function (DUF1018) |
| LSS03734; hypothetical protein | - | |
| LSS03744; hypothetical protein | - | |
| LSS03784; hypothetical protein | - | |
| LSS03789; hypothetical protein | - | |
| LSS03839; hypothetical protein | - | |
| 901306–925137 | LSS16066; hypothetical protein | Serine dehydrogenase proteinase |
| LSS16071; hypothetical protein | HNH endonuclease | |
| LSS16076; hypothetical protein | - | |
| LSS16081; hypothetical protein | - | |
| LSS20376; hypothetical protein | - | |
| 1241453–1270000 | LSS19962; hypothetical protein | Peptidase C39-like family |
| LSS20690; hypothetical protein | Domain of unknown function (DUF3368) | |
| 1637637–1664268 | LSS18234; hypothetical protein | - |
| LSS18249; hypothetical protein | - | |
| LSS18254; hypothetical protein | - | |
| LSS18259; hypothetical protein | - | |
| LSS18294; hypothetical protein | - | |
| LSS20077; hypothetical protein | - | |
| 2062735–2081773 | LSS09598; hypothetical protein | - |
| LSS09603; hypothetical protein | - | |
| LSS22020; hypothetical protein | Reverse transcriptase (RNA-dependent DNA polymerase; group II intron, maturase-specific domain | |
| LSS22035; hypothetical protein | - | |
| LSS22040; hypothetical protein | Reverse transcriptase (RNA-dependent DNA polymerase); group II intron, maturase-specific domain | |
| 2239208–2254431 | LSS16416; hypothetical protein | Aminoglycoside 3- |
| LSS16441; hypothetical protein | - | |
| LSS22130; hypothetical protein | Macrocin-O-methyltransferase (TylF) | |
| 2663206–2678060 | LSS15106; hypothetical protein | - |
| LSS15111; hypothetical protein | - | |
| LSS15146; hypothetical protein | - | |
| LSS15156; hypothetical protein | - | |
| LSS15191; hypothetical protein | - |
Figure 3A circular genome map for L. santarosai serovar Shermani compared with pathogenic Leptospira spp. The sequence similarity detected by BLASTp comparison analysis of chromosome I (A) and chromosome II (B) of pathogenic Leptospira spp. using L. santarosai serovar Shermani as a reference were performed with CGView Comparison Tool software. The circles are colored according to the percent identities of matches (black to light red, 100%–50% identity; blue to light blue, 50%–10% identity; and colorless, 0% identity). From the inner to outer circle on A: GC skew and GC content of L. santarosai serovar Shermani, L. borgpetersenii Hardjo-bovis serovar JB197, serovar L550, L. interrogans Copenhageni serovar Fiocruz L1-130, L. interrogans Lai serovar 56601 and strain IPAV, forward and reverse strand CDSs of L. santarosai serovar Shermani. From the inner to outer circle on B: GC skew and GC content of L. santarosai serovar Shermani, L. interrogans Copenhageni serovar Fiocruz L1-130, L. borgpetersenii Hardjo-bovis serovar JB197, L. interrogans Lai serovar 56601 and strain IPAV, L. borgpetersenii Hardjo-bovis serovar L550, forward and reverse strand CDSs of L. santarosai serovar Shermani.
Genes for leptospiral gene expression analysis in cell-based infection models.
| LSS locus | Product | |
|---|---|---|
| LSS19962 | Hypothetical protein | Not found |
| LSS13769 | Hypothetical protein | Not found |
| LSS12422 | Hypothetical protein | Not found |
| LSS12447 | Hypothetical protein | Not found |
| LSS08624 | Hypothetical protein | Not found |
| LSS01089 | Hypothetical protein | LIC13050 |
| LSS08269 | Hypothetical protein | LIC13236 |
| LSS02919 | Hypothetical protein | LIC12708 |
| LSS01907 | Hypothetical protein | LIC11052 |
| LSS00500 | Hypothetical protein | LIC10376 |
| LSS03359 | Hypothetical protein | LIC12339 |
| LSS14871 | Hypothetical protein | LIC10639 |
| LSS18953 | LipL32 | LIC11352 |
| LSS15341 | LipL21 | LIC10011 |
| LSS00320 | LipL36 | LIC13060 |
| LSS16716 | FlaB | LIC11890 |
| LSS16476 | PseA | Not found |
| LSS22895 | Hypothetical protein | LIC12676 |
| LSS16296 | Lsa24 | LIC12906 |
| LSS14677 | OmpL37 | LIC12263 |
| LSS21190 | Imelysin | LIC10711 |
The LSS locus tag corresponds to the L. santarosai serovar Shermani genome; the LIC locus tag corresponds to the L. interrogans serovar Copenhageni genome.
Figure 4A comparative analysis of differential leptospiral gene expressions in L. santarosai serovar Shermani-infected HK-2 cells (A) and L. interrogans serovar Copenhageni-infected HK-2 cells (B). Data are represented as the means±SD of three independent experiments.