| Literature DB >> 28673248 |
Henk Bolhuis1, Ana Belén Martín-Cuadrado2, Riccardo Rosselli2, Lejla Pašić3, Francisco Rodriguez-Valera2.
Abstract
BACKGROUND: Haloquadratum walsbyi dominates saturated thalassic lakes worldwide where they can constitute up to 80-90% of the total prokaryotic community. Despite the abundance of the enigmatic square-flattened cells, only 7 isolates are currently known with 2 genomes fully sequenced and annotated due to difficulties to grow them under laboratory conditions. We have performed a transcriptomic analysis of one of these isolates, the Spanish strain HBSQ001 in order to investigate gene transcription under light and dark conditions.Entities:
Keywords: Archaea; Bacteriorhodopsin; Glycoprotein; Halophile; Haloquadratum; Transcriptome
Mesh:
Substances:
Year: 2017 PMID: 28673248 PMCID: PMC5496347 DOI: 10.1186/s12864-017-3892-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
H. walsbyi HBSQ001 general RNA sequencing statistics
| Growth Conditions | ||
|---|---|---|
| Dark | Light | |
| Chromosome length (nt) | 3,132,794 | |
| Total number of PE100 reads mapped to chromosome | 111,441,003 | 115,537,266 |
| Reads mapped to ribosomal operons | 88,414,038 | 90,353,332 |
| Reads mapped to CDS | 3,272,328 | 3,925,912 |
| Percent reads mapped to CDS | 2.9% | 3.4% |
| Percent reads mapped to ribosomal operons | 79.3% | 78.2% |
| Percent reads mapped to Intergenic regions | 16.5% | 17.2% |
| Percent reads mapped to UTR’s | 15.1% | 15.2% |
| Percent reads mapped to non UTR intergenic region | 1.5% | 2.0% |
| Percent reads mapped to SRP RNA | 0.31% | 0.28% |
| Percent reads mapped to RNAseP | 0.88% | 0.95% |
| Average coverage per nucleotide | 199× | 245× |
| Median number of reads per CDS | 370 | 430 |
| Average number of reads per CDS | 1155 | 1386 |
| Median TPM per CDS | 8.8 | 9.9 |
| Average TPM per CDS | 21.3 | 23.4 |
| Percentage of CDS expressed (>Percentile 10 (less than 37 and 43 reads) | 17.6% | 17.2% |
| Percentage of CDS expressed (>0 reads) (6 CDS) | 94.6% | 94.9% |
| GC | 55% | 54% |
CDS, coding DNA sequence; IR, intergenic region
Fig. 1Comparison of gene expression between the light and dark sample for the protein coding sequences (a) and intergenic regions (b). The expression levels in TPM were log transformed
Fig. 2Levels of transcription (TPM) of the coding (a) and non-coding regions (b) of H. walsbyi HBSQ001. Positions for the rRNA operons, cdc6 genes, IS elements and genes encoding ribosomal proteins are indicated (see legend). Only the products (protein or RNA) encoded by the genes with TPM > 300 are indicated. c Codon adaptation index (CAI) for each of the genes. d Percentage of the hypothetical proteins in a window of 100 and e regions with a GC-content deviation bigger than 2.5 x SD. The blue bands indicate the position of the ribosomal RNA gene clusters. The two pink bands indicate the two low expressed regions in the genome
Highest expressed genes with expression levels (TPM) more than 10 times the average per coding sequence, sorted in order of expression level
| Gene-designation | gene | TPM | Function |
|---|---|---|---|
| HQ1207A |
| 1843 | Cell surface glycoprotein |
| HQ1014A |
| 1833 | Energy conversion |
| HQ1205A | HQ1205A | 1050 | Cell surface glycoprotein/ adhesin |
| HQ2461A |
| 671 | Superoxide dismutase |
| HQ1415A |
| 615 | Cell division protein |
| HQ3385A |
| 610 | Elongation factor 1-alpha |
| HQ1782A |
| 542 | Gas vesicle production |
| HQ2545A |
| 456 | Iron transport |
| HQ1276A |
| 447 | Putative thiamine biosynthetic enzyme |
| HQ1706A |
| 432 | Fe-S cluster assembly ATPase |
| HQ3729A |
| 374 | Sulfurtransferase |
| HQ1253A |
| 370 | 50S ribosomal protein L44e |
| HQ3408A |
| 369 | Transcription initiation factor IIB |
| HQ1501A | HQ1501A | 363 | Uncharacterized protein |
| HQ3391A |
| 346 | 30S ribosomal protein S12 |
| HQ2843A | HQ2843A | 346 | Uncharacterized protein -DUF171 family |
| HQ1707A |
| 331 | Fe-S cluster assembly |
| HQ1172A | HQ1172A | 318 | RNA binding / TRAM domain protein |
| HQ1100A | HQ1100A | 279 | Uncharacterized protein |
| HQ1133A | HQ1133A | 276 | Uncharacterized protein |
| HQ3117A |
| 274 | Aconitate hydratase (TCA cycle enzyme) |
| HQ3366A | HQ3366A | 269 | Uncharacterized protein |
| HQ3457A |
| 258 | Ferredoxin (2Fe-2S) - electron transfer |
| HQ2649A | HQ2649A | 247 | Uncharacterized protein -DUF293 domain |
| HQ3356A |
| 243 | pyruvate: ferredoxin oxidoreductase |
| HQ2902A |
| 243 | 50S ribosomal protein L10 |
| HQ1283A |
| 235 | 50S ribosomal protein L43 |
| HQ3242A |
| 234 | V-type ATP synthase subunit D |
| HQ3309A |
| 232 | 30S ribosomal protein S8e |
| HQ3020A |
| 223 | Peroxiredoxin |
Fig. 3Expression of regions of interest. (a) Genomic island 1, Arrows indicated the beginning and the end of GI1 accordingly to Martin-Cuadrado et al. 2015 [26]. (b) Bacterioopsin genes bopI and bopII. (c) ATP synthase gene cluster and (d) the gas vesicle synthesis operon
Fig. 4Median gene expression ranked by COG functional groups of the total protein coding dataset (a) and of the top 200 highest expressed genes (b)