| Literature DB >> 18078514 |
Ben Sidders1, Mike Withers, Sharon L Kendall, Joanna Bacon, Simon J Waddell, Jason Hinds, Paul Golby, Farahnaz Movahedzadeh, Robert A Cox, Rosangela Frita, Annemieke M C Ten Bokum, Lorenz Wernisch, Neil G Stoker.
Abstract
We describe an analysis, applicable to any spotted microarray dataset produced using genomic DNA as a reference, that quantifies prokaryotic levels of mRNA on a genome-wide scale. Applying this to Mycobacterium tuberculosis, we validate the technique, show a correlation between level of expression and biological importance, define the complement of invariant genes and analyze absolute levels of expression by functional class to develop ways of understanding an organism's biology without comparison to another growth condition.Entities:
Mesh:
Substances:
Year: 2007 PMID: 18078514 PMCID: PMC2246267 DOI: 10.1186/gb-2007-8-12-r265
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Probe length normalization and quantified gene expression levels in M. tuberculosis. (a) We found that longer probes correlated with increased fluorescent intensities, which then biased the ppm values we obtained. (b) We are able to remove this bias using a model of linear regression. The three distinct groupings visible in the figure are an artifact of the probe lengths targeted during their synthesis by PCR. (c) The level of expression for each gene in the genome, as determined by our analysis from chemostat grown wild-type M. tuberculosis H37Rv, is shown ordered as they appear in the chromosome. (d) The log frequency distribution of mRNA abundances from (c). A clear skew to the right, containing a subset of very highly expressed genes, is typical of the distributions we have found.
The 50 most highly expressed genes in vitro
| Rv | Name | ppm | Function [40] | |
| 1 | Rv0009 | ppiA | 13,634 | Protein translation and modification |
| 2 | Rv2527 | Rv2527 | 10,020 | Conserved hypotheticals |
| 3 | Rv3418c | groES | 8,651 | Chaperones-heat shock* |
| 4 | Rv3615c | Rv3615c | 8,531 | Conserved hypotheticals* |
| 5 | Rv0440 | groEL2 | 8,300 | Chaperones-heat shock* |
| 6 | Rv3258c | Rv3258c | 5,430 | Unknown* |
| 7 | Rv3616c | Rv3616c | 5,370 | Conserved hypotheticals* |
| 8 | Rv3477 | PE31 | 5,138 | PE subfamily |
| 9 | Rv2244 | acpM | 4,935 | Synthesis of fatty and mycolic acids* |
| 10 | Rv0060 | Rv0060 | 4,616 | Unknown* |
| 11 | Rv3874 | Rv3874 | 4,490 | Conserved hypotheticals |
| 12 | Rv3875 | esat6 | 4,115 | SP, L, P and A† |
| 13 | Rv3648c | cspA | 4,101 | Adaptations and atypical conditions* |
| 14 | Rv3053c | nrdH | 3,934 | 2'-Deoxyribonucleotide metabolism |
| 15 | Rv3614c | Rv3614c | 3,748 | Conserved hypotheticals* |
| 16 | Rv1078 | pra | 3,582 | Conserved hypotheticals |
| 17 | Rv2780 | ald | 3,577 | Amino acids and amines |
| 18 | Rv1388 | mIHF | 3,499 | Nucleoproteins* |
| 19 | Rv0003 | recF | 3,371 | DNA R, R R and R‡ |
| 20 | Rv3786c | Rv3786c | 3,164 | Unknown |
| 21 | Rv1641 | infC | 2,993 | Protein translation and modification* |
| 22 | Rv3269 | Rv3269 | 2,778 | Chaperones-heat shock |
| 23 | Rv1398c | Rv1398c | 2,777 | Conserved hypotheticals |
| 24 | Rv3407 | Rv3407 | 2,732 | Conserved hypotheticals |
| 25 | Rv2245 | kasA | 2,693 | Synthesis of fatty and mycolic acids* |
| 26 | Rv0685 | tuf | 2,688 | Protein translation and modification* |
| 27 | Rv2145c | wag31 | 2,652 | SP, L, P and A*† |
| 28 | Rv1306 | atpF | 2,620 | ATP-proton motive force* |
| 29 | Rv0005 | gyrB | 2,466 | DNA R, R R and R*‡ |
| 30 | Rv1305 | atpE | 2,455 | ATP-proton motive force* |
| 31 | Rv0016c | pbpA | 2,423 | Murein sacculus and peptidoglycan |
| 32 | Rv1622c | cydB | 2,418 | Electron transport* |
| 33 | Rv3219 | whiB1 | 2,373 | Repressors-activators |
| 34 | Rv3583c | Rv3583c | 2,345 | Repressors-activators |
| 35 | Rv1980c | mpt64 | 2,254 | SP, L, P and A† |
| 36 | Rv1072 | Rv1072 | 2,205 | Other membrane proteins |
| 37 | Rv2457c | clpX | 2,200 | Proteins, peptides and glycopeptides* |
| 38 | Rv1958c | Rv1958c | 2,195 | Unknown |
| 39 | Rv3763 | lpqH | 2,180 | Lipoproteins (lppA-lpr0) |
| 40 | Rv1872c | lldD2 | 2,149 | Aerobic respiration |
| 41 | Rv3461c | rpmJ | 2,133 | Ribosomal protein synthesis* |
| 42 | Rv1361c | PPE19 | 2,127 | PPE family |
| 43 | Rv0097 | Rv0097 | 2,123 | Conserved hypotheticals |
| 44 | Rv1971 | Rv1971 | 2,034 | Virulence |
| 45 | Rv3051c | nrdE | 1,984 | 2'-Deoxyribonucleotide metabolism* |
| 46 | Rv2346c | Rv2346c | 1,937 | Conserved hypotheticals |
| 47 | Rv3679 | Rv3679 | 1,931 | Anions* |
| 48 | Rv1298 | rpmE | 1,883 | Ribosomal protein synthesis* |
| 49 | Rv0108c | Rv0108c | 1,837 | Unknown |
| 50 | Rv2193 | ctaE | 1,793 | Aerobic respiration* |
*Essential genes (TraSH) [26,27]. †Surface polysaccharides, lipopolysaccharides, proteins and antigens. ‡DNA replication, repair, recombination and restriction-modification.
Figure 2Microarray analysis validation. There is a strong correlation (0.86, Spearman's rank, p < 0.0001) between mRNA levels as predicted by our microarray analysis and mRNA copy number as determined by RTq-PCR.
Figure 3Correlations between mRNA and biological importance. (a) Proteins identified by two-dimensional PAGE/MS [24] correlates with the most highly expressed genes (Chi-squared test for trend in proportions = 251.9, df = 1, p value < 0.0001). (b) Similarly, there is a significant relationship between expression level and essentiality as determined by TraSH [7,26,27] (Chi-squared test for trend in proportions = 161.2, df = 1, p value < 0.0001).
The 133 genes of the 'abundant invariome'
| Rv | Name | Avg ppm | Stdev | Essential | |
| 1 | Rv3874 | lhp | 5,414 | 3,950 | - |
| 2 | Rv3418c | groES | 5,189 | 2,593 | |
| 3 | Rv0440 | groEL2 | 4,438 | 2,385 | |
| 4 | Rv3615c | Rv3615c | 3,887 | 2,539 | |
| 5 | Rv0009 | ppiA | 3,460 | 4,587 | - |
| 6 | Rv3616c | Rv3616c | 2,619 | 1,457 | |
| 7 | Rv3477 | PE31 | 2,537 | 1,553 | - |
| 8 | Rv2244 | acpM | 2,475 | 1,304 | |
| 9 | Rv3875 | esat6 | 2,472 | 1,229 | - |
| 10 | Rv1398c | Rv1398c | 2,449 | 1,311 | - |
| 11 | Rv3648c | cspA | 2,372 | 1,149 | |
| 12 | Rv2245 | kasA | 2,236 | 481 | |
| 13 | Rv3614c | Rv3614c | 2,232 | 847 | |
| 14 | Rv1307 | atpH | 2,151 | 1,195 | |
| 15 | Rv0667 | rpoB | 2,105 | 563 | |
| 16 | Rv1388 | mihF | 2,103 | 1,013 | |
| 17 | Rv0685 | tuf | 2,100 | 792 | |
| 18 | Rv3583c | Rv3583c | 2,096 | 2,027 | - |
| 19 | Rv3053c | nrdH | 1,930 | 1,339 | - |
| 20 | Rv1133c | metE | 1,915 | 323 | |
| 21 | Rv1072 | Rv1072 | 1,897 | 560 | - |
| 22 | Rv1872c | lldD2 | 1,817 | 574 | - |
| 23 | Rv3461c | rpmJ | 1,814 | 795 | |
| 24 | Rv2457c | clpX | 1,790 | 343 | |
| 25 | Rv0700 | rpsJ | 1,693 | 1,148 | |
| 26 | Rv1078 | pra | 1,643 | 971 | - |
| 27 | Rv1298 | rpmE | 1,529 | 543 | |
| 28 | Rv2840c | Rv2840c | 1,495 | 566 | - |
| 29 | Rv1630 | rpsA | 1,491 | 439 | |
| 30 | Rv0046c | ino1 | 1,488 | 620 | - |
| 31 | Rv1886c | fbpB | 1,464 | 1,168 | - |
| 32 | Rv2196 | qcrB | 1,455 | 472 | |
| 33 | Rv3443c | rplM | 1,447 | 300 | |
| 34 | Rv0701 | rplC | 1,421 | 880 | |
| 35 | Rv0682 | rpsL | 1,419 | 576 | |
| 36 | Rv3219 | whiB1 | 1,384 | 795 | - |
| 37 | Rv0702 | rplD | 1,364 | 619 | |
| 38 | Rv0289 | Rv0289 | 1,351 | 845 | |
| 39 | Rv2200c | ctaC | 1,332 | 1,406 | |
| 40 | Rv1980c | mpt64 | 1,316 | 629 | - |
| 41 | Rv1306 | atpF | 1,246 | 695 | |
| 42 | Rv2193 | ctaE | 1,217 | 334 | |
| 43 | Rv1310 | atpD | 1,184 | 412 | |
| 44 | Rv1174c | Rv1174c | 1,165 | 424 | - |
| 45 | Rv1308 | atpA | 1,148 | 349 | |
| 46 | Rv3051c | nrdE | 1,123 | 578 | |
| 47 | Rv1305 | atpE | 1,086 | 696 | |
| 48 | Rv0683 | rpsG | 1,080 | 541 | |
| 49 | Rv1297 | rho | 1,074 | 281 | |
| 50 | Rv2461c | clpP1 | 1,028 | 346 | - |
| 51 | Rv0655 | mkl | 1,024 | 385 | |
| 52 | Rv3052c | nrdI | 1,018 | 551 | - |
| 53 | Rv3801c | fadD32 | 1,015 | 209 | |
| 54 | Rv0005 | gyrB | 1,011 | 684 | |
| 55 | Rv0704 | rplB | 1,011 | 528 | |
| 56 | Rv3412 | Rv3412 | 1,002 | 141 | - |
| 57 | Rv0250c | Rv0250c | 995 | 439 | - |
| 58 | Rv2460c | clpP2 | 991 | 393 | |
| 59 | Rv2204c | Rv2204c | 963 | 277 | - |
| 60 | Rv3478 | PPE60 | 957 | 360 | - |
| 61 | Rv0703 | rplW | 956 | 556 | |
| 62 | Rv2094c | tatA | 949 | 241 | - |
| 63 | Rv1303 | Rv1303 | 939 | 637 | |
| 64 | Rv3456c | rplQ | 937 | 297 | - |
| 65 | Rv0719 | rplF | 925 | 406 | |
| 66 | Rv0684 | fusA1 | 923 | 367 | |
| 67 | Rv2347c | Rv2347c | 920 | 447 | - |
| 68 | Rv0715 | rplX | 906 | 342 | |
| 69 | Rv1197 | Rv1197 | 906 | 440 | - |
| 70 | Rv1479 | moxR1 | 898 | 162 | |
| 71 | Rv0718 | rpsH | 886 | 243 | |
| 72 | Rv3460c | rpsM | 882 | 373 | - |
| 73 | Rv2194 | qcrC | 870 | 194 | |
| 74 | Rv2195 | qcrA | 868 | 189 | |
| 75 | Rv0860 | fadB | 863 | 420 | - |
| 76 | Rv1309 | atpG | 861 | 401 | |
| 77 | Rv0243 | fadA2 | 859 | 231 | - |
| 78 | Rv3248c | sahH | 850 | 243 | |
| 79 | Rv0020c | TB39.8 | 850 | 224 | - |
| 80 | Rv3584 | lpqE | 845 | 246 | - |
| 81 | Rv1793 | Rv1793 | 836 | 309 | - |
| 82 | Rv3620c | Rv3620c | 827 | 375 | - |
| 83 | Rv1410c | Rv1410c | 815 | 188 | |
| 84 | Rv3459c | rpsK | 809 | 206 | |
| 85 | Rv0483 | lprQ | 805 | 341 | - |
| 86 | Rv3043c | ctaD | 803 | 221 | |
| 87 | Rv3029c | fixA | 801 | 232 | |
| 88 | Rv2868c | gcpE | 799 | 384 | - |
| 89 | Rv1304 | atpB | 796 | 203 | |
| 90 | Rv1642 | rpmI | 784 | 388 | - |
| 91 | Rv1794 | Rv1794 | 781 | 254 | - |
| 92 | Rv0288 | cfp7 | 781 | 286 | - |
| 93 | Rv3810 | pirG | 778 | 131 | |
| 94 | Rv1543 | Rv1543 | 770 | 222 | - |
| 95 | Rv3680 | Rv3680 | 765 | 294 | - |
| 96 | Rv3457c | rpoA | 760 | 293 | |
| 97 | Rv3045 | adhC | 756 | 272 | - |
| 98 | Rv1792 | Rv1792 | 753 | 326 | - |
| 99 | Rv2969c | Rv2969c | 738 | 141 | |
| 100 | Rv1177 | fdxC | 736 | 303 | |
| 101 | Rv3867 | Rv3867 | 735 | 200 | - |
| 102 | Rv1038c | Rv1038c | 724 | 391 | - |
| 103 | Rv2890c | rpsB | 715 | 120 | |
| 104 | Rv3224 | Rv3224 | 709 | 303 | - |
| 105 | Rv3458c | rpsD | 707 | 208 | |
| 106 | Rv2785c | rpsO | 690 | 316 | - |
| 107 | Rv2986c | hupB | 687 | 255 | |
| 108 | Rv0174 | mce1F | 683 | 211 | - |
| 109 | Rv3211 | rhlE | 676 | 251 | - |
| 110 | Rv1436 | gap | 673 | 234 | |
| 111 | Rv0351 | grpE | 672 | 282 | |
| 112 | Rv2764c | thyA | 667 | 239 | - |
| 113 | Rv1311 | atpC | 660 | 160 | |
| 114 | Rv0432 | sodC | 657 | 177 | - |
| 115 | Rv1791 | PE19 | 655 | 232 | - |
| 116 | Rv0932c | pstS2 | 652 | 249 | - |
| 117 | Rv2971 | Rv2971 | 645 | 188 | |
| 118 | Rv1300 | hemK | 644 | 220 | |
| 119 | Rv2703 | sigA | 643 | 114 | |
| 120 | Rv2203 | Rv2203 | 633 | 196 | - |
| 121 | Rv0423c | thiC | 614 | 149 | |
| 122 | Rv2587c | secD | 602 | 267 | - |
| 123 | Rv1887 | Rv1887 | 601 | 148 | - |
| 124 | Rv0313 | Rv0313 | 588 | 136 | - |
| 125 | Rv0502 | Rv0502 | 559 | 147 | - |
| 126 | Rv3841 | bfrB | 542 | 107 | - |
| 127 | Rv2115c | Rv2115c | 529 | 106 | - |
| 128 | Rv3587c | Rv3587c | 526 | 68 | - |
| 129 | Rv2110c | prcB | 516 | 151 | |
| 130 | Rv1987 | Rv1987 | 495 | 136 | - |
| 131 | Rv2454c | Rv2454c | 489 | 55 | - |
| 132 | Rv0125 | pepA | 483 | 80 | - |
| 133 | Rv1324 | Rv1324 | 474 | 152 | - |
Using our measure of absolute mRNA levels, we have been able to identify the genes whose level of expression remains consistently high across a variety of growth conditions. These genes remain amongst the top 15% most highly expressed genes across all of the conditions tested. We have termed the genes whose level of expression does not vary greatly as invariant, and, therefore, the subset of genes included in this table is dubbed the 'abundant invariome'.
Microarray datasets used in this study
| Description | Origin | Reference | Data storage | |
| 1 | Wild-type Mtb H37Rv aerobic chemostat | CAMR, UK | [20] | BμG@Sbase: E-BUGS-60 |
| 2 | Wild-type Mtb H37Rv low oxygen chemostat - 0.2% DOT | CAMR, UK | [20] | BμG@Sbase: E-BUGS-60 |
| 3 | Wild-type Mtb H37Rv aerobic rolling batch culture | RVC, UK | Unpublished | BμG@Sbase: E-BUGS-60 |
| 4 | Wild-type Mbovis AF2122/97 aerobic chemostat | VLA, UK | Unpublished | BμG@Sbase: E-BUGS-60 |
| 5 | Wild-type Mbovis AF2122/97 aerobic rolling batch culture | VLA, UK | Unpublished | BμG@Sbase: E-BUGS-60 |
| 6 | Wild-type Mtb H37Rv harvested from macrophages | SGUL, UK | Unpublished | BμG@Sbase: E-BUGS-60 |