| Literature DB >> 26067107 |
Xiao Liu1, Baojin Wang1, Luo Xu1.
Abstract
Methods for identifying essential genes currently depend predominantly on biochemical experiments. However, there is demand for improved computational methods for determining gene essentiality. In this study, we used the Hurst exponent, a characteristic parameter to describe long-range correlation in DNA, and analyzed its distribution in 33 bacterial genomes. In most genomes (31 out of 33) the significance levels of the Hurst exponents of the essential genes were significantly higher than for the corresponding full-gene-set, whereas the significance levels of the Hurst exponents of the nonessential genes remained unchanged or increased only slightly. All of the Hurst exponents of essential genes followed a normal distribution, with one exception. We therefore propose that the distribution feature of Hurst exponents of essential genes can be used as a classification index for essential gene prediction in bacteria. For computer-aided design in the field of synthetic biology, this feature can build a restraint for pre- or post-design checking of bacterial essential genes. Moreover, considering the relationship between gene essentiality and evolution, the Hurst exponents could be used as a descriptive parameter related to evolutionary level, or be added to the annotation of each gene.Entities:
Mesh:
Year: 2015 PMID: 26067107 PMCID: PMC4466317 DOI: 10.1371/journal.pone.0129716
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Information of the analyzed objects.
| Analysis organisms | NCBI RefSeq access number | Gene number (Full-gene-set) | Gene number (Essential) | Gene number (Nonessential) |
|---|---|---|---|---|
|
| NC_005966 | 3307 | 499 | 2594 |
|
| NC_000964 | 4175 | 271 | 3904 |
|
| NC_016776 | 4290 | 547 | 3743 |
|
| NC_004663 | 4778 | 325 | 4453 |
|
| NC_006350/006351 | 3398+2329 | 505 | 5222 |
|
| NC_007650/007651 | 3276+2356 | 406 | 5226 |
|
| NC_002163 | 1576 | 228 | 1395 |
|
| NC_011916 | 3885 | 480 | 3224 |
|
| NC_000913 | 4140 | 609 | 2923 |
|
| NC_000913 | 4140 | 296 | 4077 |
|
| NC_008601 | 1719 | 392 | 1329 |
|
| NC_000907 | 1610 | 642 | 512 |
|
| NC_000915 | 1469 | 323 | 1135 |
|
| NC_000962 | 3906 | 614 | 2552 |
|
| NC_000962 | 3906 | 771 | 3171 |
|
| NC_000962 | 3906 | 687 | 3070 |
|
| NC_000908 | 475 | 381 | 94 |
|
| NC_002771 | 782 | 310 | 322 |
|
| NC_010729 | 2089 | 463 | 1627 |
|
| NC_002516 | 5572 | 117 | 5454 |
|
| NC_008463 | 5892 | 335 | 960 |
|
| NC_004631 | 4352 | 353 | 4005 |
|
| NC_004631 | 4352 | 358 | 3906 |
|
| NC_016810 | 4446 | 353 | 4035 |
|
| NC_016856 | 5315 | 105 | 5210 |
|
| NC_003197 | 4451 | 230 | 4228 |
|
| NC_004347 | 4065 | 403 | 1103 |
|
| NC_009511 | 4850 | 535 | 4315 |
|
| NC_002745 | 2582 | 302 | 2281 |
|
| NC_007795 | 2767 | 351 | 2541 |
|
| NC_003098 | 1813 | 244 | NULL |
|
| NC_009009 | 2270 | 218 | 2052 |
|
| NC_002505/002506 | 2534+970 | 779 | 2943 |
a NO nonessential genes information provided in DEG.
Significance levels of 33 objects in a normal distribution based on the hurstSpec method in smoothed mode.
| Analysis organisms | NCBI RefSeq access number | Full-gene-set | Essential Genes | Nonessential Genes |
|---|---|---|---|---|
|
| NC_005966 | 0.052 | 0.604 | 0.093 |
|
| NC_000964 | 0.004 | 0.439 | 0.004 |
|
| NC_016776 | 0.002 | 0.175 | 0.015 |
|
| NC_004663 | 0.000 | 0.688 | 0.000 |
|
| NC_006350/006351 | 0.000 | 0.645 | 0.001 |
|
| NC_007650/007651 | 0.000 | 0.408 | 0.000 |
|
| NC_002163 | 0.018 | 0.192 | 0.074 |
|
| NC_011916 | 0.000 | 0.757 | 0.000 |
|
| NC_000913 | 0.000 | 0.807 | 0.075 |
|
| NC_000913 | 0.000 | 0.639 | 0.000 |
|
| NC_008601 | 0.045 | 0.258 | 0.089 |
|
| NC_000907 | 0.037 | 0.711 | 0.291 |
|
| NC_000915 | 0.289 | 0.324 | 0.394 |
|
| NC_000962 | 0.000 | 0.717 | 0.009 |
|
| NC_000962 | 0.000 | 0.431 | 0.001 |
|
| NC_000962 | 0.000 | 0.845 | 0.004 |
|
| NC_000908 | 0.996 | 0.993 | 0.662 |
|
| NC_002771 | 0.131 | 0.894 | 0.133 |
|
| NC_010729 | 0.000 | 0.343 | 0.000 |
|
| NC_002516 | 0.001 | 0.978 | 0.001 |
|
| NC_008463 | 0.001 | 0.289 | 0.181 |
|
| NC_004631 | 0.001 | 0.183 | 0.002 |
|
| NC_004631 | 0.001 | 0.503 | 0.024 |
|
| NC_016810 | 0.006 | 0.421 | 0.015 |
|
| NC_016856 | 0.000 | 0.904 | 0.000 |
|
| NC_003197 | 0.003 | 0.516 | 0.004 |
|
| NC_004347 | 0.014 | 0.784 | 0.212 |
|
| NC_009511 | 0.002 | 0.167 | 0.005 |
|
| NC_002745 | 0.004 | 0.437 | 0.013 |
|
| NC_007795 | 0.000 | 0.124 | 0.002 |
|
| NC_003098 | 0.220 | 0.177 | NULL |
|
| NC_009009 | 0.009 | 0.127 | 0.020 |
|
| NC_002505/002506 | 0.002 | 0.000 | 0.053 |
Fig 1Q–Q plots of the Hurst exponents.
A, B, and C: Escherichia coli MG1655 I; D, E, and F: Salmonella enterica subsp. enterica serovar Typhimurium str. 14028S. A and D show Q–Q plots of the Hurst exponents of the full-gene-sets of the two objects from respective organisms. B and E show Q–Q plots of the Hurst exponents of the essential genes of the two objects from respective organisms. C and F show Q–Q plots of the Hurst exponents of the nonessential genes of the two objects from respective organisms.