| Literature DB >> 22385986 |
Li-Yeh Chuang1, Cheng-Huei Yang, Ming-Cheng Lin, Cheng-Hong Yang.
Abstract
BACKGROUND: Genomic islands play an important role in medical, methylation and biological studies. To explore the region, we propose a CpG islands prediction analysis platform for genome sequence exploration (CpGPAP).Entities:
Mesh:
Year: 2012 PMID: 22385986 PMCID: PMC3313849 DOI: 10.1186/1471-2156-13-13
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Figure 1CpGPAP platform flowchart. A: Selection of the optimization algorithm for predicting CpG islands. B: Parameter settings for the optimization algorithm, CpG island related parameters and input sequence
Figure 2Prediction results showing CpG island-related information. such as number of CpG islands, CpG island length, start and end position, input parameters, O/E ratio, and GC content
Figure 3Visualization of the CpG island prediction results. A: GC% chart shows the GC content distribution in the input sequence. B: O/E ratio chart shows the O/E ratio distribution in the input sequence. C: TSS chart shows the probability of the predicted CpG island overlapping with a transcription start site. D: CpG chart shows the CpG nucleic and CpG island distribution
Comparison of different software functions for CpG island
| Year of publication | 2011 | 2006 | 2003 | 2002 | 2000 | |
|---|---|---|---|---|---|---|
| [ | [ | [ | [ | [ | ||
| Type | Web-based | Web-based | Web-based | Web- | Web- | |
| Parameters | √ | √ | √ | √ | √ | |
| CpG island result | √ | √ | √ | √ | √ | |
| CpG dinucleotides | √ | √ | ||||
| TSS | √ | √ | ||||
| O/E ratio bar | √ | √ | ||||
| GC% bar | √ | |||||
| Upload sequence | √ | √ | √ | |||
| Data integrator | √ | |||||
| Standalone | √ | |||||
| version | √ | √ | √ | √ | ||
| Method | CPSO and | cluster | Sliding | Sliding | Sliding | |
TSS: transcription start site. CPSO: complementary particle swarm optimization. CGA: complementary genetic algorithm
Comparison of the number of CpG islands identified in the human genome with different methods (NCBI.36)
| Chromosome Length (bp) | 46,944,329 | |||||||
| Total length of CpG islands | 347,334 | 639,161 | 1,072,192 | 1,280,505 | 1,564,596 | 1,607,472 | 1,262,449 | 1,589,629 |
| Number of islands predicted | 973 | 2,703 | 1,091 | 3,704 | 2,648 | 2,813 | 2,513 | 3,304 |
| Island coverage (%) | 0.73 | 1.36 | 2.28 | 2.73 | 3.3 | 3.4 | 2.68 | 3.39 |
| Island length (bp) | ||||||||
| Average | 357 | 237 | 983 | 346 | 591 | 571 | 502 | 482 |
| Minimum | 101 | 8 | 500 | 200 | 202 | 202 | 201 | 201 |
| Maximum | 3,047 | 3,028 | 6,732 | 1,948 | 4,020 | 4,035 | 6,126 | 10,687 |
| GC-content ± SD (%) | 62.17 ± 0.07 | 65.49 ± 0.07 | 54.49 ± 0.06 | 57.98 ± 0.04 | 53.73 ± 0.05 | 53.72 ± 0.05 | 54.24 ± 0.05 | 55.07 ± 0.05 |
| CpG island O/E ratio ± SD | 0.84 ± 0.1 | 0.87 ± 0.3 | 0.63 ± 0.1 | 0.68 ± 0.1 | 0.64 ± 0.08 | 0.65 ± 0.08 | 0.68 ± 0.1 | 0.71 ± 0.1 |
| Chromosome Length (bp) | 49,691,432 | |||||||
| Total length of CpG islands | 679,803 | 522,748 | 2,067,653 | 2,842,255 | 2,802,675 | 2,907,983 | 2,251,454 | 3,085,715 |
| Number of islands predicted | 1,642 | 2,186 | 1,903 | 6,875 | 4,571 | 4,882 | 3,902 | 4,985 |
| Island coverage (%) | 1.36 | 1.05 | 4.16 | 5.71 | 5.64 | 5.85 | 4.53 | 6.20 |
| Island length (bp) | ||||||||
| Average | 414 | 239 | 1,087 | 413 | 613 | 596 | 577 | 619 |
| Minimum | 200 | 8 | 500 | 200 | 198 | 202 | 201 | 201 |
| Maximum | 7,902 | 7,774 | 8,363 | 3,339 | 4,076 | 4,076 | 5,340 | 5,905 |
| GC-content ± SD (%) | 63.70 ± 0.08 | 70.23 ± 0.08 | 55.84 ± 0.07 | 55.12 ± 0.06 | 54.50 ± 0.07 | 54.46 ± 0.07 | 55.21 ± 0.05 | 56.15 ± 0.06 |
| CpG island O/E ratio ± SD | 0.84 ± 0.1 | 0.95 ± 0.3 | 0.62 ± 0.1 | 0.68 ± 0.1 | 0.63 ± 0.05 | 0.63 ± 0.05 | 0.64 ± 0.1 | 0.68 ± 0.1 |
SD: Standard Deviation. Proportion (%) of the chromosome sequence covered by methods
Figure 4CpGPAP system flowchart.