| Literature DB >> 17430969 |
Taesung Park1, Youngchul Kim, Stefan Bekiranov, Jae K Lee.
Abstract
Statistical analysis on tiling array data is extremely challenging due to the astronomically large number of sequence probes, high noise levels of individual probes and limited number of replicates in these data. To overcome these difficulties, we first developed statistical error estimation and weighted ANOVA modeling approaches to high-density tiling array data, especially the former based on an advanced error-pooling method to accurately obtain heterogeneous technical error of small-sample tiling array data. Based on these approaches, we analyzed the high-density tiling array data of the temporal replication patterns during cell-cycle S phase of synchronized HeLa cells on human chromosomes 21 and 22. We found many novel temporal replication patterns, identifying about 26% of over 1 million tiling array sequence probes with significant differential replication during the four 2-h time periods of S phase. Among these differentially replicated probes, 126 941 sequence probes were matched to 417 known genes. The majority of these genes were found to be replicated within one or two consecutive time periods, while the others were replicated at two non-consecutive time periods. Also, coding regions found to be more differentially replicated in particular time periods than noncoding regions in the gene-poor chromosome 21 (25% differentially replicated among genic probes versus 18.6% among intergenic probes), while such a phenomenon was less prominent in gene-rich chromosome 22. A rigorous statistical testing for local proximity of differentially replicated genic and intergenic probes was performed to identify significant stretches of differentially replicated sequence regions. From this analysis, we found that adjacent genes were frequently replicated at different time periods, potentially implying the existence of quite dense replication origins. Evaluating the conditional probability significance of identified gene ontology terms on chromosomes 21 and 22, we detected some over-represented molecular functions and biological processes among these differentially replicated genes, such as the ones relevant to hydrolase, transferase and receptor-binding activities. Some of these results were confirmed showing >70% consistency with cDNA microarray data that were independently generated in parallel with the tiling arrays. Thus, our improved analysis approaches specifically designed for high-density tiling array data enabled us to reliably and sensitively identify many novel temporal replication patterns on human chromosomes.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17430969 PMCID: PMC1888820 DOI: 10.1093/nar/gkm130
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Estimated LPE baseline distributions for four time periods of replication in S phase. The LPE variance estimates of the replicated tiling arrays were found to be a non-increasing function of probe intensity. Left-hand sides of the LPE estimates were thresholded due to the artificially reduced variability, which can be easily revealed in the AM plots (see Supplementary Figure 1). The LPE baseline distributions showed significantly different magnitude between time conditions.
Figure 2.Frequency of differentially replicated probes on each 500-kb interval of chromosomes 21 and 22. The start and end parts of chromosomes have much higher concentration of differential replication for both chromosomes, and the number of these probes is larger near the centromere and telomere of q-arms than the remaining chromosome positions. Frequencies of differentially replicated probes in coding and noncoding regions on chromosomes 21 and 22.
Distribution of differentially replicated probes
| Chromosome 21 | Chromosome 22 | |||||
|---|---|---|---|---|---|---|
| FDR cutoff | Total | Number of coding probes | Number of genes | Total | Number of coding probes | Number of genes |
| 5E−2 | 113 841 | 50 929 | 154 | 157 899 | 81 887 | 256 |
| 5E−3 | 67 114 | 31 559 | 109 | 101 121 | 53 111 | 181 |
| 5E−4 | 46 950 | 21 468 | 85 | 72 100 | 37 414 | 137 |
| 5E−5 | 34 764 | 15 651 | 65 | 60 910 | 58 677 | 30 260 |
| Significant | Non-significant | Total | ||||
| Chromosome 21 | Genic | 50 929 (24.97%) (154 gene) | 153 049 (76.03%) | 203 978 (336 gene) | ||
| Intergenic | 62 912 (18.61%) | 275 065 (71.39%) | 337 977 | |||
| Total | 113 841 (21.02%) | 427 583 (78.98%) | 541 424 | |||
| Chromosome 22 | Genic | 81 887 (29.63%) (256 gene) | 194 398 (70.37%) | 276 285 (688 gene) | ||
| Intergenic | 76 012 (32.20%) | 159 995 (67.80%) | 236 007 | |||
| Total | 157 889 (30.85%) | 353 871 (69.15%) | 511 760 | |||
Figure 3.Replication patterns for four genes: HASF2BP, COL6A2, PCNT2 and ANKRD21, with differential replication in time. These genes are called as early (0–2 h), middle (2–6 h) or late (6–8 h) replicated genes. For example, Figure 3A shows that HASF2BP has the highest peak at early replication time, where each line in this figure represents a sequence probe for this gene.
Figure 4.Replication patterns of gene BAGE3 divided into six different consecutive regions with tightly homogeneous expression patterns for the replication times. Sequence probes of BAGE3 from the same gene seem to have homogeneous replication patterns with minor variation on their physical positions.
Figure 5.Replication timing and exon density of differentially replicated probes on (A) chromosome 21 and (B) chromosome 22 for the entire time period of S phase. Replication period (y-axis) averaging over multiple probes of each of the genes with differential temporal expression along the chromosomal position (x-axis) was shown compared to the frequency of exon probes at the same position. Even though some genes are nearby, their replication times seem to be quite different, and the regions concentrated with exon exhibited little or no difference compared to other regions.
Overrepresented GO terms of the genes with temporal differential replication on chromosome 21 and 22 from 700-bp window analysis
| Best gos | Gene symbols | Number of annotated gene/total gene | FDR |
| Lipid transport | ABCG1 APOL4 APOL6 APOL2 APOL5 OSBP2 APOL1 APOL3 | 8/84 | 0.0611 |
| Glutathione biosynthesis | GGT1 GGTLA1 GGT2 | 3/8 | 0.0611 |
| Cyanate metabolism | TST MPST | 2/2 | 0.0611 |
| Cyanate catabolism | TST MPST | 2/2 | 0.0611 |
| One-carbon-compound catabolism | TST MPST | 2/2 | 0.0611 |
| Best gos | Gene symbols | Number of annotated gene/total gene | Gostat FDR |
| Hydrolase activity, acting on carbon–nitrogen (but not peptide) bonds, in cyclic amidines | APOBEC3G ARP10 APOBEC3F Q5IFJ4 Q8NFD1 APOBEC3C APOBEC3A | 7/41 | 0.00604 |
| Structural constituent of eye lens | CRYAA CRYBB3 CRYBA4 CRYBB1 CRYBB2 | 5/19 | 0.00618 |
| Gamma-glutamyl transferase activity | GGT1 GGTLA1 Q6ISH0 Q5NV76 GGT2 | 5/22 | 0.00889 |
| Glucocorticoid receptor binding | YWHAH NRIP1 | 2/2 | 0.0317 |
| Oncostatin-M receptor binding | LIF OSM | 2/2 | 0.0317 |
| Transferase activity, transferring amino-acyl groups | GGT1 GGTLA1 Q6ISH0 Q5NV76 GGT2 | 5/33 | 0.0317 |
| Oxidoreductase activity, acting on superoxide radicals as acceptor | SOD1 KIAA0179 D21S2056E | 3/9 | 0.0317 |
| Superoxide dismutase activity | SOD1 KIAA0179 D21S2056E | 3/9 | 0.0317 |
| Hydrolase activity, acting on carbon–nitrogen (but not peptide) bonds | APOBEC3G Q8NFD1 ARP10 APOBEC3F Q5IFJ4 UPB1 APOBEC3C HDAC10 APOBEC3A | 9/132 | 0.0361 |
| Protein carrier activity | Q6ICM2 SEC14L2 SEC14L3 SEC14L4 | 4/24 | 0.0361 |
| Carbonyl reductase (NADPH) activity | CBR3 CBR1 | 2/3 | 0.0361 |
| Thiosulfate sulfurtransferase activity | TST MPST | 2/3 | 0.0361 |
| Hematopoietin/interferon-class (D200-domain) cytokine receptor activity | IFNAR1 CSF2RB IL2RB ENSP00000343289 IL17R IFNGR2 IL10RB | 7/85 | 0.038 |
Figure 6.Concordant replication timing between tiling and cDNA array data. The majority of matched pairs of tiling array probe and cDNA clone showed concordant replication times: 29 (15 on chromosome 21 and 14 on chromosome 22) of 41 pairs with exact or adjacent time periods and 19 pairs (10 on chromosome 21 and 9 on chromosome 22) had the exact same replication times.