| Literature DB >> 35559007 |
Ang Shan1,2, Fang Zhang1, Yihui Luan1.
Abstract
Biological time series data plays an important role in exploring the dynamic changes of biological systems, while the determinate patterns of association between various biological factors can further deepen the understanding of biological system functions and the interactions between them. At present, local trend analysis (LTA) has been commonly conducted in many biological fields, where the biological time series data can be the sequence at either the level of gene expression or OTU abundance, etc., A local trend score can be obtained by taking the similarity degree of the upward, constant or downward trend of time series data as an indicator of the correlation between different biological factors. However, a major limitation facing local trend analysis is that the permutation test conducted to calculate its statistical significance requires a time-consuming process. Therefore, the problem attracting much attention from bioinformatics scientists is to develop a method of evaluating the statistical significance of local trend scores quickly and effectively. In this paper, a new approach is proposed to evaluate the efficient approximation of statistical significance in the local trend analysis of dependent time series, and the effectiveness of the new method is demonstrated through simulation and real data set analysis.Entities:
Keywords: Markov chain model; dependent time series; local trend analysis; spectral decomposition theory; statistical significance
Year: 2022 PMID: 35559007 PMCID: PMC9086404 DOI: 10.3389/fgene.2022.729011
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
Type I error rate for different methods (the third to fifth columns) in the AR(1) model whent = 0. The first and second columns represent different combinations of autoregressive coefficients and sample sizes. The number of permutation tests is 1,000, the number of repeated simulations is 10,000, and the significance level is α = 0.05.
|
|
| Permutation test | TLTA | STLTA |
|---|---|---|---|---|
| −0.5, −0.5 | 20 | 0.1413 | 0.0470 | 0.0040 |
| 40 | 0.1444 | 0.0764 | 0.0128 | |
| 60 | 0.1378 | 0.0880 | 0.0169 | |
| 80 | 0.1472 | 0.1040 | 0.0213 | |
| 100 | 0.1380 | 0.1046 | 0.0238 | |
| 200 | 0.1465 | 0.1059 | 0.0283 | |
| 0, 0 | 20 | 0.0610 | 0.0170 | 0.0119 |
| 40 | 0.0613 | 0.0270 | 0.0209 | |
| 60 | 0.0605 | 0.0311 | 0.0257 | |
| 80 | 0.0545 | 0.0363 | 0.0282 | |
| 100 | 0.0551 | 0.0360 | 0.0300 | |
| 200 | 0.0581 | 0.0367 | 0.0357 | |
| 0.3, 0.3 | 20 | 0.0518 | 0.0109 | 0.0136 |
| 40 | 0.0451 | 0.0177 | 0.0272 | |
| 60 | 0.0475 | 0.0179 | 0.0285 | |
| 80 | 0.0408 | 0.0238 | 0.0310 | |
| 100 | 0.0435 | 0.0260 | 0.0349 | |
| 200 | 0.0428 | 0.0254 | 0.0371 | |
| 0.3, 0.5 | 20 | 0.0459 | 0.0092 | 0.0135 |
| 40 | 0.0397 | 0.0165 | 0.0288 | |
| 60 | 0.0379 | 0.0181 | 0.0314 | |
| 80 | 0.0407 | 0.0233 | 0.0334 | |
| 100 | 0.0359 | 0.0237 | 0.0354 | |
| 200 | 0.0345 | 0.0221 | 0.0424 | |
| 0.5, 0.5 | 20 | 0.0398 | 0.0091 | 0.0159 |
| 40 | 0.0414 | 0.0159 | 0.0284 | |
| 60 | 0.0365 | 0.0176 | 0.0314 | |
| 80 | 0.0369 | 0.0199 | 0.0343 | |
| 100 | 0.0355 | 0.0213 | 0.0374 | |
| 200 | 0.0344 | 0.0215 | 0.0428 | |
| 0.5, 0.8 | 20 | 0.0412 | 0.0088 | 0.0161 |
| 40 | 0.0388 | 0.0134 | 0.0277 | |
| 60 | 0.0338 | 0.0145 | 0.0342 | |
| 80 | 0.0319 | 0.0165 | 0.0357 | |
| 100 | 0.0337 | 0.0214 | 0.0411 | |
| 200 | 0.0314 | 0.0170 | 0.0402 |
Type I error rate for different methods (the third to fifth columns) in the ARMA(1,1) model whent = 0. The first and second columns represent different combinations of autoregressive coefficients and sample sizes. The number of permutation tests is 1,000, the number of repeated simulations is 10,000, and the significance level is α = 0.05.
|
|
| Permutation test | TLTA | STLTA |
|---|---|---|---|---|
| −0.5, −0.5 | 20 | 0.0617 | 0.0166 | 0.0112 |
| 40 | 0.0609 | 0.0262 | 0.0219 | |
| 60 | 0.0557 | 0.0323 | 0.0289 | |
| 80 | 0.0562 | 0.0333 | 0.0267 | |
| 100 | 0.0538 | 0.0354 | 0.0311 | |
| 200 | 0.0572 | 0.0338 | 0.0329 | |
| 0, 0 | 20 | 0.0444 | 0.0109 | 0.0210 |
| 40 | 0.0463 | 0.0170 | 0.0380 | |
| 60 | 0.0455 | 0.0213 | 0.0404 | |
| 80 | 0.0422 | 0.0270 | 0.0464 | |
| 100 | 0.0397 | 0.0242 | 0.0444 | |
| 200 | 0.0428 | 0.0260 | 0.0539 | |
| 0.3, 0.3 | 20 | 0.0472 | 0.0109 | 0.0240 |
| 40 | 0.0497 | 0.0168 | 0.0426 | |
| 60 | 0.0413 | 0.0187 | 0.0404 | |
| 80 | 0.0395 | 0.0222 | 0.0421 | |
| 100 | 0.0421 | 0.0261 | 0.0545 | |
| 200 | 0.0418 | 0.0250 | 0.0559 | |
| 0.3, 0.5 | 20 | 0.0483 | 0.0095 | 0.0218 |
| 40 | 0.0447 | 0.0172 | 0.0410 | |
| 60 | 0.0438 | 0.0198 | 0.0427 | |
| 80 | 0.0453 | 0.0230 | 0.0432 | |
| 100 | 0.0399 | 0.0240 | 0.0515 | |
| 200 | 0.0420 | 0.0231 | 0.0505 | |
| 0.5, 0.5 | 20 | 0.0503 | 0.0097 | 0.0220 |
| 40 | 0.0409 | 0.0186 | 0.0403 | |
| 60 | 0.0455 | 0.0191 | 0.0417 | |
| 80 | 0.0445 | 0.0235 | 0.0460 | |
| 100 | 0.0399 | 0.0271 | 0.0509 | |
| 200 | 0.0342 | 0.0257 | 0.0591 | |
| 0.5, 0.8 | 20 | 0.0492 | 0.0093 | 0.0202 |
| 40 | 0.0430 | 0.0158 | 0.0337 | |
| 60 | 0.0399 | 0.0193 | 0.0372 | |
| 80 | 0.0435 | 0.0206 | 0.0366 | |
| 100 | 0.0359 | 0.0204 | 0.0418 | |
| 200 | 0.0381 | 0.0199 | 0.0462 |
Type I error rate for different methods (the third to fifth columns) in the ARMA(1,1)-TAR(1) model whent = 0. The first and second columns represent different combinations of autoregressive coefficients and sample sizes. The number of permutation tests is 1,000, the number of repeated simulations is 10,000, and the significance level is α = 0.05.
|
|
| Permutation test | TLTA | STLTA |
|---|---|---|---|---|
| −0.5, −0.5 | 20 | 0.0563 | 0.0127 | 0.0119 |
| 40 | 0.0527 | 0.0194 | 0.0220 | |
| 60 | 0.0463 | 0.0247 | 0.0282 | |
| 80 | 0.0481 | 0.0279 | 0.0285 | |
| 100 | 0.0481 | 0.0264 | 0.0291 | |
| 200 | 0.0437 | 0.0277 | 0.0341 | |
| 0, 0 | 20 | 0.0437 | 0.0083 | 0.0147 |
| 40 | 0.0436 | 0.0150 | 0.0270 | |
| 60 | 0.0393 | 0.0177 | 0.0350 | |
| 80 | 0.0412 | 0.0212 | 0.0377 | |
| 100 | 0.0354 | 0.0210 | 0.0382 | |
| 200 | 0.0362 | 0.0221 | 0.0435 | |
| 0.3, 0.3 | 20 | 0.0395 | 0.0076 | 0.0172 |
| 40 | 0.0382 | 0.0126 | 0.0332 | |
| 60 | 0.0393 | 0.0136 | 0.0349 | |
| 80 | 0.0363 | 0.0183 | 0.0385 | |
| 100 | 0.0353 | 0.0195 | 0.0411 | |
| 200 | 0.0296 | 0.0186 | 0.0470 | |
| 0.3, 0.5 | 20 | 0.0372 | 0.0068 | 0.0199 |
| 40 | 0.0345 | 0.0128 | 0.0328 | |
| 60 | 0.0356 | 0.0137 | 0.0336 | |
| 80 | 0.0328 | 0.0174 | 0.0382 | |
| 100 | 0.0315 | 0.0208 | 0.0437 | |
| 200 | 0.0354 | 0.0184 | 0.0448 | |
| 0.5, 0.5 | 20 | 0.0343 | 0.0067 | 0.0170 |
| 40 | 0.0338 | 0.0130 | 0.0337 | |
| 60 | 0.0305 | 0.0130 | 0.0367 | |
| 80 | 0.0319 | 0.0196 | 0.0400 | |
| 100 | 0.0309 | 0.0160 | 0.0399 | |
| 200 | 0.0251 | 0.0163 | 0.0463 | |
| 0.5, 0.8 | 20 | 0.0410 | 0.0061 | 0.0176 |
| 40 | 0.0316 | 0.0127 | 0.0322 | |
| 60 | 0.0330 | 0.0142 | 0.0354 | |
| 80 | 0.0323 | 0.0170 | 0.0377 | |
| 100 | 0.0273 | 0.0181 | 0.0414 | |
| 200 | 0.0294 | 0.0189 | 0.0466 |
Type I error rate for different methods (the third to fifth columns) in the AR(1) model whent = 0.5. The first and second columns represent different combinations of autoregressive coefficients and sample sizes. The number of permutation tests is 1,000, the number of repeated simulations is 10,000, and the significance level is α = 0.05.
|
|
| Permutation test | TLTA | STLTA |
|---|---|---|---|---|
| −0.5, −0.5 | 20 | 0.2236 | 0.0275 | 0.0400 |
| 40 | 0.2155 | 0.0520 | 0.0134 | |
| 60 | 0.2210 | 0.0508 | 0.0119 | |
| 80 | 0.2158 | 0.0665 | 0.0166 | |
| 100 | 0.2159 | 0.0682 | 0.0178 | |
| 200 | 0.2213 | 0.0702 | 0.0226 | |
| 0, 0 | 20 | 0.0737 | 0.0039 | 0.0263 |
| 40 | 0.0628 | 0.0059 | 0.0188 | |
| 60 | 0.0594 | 0.0075 | 0.0220 | |
| 80 | 0.0572 | 0.0089 | 0.0247 | |
| 100 | 0.0552 | 0.0084 | 0.0246 | |
| 200 | 0.0580 | 0.0107 | 0.0325 | |
| 0.3, 0.3 | 20 | 0.0379 | 0.0009 | 0.0276 |
| 40 | 0.0296 | 0.0012 | 0.0216 | |
| 60 | 0.0296 | 0.0011 | 0.0277 | |
| 80 | 0.0229 | 0.0025 | 0.0304 | |
| 100 | 0.0270 | 0.0017 | 0.0324 | |
| 200 | 0.0241 | 0.0021 | 0.0398 | |
| 0.3, 0.5 | 20 | 0.0243 | 0.0006 | 0.0229 |
| 40 | 0.0174 | 0.0010 | 0.0246 | |
| 60 | 0.0170 | 0.0013 | 0.0263 | |
| 80 | 0.0184 | 0.0018 | 0.0337 | |
| 100 | 0.0184 | 0.0012 | 0.0334 | |
| 200 | 0.0152 | 0.0011 | 0.0355 | |
| 0.5, 0.5 | 20 | 0.0196 | 0.0002 | 0.0175 |
| 40 | 0.0149 | 0.0005 | 0.0221 | |
| 60 | 0.0102 | 0.0006 | 0.0282 | |
| 80 | 0.0105 | 0.0003 | 0.0311 | |
| 100 | 0.0124 | 0.0005 | 0.0350 | |
| 200 | 0.0104 | 0.0003 | 0.0430 | |
| 0.5, 0.8 | 20 | 0.0099 | 0.0001 | 0.0159 |
| 40 | 0.0052 | 0.0001 | 0.0194 | |
| 60 | 0.0036 | 0.0002 | 0.0286 | |
| 80 | 0.0032 | 0.0001 | 0.0303 | |
| 100 | 0.0033 | 0.0000 | 0.0325 | |
| 200 | 0.0017 | 0.0000 | 0.0377 |
Type I error rate for different methods (the third to fifth columns) in the ARMA(1,1) model whent = 0.5. The first and second columns represent different combinations of autoregressive coefficients and sample sizes. The number of permutation tests is 1,000, the number of repeated simulations is 10,000, and the significance level is α = 0.05.
|
|
| Permutation test | TLTA | STLTA |
|---|---|---|---|---|
| −0.5, −0.5 | 20 | 0.0767 | 0.0033 | 0.0269 |
| 40 | 0.0609 | 0.0047 | 0.0166 | |
| 60 | 0.0595 | 0.0070 | 0.0212 | |
| 80 | 0.0566 | 0.0082 | 0.0229 | |
| 100 | 0.0542 | 0.0094 | 0.0284 | |
| 200 | 0.0552 | 0.0104 | 0.0343 | |
| 0, 0 | 20 | 0.0300 | 0.0008 | 0.0251 |
| 40 | 0.0211 | 0.0008 | 0.0354 | |
| 60 | 0.0187 | 0.0013 | 0.0429 | |
| 80 | 0.0201 | 0.0012 | 0.0442 | |
| 100 | 0.0185 | 0.0018 | 0.0456 | |
| 200 | 0.0190 | 0.0016 | 0.0533 | |
| 0.3, 0.3 | 20 | 0.0137 | 0.0001 | 0.0239 |
| 40 | 0.0112 | 0.0004 | 0.0395 | |
| 60 | 0.0115 | 0.0008 | 0.0424 | |
| 80 | 0.0083 | 0.0004 | 0.0453 | |
| 100 | 0.0100 | 0.0003 | 0.0489 | |
| 200 | 0.0073 | 0.0007 | 0.0579 | |
| 0.3, 0.5 | 20 | 0.0109 | 0.0001 | 0.0208 |
| 40 | 0.0073 | 0.0002 | 0.0306 | |
| 60 | 0.0044 | 0.0001 | 0.0431 | |
| 80 | 0.0044 | 0.0003 | 0.0456 | |
| 100 | 0.0048 | 0.0004 | 0.0473 | |
| 200 | 0.0037 | 0.0003 | 0.0565 | |
| 0.5, 0.5 | 20 | 0.0076 | 0.0000 | 0.0206 |
| 40 | 0.0050 | 0.0000 | 0.0360 | |
| 60 | 0.0052 | 0.0002 | 0.0406 | |
| 80 | 0.0041 | 0.0000 | 0.0442 | |
| 100 | 0.0041 | 0.0002 | 0.0511 | |
| 200 | 0.0028 | 0.0001 | 0.0509 | |
| 0.5, 0.8 | 20 | 0.0020 | 0.0000 | 0.0148 |
| 40 | 0.0010 | 0.0000 | 0.0249 | |
| 60 | 0.0011 | 0.0000 | 0.0288 | |
| 80 | 0.0008 | 0.0000 | 0.0333 | |
| 100 | 0.0007 | 0.0000 | 0.0333 | |
| 200 | 0.0003 | 0.0000 | 0.0470 |
Type I error rate for different methods (the third to fifth columns) in the ARMA(1,1)-TAR(1) model whent = 0.5. The first and second columns represent different combinations of autoregressive coefficients and sample sizes. The number of permutation tests is 1,000, the number of repeated simulations is 10,000, and the significance level is α = 0.05.
|
|
| Permutation test | TLTA | STLTA |
|---|---|---|---|---|
| −0.5, −0.5 | 20 | 0.0521 | 0.0013 | 0.0241 |
| 40 | 0.0421 | 0.0034 | 0.0201 | |
| 60 | 0.0375 | 0.0040 | 0.0257 | |
| 80 | 0.0364 | 0.0049 | 0.0264 | |
| 100 | 0.0370 | 0.0049 | 0.0282 | |
| 200 | 0.0330 | 0.0049 | 0.0338 | |
| 0, 0 | 20 | 0.0276 | 0.0005 | 0.0234 |
| 40 | 0.0189 | 0.0009 | 0.0245 | |
| 60 | 0.0186 | 0.0009 | 0.0311 | |
| 80 | 0.0188 | 0.0009 | 0.0360 | |
| 100 | 0.0174 | 0.0011 | 0.0340 | |
| 200 | 0.0150 | 0.0016 | 0.0440 | |
| 0.3, 0.3 | 20 | 0.0169 | 0.0003 | 0.0207 |
| 40 | 0.0113 | 0.0005 | 0.0294 | |
| 60 | 0.0097 | 0.0007 | 0.0301 | |
| 80 | 0.0108 | 0.0006 | 0.0351 | |
| 100 | 0.0091 | 0.0007 | 0.0386 | |
| 200 | 0.0072 | 0.0004 | 0.0440 | |
| 0.3, 0.5 | 20 | 0.0140 | 0.0000 | 0.0209 |
| 40 | 0.0089 | 0.0005 | 0.0283 | |
| 60 | 0.0077 | 0.0000 | 0.0317 | |
| 80 | 0.0072 | 0.0006 | 0.0340 | |
| 100 | 0.0079 | 0.0003 | 0.0375 | |
| 200 | 0.0067 | 0.0004 | 0.0439 | |
| 0.5, 0.5 | 20 | 0.0090 | 0.0001 | 0.0198 |
| 40 | 0.0047 | 0.0001 | 0.0271 | |
| 60 | 0.0054 | 0.0000 | 0.0296 | |
| 80 | 0.0039 | 0.0002 | 0.0360 | |
| 100 | 0.0038 | 0.0002 | 0.0370 | |
| 200 | 0.0045 | 0.0000 | 0.0450 | |
| 0.5, 0.8 | 20 | 0.0072 | 0.0000 | 0.0184 |
| 40 | 0.0045 | 0.0001 | 0.0251 | |
| 60 | 0.0024 | 0.0001 | 0.0328 | |
| 80 | 0.0024 | 0.0001 | 0.0323 | |
| 100 | 0.0016 | 0.0000 | 0.0338 | |
| 200 | 0.0013 | 0.0000 | 0.0440 |
FIGURE 1Standardized abundance map of Parabacteroides (A) and Bifidobacterium (B) in MPHM “M3” sample fecal data set. The autocorrelation graph (C,D) shows the autocorrelation coefficient of the time series at different delays.
The numbers of significant correlations between OTUs found by permutation tests, TLTA, STLTA and DDLSA for different data sets and significance levels.
| — | t = 0.5 | t = 0 | |||
|---|---|---|---|---|---|
| Dataset | — | MPHM | PML | MPHM | PML |
| # of factors | — | 59 | 75 | 59 | 75 |
|
| Permutation | 589 | 87 | 727 | 29 |
| — | TLTA | 165 | 75 | 532 | 13 |
| — | STLTA | 739 | 50 | 667 | 13 |
| — | DDLSA | 685 | 371 | 685 | 371 |
|
| Permutation | 489 | 84 | 549 | 29 |
| — | TLTA | 86 | 74 | 436 | 11 |
| — | STLTA | 621 | 16 | 514 | 4 |
| — | DDLSA | 549 | 227 | 549 | 227 |
FIGURE 2The Venn diagram of the significant relationships found in permutation test, TLTA and STLTA for the MPHM “M3” fecal data set. Green, blue, and red indicate the number of significant relationships found by permutation test, TLTA, and STLTA, respectively.
FIGURE 3The Venn diagram of the significant relationships found in permutation test, TLTA and STLTA for the PML data set. Green, blue, and red indicate the number of significant relationships found by permutation test, TLTA, and STLTA, respectively.