| Literature DB >> 29743333 |
Jaime Pinilla1, Beatriz G López-Valcárcel1, Christian González-Martel1, Salvador Peiro2.
Abstract
OBJECTIVE: Newcomb-Benford's Law (NBL) proposes a regular distribution for first digits, second digits and digit combinations applicable to many different naturally occurring sources of data. Testing deviations from NBL is used in many datasets as a screening tool for identifying data trustworthiness problems. This study aims to compare public available waiting lists (WL) data from Finland and Spain for testing NBL as an instrument to flag up potential manipulation in WLs.Entities:
Keywords: benford-newcomb distribution; fabricated data; waiting list data
Mesh:
Year: 2018 PMID: 29743333 PMCID: PMC5942457 DOI: 10.1136/bmjopen-2018-022079
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Figure 1Theoretical (line) and observed distributions (columns) of first digit for Finnish and Spanish waiting list data.
Test statistics for the first digits of Finnish data
| Value | Count | Frequency observed | Frequency expected | Diff. (MAD) | P values of Z-test for each digit |
| 1 | 175 | 0.29461 | 0.30103 | −0.00642 | 0.7544 |
| 2 | 106 | 0.17845 | 0.17609 | 0.00236 | 0.8717 |
| 3 | 67 | 0.11279 | 0.12494 | −0.01214 | 0.4196 |
| 4 | 64 | 0.10774 | 0.09691 | 0.01083 | 0.3671 |
| 5 | 51 | 0.08586 | 0.07918 | 0.00660 | 0.5429 |
| 6 | 41 | 0.06902 | 0.06695 | 0.00208 | 0.8055 |
| 7 | 43 | 0.07239 | 0.05799 | 0.01440 | 0.1352 |
| 8 | 25 | 0.04209 | 0.05115 | −0.00906 | 0.3521 |
| 9 | 22 | 0.03704 | 0.04576 | −0.00872 | 0.3757 |
| Total | 594 |
Pearson’s χ2 test: 5.9584 (p=0.6519); mean test (absolute value): 0.8077: Kuiper test: 0.8338. All p values are non-significant at the 1% level.
The respective critical test values for the 5% and 1% significance levels are: Pearson’s χ2 test (8 df): 15.51 and 20.09; mean test: 1.96 and 2.58; Kuiper test: 1.75 and 2.00.
MAD, mean absolute deviation.
Test statistics for the first digits of Spanish data
| Value | Count | Frequency observed | Frequency expected | Diff. (MAD) | P values of Z-test for each digit |
| 1 | 312 | 0.40838 | 0.30103 | 0.10735 | 0.0000** |
| 2 | 117 | 0.15314 | 0.17609 | −0.02295 | 0.0966 |
| 3 | 47 | 0.06152 | 0.12494 | −0.06342 | 0.0000** |
| 4 | 45 | 0.05890 | 0.09691 | −0.03801 | 0.0002** |
| 5 | 50 | 0.06545 | 0.07918 | −0.01374 | 0.1798 |
| 6 | 31 | 0.04058 | 0.06695 | −0.02637 | 0.0023** |
| 7 | 55 | 0.07199 | 0.05799 | 0.01400 | 0.1035 |
| 8 | 41 | 0.05366 | 0.05115 | 0.00251 | 0.7422 |
| 9 | 66 | 0.08639 | 0.04576 | 0.04063 | 0.0000** |
| Total | 764 |
Pearson’s χ2 test: 107.511** (p>0.0001); mean test (absolute value): 3.6553**: Kuiper test (absolute value): 4.5732**. **Significant test value on the 1% level.
The respective critical test values for the 5% and 1% significance levels are: Pearson’s χ2 test (8 df): 15.51 and 20.09; mean test: 1.96 and 2.58; Kuiper test: 1.75 and 2.00.
MAD, mean absolute deviation.