| Literature DB >> 34095519 |
Augusto Afonso Guerra Junior1, Ramon Gonçalves Pereira2, Eli Iola Gurgel3, Mariangela Cherchiglia3, Leonardo Vinicius Dias4, Juliano D Ávila5, Núbia Santos5, Afonso Reis5, Francisco Assis Acurcio1, Wagner Meira Junior2.
Abstract
INTRODUCTION: In Brazil, the National Health System (SUS) provides healthcare to the public. The system has multiple administrative databases; the major databases record hospital (SIH) and outpatient (SIA) procedures. Epidemiological information is collected for all populations in subsystems, such as mortality (SIM), live births (SINASC) and diseases of compulsory declaration (SINAN). Each subsystem has its own information system, which is able to provide information about consultations, clinical information and medicines dispensed. However, these systems are not linked, thereby preventing individual-centred analysis.Entities:
Keywords: Brazilian health database; Data linkage; SUS deduplication; record linkage
Year: 2018 PMID: 34095519 PMCID: PMC8142958 DOI: 10.23889/ijpds.v3i1.446
Source DB: PubMed Journal: Int J Popul Data Sci ISSN: 2399-4908
Figure 1: Model proposed to create the Brazilian National Database of Health centred on the individual(*)2008-2015
| Databases | Number of records |
|---|---|
| Hospitalization Information System (SIH) | 188,512,557 |
| Outcomes Information System (SIA (APAC;BPAI) | 869,709,353 |
| Mortality Information System (SIM) | 16,635,462 |
| Diseases and conditions of mandatory reporting and control data (SINAN)* | 10,994,195 |
| TOTAL | 1,085,851,567 |
(*)The amplitude of variation of agreements’ weight for these attributes was calculated through frequency tables.
| Attribute | M | U | Agreement Weight | Disagreement Weight | Missing Weight | |
|---|---|---|---|---|---|---|
| Patient’s name | 0.8523 | 0.0046 | 7.5377 - 14.2942* | -2.7528 | 2.3925 | |
| Mother’s name | 0.8448 | 0.8222 | 0.0391 - 14.2880* | -0.1959 | -0.0784 | |
| Father’s name | 1.0000 | 1.0000 | 0.0000 - 14.2907* | 0.0000 | 0.0000 | |
| Sex | 0.9868 | 0.9312 | 0.0836 | -2.3779 | -1.1472 | |
| CPF | 0.9999 | 0.9102 | 0.1356 | -0.1215 | 0.0000 | |
| CNS | 0.9389 | 0.9414 | -0.0038 | 0.0590 | 0.0276 | |
| Date of Birth | 0.2132 | 0.1842 | 0.2109 - 14.2880* | -0.0522 | 0.0794 | |
| State | 0.9872 | 0.9694 | 0.0263 - 10.3754* | -1.2610 | -0.6173 | |
| City | 0.9760 | 0.8896 | 0.1338 - 14.2886* | -2.2041 | -1.0352 | |
| Zip code | 0.1979 | 0.0008 | -2.3428 - 14.2880* | -0.3170 | 3.8546 | |
| ID | Blocking Strategy | Pairs |
|---|---|---|
| 1 | Cpf | 12,480,286 |
| 2 | Cns | 151,730,408 |
| 4 | patient’s first name, patient’s middle name, patient’s last name, date | 486,325,426 |
| 4 | mother’s first name, mother’s last name, date | 1,215,062,093 |
| 5 | patient’s name, patient’s last name, mother’s last name, month, year, sex | 1,722,402,253 |
| 6 | patient’s name, patient’s last name, mother’s first name, mother’s last name, zip code | 1,506,043,390 |
| 7 | patient’s name, patient’s last name, state, day, month, sex | 2,535,059,431 |
| 8 | patient’s name, patient’s last name, state, day, year, sex | 2,830,748,051 |
| 9 | patient’s name, patient’s last name, state, month, year, sex | 6,304,252,194 |
Figure 2: Score distribution
Figure 3: True pairs x full ranges and Figure 4: True pair x expanded ranges
Figure 4: True pair x expanded ranges