| Literature DB >> 28492523 |
E S Hall1,2, K Marsolo2, J M Greenberg1.
Abstract
OBJECTIVE: To better address barriers arising from missing and unreliable identifiers in neonatal medical records, we evaluated agreement and discordance among traditional and non-traditional linkage fields within a linked neonatal data set. STUDYEntities:
Mesh:
Year: 2017 PMID: 28492523 PMCID: PMC5578885 DOI: 10.1038/jp.2017.70
Source DB: PubMed Journal: J Perinatol ISSN: 0743-8346 Impact factor: 2.521
Figure 1Data linkage flow diagram.
Count of distinct values and selectivity of each identifier field within the set of 7,293 linked records.
| Identifier Field | Distinct Values in the Physician Billing Record Set | Selectivity in the Physician Billing Record Set | Distinct Values in the Newborn Medical Record Set | Selectivity in the Newborn Medical Record Set |
|---|---|---|---|---|
| Infant Sex | 2 | 0.0% | 2 | 0.0% |
| Zip Code | 270 | 3.7% | 277 | 3.8% |
| Birth Weight (Nearest 10 Grams) | 430 | 5.9% | 429 | 5.9% |
| Mother First Name (Soundex-Encoded) | 1,022 | 14.0% | 902 | 12.4% |
| Date of Birth | 1,092 | 15.0% | 1092 | 15.0% |
| Infant First Name (Soundex-Encoded) | 1,103 | 15.1% | 889 | 12.2% |
| Father Surname (Soundex-Encoded) | 1,494 | 20.5% | 0 | 0.0% |
| Street Name (Soundex-Encoded) | 1,765 | 24.2% | 1,730 | 23.7% |
| Birth Weight (Exact) | 2,089 | 28.6% | 1,709 | 23.4% |
| Mother Surname (Soundex-Encoded) | 2,367 | 32.5% | 2,408 | 33.0% |
| Father Surname | 2,380 | 32.6% | 0 | 0.0% |
| Infant Surname (Soundex-Encoded) | 2,429 | 33.3% | 2,471 | 33.9% |
| Street Name | 2,653 | 36.4% | 2,533 | 34.7% |
| Mother First Name | 3,120 | 42.8% | 2,808 | 38.5% |
| Infant First Name | 3,164 | 43.4% | 2,305 | 31.6% |
| Mother Surname | 3,828 | 52.5% | 3,834 | 52.6% |
| Street Number | 3,950 | 54.1% | 3,912 | 53.6% |
| Infant Surname | 3,956 | 54.2% | 3,977 | 54.5% |
| Street Address | 6,976 | 95.7% | 6,687 | 91.7% |
Father’s surname was not available in the newborn medical record.
Number of matching, non-matching, and missing values for each identifier field within the set of 7,293 linked neonatal records.
| Identifier Field | Matching N, % | Non-Matching N, % | Missing from Only Physician Billing Records N, % | Missing from Only Newborn Medical Records N, % | Missing from Both Physician Billing and Newborn Medical Records N, % |
|---|---|---|---|---|---|
| Infant Sex | 7,281, 99.8% | 11, 0.2% | 0, 0.0% | 1, 0.0% | 0, 0.0% |
| Date of Birth | 7,280, 99.8% | 13, 0.2% | 0, 0.0% | 0, 0.0% | 0, 0.0% |
| Mother Surname (Soundex-Encoded) | 6,945, 95.2% | 270, 3.7% | 72, 1.0% | 5, 0.1% | 1, 0.0% |
| Mother Surname | 6,865, 94.1% | 350, 4.8% | 72, 1.0% | 5, 0.1% | 1, 0.0% |
| Mother First Name (Soundex-Encoded) | 6,376, 87.4% | 100, 1.4% | 71, 1.0% | 745, 10.2% | 1, 0.0% |
| Mother First Name | 6,276, 86.1% | 200, 2.7% | 71, 1.0% | 745, 10.2% | 1, 0.0% |
| Zip Code | 6,265, 85.9% | 867, 11.9% | 0, 0.0% | 161, 2.2% | 0, 0.0% |
| Street Number | 6,023, 82.6% | 1,109, 15.2% | 0, 0.0% | 161, 2.2% | 0, 0.0% |
| Birth Weight (Nearest 10 Grams) | 5,963, 81.8% | 971, 13.3% | 228, 3.1% | 120, 1.6% | 11, 0.2% |
| Street Name (Soundex-Encoded) | 5,793, 79.4% | 1,300, 17.8% | 16, 0.2% | 182, 2.5% | 2, 0.0% |
| Street Name | 5,642, 77.4% | 1,451, 19.9% | 16, 0.2% | 182, 2.5% | 2, 0.0% |
| Mother or Father Surname (Physician Billing) to Infant Surname (Newborn Medical) (Soundex-Encoded) | 5,211, 71.5% | 2,020, 27.7% | 62, 0.9% | 0, 0.0% | 0, 0.0% |
| Mother or Father Surname (Physician Billing) to Infant Surname (Newborn Medical) | 5,146, 70.6% | 2,085, 28.6% | 62, 0.9% | 0, 0.0% | 0, 0.0% |
| Infant Surname (Soundex-Encoded) | 5,047, 69.2% | 2,246, 30.8% | 0, 0.0% | 0, 0.0% | 0, 0.0% |
| Infant Surname | 5,003, 68.6% | 2,290, 31.4% | 0, 0.0% | 0, 0.0% | 0, 0.0% |
| Mother Surname to Infant Surname (Soundex-Encoded) | 4,212, 57.7% | 3,008, 41.2% | 73,1.0% | 0, 0.0% | 0, 0.0% |
| Mother Surname to Infant Surname | 4,173, 57.2% | 3,047, 41.8% | 73, 1.0% | 0, 0.0% | 0, 0.0% |
| Birth Weight (Exact) | 3,516, 48.2% | 3,418, 46.9% | 228, 3.1% | 120, 1.6% | 11, 0.2% |
| Street Address | 3,437, 47.1% | 3,695, 50.7% | 0, 0.0% | 161, 2.2% | 0, 0.0% |
| Infant First Name (Soundex-Encoded) | 2,898, 39.7% | 91, 1.2% | 806, 11.1% | 2,700, 37.0% | 798, 10.9% |
| Infant First Name | 2,803, 38.4% | 186, 2.6% | 806, 11.1% | 2,700, 37.0% | 798, 10.9% |
| Father Surname (Physician Billing) to Infant Surname (Newborn Medical) (Soundex-Encoded) | 2,210, 30.3% | 1,585, 21.7% | 3,498, 48.0% | 0, 0.0% | 0, 0.0% |
| Father Surname (Physician Billing) to Infant Surname (Newborn Medical) | 2,159, 29.6% | 1,636, 22.4% | 3,498, 48.0% | 0, 0.0% | 0, 0.0% |
The number of matching as well as non-missing, non-matching identifier pairs within the set of 7,293 linked neonatal records.*
| Count of Matching Pairs | Number of Linked Records (N) | Absolute Percent (%) | Cumulative Percent (%) | Count of Non-Matching Pairs | Number of Linked Records (N) | Absolute Percent (%) | Cumulative Percent (%) |
|---|---|---|---|---|---|---|---|
| 12 | 347 | 4.8% | 4.8% | 12 | 0 | 0.0% | 0.0% |
| 11 | 1,310 | 18.0% | 22.7% | 11 | 0 | 0.0% | 0.0% |
| 10 | 1,738 | 23.8% | 46.6% | 10 | 0 | 0.0% | 0.0% |
| 9 | 1,382 | 18.9% | 65.5% | 9 | 0 | 0.0% | 0.0% |
| 8 | 1,228 | 16.8% | 82.3% | 8 | 2 | 0.0% | 0.0% |
| 7 | 714 | 9.8% | 92.1% | 7 | 3 | 0.0% | 0.1% |
| 6 | 308 | 4.2% | 96.4% | 6 | 67 | 0.9% | 1.0% |
| 5 | 189 | 2.6% | 98.9% | 5 | 272 | 3.7% | 4.7% |
| 4 | 60 | 0.8% | 99.8% | 4 | 532 | 7.3% | 12.0% |
| 3 | 16 | 0.2% | 100.0% | 3 | 748 | 10.3% | 22.3% |
| 2 | 1 | 0.0% | 100.0% | 2 | 1,724 | 23.6% | 45.9% |
| 1 | 0 | 0.0% | 100.0% | 1 | 1,954 | 26.8% | 72.7% |
| 0 | 0 | 0.0% | 100.0% | 0 | 1,991 | 27.3% | 100.0% |
The 12 identifier pairs included in the evaluation were: infant date of birth, infant sex, Soundex-encoded infant first name, Soundex-encoded infant surname, street number, Soundex-encoded street name, zip code, infant birth weight rounded to the nearest 10 grams, Soundex-encoded maternal first name, Soundex-encoded maternal surname, comparison of the Soundex-encoded infant surname to the Soundex-encoded maternal surname, and comparison of the Soundex-encoded infant surname to the Soundex-encoded paternal surname.
Qualitative summary of causes for discordance between identifier field pairs within linked records.
| Identifier Field | Qualitative Assessment |
|---|---|
| Birth Weight (Exact) | Birth weights disagreed by one or two grams |
| Date of Birth | “Day” component of date disagreed by one or two days |
| Infant First Name | Transposed first and surnames in one record |
| Infant Surname | Inconsistent name spelling |
| Infant Sex | Incorrectly coded value in one record, in several cases a child with a first name like “Infant Boy” was assigned “Female” sex within the same record. |
| Mother First Name | Inconsistent name spelling |
| Mother Surname | Inconsistent name spelling |
| Street Address | Completely different addresses listed |
| Zip Code | Miskeyed or transposed digits in one record |