Alison J Gibberd1, Judy M Simpson2, Sandra J Eades3. 1. Sydney School of Public Health, The University of Sydney, Edward Ford Building, Sydney, New South Wales 2006, Australia. Electronic address: a.gibberd@unswalumni.com. 2. Sydney School of Public Health, The University of Sydney, Edward Ford Building, Sydney, New South Wales 2006, Australia. 3. Aboriginal Health Domain, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne, Victoria 3004, Australia.
Abstract
BACKGROUND AND OBJECTIVES: Algorithms are often used to improve identification of Aboriginal Australians in linked data sets with inconsistent and incomplete recording of Aboriginal status. We compared how consistently some common algorithms identified family members, developed a new algorithm incorporating relatives' information, and assessed the effects of these algorithms on health estimates. METHODS: The sample was people born 1980-2011 recorded as Aboriginal at least once (or a relative) in four Western Australian data sets and their relatives (N = 156,407). A very inclusive approach, ever-Aboriginal (EA/EA+, where + denotes children's records incorporated), and two more specific approaches, multistage median (MSM/MSM+) and last record (LR/LR+), were chosen, along with the new algorithm (MSM+Family). RESULTS: Ever-Aboriginal (EA) categorized relatives the least consistently; 25% of parent-child triads had incongruent Aboriginal statuses with EA+, compared with only 9% with MSM+. With EA+, 14% of full siblings had different statuses compared with 8% for MSM+. EA produced the lowest estimates of the proportion of Aboriginal people with poor health outcomes. Using relatives' records reduced the number of uncategorized people and categorized as Aboriginal more people who had few records (e.g., no hospital admissions). CONCLUSION: When many data sets are linked, more specific algorithms select more representative Aboriginal samples and identify Aboriginality of relatives more consistently.
BACKGROUND AND OBJECTIVES: Algorithms are often used to improve identification of Aboriginal Australians in linked data sets with inconsistent and incomplete recording of Aboriginal status. We compared how consistently some common algorithms identified family members, developed a new algorithm incorporating relatives' information, and assessed the effects of these algorithms on health estimates. METHODS: The sample was people born 1980-2011 recorded as Aboriginal at least once (or a relative) in four Western Australian data sets and their relatives (N = 156,407). A very inclusive approach, ever-Aboriginal (EA/EA+, where + denotes children's records incorporated), and two more specific approaches, multistage median (MSM/MSM+) and last record (LR/LR+), were chosen, along with the new algorithm (MSM+Family). RESULTS: Ever-Aboriginal (EA) categorized relatives the least consistently; 25% of parent-child triads had incongruent Aboriginal statuses with EA+, compared with only 9% with MSM+. With EA+, 14% of full siblings had different statuses compared with 8% for MSM+. EA produced the lowest estimates of the proportion of Aboriginal people with poor health outcomes. Using relatives' records reduced the number of uncategorized people and categorized as Aboriginal more people who had few records (e.g., no hospital admissions). CONCLUSION: When many data sets are linked, more specific algorithms select more representative Aboriginal samples and identify Aboriginality of relatives more consistently.
Authors: Carol McInerney; Ibinabo Ibiebele; Jane B Ford; Deborah Randall; Jonathan M Morris; David Meharg; Jo Mitchell; Andrew Milat; Siranda Torvaldsen Journal: BMJ Open Date: 2019-11-21 Impact factor: 2.692
Authors: B J McNamara; J Jones; Ccj Shepherd; L Gubhaju; G Joshy; D McAullay; D B Preen; L Jorm; S J Eades Journal: Int J Popul Data Sci Date: 2020-03-16
Authors: Michael A Nelson; Kim Lim; Jason Boyd; Damien Cordery; Allan Went; David Meharg; Lisa Jackson-Pulver; Scott Winch; Lee K Taylor Journal: BMC Med Res Methodol Date: 2020-10-28 Impact factor: 4.615