BACKGROUND: An assessment was made of the usefulness and accuracy of a computer program for the identification of the south Asian population through the classification of names on a disease register. METHODS: The computer program, Nam Pehchan, was used to classify names as either south Asian or non south Asian. The results were compared with a reference standard, which combined use of the program with visual inspection. The latter was facilitated by a computer-generated dictionary of common non south Asian names. The data set consisted of 356,555 cases of incident cancer (ICD9: 140-208) registered between 1990 and 1992 by Thames, Trent, West Midlands and Yorkshire cancer registries. RESULTS: Nam Pehchan classified 5506 cases as south Asian. Visual inspection identified 2024 false positives (36.8 per cent of all cases identified as south Asian by Nam Pehchan) and 363 false negatives (9.5 per cent of those identified by the reference standard). Compared with the reference standard, Nam Pehchan had a sensitivity of 90.5 per cent and a positive predictive value of 63.2 per cent. CONCLUSION: The Nam Pehchan program quickly identified a high proportion of the names classified as south Asian by the reference standard, but the high false positive rate means that the program alone is not an adequate single strategy. The time-consuming process of inspection of program negatives for large data sets can be substantially reduced by comparison with dictionaries of common non south Asian names.
BACKGROUND: An assessment was made of the usefulness and accuracy of a computer program for the identification of the south Asian population through the classification of names on a disease register. METHODS: The computer program, NamPehchan, was used to classify names as either south Asian or non south Asian. The results were compared with a reference standard, which combined use of the program with visual inspection. The latter was facilitated by a computer-generated dictionary of common non south Asian names. The data set consisted of 356,555 cases of incident cancer (ICD9: 140-208) registered between 1990 and 1992 by Thames, Trent, West Midlands and Yorkshire cancer registries. RESULTS:NamPehchan classified 5506 cases as south Asian. Visual inspection identified 2024 false positives (36.8 per cent of all cases identified as south Asian by NamPehchan) and 363 false negatives (9.5 per cent of those identified by the reference standard). Compared with the reference standard, NamPehchan had a sensitivity of 90.5 per cent and a positive predictive value of 63.2 per cent. CONCLUSION: The NamPehchan program quickly identified a high proportion of the names classified as south Asian by the reference standard, but the high false positive rate means that the program alone is not an adequate single strategy. The time-consuming process of inspection of program negatives for large data sets can be substantially reduced by comparison with dictionaries of common non south Asian names.
Authors: Hude Quan; Nadia Khan; Bing Li; Karin H Humphries; Peter Faris; P Diane Galbraith; Michelle Graham; Merril L Knudtson; William A Ghali Journal: Can J Cardiol Date: 2010 Aug-Sep Impact factor: 5.223
Authors: D R Webb; K Khunti; B Srinivasan; L J Gray; N Taub; S Campbell; J Barnett; J Henson; S Hiles; A Farooqi; S J Griffin; N J Wareham; M J Davies Journal: Trials Date: 2010-02-19 Impact factor: 2.279
Authors: Baiju R Shah; Maria Chiu; Shubarna Amin; Meera Ramani; Sharon Sadry; Jack V Tu Journal: BMC Med Res Methodol Date: 2010-05-15 Impact factor: 4.615
Authors: M van Laar; P A McKinney; R C Parslow; A Glaser; S E Kinsey; I J Lewis; S V Picton; M Richards; G Shenton; D Stark; P Norman; R G Feltbower Journal: Br J Cancer Date: 2010-09-14 Impact factor: 7.640
Authors: Aman P K Nijjar; Hong Wang; Kaberi Dasgupta; Doreen M Rabi; Hude Quan; Nadia A Khan Journal: Cardiovasc Diabetol Date: 2010-01-22 Impact factor: 9.951