Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Understanding the limits of large datasets.

Literature DB >> 22729362

Understanding the limits of large datasets.

Catherine M Sanders¹, Sidney L Saltzstein, Matthew M Schultzel, Duy H Nguyen, Helen Shi Stafford, Georgia Robins Sadler.

Abstract

Many health professionals use large datasets to answer behavioral, translational, or clinical questions. Understanding the impact of missing data in large databases, such as disease registries, can avoid erroneous interpretations of these data. Using the California Cancer Registry, the authors selected seven common cancers, seven sociodemographic and clinical variables, and the top three reporting sources, as examples of the type of data that would be deemed critical to most studies. The gender variable had no missing data, followed by age (<0.1 % missing), ethnicity (1.7 %), stage (9.8 %), differentiation (39.1 %), and birthplace (41.1 %). Reports from hospitals and clinics had the lowest percentages of missing data. Users of large datasets should anticipate the limitations of missing data to prevent methodological flaws and misinterpretations of research findings. Knowledge of what and how much data may be missing in large datasets can help prevent errors in research conclusions, while better guiding treatment modalities and public health policies and programs.

Entities: Disease Species

Mesh：

Year: 2012 PMID： 22729362 PMCID： PMC4153382 DOI： 10.1007/s13187-012-0383-7

Source DB: PubMed Journal: J Cancer Educ ISSN： 0885-8195 Impact factor: 2.037

14 in total

1. Voluntary reporting system for occupational disease: pilot project, evaluation.

Authors: N S Seixas; K D Rosenman
Journal: Public Health Rep Date: 1986 May-Jun Impact factor: 2.792

2. Quality of cancer registry birthplace data for Hispanics living in the United States.

Authors: Scarlett L Gomez; Sally L Glaser
Journal: Cancer Causes Control Date: 2005-08 Impact factor: 2.506

3. The underreporting of disease and physicians' knowledge of reporting requirements.

Authors: P M Konowitz; G A Petrossian; D N Rose
Journal: Public Health Rep Date: 1984 Jan-Feb Impact factor: 2.792

4. Factors associated with missing birthplace information in a population-based cancer registry.

Authors: S S Lin; C D O'Malley; S W Lui
Journal: Ethn Dis Date: 2001 Impact factor: 1.847

5. Intracystic papillary carcinoma: a review of 917 cases.

Authors: Julia Grabowski; Sidney L Salzstein; Georgia Robins Sadler; Sarah Blair
Journal: Cancer Date: 2008-09-01 Impact factor: 6.860

6. Late age (85 years or older) peak incidence of bladder cancer.

Authors: Matthew Schultzel; Sidney L Saltzstein; Tracy M Downs; Suzuho Shimasaki; Catherine Sanders; Georgia Robins Sadler
Journal: J Urol Date: 2008-03-04 Impact factor: 7.450

7. Bias in completeness of birthplace data for Asian groups in a population-based cancer registry (United States).

Authors: Scarlett L Gomez; Sally L Glaser; Jennifer L Kelsey; Marion M Lee
Journal: Cancer Causes Control Date: 2004-04 Impact factor: 2.506

8. Racial/ethnic differences in early detection of breast cancer: a study of 250,985 cases from the California Cancer Registry.

Authors: Courtney Summers; Sidney L Saltzstein; Sarah Lynn Blair; Tara Tomiko Tsukamoto; Georgia Robins Sadler
Journal: J Womens Health (Larchmt) Date: 2010-02 Impact factor: 2.681

9. A comparison of merkel cell carcinoma and melanoma: results from the california cancer registry.

Authors: Julia Grabowski; Sidney L Saltzstein; Georgia Robins Sadler; Zunera Tahir; Sarah Blair
Journal: Clin Med Oncol Date: 2008-04-01

10. Early cancer detection among rural and urban Californians.

Authors: Sarah L Blair; Georgia R Sadler; Rebecca Bristol; Courtney Summers; Zanera Tahar; Sidney L Saltzstein
Journal: BMC Public Health Date: 2006-07-26 Impact factor: 3.295

12 in total

1. Language affects length of stay in emergency departments in Queensland public hospitals.

Authors: Ibrahim Mahmoud; Xiang-Yu Hou; Kevin Chu; Michele Clark
Journal: World J Emerg Med Date: 2013

2. Artificial Intelligence in Adult Spinal Deformity.

Authors: Pramod N Kamalapathy; Aditya V Karhade; Daniel Tobert; Joseph H Schwab
Journal: Acta Neurochir Suppl Date: 2022

3. Risk factors and communities disproportionately affected by cervical cancer in the Russian Federation: A national population-based study.

Authors: Anastasiya Muntyanu; Vladimir Nechaev; Elena Pastukhova; James Logan; Elham Rahme; Elena Netchiporouk; Andrei Zubarev; Ivan V Litvinov
Journal: Lancet Reg Health Eur Date: 2022-06-30

Review 4. Big data and clinicians: a review on the state of the science.

Authors: Weiqi Wang; Eswar Krishnan
Journal: JMIR Med Inform Date: 2014-01-17

5. Identification of patients with congenital hemophilia in a large electronic health record database.

Authors: Michael Wang; Anissa Cyhaniuk; David L Cooper; Neeraj N Iyer
Journal: J Blood Med Date: 2017-08-30

6. Incidence trends and survival outcomes of penile squamous cell carcinoma: evidence from the Surveillance, Epidemiology and End Results population-based data.

Authors: Feng Qi; Xiyi Wei; Yuxiao Zheng; Xiaohan Ren; Xiao Li; Erkang Zhao
Journal: Ann Transl Med Date: 2020-11

7. Population-Based Study Detailing Cutaneous Melanoma Incidence and Mortality Trends in Canada.

Authors: Santina Conte; Feras M Ghazawi; Michelle Le; Hacene Nedjar; Akram Alakel; François Lagacé; Ilya M Mukovozov; Janelle Cyr; Ahmed Mourad; Wilson H Miller; Joël Claveau; Thomas G Salopek; Elena Netchiporouk; Robert Gniadecki; Denis Sasseville; Elham Rahme; Ivan V Litvinov
Journal: Front Med (Lausanne) Date: 2022-03-03

Review 8. The utility of medico-legal databases for public health research: a systematic review of peer-reviewed publications using the National Coronial Information System.

Authors: Lyndal Bugeja; Joseph E Ibrahim; Noha Ferrah; Briony Murphy; Melissa Willoughby; David Ranson
Journal: Health Res Policy Syst Date: 2016-04-12

9. Identification of people with acquired hemophilia in a large electronic health record database.

Authors: Michael Wang; Anissa Cyhaniuk; David L Cooper; Neeraj N Iyer
Journal: J Blood Med Date: 2017-07-19

10. Incidence and Mortality of Prostate Cancer in Canada during 1992-2010.

Authors: François Lagacé; Feras M Ghazawi; Michelle Le; Evgeny Savin; Andrei Zubarev; Mathieu Powell; Linda Moreau; Denis Sasseville; Ioana Popa; Ivan V Litvinov
Journal: Curr Oncol Date: 2021-02-21 Impact factor: 3.677