| Literature DB >> 27141517 |
Jalpa A Doshi1, Franklin B Hendrick2, Jennifer S Graff3, Bruce C Stuart2.
Abstract
INTRODUCTION: High quality research regarding treatment effectiveness, quality, and value is critical for improving the U.S. health care system. Recognition of this has led federal and state officials to better leverage existing data sources such as medical claims and survey data, but access must be balanced with privacy concerns.Entities:
Keywords: Data access policies; all-payer claims datasets; privacy; public datasets; research
Year: 2016 PMID: 27141517 PMCID: PMC4827788 DOI: 10.13063/2327-9214.1204
Source DB: PubMed Journal: EGEMS (Wash DC) ISSN: 2327-9214
Characteristics and Access for Selected Federal Data Sets Containing Protected Health Information
|
| |||||||
|---|---|---|---|---|---|---|---|
| Centers for Medicare and Medicaid Services (CMS) | Chronic Condition Data Warehouse (CCW) files | Enrollment information and claims data for Medicare Part A, B, and D including service dates, diagnoses, procedures, charges, and payments, plus functional assessment data (MDS, OASIS), Part D plan characteristics and formulary data, prescriber and dispenser characteristics. | RIF | 5-digit ZIP code, service dates, birth dates, ages >89, death dates | Only for research that advances CMS mission | No explicit restrictions | No explicit restrictions, but requestor must be deemed independent of commercial funding source |
| Centers for Medicare and Medicaid Services (CMS) | Medicaid Analytic eXtract (MAX) files | Enrollment information and inpatient, longterm care, prescription, and other claims and encounter records for all Medicaid recipients. Includes service dates, diagnoses, procedures, charges, and payments. | RIF | 5-digit ZIP code, service dates, birth dates, ages >89, death dates | Only for research that advances CMS mission | No explicit restrictions | No explicit restrictions but requestor must be deemed independent of commercial funding source |
| Centers for Medicare and Medicaid Services (CMS) | Medicare Standard Analytical Files (SAF) | Enrollment information and claims data for Medicare Part A and B including service dates, diagnoses, procedures, charges, and payments. | LDS | County, service dates, birth dates, ages >89, death dates | Only for research that advances CMS mission | No explicit restrictions | No explicit restrictions |
| Centers for Medicare and Medicaid Services (CMS) | Medicare Current Beneficiary Survey (MCBS) | Panel survey of a nationally representative sample of the Medicare population. Survey data on socioeconomic and demographic characteristics, health status and functioning, health care use and expenditures, health insurance coverage, and access to care. MCBS files come linked to Part A, B, and D claims records including service dates, diagnoses, procedures, charges, and payments. | LDS | 5-digit ZIP code, service dates, birth dates, ages >89, death dates | Only for research that advances CMS mission | No explicit restrictions | No explicit restrictions |
| National Institutes of Health (NIH) National Cancer Institute (NCI) | Surveillance, Epidemiology, and End Results (SEER)–Medicare Linked Database | Data from cancer registries linked with Medicare enrollment and claims files. | LDS | County, service dates, birth (year and month), death dates | Only for research purposes | No explicit restrictions | Commercial funders must provide letter indicating researcher has freedom to publish. |
| National Institutes of Health (NIH) National Institute on Aging (NIA) | Health and Retirement Study (HRS) linked with Medicare enrollment and claims | Biennial panel survey of Americans over the age of 50. Survey data include demographic and socioeconomic characteristics, labor force participation, retirement, health status, and functioning. Can be linked with restricted data including interview and birth date, Medicare enrollment, and Part A, B, And D claims files. | HRS-Restricted Data (NIA), Medicare enrollment and claims-RIF (CMS) | 5-digit ZIP code, service dates, birth dates, ages > 89, death dates | Only for research and statistical purposes (NIA). Only for research that advances CMS’s mission (CMS) | Only if affiliated with an institution with a DHHS certified Human Subjects review process | The project must be funded by U.S. government grant or contract or foundation. |
| Centers for Disease Control and Prevention (CDC) | National Health and Nutrition Examination Survey (NHANES) linked with Medicare enrollment and claims | NHANES data can be linked with Medicare enrollment and Part A, B, and D claims files. | NHANES-PUF (CDC), Medicare enrollment and claims-RIF (CMS) | 5-digit ZIP code, service dates, birth dates, ages > 89, death dates | Research must have public health benefit. | No explicit restrictions | No explicit restrictions |
| U.S. Food and Drug Administration (FDA) | FDA Mini-Sentinel distributed data network | Mini-Sentinel is a pilot project sponsored by the FDA to create an active surveillance system— the Sentinel System— to monitor the safety of FDA-regulated medical products. Mini-Sentinel uses preexisting electronic health care data from multiple sources. Collaborating institutions provide access to data, as well as scientific and organizational expertise. Data include enrollment, demographic, prescription drug, and medical procedure information. | Not specified | 5-digit ZIP code, service dates, birth dates, ages >89, death dates | Limited to Mini-Sentinel’s public health purposes (e.g., active surveillance, assessment of the impact of FDA actions). Expanded Sentinel Initiative will include broader research component. | N/A for Mini-Sentinel in the pilot phase. Only FDA and Mini-Sentinel Collaborators have access to data. | N/A for Mini-Sentinel in the pilot phase |
| Veterans Health Administration (VHA) | Veterans Health Information System and Technology Architecture (VistA) | VA-wide information system built around electronic medical record data relating to veterans’ health care, with nearly 160 integrated software modules for clinical care, financial functions, and infrastructure. | Not specified | 5-digit ZIP code, service dates, birth dates, death dates | No explicit restrictions | Only VA employees. | No explicit restrictions |
Notes: (DHHS) Department of Health & Human Services; (FFS) fee-for-service; (LDS) limited data set; (MDS) Minimum Data Set; (OASIS) Outcome and Assessment Information Set; (PUF) public use file; (RIF) research identifiable file.
Data are based on information available as of November 2014.
“Type of file” is based on the terminology used by the data owner.
Data Request Process, Cost, and Timelines; Data Storage and Access Policies; and Latest Years of Data for Selected Federal Data Sets
| Chronic Condition Data Warehouse (CCW) files | Fixed fee per file type per year of data depending on number of beneficiary lives requested ($$$); data reuse fee ($$); for virtual access fixed fee per user ($$$) | Yes | Yes | Yes | Yes (for Part D data and functional assessment files) | Yes | Yes | ∼6 to 18 weeks | (1) Physical data files mailed to researcher (rigorous policies on securing research environment), and (2) Remote access | Medicare files (2012) |
| Medicaid Analytic eXtract (MAX) files | Fixed fee per file type per year of data depending on number of beneficiary lives requested ($$$); data reuse fee ($$); for virtual access fixed fee per user ($$$) | Yes | Yes | Yes | Yes (for functional assessment files) | Yes | Yes | ∼6 to 18 weeks | (1) Physical data files mailed to researcher (rigorous policies on securing research environment), and (2) Remote access | MAX files (2010) |
| Medicare Standard Analytical Files SAF) | Fixed fee per year of data ($$); data reuse fees ($) | Yes | Yes | Yes | No | No | No | ∼ 3 to 4 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | All files (2011) |
| Medicare Current Beneficiary Survey (MCBS) | Fixed fee per year of data ($) | Yes | Yes | Yes | No | No | No | ∼3 to 4 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | Access to Care (2012), Cost and Use (2010) |
| Surveillance, Epidemiology, and End Results (SEER)–Medicare Linked Database | Fixed fee per file (per year depending on type of file) by type and number of cancer sites ($$) | Yes | Yes | Yes | No | Yes | No | ∼4 to 6 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | SEER cases (2011) with Medicare enrollment and claims (2012) |
| Restricted version of the Health and Retirement Study (HRS) linked with Medicare enrollment and claims | Fixed fee per year ($$). | Yes | Yes | Yes | Yes (for Part D data and assessment files) | Yes | Yes | See Table Note | Physical data files mailed to researcher (rigorous data security policies apply) | Linked with CMS Medicare enrollment and claims files (2012) |
| Restricted version of the National Health and Nutrition Examination Survey (NHANES) linked with Medicare enrollment and claims | Fixed fee per day for on-site access ($); fixed fee per day for staff-assisted research ($). | Yes | N/A | Yes | Yes | Yes | Yes | ∼6 to 8 weeks | (1) On-site access, (2) remote access, or 3) staff-assisted research option | 1999–2004 NHANES linked through 2007 Medicare enrollment and claims files |
| FDA Mini-Sentinel distributed data network | No costs. | N/A for public health operations | N/A | Data Partners run queries on their own data and provide aggregate results to the Coordinating Center. Data held outside of the Data Partner’s environment must meet rigorous data security policies | 2014 | |||||
| Veterans Health Information System and Technology Architecture (VistA) | No fee for local VHA data; fixed fee per file per year for national files ($$) | Yes | N/A | Yes | No | Yes | Yes | At least 4 weeks | (1) On-site access, (2) Remote access | 2014 |
Notes: (CMS) Centers for Medicare & Medicaid Services; (DMP) data management plan; (DUA) data use agreement; (FFS) fee-for-service; (IRB) institutional review board.
Data are based on information available as of November 2014.
$ denotes typically < $1000; $$ denotes typically >=$1000 and < $10,000: $$$ denotes typically > $10,000.
The requestor must first obtain approval, which can take from under 3 months to as much as over a year, depending on the research environment (e.g., computer without access to internet locked in room versus networked computer). The request must then obtain CMS approval, which can take as long as 6 to 18 weeks.
Characteristics and Restrictions on Access to Selected State Funded All Payer Claims Data Sets (APCDs)
| Vermont Green Mountain Board | Vermont Healthcare Claims Uniform Reporting and Evaluation System ( | Enrollment information; medical and pharmacy claims; and provider information from the following:
Commercial payers Self-funded and third party administrators Medicare (FFS under DUA with CMS, Medicare Advantage, Medicare Part D) Medicaid (FFS and managed care) | Not specified | Street address, service dates, birth dates, ages > 89, death dates | N/A | Only state govt. agencies or contractors | N/A |
| Massachusetts Center for Health Information and Analysis | Massachusetts All-Payer Claims Database ( | Enrollment information; medical, pharmacy, and dental claims; and provider information from the following:
Commercial payers Third party administrators Medicare (FFS under DUA with CMS, Medicare Advantage, Medicare Part D) Medicaid and MassHealth (FFS and managed care) | Level 2 File | City name, service dates, birth dates (month and year only), ages > 89 | Only if purpose serves the public interest. Purpose of Medicare FFS data must fall under the DUA with CMS. Use of Medicaid data must be connected with Medicaid program | Only state govt. agencies, researchers, providers, and qualified individuals. Only state govt. agencies can use Medicare FFS data. | No explicit restrictions |
| Maine Health Data Organization | Maine All-Payer Claims Database ( | Enrollment information; medical, dental, and pharmacy claims; and provider information from the following:
Commercial payers Self-funded and third party administrators Medicare (FFS, Medicare Advantage, Medicare Part D) Medicaid (FFS and managed care) | Not specified | City name, service dates, birth dates, ages > 89 | Only for research or statistical purposes. Medicare FFS data must fall under the DUA with CMS | No explicit restrictions | No explicit restrictions |
| New Hampshire Dept. of Health and Human Services | New Hampshire Comprehensive Health Care Information System ( | Enrollment information; medical, pharmacy, and dental claims; and provider information from the following:
Commercial Medicaid (managed care, no FFS Medicaid in state) There are plans to include FFS and Part D from the Medicare program in the future. | Commercial Limited Use Data Set | City name, service dates, birth dates, ages > 89 | Only for research purposes. | No explicit restrictions | No explicit restrictions |
| Minnesota Dept. of Health | Minnesota’s All-Payer Claims Database ( | Enrollment information; medical and pharmacy claims; and provider information from the following:
Commercial payers Self-funded and third party administrators Medicare (FFS under DUA CMS but the Minnesota Department of Health is also a Qualified Entity, Medicare Advantage, Medicare Part D) Medicaid (FFS and managed care) | Not specified | City name, service dates, birth dates, ages > 89 | N/A | Only state govt. agencies or contractors | N/A |
| Kansas Dept. of Health and Environment (KDHE) Division of Health Care Finance (DHCF) | Kansas Data Analytic Interface ( | Enrollment information; medical, dental, and pharmacy claims from the following:
State Employee Health Plan Medicaid (managed care, no FFS Medicaid in state) | Not specified | Street address, service dates, birth dates, ages > 89, death dates | N/A | Only state govt. agency and select entities working with the state | N/A |
| Tennessee Division of Health Planning | Tennessee All-Payer Claims Database ( | Enrollment information; medical and pharmacy claims; and provider information from the following:
Commercial payers Self-funded and third party administrators Medicaid (managed care, no FFS Medicaid in state) There are plans to include data from the Medicare program in the future | Not specified | 5-digit ZIP code, service dates, birth dates (year only), ages > 89 | N/A | State govt. agency | N/A |
| Colorado Dept. of Health Care Policy and Financing (HCPF) and Center for Improving Value in Health Care | Colorado All-Payer Claims Database ( | Enrollment information; medical and pharmacy claims; and provider information from the following:
Commercial payers There are plans to include self-funded and third party administrators Medicare (Medicare Advantage, Medicare Part D) Medicaid (managed care, almost no FFS in state) There are plans to include FFS data from the Medicare program in the future. | Limited Data Set | 5-digit ZIP code, service dates, birth dates, ages > 89 | Only research supporting the Colorado Triple Aim of better health, better care, and lower costs. | No explicit restrictions | No explicit restrictions |
| Identifiable Information Data Set | Street address, service dates, birth dates, ages > 89 | Comparative effectiveness research | Researchers and health care providers | No explicit restrictions | |||
| Maryland Health Care Commission (MHCC) | Maryland Medical Care Database ( | Enrollment information; medical, pharmacy, and dental claims; and provider information from the following:
Commercial payers Self-funded and third party administrators Medicare (FFS under DUA with CMS, Medicare Advantage, Medicare Part D) Medicaid | Not specified | 5-digit ZIP code, service dates, birth dates (year and month only), ages > 89 | Purpose of Medicare FFS data must fall under the DUA with CMS. No explicit restrictions for other data | Only the MHCC can use Medicare FFS data. No explicit restrictions for other data | No explicit restrictions |
| Office for Oregon Health Policy and Research | Oregon All Payer All Claims Database Limited Data Set ( | Enrollment information; medical and pharmacy claims; and provider information from the following:
Commercial payers Self-funded and third party administrators Medicare (Medicare Advantage, Medicare Part D) Medicaid (FFS and managed care) | Limited Data Set | 5-digit ZIP code, service dates, birth dates (year only), ages > 89 | Only if purpose serves the public interest and supports the mission and aims of the Oregon Health Authority | No explicit restrictions | No explicit restrictions |
Notes: (CMS) Centers for Medicare & Medicaid Services; (DUA) data use agreement; (FFS) fee-for-service.
Data are based on information available as of November 2014.
States in most cases do not use the research identifiable file and limited data set terminology. The current listings come from the language found on state websites.
Data Cost, Request Process and Timelines, Data Storage, and Recency for State All Payer Claims Data Sets (APCDs) Available to External Users
| Massachusetts All-Payer Claims Database | Level 2 File | Fixed fee per file type of data depending on requestor: academic, $$$; and others, $$$ | Yes | Yes | Yes | Yes | No | No | ∼12 to 20 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | 2012 |
| Maine All-Payer Claims Database | Fixed fee per file type per year of data depending on requestor: commercial, $$$; assessed (e.g., provider, health insurer, etc.), $$$; nonprofit and educational, $$; redistributor, $$$ | No | No | Yes | No | No | No | At least 6 weeks for nonpractitioner-identifiable data; at least 8 weeks for practitioner-identifiable data | Physical data files mailed to researcher (no specific policies on securing research environment) | Commercial & Medicaid (2014) | |
| New Hampshire Comprehensive Health Care Information System | Commercial Limited Use Data Set | No cost | Yes | No | Yes | Yes | No | No | ∼12 weeks | Physical data files mailed to researcher (no specific policies on securing research environment) | 2014 |
| Colorado All-Payer Claims Database | Limited Data Set | Licensing fee based: nonprofit with revenues $3–$5 mil, $$$; nonprofit with revenues of $1–$3 mil, $$$; nonprofit with revenues <$1 mil, $$$; academic institutions, $$$; state agencies,$$; other organizations, $$$ | Yes | Yes | Yes | Yes | No | No | ∼4 to 11 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | 2012 |
| Identifiable Information Data Set | Yes | Yes | Yes | Yes | IRB approval, Privacy Board Review Approval, or proof of patient authorization | Physical data files mailed to researcher (rigorous policies on securing research environment) | |||||
| Maryland Medical Care Database | No cost | Yes | Yes | Yes | Yes | Yes | No | At least 4 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | 2014 | |
| Oregon All Payer All Claims Database Limited Data Set | Limited Data Set | Fixed fee per file type per year of data ($$) | Yes | Yes | Yes | Yes | Yes | No | At least 6 weeks | Physical data files mailed to researcher (rigorous policies on securing research environment) | 2011 |
Notes: (DMP) data management plan; (DUA) data use agreement; (FFS) fee-for-service; (IRB) institutional review board.
Data are based on information available as of November 2014.
$ denotes typically < $1000; $$ denotes typically >=$1000 and < $10,000; $$$ denotes typically > $10,000.
Academic institutions, state agencies, and nonprofits with revenue less than $5 million are eligible for external APCD scholarship support that covers almost two-thirds of the cost, on average.