Synthetic Healthcare Database for Research (SyH-DR)
The Synthetic Healthcare Database for Research (SyH-DR) is an all-payer, nationally representative claims database. The database consists of a sample of inpatient, outpatient, and prescription drug claims, including utilization, payment, and enrollment data, for people insured by Medicare, Medicaid, or commercial health insurance in 2016. AHRQ created SyH-DR, in part, as a resource to facilitate improvements to price and quality transparency in healthcare.
SyH-DR is a synthetic database that replicates the structure and statistical properties of the original claims data while protecting privacy and confidentiality of people and institutions. Synthetic data are created by statistically modeling or changing original data so that new values or data elements are generated while maintaining the original data's statistical properties. Additional steps, such as masking, are taken to reduce the risk of identifying people and institutions so that the data may be made publicly available to a broad community of researchers.
An approved application and data use agreement are required for access to SyH-DR.
Overview of SyH-DR
- The Agency for Healthcare Research and Quality (AHRQ) created SyH-DR from eligibility and claims files for Medicare, Medicaid, and commercial insurance plans in calendar year 2016.
- SyH-DR contains data from a nationally representative sample of insured individuals for the 2016 calendar year.
- SyH-DR uses synthetic data elements at the claim level to resemble the marginal distribution of the original data elements.
- SyH-DR person-level data elements are not synthetic, but identifying information is aggregated or masked. Go to data documentation for a complete listing of synthetic, masked, and retained data elements.
- Although SyH-DR was designed to be analytically valid, researchers should be aware of the recommendations and limitations described in the data documentation.
More information is available in Introduction to Synthetic Healthcare Database for Research (PDF, 447 KB).
Data Files in SyH-DR
SyH-DR consists of 14 files. Each of the three insurance categories (Medicare, Medicaid, and Commercial) has three claims files (inpatient, outpatient, and pharmacy) and a person file. In addition, two provider files provide a limited set of hospital characteristics and are linkable to the claims files by the facility ID. All variables were harmonized across payers so that the files have the same structure, variable names, and definitions to allow for ease of analysis across payers.
Claims Files
- Commercial Inpatient File.
- Commercial Outpatient File.
- Commercial Person-Level File.
- Commercial Pharmacy File.
- Medicaid Inpatient File.
- Medicaid Outpatient File.
- Medicaid Person-Level File.
- Medicaid Pharmacy File.
- Medicare Inpatient File.
- Medicare Outpatient File.
- Medicare Person-Level File.
- Medicare Pharmacy File.
Provider Files
- Medicaid Provider File.
- Medicare Provider File.
SyH-DR Requests and Documentation
Data Request
AHRQ approval is required for access to SyH-DR. To request access to SyH-DR, follow the steps included in the Data Request Guide (PDF, 640 KB) and submit the required application form (PDF, 391 KB) and data use agreement (PDF, 875 KB). Completed applications will be reviewed by AHRQ.
Documentation
- SyH-DR Sampling, Weighting, and Synthetization Methodologies (PDF, 868 KB)
- SyH-DR Codebook (PDF, 746 KB)
- SyH-DR Data Dictionary:
- Data Dictionary Contents (CSV, 2.3 MB)
- Person-Level Variables (PDF, 115 KB)
- Claims-Level Variables (PDF, 183 KB)
- Pharmacy Variables (PDF, 116 KB)
- Introduction to Synthetic Healthcare Database for Research (PDF, 447 KB)
Known Data Issues
- Maryland Medicaid Diagnosis Codes, January 2024 (PDF, 212 KB)
- Primary Diagnosis Imputed Flag, January 2024 (PDF, 233 KB)
- Removal of Synthetic Drug ID, January 2024 (PDF, 261 KB)
- State Code Label, January 2024 (PDF, 252 KB)
Summary Statistics
All files are public use files and are weighted. Summary statistics updated to reflect changes to the SyH-DR database in January 2024.
- Commercial Inpatient File (PDF, 640 KB)
- Commercial Outpatient File (PDF, 741 KB)
- Commercial Person-Level File (PDF, 157 KB)
- Commercial Pharmacy File (PDF, 86 KB)
- Medicaid Inpatient File (PDF, 482 KB)
- Medicaid Outpatient File (PDF, 626 KB)
- Medicaid Person-Level File (PDF, 217 KB)
- Medicaid Pharmacy File (PDF, 86 KB)
- Medicaid Provider File (PDF, 80 KB)
- Medicare Inpatient File (PDF, 841 KB)
- Medicare Outpatient File (PDF, 877 KB)
- Medicare Person-Level File (PDF, 234 KB)
- Medicare Pharmacy File (PDF, 87 KB)
- Medicare Provider File (PDF, 80 KB)