This appendix briefly describes some analytical methods that can help you estimate program impacts: using a difference-in-difference approach, conducting multivariate analysis that controls for patient- and practice-level variables, and adjusting standard errors for clustering and multiple comparisons. These are complex topics, and these summaries are intended as an introduction to the general concepts underlying them. Again, it’s often most efficient to consult with an experienced evaluator to explore the analytic methods that are most appropriate for your evaluation questions and design.
Estimate Effects Using a Difference-in-Difference Approach
We recommend that evaluations calculate difference-in-difference estimates of program impacts by subtracting the difference in a given outcome between the intervention and comparison groups before the intervention began from the difference in that same outcome during the intervention. This approach assumes that any differences between intervention and comparison practices in both levels and trends in outcomes before the intervention would have persisted after the intervention if the intervention had not occurred. Thus, for example, in the case of improved access through email described in Figure 2, the impact of the intervention is the change in access over time for patients in intervention practices after netting out the change in access over time experienced by patients in comparison practices.
Control for Differences in Patient and Practice Characteristics
In your analyses, you should use multivariate regressions to adjust estimates for differences in important patient- and practice-level variables (described previously) or control for practice fixed effects (that is, practice-level characteristics that do not change over time) because pre-existing differences in intervention and comparison practices can affect outcomes.
Adjust Standard Errors for Clustering and Multiple Comparisons
You must account for clustering when determining the statistical significance of the estimates of program effects. If clustering is ignored, a test of statistical significance might show a difference in outcomes between intervention and comparison practices to be statistically significant when it is not. In other words, ignoring the clustered nature of the data can lead to a false positive—finding an effect that does not exist.7 Similarly, if you test the effect of the intervention on numerous outcomes, you risk finding some effects just by chance. Therefore, you should assess whether you are finding more statistically significant findings than would be expected by chance for the number of tests you are conducting. There are also more formal ways to adjust standard errors for multiple comparisons.20