-
A Visual Diagnostic Tool for Causal Inference
Rosenbaum and Rubin (1983) suggested a visual representation, that can be used as a diagnostic tool, for examining whether the relationships between confounders and outcomes are sufficiently controlled, or whether there is a more complex relationship that requires further adjustment. This short commentary highlights this simple tool, providing an example of its utility along with relevant R code.
causal inference, data visualization, regression diagnostics
Despite numerous analytical methods available, the most common method for confounding control is covariate adjustment, therefore straight-forward, user friendly diagnostics for determining the adequacy of this adjustment are valuable. The final section of Rosenbaum and Rubin (1983) suggests a visual representation, that can be used as a diagnostic tool, for such an adjustment that we have not seen receive focus in research workflows or pedagogical methods. Specifically, Rosenbaum and Rubin (1983) point out that research studies that implement propensity score methods where the outcome is continuous and the propensity score is estimated using a linear discriminant analysis (êi), there are many useful properties for evaluating the fit of the final outcome model conditional on the propensity score. In this situation, the point estimate for the treatment effect in an ordinary least squares model fit with the adjustment covariates will be exactly equal to one fit adjusting for the propensity score, assuming the same functional form of all covariates is used in the fully adjusted model and the propensity score model. Therefore, a plot of the outcome (y0i and y1i) or residuals (yti −ŷti) versus the estimated propensity score (êi) will provide a two-dimensional display of the multivariate adjustment. In the case of an ordinary least squares model, the latter is a function of the “residuals versus fits” plot with the key distinction that the points are visually distinguished by treatment arm (for example with a different color or shape or even faceted separately). A simple plot such as this could be a useful diagnostic tool [End Page 87] to assess whether the relationships between confounders and the outcomes are sufficiently controlled, or whether there is a more complex relationship that requires further adjustment, for example non-linear relationships or heterogeneous treatment effects.
Much attention in the literature has been given to the selection of confounders(VanderWeele 2019; VanderWeele and Shpitser 2011; Shrier 2008; Rubin 2008, 2009, 2007; Pearl 2000; Sjölander 2009; D’Amour et al. 2021; Häggström 2018; Greenland and Pearl 2011), however once selected, the functional form of the selected covariates appears to be less emphasized. Even under the assumption that all confounders have been adjusted for in some capacity, if the correct functional form is not used substantial bias can be seen in the estimated treatment effect. Similarly, the presence of non-parallel response surfaces for outcomes within treatment groups or heterogeneous treatment effects can also bias a result if not accounted for. These propensity score residual plots recommended by Rosenbaum and Rubin (1983) can provide one method for evaluating whether the estimated treatment effect is distorted by the presence of non-linear or nonparallel response surfaces. Of note, if the variance of the covariates differs by treatment, covariate adjustment (either directly or via adjustment of the propensity score) without properly accounting for non-linearity will yield substantially biased results (Rubin 1973).
In this variation of the “residuals versus fits” plot, the proposed plot maps the residuals (yti −ŷti) to the y-axis and the estimated propensity score (êi) to the x-axis. Alternatively, one could plot the “standard” residuals versus fits plot (residuals on the y-axis and fitted values on the x-axis) stratified by treatment such that any differences are more clearly visible in the facets. Below is a simple example of such a plot along with the R code used to create it. The data used for the plot was simulated such that there is a heterogeneous relationship between a summary measure of the confounders (such as the true propensity score), X, and the outcome, Y , based on treatment, T as follows:
As an illustration, we fit the following ordinary least squares model, and then created the standard residuals versus fits plot (Figure 1). We added a loess line to aid in visualizing any potential relationships.
Notice when examining this standard plot alone there is not an obvious violation of assumptions, if anything there appears to be a non-linear relationship between X and Y , rather than a heterogeneous effect. Using this plot alone could lead to the investigator erroneously adding a non-linear term rather than the appropriate interaction.
Based on the recommendation in Rosenbaum and Rubin (1983), we will create this same figure, stratified by treatment assignment. We can either do this by selecting a different [End Page 88]
[End Page 89]
color for treatment arm and overlaying the plots (Figure 2) or faceting the plots, creating a separate one for each treatment group (Figure 3).
Observing these figures, the lack of fit is much more obvious and would likely lead the researcher to re-fit the model accounting for the heterogeneous relationship. Additionally, the estimated causal effect for the misspecified model (β̂1) is 0, which is a biased estimate, compared to the “true” causal effect, which is dependent on the value of X. We could now fit the following correct model and recreate the stratified residuals versus fits plot (Figure 4).
Summary
When estimating causal effects using covariate adjustment, a sensible and straightforward diagnostic plot to use is the residuals versus fits plot stratified by treatment assignment. Ideally these plots would be generated during the exploratory phase of the modeling process and once in the confirmatory phase the correct relationship between the treatment, confounders, and outcome would be well understood allowing the correct model to be pre-specified. The use of this graphical tool could also be incorporated into routine diagnostics [End Page 90]
[End Page 91]
[End Page 92]
that are used when assessing the performance of propensity score models in statistical inference.
Code
Below is the R code used to simulate the scenarios described in the paper as well as create the figures.
[End Page 94]
Wake Forest University
Winston-Salem, NC, 27109, USA
mcgowald@wfu.edu
Wake Forest University School of Medicine
Winston-Salem, NC, 27109, USA
rdagosti@wakehealth.edu