Evaluating uses of deep learning methods for Causal inference

Whata, Albert

Evaluating uses of deep learning methods for Causal inference

Whata, Albert

URI: https://doi.org/10.1109/ACCESS.2021.3140189
http://hdl.handle.net/20.500.12821/518

Date: 2022

Abstract:

Logistic regression (LR) is a popular method that is used for estimating causal effects in observational studies using propensity scores. We examine the use of deep learning models such as the deep neural network (DNN), PropensityNet (PN), convolutional neural network (CNN), and convolutional neural network-long short-term memory network (CNN-LSTM) to estimate propensity scores and evaluate causal inference. We conducted studies using simulated data with different sample sizes (N = 500, N = 1000, N =2000), 15 covariates, a continuous outcome and a binary exposure. These data were used in seven scenarios that were different in the degree of nonlinearity and nonadditivity associations between the exposure and covariates. Estimation of propensity scores was considered a classification task and performance metrics that included classification accuracy, receiver operating characteristic curve area under the curve (AUCROC), covariate balance, standard error, absolute bias, and the 95% confidence interval coverage were evaluated for each model. Our simulation results show that deep learning models (CNN, DNN, and CNN-LSTM) outperformed LR in the estimation of the propensity score. CNN and CNN-LSTM achieved good results for covariate balance, classification accuracy, AUCROC, and Cohen’s Kappa. Although LR provided substantially better bias reduction, it produced subpar performance based on classification accuracy,AUCROC, Cohen’s Kappa, and 95% confidence interval coverage compared to the deep learning models.The results suggest that deep learning methods, especially CNN, may be useful for estimating propensity scores that are used to estimate causal effects.

Show full item record