Abstract:
Logistic regression (LR) is a popular method that is used for estimating causal effects in
observational studies using propensity scores. We examine the use of deep learning models such as the deep neural network (DNN), PropensityNet (PN), convolutional neural network (CNN), and convolutional neural network-long short-term memory network (CNN-LSTM) to estimate propensity scores and evaluate causal inference. We conducted studies using simulated data with different sample sizes (N = 500, N = 1000, N =2000), 15 covariates, a continuous outcome and a binary exposure. These data were used in seven scenarios that were different in the degree of nonlinearity and nonadditivity associations between the exposure and covariates. Estimation of propensity scores was considered a classification task and performance metrics that included classification accuracy, receiver operating characteristic curve area under the curve (AUCROC), covariate balance, standard error, absolute bias, and the 95% confidence interval coverage were evaluated for each model. Our simulation results show that deep learning models (CNN, DNN, and
CNN-LSTM) outperformed LR in the estimation of the propensity score. CNN and CNN-LSTM achieved good results for covariate balance, classification accuracy, AUCROC, and Cohen’s Kappa. Although LR provided substantially better bias reduction, it produced subpar performance based on classification accuracy,AUCROC, Cohen’s Kappa, and 95% confidence interval coverage compared to the deep learning models.The results suggest that deep learning methods, especially CNN, may be useful for estimating propensity scores that are used to estimate causal effects.