We've Moved!
Visit SDSU’s new digital collections website at https://digitalcollections.sdsu.edu
Description
In observational studies, in order to derive unbiased inference, matching methods are often applied to produce balanced treatment and control groups in terms of all background variables. Propensity score has been a key component in this research area. However, propensity score based matching methods in the literature have several limitations, such as difficulties in handling missing data, categorical variables with more than two levels, and nonlinear relationships. Random forest, averaging outcomes from many decision trees, is straightforward to use and capable of solving these issues. More importantly, the precision afforded by random forest may provide us with a better and less model dependent estimate of the propensity score. In addition, the proximity matrix, a by-product of the random forest, may naturally serve as a distance measure between observations that can be used in matching. Our results show that the proposed methods can produce well balanced treatment and control groups. An illustration is provided that the methods can effectively deal with missing data in covariates. In randomized clinical trials, because of randomization, the confounding effects of covariates is seldom an issue. However, randomized clinical trials are designed to demonstrate the average treatment effect at the group level, and usually only a subgroup of patients truly shows the treatment effects. Therefore, it can be very misleading to apply the average treatment effects to the individual patients with different background in clinical practices. A novel method, random forest of interaction trees (IT) is proposed to predict individual treatment effect (ITE) based on randomized clinical trials. The interaction tree procedure automatically facilitates a number of objectively defined subgroups, in some of which the treatment effect can be found prominent while in others the treatment may have a negligible or even negative effect. The basic idea is to obtain a predicted treatment effect for any individual via random IT forest, given his/her covariate information. Besides predicting the ITE, two add-on methods are developed based on the advanced features in random forest, including the "grouping" method to form better groups to guide stratified medicine/treatment and "variable importance rankings" to identify the important treatment effect moderator or modifiers.