Higher education institutions are often interested in examining the performance discrepancies of specific subgroups, such as students from underrepresented minority and first-generation backgrounds. An increase in educational technology and computational power has promoted researchers’ interest in using data mining tools to help identify these at-risk groups. Institutions can then implement data-driven decisions to help promote student access, increase retention and graduation rates, and implement appropriate intervention programs. We introduce a Latent Class Analysis (LCA) and random forest ensemble that will recursively partition observations into groups to help identify at-risk students. The procedure is a form of model-based hierarchical clustering that relies on latent class trees to optimally identify subgroups. Causal inferences can also be embedded within the latent class forest for observational studies. This algorithm is applied to data from three semesters of Psychology 101 at San Diego State University to identify at-risk groups of students before they enroll in the class. A post hoc analysis is conducted to identify students who benefit from Supplemental Instruction (SI), a peer led academic assistance program. In doing so, we are able to classify students by their demographic and academic characteristics to identify unique traits that could be important factors in their academic success.