While Twitter is a rich source of temporally dynamic data, the nature of Twitter language and Twitter use pose problems for machine learning. More specifically, the constant shifting of language, the wavering attention to trending topics, and the dominance of select opinions make modeling real language on Twitter particularly difficult. However, though these characteristics can make successful machine learning quite an arduous exercise, these problems are quite common to many classification experiments. In fact, extreme variance, small datasets, and large class imbalance plague a majority supervised learning tasks, as in the present study. Here, we explore the use of supervised learning methods for the automatic classification of tweets concerning the California Senate Bill 277 (SB277), the bill prohibiting schools from admitting children that have not been vaccinated. Tweets containing ‘SB277’ were collected for three events: when the bill was signed into law, when the attempt to overturn the bill failed, and when the law went into effect. Various classification algorithms were used for experimentation with the goal of classifying tweets into those that supported the SB277 bill and those that opposed it. However, results from experimentation and further investigation demonstrated that variation, low sample size and class imbalance are in fact the bane of successful classification for the SB277 dataset here, with the extreme class imbalance being the most damaging. Semi-supervised and active learning approaches are concluded to be the next line of experimentation for future research.