Hashtags are a feature of tweets sometimes utilized in identifying discourse topic and/or user sentiment, presented in the form #thisisahashtag. This method of word concatenation, while a logical response to the 140-character limit per tweet enforced by Twitter, does present certain issues in the context of applying machine learning techniques to derive a tweet's topic or sentiment: since the training data for a tweet are limited, one naturally wishes to utilize said data to the fullest extent possible, but one well-known method for segmenting strings does not necessarily work in the context of hashtags. This potential underperformance stems from a reliance on a static training corpus when generating n-gram probabilities; the ephemeral nature of hashtags that respond to current events and trends indicates instead a need for a dynamically updated training corpus. This thesis proposes a modification of said method that retains its probabilistic power while allowing slang and other neologisms typically found in hashtags to bubble up through the training data at a rate that permits proper segmentation on terms that would otherwise not be recognized. I begin with a review of tweet structure and the algorithm that is to be modified and then discuss certain operational and methodological issues inherent in working with Twitter data. I then describe my algorithm and the techniques utilized in avoiding said issues. I conclude with a discussion of the new algorithm's efficacy and ways in which it might be improved.