We've Moved!
Visit SDSU’s new digital collections website at https://digitalcollections.sdsu.edu
Description
Web texts have undergone an interesting transformation in the past decade. Humans increasingly interact with one another through social media. Depending on the site and the user's audience, users deploy a vocabulary and illustrate their membership with other groups in ways that reinforce or transgress typical understandings of social identity. Because of these subtle shifts in the way users perform group membership, extant methods in user classification may fail to capture the peculiarities of human identity. This project will explore methods for user classification on a gay dating site. User classification refers to the task of sorting users based on the language encountered in natural language texts. Social networking and dating sites pose peculiar challenges for user classification. The kind of language encountered on these sites is conversational, colloquial, unedited, and informal. Further, social identity is oftentimes articulated in ways that on the surface seem contradictory. This project looks at one gay networking site where users have chosen a scene (leather, jock, trendy, to name a few) in the construction of their user profiles. A machine-learning algorithm was applied to the self-descriptions (performances) in these user profiles to automatically classify a user's scene affiliation based on performance. The machine-learning algorithm returned better than chance results for scene-by-scene comparisons and when classifying all scenes. Because the meaning behind these kinds of human-produced texts is wrapped up in a symbolic space that is not easily tractable, this research looks at how social identity theory can be incorporated into improving the accuracy of future user classification tasks of this same nature.