Social networks generally provide an implementation of some kind of groups or communities which users can
voluntarily join. Twitter does not have this functionality, and there is no notion of a formal group or community.
We propose a method for identification of communities and assignment of semantic meaning to the discussion
topics of the resulting communities. Using this analysis method and a sample of roughly a month's worth of
Tweets from Twitter's "gardenhose" feed, we demonstrate the discovery of meaningful user communities on
We examine Twitter data streaming in real time and treat it as a sensor. Twitter is a social network which
pioneered microblogging with the messages fitting an SMS, and a variety of clients, browsers, smart phones and
PDAs are used for status updates by individuals, businesses, media outlets and even devices all over the world.
Often an aggregate trend of such statuses may represent an important development in the world, which has been
demonstrated with the Iran and Moldova elections and the anniversary of the Tiananmen in China.
We propose using Twitter as a sensor, tracking individuals and communities of interest, and characterizing
individual roles and dynamics of their communications. We developed a novel algorithm of community identification
in social networks based on direct communication, as opposed to linking. We show ways to find communities
of interest and then browse their neighborhoods by either similarity or diversity of individuals and groups adjacent
to the one of interest. We use frequent collocations and statistically improbable phrases to summarize the
focus of the community, giving a quick overview of its main topics.
Our methods provide insight into the largest social sensor network in the world and constitute a platform for