Mining Twitter Data with Ruby – Visualizing User Mentions

In my previous post on mining twitter data with ruby, we laid our foundation for collecting and analyzing Twitter updates. We stored these updates in MongoDB and used map-reduce to implement a simple counting of tweets. In this post, we’ll show relationships between users based on mentions inside the tweet. Fortunately for us, there is no need to parse each tweet just to get a list of users mentioned in the tweet because Twitter provides the “entities.mentions” field that contains what we need. After we collected the “who mentions who”, we then construct a directed graph to represent these relationships and convert them to an image so we can actually see it.

First, we start with the aggregation of mentions per user. We will use the same code base as last time. So if this is your first time, I recommend reading my previous related post or you can follow the changes in Github. Note to self: Convert this to an actual gem in the next post.

# user_mention.rb

module UserMention

  def mentions_by_user
    map_command = %q{
      function() {
        var mentions = this.entities.user_mentions,
            users = [];

        if (mentions.length > 0) {
          for(i in mentions) {
            users.push(mentions[i].id_str)
          }

          emit(this.user.id_str, { mentions: users });
        }
      }
    }

    reduce_command = %q{
      function(key, values) {
        var users = [];

        for(i in values) {
          users = users.concat(values[i].mentions);
        }

        return { mentions: users };
      }
    }

    options = {:out => {:inline => 1}, :raw => true, :limit => 50 }
    statuses.map_reduce(map_command, reduce_command, options)
  end

end

We then again use map-reduce in MongoDB to implement our aggregation. Of course, this sort of thing can be done in Ruby directly but it would be way more efficient if we do it in MongoDB especially if you have a big collection to process. Note that we limit the number of documents to process because we don’t want our graph to look unrecognizable when we display it.

Now that we have our aggregation working, we construct a directed graph of user mentions using the rgl library.

require "bundler"
Bundler.require

require File.expand_path("../tweetminer", __FILE__)

settings = YAML.load_file File.expand_path("../mongo.yml", __FILE__)
miner = TweetMiner.new(settings)

require "rgl/adjacency"
require "rgl/dot"

graph = RGL::DirectedAdjacencyGraph.new

miner.mentions_by_user.fetch("results").each do |user|
  user.fetch("value").fetch("mentions").each do |mention|
    graph.add_edge(user.fetch("_id"), mention)
  end
end

# creates graph.dot, graph.png
graph.write_to_graphic_file

Once you have the user-mentions relationships in a graph, you can do interesting things like who is connected to somebody and the degrees of separation. But for now, we are just interested in showing who mentioned whom. Our sample program saves the graph to the file graph.dot (using the DOT language) and PNG output. But the default PNG output is not laid out nicely. Instead, we will use the “neato” program to convert our graph.dot into a nice looking PNG file.

$ neato -Tpng graph.dot -o mentions.png

When you view “mentions.png”, you should see something similar as the one below. The labels are user IDs and the arrows show the mentioned users.

It would be cool to modify our program to use the users’ avatars and also make it interactive. Or, use Twitter’s streaming API and create an auto-update graph. I haven’t done any research yet but I’m sure there is some Javascript library out there that can help us display graph relationships.

Tagged , , ,

4 thoughts on “Mining Twitter Data with Ruby – Visualizing User Mentions

  1. hgs says:

    You can just tweak the dot code for the images, see

    http://www.karakas-online.de/forum/viewtopic.php?t=2647

    Interactivity and updates is another matter. SVG and cmapx will allow you to trigger URL responses from clicks, just from the dot code with the URL attribute. Hope this helps.

  2. [...] Mining Twitter Data with Ruby – Visualizing User Mentions by Greg Moreno. [...]

  3. operations. When a new named object is created, the HDF5 library executes the link operation in the background immediately after creating the object (i.e., a new object is added as a member of the group in which it is created without further user intervention).

  4. kazino39 says:

    интернет казино на реальные деньги без вложений украина [url=http://master19.kolombus-slot.ru/infa170.html]игровые автоматы оплата visa яндекс[/url] интернет казино эстонии или игровые аппараты пополнение мобильного nokia [url=http://master19.kolombus-slot.ru]Игровые Автоматы 777[/url] интернет покер на деньги.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: