Tag Archives: ruby

Mining Twitter Data with Ruby – Visualizing User Mentions

In my previous post on mining twitter data with ruby, we laid our foundation for collecting and analyzing Twitter updates. We stored these updates in MongoDB and used map-reduce to implement a simple counting of tweets. In this post, we’ll show relationships between users based on mentions inside the tweet. Fortunately for us, there is no need to parse each tweet just to get a list of users mentioned in the tweet because Twitter provides the “entities.mentions” field that contains what we need. After we collected the “who mentions who”, we then construct a directed graph to represent these relationships and convert them to an image so we can actually see it.

First, we start with the aggregation of mentions per user. We will use the same code base as last time. So if this is your first time, I recommend reading my previous related post or you can follow the changes in Github. Note to self: Convert this to an actual gem in the next post.

# user_mention.rb

module UserMention

  def mentions_by_user
    map_command = %q{
      function() {
        var mentions = this.entities.user_mentions,
            users = [];

        if (mentions.length > 0) {
          for(i in mentions) {
            users.push(mentions[i].id_str)
          }

          emit(this.user.id_str, { mentions: users });
        }
      }
    }

    reduce_command = %q{
      function(key, values) {
        var users = [];

        for(i in values) {
          users = users.concat(values[i].mentions);
        }

        return { mentions: users };
      }
    }

    options = {:out => {:inline => 1}, :raw => true, :limit => 50 }
    statuses.map_reduce(map_command, reduce_command, options)
  end

end

We then again use map-reduce in MongoDB to implement our aggregation. Of course, this sort of thing can be done in Ruby directly but it would be way more efficient if we do it in MongoDB especially if you have a big collection to process. Note that we limit the number of documents to process because we don’t want our graph to look unrecognizable when we display it.

Now that we have our aggregation working, we construct a directed graph of user mentions using the rgl library.

require "bundler"
Bundler.require

require File.expand_path("../tweetminer", __FILE__)

settings = YAML.load_file File.expand_path("../mongo.yml", __FILE__)
miner = TweetMiner.new(settings)

require "rgl/adjacency"
require "rgl/dot"

graph = RGL::DirectedAdjacencyGraph.new

miner.mentions_by_user.fetch("results").each do |user|
  user.fetch("value").fetch("mentions").each do |mention|
    graph.add_edge(user.fetch("_id"), mention)
  end
end

# creates graph.dot, graph.png
graph.write_to_graphic_file

Once you have the user-mentions relationships in a graph, you can do interesting things like who is connected to somebody and the degrees of separation. But for now, we are just interested in showing who mentioned whom. Our sample program saves the graph to the file graph.dot (using the DOT language) and PNG output. But the default PNG output is not laid out nicely. Instead, we will use the “neato” program to convert our graph.dot into a nice looking PNG file.

$ neato -Tpng graph.dot -o mentions.png

When you view “mentions.png”, you should see something similar as the one below. The labels are user IDs and the arrows show the mentioned users.

It would be cool to modify our program to use the users’ avatars and also make it interactive. Or, use Twitter’s streaming API and create an auto-update graph. I haven’t done any research yet but I’m sure there is some Javascript library out there that can help us display graph relationships.

Tagged , , ,

Mining Twitter data with Ruby, MongoDB and Map-Reduce

When is the best time to tweet? If you care about reaching a lot of users, the best time probably is when your followers are also tweeting. In this exercise,we will try to figure out the day and time users are the most active. Since there is no way for us to do this for all users in the twitterverse, we will only use the users we follow as our sample.

What do we need

  • mongodb
  • tweetstream gem
  • awesome_print gem for awesome printing of Ruby objects
  • oauth credentials

Visit http://dev.twitter.com to get your oauth credentials. You just need to login, create an app, and the oauth credentials you need will be there. Copy the oauth settings to the twitter.yml file because that is where our sample code will be looking.

Collect status updates

We use the Tweetstream gem to access the Twitter Streaming APIs which allows our program to receive updates as they occur without the need to regularly poll Twitter.

# Collects user tweets and saves them to a mongodb
require "bundler"
require File.dirname(__FILE__) + "/tweetminer"

Bundler.require

# We use the TweetStream gem to access Twitter's Streaming API.
# https://github.com/intridea/tweetstream

TweetStream.configure do |config|
  settings = YAML.load_file File.dirname(__FILE__) + '/twitter.yml'

  config.consumer_key       = settings['consumer_key']
  config.consumer_secret    = settings['consumer_secret']
  config.oauth_token        = settings['oauth_token']
  config.oauth_token_secret = settings['oauth_token_secret']
end

settings = YAML.load_file File.dirname(__FILE__) + '/mongo.yml'
miner = TweetMiner.new(settings)

stream = TweetStream::Client.new

stream.on_error do |msg|
  puts msg
end

stream.on_timeline_status do |status|
  miner.insert_status status
  print "."
end

# Do not forget this to trigger the collection of tweets
stream.userstream

The code above handles the collection of status updates. The actual saving to mongodb is handled by the TweetMiner module.

# tweetminer.rb

require "mongo"

class TweetMiner
  attr_writer :db_connector
  attr_reader :options

  def initialize(options)
    @options = options
  end

  def db
    @db ||= connect_to_db
  end

  def insert_status(status)
    statuses.insert status
  end

  def statuses
    @statuses ||= db["statuses"]
  end

  private

  def connect_to_db
    db_connector.call(options["host"], options["port"]).db(options["database"])
  end

  def db_connector
    @db_connector ||= Mongo::Connection.public_method :new
  end

 end

We will be modifying our code along the way and if you want follow each step, you can view this commit at github.

Depending on how active the people you follow, it may take a while before you get a good sample of tweets. Actually, it would be interesting if you could run the collection for several days.

Assuming we have several days’ worth of data, let us proceed with the “data mining” part. Data mining would not be fun without a mention of map reduce – a strategy for data mining popularized by Google. The key innovation with map reduce is its ability to take a query over a data set, divide it, and run it in parallel over many nodes. “Counting”, for example, is a task that fits nicely with the map reduce framework. Imagine you and your friends are counting the number of people in a football stadium. First, you divide yourselves into 2 groups – group A counts the people in the lower deck while group B does the upper deck. Group A in turn divides the task into north, south, and endzones. When group A is done counting, they tally all their results. After group B is done, they combine the results with group A for which the total gives us the number of people in the stadium. Dividing your friends is the “map” part while the tallying of results is the “reduce” part.

Updates per user

First, let us do a simple task. Let us count the number of updates per user. We introduce a new module ‘StatusCounter’ which we include in our TweetMiner module. We also add a new program to execute the map reduce task.

# counter.rb

require "bundler"
Bundler.require

require File.dirname(__FILE__) + "/tweetminer"

settings = YAML.load_file File.dirname(__FILE__) + '/mongo.yml'
miner = TweetMiner.new(settings)

results = miner.status_count_by_user 
ap results

Map reduce commands in mongodb are written in Javascript. When writing Javascript, just be conscious about string interpolation because Ruby sees it as a bunch of characters and nothing else. For the example below, we use the here document which interprets backslashes. In our later examples, we switch to single quotes when we use regular expressions within our Javascript.

module StatusCounter

  class UserCounter
    def map_command
      <<-EOS
        function() {
          emit(this.user.id_str, 1);
        }
      EOS
    end

    def reduce_command
      <<-EOS
        function(key, values) {
          var count = 0;

          for(i in values) {
            count += values[i]
          }

          return count;
        }
      EOS
    end

  end

  def status_count_by_user
    counter = UserCounter.new
    statuses.map_reduce(counter.map_command, counter.reduce_command, default_mr_options)
  end

  def default_mr_options
    {:out => {:inline => 1}, :raw => true }
  end
 end

Follow this commit to view the changes from our previous examples.

When you run ‘ruby counter.rb’, you should see a similar screenshot as the one below:

Tweets per Hour

Now, let’s do something a little bit harder than the previous example. This time, we want to know how many tweets are posted per hour. Every tweet has a created_at field of type String. We then use a regular expression to extract the hour component.

created_at:  "Tue Sep 04 22:04:40 +0000 2012"
regex:  (\d{2,2}):\d{2,2}:\d{2,2}
match: 22

The only significant change is the addition of a new map command. Note the reduce command did not change from the previous example. See the commit.

  class HourOfDayCounter
    def map_command
      'function() {
        var re = /(\d{2,2}):\d{2,2}:\d{2,2}/;
        var hour = re.exec(this.created_at)[1];

        emit(hour, 1);
      }'
    end

    def reduce_command
      <<-EOS
        function(key, values) {
          var count = 0;

          for(i in values) {
            count += values[i]
          }

          return count;
        }
      EOS
    end

  end

  def status_count_by_hday
    counter = HourOfDayCounter.new
    statuses.map_reduce(counter.map_command, counter.reduce_command, default_mr_options)   
  end

Now run ‘ruby counter.rb’ in the console with the new method and the result should be something like the one below.

Filtering records

Our examples so far include every status since the beginning of time, which is pretty much useless. What we want is to apply the counting tasks to statuses posted the past 7 days, for example. MongoDB allows you to pass a query to your map-reduce so you can filter the data where the map-reduce is applied. One problem though: created_at field is a string. To get around this, we introduce a new field created_at_dt which is of type Date. You can hook it up in the insert_status method but since we already have our data, we instead run a query (using MongoDB console) to update our records. Please note the collection we are using is statuses and the new field is created_at_dt.

var cursor = db.statuses.find({ created_at_dt: { $exists: false } });
while (cursor.hasNext()) {
  var doc = cursor.next();
  db.statuses.update({ _id : doc._id }, { $set : { created_at_dt : new Date(doc.created_at) } } )
}

Now, that we have a Date field, let’s modify our method to include a days_ago parameter and a query in our map reduce.

  def status_count_by_hday(days_ago = 7)
    date     = Date.today - days_ago
    days_ago = Time.utc(date.year, date.month, date.day)
    query = { "created_at_dt" => { "$gte" => days_ago } }

    options = default_mr_options.merge(:query => query)

    counter = HourOfDayCounter.new
    statuses.map_reduce(counter.map_command, counter.reduce_command, options)   
end

Since we’re now getting the hang of it, why don’t we add another complexity. This time, let us count by day of the week and include a breakdown per hour. Luckily for us, the day of the week is also included in the created_at field and it is just a matter of extracting it. Of course, if Twitter decides to change the format, this will break. Let’s visit rubular.com and try our regular expression.

Now that we have our regex working, let’s include this in our new map command.

    def map_command
      'function() {
        var re = /(^\w{3,3}).+(\d{2,2}):\d{2,2}:\d{2,2}/;
        var matches = re.exec(this.created_at);

        var wday = matches[1],
            hday = matches[2];

        emit(wday, { count: 1, hdayBreakdown: [{ hday: hday, count: 1 }] });
      }'     
    end

Note the difference in the emit function from our previous examples. Before, we only emit a single numeric value that is why our reduce command is simple array loop. This time, our reduce command requires more work.

    def reduce_command
      'function(key, values) {
         var total = 0,
             hdays = {},
             hdayBreakdown;

         for(i in values) {
           total += values[i].count

           hdayBreakdown = values[i].hdayBreakdown;

           for(j in hdayBreakdown) {
             hday  = hdayBreakdown[j].hday;
             count = hdayBreakdown[j].count;

             if( hdays[hday] == undefined ) {
               hdays[hday] = count;
             } else {
               hdays[hday] += count;
             }
           }
         }

         hdayBreakdown = [];
         for(k in hdays) {
           hdayBreakdown.push({ hday: k, count: hdays[k] })
         }

         return { count: total, hdayBreakdown: hdayBreakdown }
       }'
    end

In our previous examples, the values parameter is a simple array of numeric values. Now, it becomes an an array of properties. On top of that, one of the properties (i.e. hdayBreakdown) is also an array. If everything works according to plan, you should see something like the image below when you run collect.rb.

Did you have fun? I hope so :)

Tagged , , ,

Adding keyboard shortcuts in web pages

Adding keyboard shortcuts to interact with your web pages seems like a useless feature when the rest of the world is using a mouse. But for a programmer who wants everything to be a few keystrokes away, keyboard shortcuts are very handy.

In this tutorial, we will add a simple scrolling shortcuts to our webpage. This is just to illustrate what is possible. So please, do not copy-and-paste this to your production code.

What do we need?

  1. jquery
  2. sinatra
  3. coffeescript

Actually, the only critical piece we need is jQuery and knowledge of Javascript. However, since I am more of a Ruby guy, we will use Sinatra to build the page and CoffeeScript to write the Javascript.

Build the pages

The screenshot below (left side) shows how our directory structure would look like. It is pretty much a standard Sinatra structure.

Our HTML page displays 10 entries where each is grouped under a “div” element with an “.entry” class and an ID. We also add in some styling in our page to distinguish each entry.

<!DOCTYPE HTML>
<html>
  <head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    <title>Index</title>
    <link rel="stylesheet" href="css/style.css"/>

    <script type="text/javascript" charset="utf-8" src="http://code.jquery.com/jquery-1.7.1.min.js"></script>
    <script type="text/javascript" charset="utf-8" src="js/app.js">
    </script>
  </head>

<% 1.upto(10) do |i| %>
  <div id="<%= "entry_#{i}" %>"class="entry">
    <%= "Title #{i}" %>
    <p>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>
  </div>
<% end %>
</html>

If everything is setup correctly, you should be able to run the app and see 10 entries.

greg@gokou ~/dev/projects/wdr/keyboard-shortcuts $ ruby app.rb
[2012-08-30 13:48:44] INFO  WEBrick 1.3.1
[2012-08-30 13:48:44] INFO  ruby 1.9.2 (2012-04-20) [x86_64-darwin12.1.0]
== Sinatra/1.3.3 has taken the stage on 4567 for development with backup from WEBrick
[2012-08-30 13:48:44] INFO  WEBrick::HTTPServer#start: pid=12415 port=4567

Now for the juicy part. When the user presses ‘j’, we will scroll to the next entry while ‘k’ scrolls to the previous. If you are a Vim user, you know why.

current_entry = -1

$(document).keydown (e) ->
  switch(e.keyCode)
    when 74 then scroll_to_next() # j
    when 75 then scroll_to_previous() # k

scroll_to_next = ->
  #alert "scroll to next"
  current_entry++
  scroll_to_entry(current_entry)


scroll_to_previous = ->
  if current_entry > 0
    current_entry--
    scroll_to_entry(current_entry)

scroll_to_entry = (entry) ->
  # Get the element we need to scroll to
  id = $(".entry")[entry].id
  $("html, body").animate { scrollTop: $("##{id}").offset().top }, "slow"

That’s it! As I’ve mentioned before, this is not production ready. For example, the shortcut should not interfere with other actions in your page like when the user is interacting with an input field. This also assumes the current visible entry is the first one.

This post is based from the book Web Development Recipes. If you are looking for quick reference on how to improve your project, I suggest reading the book.

Tagged , , ,

How to create a wrapper gem for service APIs – part 1

APIs are getting more and more popular as apps and services move to the cloud. Whenever you need to integrate a popular web service API into your Ruby app, 99.99% of the time there already exists a gem ready for use. That is a testament to how active Ruby developers are in supporting the community.

Even if integrating with the popular APIs is not in your radar, you may still have a need to create an API wrapper for internal use. For example, if your Rails application has grown tremendously (congratulations!), you may eventually need to adopt a services architecture to support upcoming features and make things manageable.

I created the gem for the Open Amplify API as part of my exploration to data mining. When I first created it, my primary goal was simply to wrap the API. Though I still didn’t write spaghetti code, it wasn’t a good example of structured code either. Two years later (yep, that’s how long I let the code rot), I decided to re-write the gem and adopt the architecture from the Twitter gem. It was a good exercise because not only I updated the gem for the newest API version, I also learned a great deal on how to write a gem.

Setup the project

We will create a wrapper for the fictitious Awesome API and thus call our gem ‘awesome’. To get things started, let’s use bundler to set up our initial code.

$> bundle gem awesome
      create  awesome/Gemfile
      create  awesome/Rakefile
      create  awesome/LICENSE
      create  awesome/README.md
      create  awesome/.gitignore
      create  awesome/awesome.gemspec
      create  awesome/lib/awesome.rb
      create  awesome/lib/awesome/version.rb
Initializating git repo in /Users/greg/dev/code/awesome

This is the standard directory structure and naming convention of Ruby gems. The files Gemfile, Rakefile, and .gitignore are not necessary but they would be very useful while developing your gem.

Gem dependencies. All gem dependencies should go into awesome.gemspec and not in Gemfile. Inside your Gemfile, the line ‘gemspec’ takes care of identifying the gems you needed in your local.

$> more Gemfile
source 'https://rubygems.org'

# Specify your gem's dependencies in awesome.gemspec
gemspec

Versioning. You specify the version of your gem inside lib/awesome/version.rb

$> more lib/awesome/version.rb
module Awesome
  VERSION = "0.0.1"
end

You may be wondering how is this used by the gem. Take a peek at awesome.gemspec and you’ll see that Awesome::VERSION is used by the .gemspec file.

$> more awesome.gemspec
# -*- encoding: utf-8 -*-
require File.expand_path('../lib/awesome/version', __FILE__)

Gem::Specification.new do |gem|
  gem.authors       = ["Greg Moreno"]
  gem.email         = ["greg.moreno@gmail.com"]
  gem.description   = %q{TODO: Write a gem description}
  gem.summary       = %q{TODO: Write a gem summary}
  gem.homepage      = ""

  gem.files         = `git ls-files`.split($\)
  gem.executables   = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
  gem.test_files    = gem.files.grep(%r{^(test|spec|features)/})
  gem.name          = "awesome"
  gem.require_paths = ["lib"]
  gem.version       = Awesome::VERSION
end

Additional modules and classes. You can write all your code in the file lib/awesome.rb and it would still work. However, in the spirit of making code maintainable, it is highly recommended that you put your classes and modules under the directory lib/awesome just like what we did with lib/awesome/version.rb

Testing the gem

We will use minitest but you can always use any test framework you prefer. For our test setup, we need to do the ff:

  1. Setup our test directory manually since bundler didn’t do this for us.
  2. Create a rake task to run our tests.
  3. Specify gem dependencies in our common test helper file.
$> mkdir test
$> touch test/helper.rb
$> mkdir test/awesome
$> touch test/awesome/awesome_test.rb

# Rakefile
require 'bundler/gem_tasks'

require 'rake/testtask'
Rake::TestTask.new do |test|
  test.libs << 'lib' << 'test'
  test.ruby_opts << "-rubygems"
  test.pattern = 'test/**/*_test.rb'
  test.verbose = true
end

# test/helper.rb
require 'awesome'
require 'minitest/spec'
require 'minitest/autorun'

Now that we have our testing in place, let’s write a simple test and see if everything works.

# test/awesome/awesome_test.rb
require 'helper'

describe Awesome do
  it 'should have a version' do
    Awesome::VERSION.wont_be_nil
  end
end

# Then, let's run the test
$> rake test
(in /Users/greg/dev/code/awesome)
/Users/greg/.rbenv/versions/1.9.2-p290/bin/ruby -I"lib:lib:test" -rubygems "/Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader.rb" "test/awesome/awesome_test.rb"
Loaded suite /Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader
Started
.
Finished in 0.000695 seconds.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

Test run options: --seed 55984

Perfect! Now, let’s start working on the juicy parts of our gem.

Configuration

Every webservice API would definitely require a per-user configuration and your gem should be able to support that. For example in the Twitter gem, some methods require authentication and you setup the default configuration with this:

    Twitter.configure do |config|
      config.consumer_key = YOUR_CONSUMER_KEY
      config.consumer_secret = YOUR_CONSUMER_SECRET
      config.oauth_token = YOUR_OAUTH_TOKEN
      config.oauth_token_secret = YOUR_OAUTH_TOKEN_SECRET     
    end

Every API has different options but if you are wrapping a webservice, the options often fall into two categories – connections and functional options. For example, connection-related options include the endpoint, user agent, and authentication keys while functional options include request format (e.g. json), number of pages to return and other parameters required by specific API functions. In some APIs, the api key is passed as a parameter to GET calls so while it may be connection-related, it is better to group it with parameter options so you can easily encode all parameters in a single call.Our Awesome API is simple and will not deal with OAuth like the Twitter gem does. For the configuration, we should be able to do this:

Awesome.api_key = 'YOUR_API_KEY'
Awesome.format = :json
# Other options are: user_agent, method

Now, let’s write some tests. Of course, these should fail at first :)

# test/awesome/configuration_test.rbrequire 'helper'

describe 'configuration' do

  describe '.api_key' do
    it 'should return default key' do
      Awesome.api_key.must_equal Awesome::Configuration::DEFAULT_API_KEY
    end
  end

  describe '.format' do
    it 'should return default format' do
      Awesome.format.must_equal Awesome::Configuration::DEFAULT_FORMAT
    end
  end

  describe '.user_agent' do
    it 'should return default user agent' do
      Awesome.user_agent.must_equal Awesome::Configuration::DEFAULT_USER_AGENT
    end
  end

  describe '.method' do
    it 'should return default http method' do
      Awesome.method.must_equal Awesome::Configuration::DEFAULT_METHOD
    end
  end

end

As I mentioned before, the best way to write your gem (or any program for that matter) is to cleary separate the functionalities into modules and classes. In our case, we will put all configuration defaults inside a module (i.e. lib/awesome/configuration.rb). We also want to provide class methods for the module Awesome which we can easily do using Ruby’s ‘extend’.

# lib/awesome/configuration.rb

module Awesome
  module Configuration
    VALID_CONNECTION_KEYS = [:endpoint, :user_agent, :method].freeze
    VALID_OPTIONS_KEYS    = [:api_key, :format].freeze
    VALID_CONFIG_KEYS     = VALID_CONNECTION_KEYS + VALID_OPTIONS_KEYS

    DEFAULT_ENDPOINT    = 'http://awesome.dev/api'
    DEFAULT_METHOD      = :get
    DEFAULT_USER_AGENT  = "Awesome API Ruby Gem #{Awesome::VERSION}".freeze

    DEFAULT_API_KEY      = nil
    DEFAULT_FORMAT       = :json

    # Build accessor methods for every config options so we can do this, for example:
    #   Awesome.format = :xml
    attr_accessor *VALID_CONFIG_KEYS

    # Make sure we have the default values set when we get 'extended'
    def self.extended(base)
      base.reset
    end

    def reset
      self.endpoint   = DEFAULT_ENDPOINT
      self.method     = DEFAULT_METHOD
      self.user_agent = DEFAULT_USER_AGENT

      self.api_key    = DEFAULT_API_KEY
      self.format     = DEFAULT_FORMAT
    end

  end # Configuration
end

# lib/awesome.rb              
require 'awesome/version'
require 'awesome/configuration'

module Awesome
  extend Configuration
end


$> rake test
(in /Users/greg/dev/code/awesome)
/Users/greg/.rbenv/versions/1.9.2-p290/bin/ruby -I"lib:lib:test" -rubygems "/Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader.rb" "test/awesome/awesome_test.rb" "test/awesome/configuration_test.rb" 
Loaded suite /Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader
Started
.....
Finished in 0.001600 seconds.

5 tests, 5 assertions, 0 failures, 0 errors, 0 skips

Our gem will not be awesome if we don’t support a ‘configure’ block like what the Twitter gem does. We want to setup the configuration like this:

  Awesome.configure do |config|
    config.api_key = 'YOUR_API_KEY'
    config.method  = :post
    config.format  = :json
  end

Fortunately, it’s an easy fix. We just need to add a ‘configure’ method to the Configuration module. We also update our tests to make sure this new method works.

# lib/awesome/configuration.rb
    def configure
      yield self
    end

# test/awesome/configuration_test.rb  
  after do
    Awesome.reset
  end

  describe '.configure' do
    Awesome::Configuration::VALID_CONFIG_KEYS.each do |key|
      it "should set the #{key}" do 
        Awesome.configure do |config|
          config.send("#{key}=", key)
          Awesome.send(key).must_equal key
        end
      end
    end
  end

Before we move on, let’s take a second look at our configuration tests. We have tests for checking default values and setting-up new ones. What if we added a new configuration key for our gem? The ‘configure’ tests will be able to handle the new key but we still have to add another test for checking the default value. And we don’t want to right another test code, right? More importantly, we don’t want our tests to yield false positives. If we fail to add the ‘default value’ check, our tests will still pass even though we forgot to set a default value.Let us remove all our default value tests and replace it with code that relies on VALID_CONFIG_KEYS instead.

# test/awesome/configuration_test.rb  
  Awesome::Configuration::VALID_CONFIG_KEYS.each do |key|
    describe ".#{key}" do
      it 'should return the default value' do
        Awesome.send(key).must_equal Awesome::Configuration.const_get("DEFAULT_#{key.upcase}")
      end
    end
  end

$> rake test 
(in /Users/greg/dev/code/awesome)
/Users/greg/.rbenv/versions/1.9.2-p290/bin/ruby -I"lib:lib:test" -rubygems "/Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader.rb" "test/awesome/awesome_test.rb" "test/awesome/configuration_test.rb" 
Loaded suite /Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader
Started
...........
Finished in 0.002935 seconds.

11 tests, 11 assertions, 0 failures, 0 errors, 0 skips

Test run options: --seed 21540

Configuring clients

Our end goal is to wrap API calls that fits nicely into our application and the common approach to do that is to wrap the API calls under a ‘Client’ class. Depending on the size of the API you want to support, the Client class maybe delegating the method calls to other classes and modules but from the point of view your program, the action happens inside the Client class. There are two ways to configure the Client class:

  1. It inherits the configuration values defined in the Awesome module;
  2. It overrides the configuration values per client.
# Use the values defined in the Awesome module
client = Awesome::Client.new
client.make_me_awesome('gregmoreno')

client_xml = Awesome::Client.new :format => :xml
client_json = Awesome::Client.new :format => :json

We are not going to show our tests in here but if you are interested, you can view the test code from the github repository. Instead, we show the code that handles the two scenarios for client configuration.

# lib/awesome/client.rb

module Awesome

  class Client

    # Define the same set of accessors as the Awesome module
    attr_accessor *Configuration::VALID_CONFIG_KEYS

    def initialize(options={})
      # Merge the config values from the module and those passed
      # to the client.
      merged_options = Awesome.options.merge(options)

      # Copy the merged values to this client and ignore those
      # not part of our configuration
      Configuration::VALID_CONFIG_KEYS.each do |key|
        send("#{key}=", merged_options[key])
      end
    end

  end # Client

end

We also need to update our Awesome module. First, we need to require the new file awesome/client.rb so it will be loaded when we require the gem. Second, we need to implement a method that returns all the configuration values inside the Awesome module. Since this is still about configuration, our new method should go inside the Configuration module.

# lib/awesome/configuration.rb
    def options
      Hash[ * VALID_CONFIG_KEYS.map { |key| [key, send(key)] }.flatten ]
    end

We’re finally done with the configuration part of our gem. I know it’s a lot of work for a simple task but we managed to put a good structure in our code. Plus, we learned how to make our tests less brittle, and use Ruby’s awesome power to make our code better. In our next installment, we’ll discuss requests and error handling.

Tagged , ,

Create your own Rails 3 engine

Engine is an interesting feature of Rails. Engines are miniature applications that live inside your application and they have structure that you would normally find in a typical Rails application. If you have used the Devise gem, which itself is an engine, you know the benefits of being able to add functionality to your application with just a few lines of code. Another great benefit of engines is when you or your team are maintaining a number of applications the common functionalities can be extracted into engines.

Engines are already available prior to Rails 3 but it is not a core feature of the framework. As such, engine developers resorted to monkey-patching which, oftentimes, lead to engines breaking when Rails gets updated. In Rails 3.1, engines are now supported by the framework and there is now a clealy defined place where to hook your engines into Rails.

Now, let us go through the steps of building a simple engine. We will be working on authentication engine (like Devise) that allows users of your application to use their Twitter or Facebook credentials.

# This is the app that will use our engine
$> rails new social_app

# This is our engine
$> rails plugin new undevise --mountable

The –mountable option tells Rails you want to generate a mountable plugin, commonly known as engine. When you look at the directory structure of your engine, it is much different from your Rails app. The engine has controllers, models, views, mailers, lib, and even its own config/routes.rb (didn’t we just said it is a miniature Rails app).

Include the engine in your app

Just like any gem, you should update your app’s Gemfile to use the engine we created.

# social_app/Gemfile
gem 'undevise', :path => '../undevise'

Of course, you can set :path to any location or if it is in a git repository, you can use the :git option. If you are developing your engine alongside your app, a better approach is to use gem groups in your Gemfile. For example:

group :development do
  gem 'undevise', :path => '../undevise'
end

group :production do
  gem 'undevise', :git => 'git://github.com/yourname/undevise.git'
end

After adding the engine in your Gemfile, let’s make sure all dependencies are available for the application. If everything works, you should be able to see a reference to undevise inside Gemfile.lock

$> cd social_app
$> bundle install
$> more Gemfile.lock
PATH
  remote: ../undevise
  specs:
    undevise (0.0.1)
      rails (~> 3.2.1)

Mount the engine

Next, we will mount the engine and see if we can route requests to it. What this does is make sure requests starting with /auth will be passed to our engine.

# social_app/config/routes.rb
SocialApp::Application.routes.draw do
  mount Undevise::Engine, :at => '/auth'
end

# Run the social app. Make sure you are in the social_app directory.
$> rails s

# Then visit http://localhost:3000/auth

Rails 3 engine

When you visit ‘/auth‘, you will get a routing error because you haven’t defined any routes in your engine yet.

# undevise/config/routes.rb
Undevise::Engine.routes.draw do
  root :to => 'auth#index'
end

Remember even though your engine is mounted at ‘/auth’, what your engine sees is the path after the ‘/auth’. Routes in engines are namespaced to avoid conflicts with your app. You can change the mounted path in your Rails app anytime and your engine wouldn’t care. Let’s try again and see what Rails would tell us.

$> cd social_app
$> rails s

Perfect! Now we know the request is being passed to our engine. We now just have to define our controller.

$ cd undevise
$ rails g controller auth

# undevise/app/controllers/undevise/auth_controller.rb
module Undevise
  class AuthController < ApplicationController

    def index
      render :text => 'Hello world'
    end

  end
end

# Now, visit http://localhost:3000/auth

Cool! We have the obligatory hello world program working. At the the risk of sounding like a broken record, remember your engine code should be namespaced. If you forget this, strange things will happen to your application and Rails will not usually complain about it.

Gem dependencies

I’m sure your idea for an engine is very far from what we have shown so far. When you generate an engine, it also creates a .gemspec file. While in your Rails app you list the gems in Gemfile, in your engine you list the gems inside the .gemspec file. This can be confusing because the engine also contains a Gemfile.

$> cd undevise
$> more Gemfile

source "http://rubygems.org"

# Declare your gem's dependencies in undevise.gemspec.
# Bundler will treat runtime dependencies like base dependencies, and
# development dependencies will be added by default to the :development group.
gemspec

As you can see, there is no need to list the gems your engine needs in the Gemfile. The line ‘gemspec’ makes sure the gems you listed in your .gemspec file are installed when you run bundle install.

Now, let’s add some gems in our engine and see how it will affect our Rails app.

# undevise/undevise.gemspec
  s.add_dependency "rails", "~> 3.2.1"
  s.add_dependency "omniauth"
  s.add_dependency "omniauth-twitter"
  s.add_dependency "omniauth-facebook"


$ cd social_app
$ bundle install
$ more Gemfile.lock
PATH
  remote: ../undevise
  specs:
    undevise (0.0.1)
      omniauth
      omniauth-facebook
      omniauth-twitter
      rails (~> 3.2.1)

Here we can see the gems we specifed in undevise/undevise.gemspec are also included in the main Rails app.

Configure OmniAuth

If you are using omniauth directly in your app, your will configuration will definitely be in the file config/initializers/omniauth.rb. Since our engine is pretty much just another Rails app, it will also have its own config/initializers/omniauth.rb file. The only consideration with regards to the configuration is where would the Twitter or Facebook credentials be located. You definitely don’t want to embed it in your engine.

Our solution is to store the credentials inside a config/twitter.yml file (or config/facebook.yml) inside your main Rails app. Then have our engine pull the values out of these files to configure omniauth.

$> cd undevise/config
$> mkdir initializers
$> touch initializers/omniauth.rb

# We have to create the initializers directory because it not created by default.

# undevise/config/initializers/omniauth.rb
providers = %w(twitter facebook).inject([]) do |providers, provider|
  fpath = Rails.root.join('config', "#{provider}.yml")

  if File.exists?(fpath)
    config = YAML.load_file(fpath)
    providers << [ provider, config['consumer_key'], config['consumer_secret'] ]
  end

  providers
end

raise 'You have not created config/twitter.yml or config/facebook.yml' if providers.empty?

Rails.application.config.middleware.use OmniAuth::Builder do
  providers.each do |p|
    provider *p
  end
end

Now, let’s go back to our main Rails app and start the server.

$> cd social_app
$> rails s
=> Booting WEBrick
=> Rails 3.2.1 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
Exiting
/Users/greg/dev/tmp/ruby/engine-tutorial/undevise/config/initializers/omniauth.rb:12:in `<top (required)>': You have not created config/twitter.yml or config/facebook.yml (RuntimeError)

Oops! We forgot to create our Twitter or Facebook configuration file. In your main Rails app, go ahead and create config/twitter.yml. If you are not familiar with Twitter apps, visit their developer site at https://dev.twitter.com/

Gemfile dependencies and sub-depencies

# social_app/config/twitter.yml
consumer_key:  'APP_CONSUMER_KEY'
consumer_secret: 'APP_CONSUMER_SECRET'

$> rails s
=> Booting WEBrick
=> Rails 3.2.1 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
Exiting
/Users/greg/dev/tmp/ruby/engine-tutorial/undevise/config/initializers/omniauth.rb:12:in `<top (required)>': uninitialized constant OmniAuth (NameError)
     from /Users/greg/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/railties-3.2.1/lib/rails/engine.rb:588:in `block (2 levels) in <class:Engine>'

The NameError occurs because during the main Rails’ app boot up, Bundler will only require dependencies listed in the Gemfile but not the sub-dependencies. As you can see from Gemfile.lock, omniauth is not a direct dependency. You could list the gems in your main app’s Gemfile but that’s defeating the purpose of isolating the gem dependencies through your engine.

The solution right now is to require your dependencies inside your engine and the place to do that is inside lib/undevise/engine.rb

# undevise/lib/undevise/engine.rb
require 'omniauth'
require 'omniauth-twitter'
require 'omniauth-facebook'

module Undevise
  class Engine < ::Rails::Engine
    isolate_namespace Undevise
  end
end

After listing required dependencies inside your engine, restart your main Rails app, then visit http://localhost:3000/auth/twitter/

When you visit http://localhost:3000/auth/twitter/, you should see the error above. The callback url is part of OmniAuth’s behaviour and should be fixed by adding a route in your engine and adding the method to handle it in your controller.

# undevise/config/routes.rb
Undevise::Engine.routes.draw do
  root :to => 'auth#index'
  match ':provider/callback' => 'auth#callback'
end

# undevise/app/controllers/undevise/auth_controller.rb
module Undevise
  class AuthController < ApplicationController

    def index
      render :text => 'Hello world'
    end

    def callback
      render :text => "Hello from #{params[:provider]}"
    end

  end
end

If everything work fine, you should see a message from Twitter.

We only scratched the surface with Rails 3 engine. Your engine, much like any normal Rails app, can have models and migrations, javascripts, css, specs, etc. If you want to dig deeper into engines, I recommend Rails 3 in Action by Ryan Bigg and Yehuda Katz. It includes a whole chapter about engines, discussion of middleware, and how tests your engine.

Tagged , ,

More Ruby tips and tricks

String to number conversion gotcha

>> Float('3.14159')
=> 3.14159 
>> '3.14159'.to_f
=> 3.14159 

# However, Float() method will return an exception if given
# a bad input while to_f() will ignore everything from the 
# offending character.

>> Float('3.x14159')
ArgumentError: invalid value for Float(): "3.x14159"
	from (irb):4:in 'Float'
	from (irb):4

>> '3.x14159'.to_f
=> 3.0


# Similar case with to_i() and Integer().

>> Integer('19x69')
ArgumentError: invalid value for Integer(): "19x69"
	from (irb):15:in 'Integer'
	from (irb):15
	from /Users/greg/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in '<main>'

>> '19x69'.to_i
=> 19 

Case insensitive regular expression

# Regex is case sensitive by default. 
# Adding 'i' for insensitive match
puts 'matches' if  /AM/i =~ 'am'

Hash is ordered in 1.9

# new syntax in 1.9
h = {first: 'a', second: 'b', third: 'c'}

# hashes in 1.9 are ordered
h.each do |e|
  pp e
end

Filter a list using several conditions

conditions = [
    proc { |i| i > 5 },
    proc { |i| (i % 2).zero? },
    proc { |i| (i % 3).zero? }
  ]

matches = (1..100).select do |i|
  conditions.all? { |c| c[i] }
end

Randomly pick an element from an array

>> [1,2,3,4,5].sample
=> 2 
>> [1,2,3,4,5].sample
=> 1 

# pick 2 random elements

>> [1,2,3,4,5].sample(2)
=> [1, 5]

List methods unique to a class

# List all instance methods that starts with 're' 
# including those inherited by String.

>> String.instance_methods.grep /^re/
=> [:replace, :reverse, :reverse!, :respond_to?, :respond_to_missing?] 

# List methods unique to String, i.e. not include 
# those defined by its ancestors.

>> String.instance_methods(false).grep /^re/
=> [:replace, :reverse, :reverse!]

Globbing key-value pairs

>> h = Hash['a', 1, 'b', 2]
=> {"a"=>1, "b"=>2}

>> h = Hash[ [ ['a', 1], ['b', 2] ] ] 
=> {"a"=>1, "b"=>2}

>> h = Hash[ 'a' => 1, 'b' => 2 ]
=> {"a"=>1, "b"=>2}

# The first form is very useful for globbing key-value pairs in Rails’ routes. For example, if you have the following:

# route definition in Rails 3
match 'items/*specs' => 'items#specs'

# sample url

http://localhost:3000/items/year/1969/month/7/day/21

# params[:specs] will be set

>> params[:specs]
=> "year/1969/month/7/day/21"

>> h = Hash[*params[:specs].split('/')]
=> {"year"=>"1969", "month"=>"7", "day"=>"21"}
Tagged

24 Ruby tips and tricks

Peter Cooper will share more tips in his book to be released later this year. Stay tune and don’t forget to leave your email address to get updates at http://rubyreloaded.com/trickshots/

Here are some of the tips in the video.

Generate random numbers within a given range

irb(main):019:0> rand(10..20)
=> 12
irb(main):020:0> rand(10...20) # works with exclusive range
=> 16

Dump your object using awesome_print

# Install the gem first
gem install awesome_print

irb(main):001:0> require 'ap'
=> true
irb(main):002:0> ap :a => 1, :b => 'greg', :c => [1,2,3]
{
    :a => 1,
    :b => "greg",
    :c => [
        [0] 1,
        [1] 2,
        [2] 3
    ]
}
=> {:a=>1, :b=>"greg", :c=>[1, 2, 3]}

Concatenating strings

irb(main):005:0> "abc" + "def"
=> "abcdef"
irb(main):006:0> "abc".concat("def")
=> "abcdef"
irb(main):007:0> x = "abc" "def"
=> "abcdef"

Include modules in a single line

class MyClass
  include Module1, Module2, Module3
  # However, the modules are included in reverse order. Confusing eh!
end

Instance variable interpolation

irb(main):008:0> @name = "greg"
=> "greg"
irb(main):009:0> "my name is #{@name}"
=> "my name is greg"
irb(main):010:0> "my name is #@name"
=> "my name is greg"

I still prefer the use curly braces.

Syntax checking

➜  ruby -c facu.rb 
facu.rb:12: syntax error, unexpected keyword_end, expecting $end

Zipping arrays

irb(main):027:0> names = %w(fred jess john)
=> ["fred", "jess", "john"]
irb(main):028:0> ages = [38, 47,91]
=> [38, 47, 91]
irb(main):029:0> locations = %w(spain france usa)
=> ["spain", "france", "usa"]
irb(main):030:0> names.zip(ages)
=> [["fred", 38], ["jess", 47], ["john", 91]]
irb(main):031:0> names.zip(ages, locations)
=> [["fred", 38, "spain"], ["jess", 47, "france"], ["john", 91, "usa"]]

Range into arrays

irb(main):034:0> (10..20).to_a  # what I used to do
=> [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
irb(main):035:0> [*10..20]
=> [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

Using parameter as default value

irb(main):047:0> def method(a, b=a); "#{a} #{b}"; end
=> nil
irb(main):048:0> method 1
=> "1 1"
irb(main):049:0> method 1, 2
=> "1 2"

Put regex match in a variable

irb(main):058:0> s = "Greg Moreno"
=> "Greg Moreno"
irb(main):059:0> /(?<first>\w+) (?<second>\w+)/ =~ s
=> 0
irb(main):060:0> first
=> "Greg"
irb(main):061:0> second
=> "Moreno"
Tagged

Ruby 101: method_missing gotchas

Forgetting ‘super’ with ‘method_missing’

method_missing is a hallmark of Ruby metaprogramming. It is one of those coding techniques that you need to master if you want to move from white belt to black belt. Aside from that, it is also fun to use.

  class RadioActive
    def initialize(path)
      ...
    end

    def to_format(format)
      ...
    end
  end
  
  
  d = RadioActive.new('/path/to/uranium')
  d.to_format('xml')

It works but you want explicit methods for each format. So you tap into your Ruby skills and implement your own method_missing.

  class RadioActive
    def method_missing(name, *args)
      if name.to_s =~ /^to_(\w+)$/
        to_format($1)
      end
    end
  end
  
  
  d = RadioActive.new('/path/to/uranium')
  d.to_xml            # WORKS
  d.to_format('xml')  # WORKS
  d.undefined_method  # FAILS

Unfortunately, the last call to ‘undefined_method’ fails. Actually, you would not know it fails because Ruby will not fire any exception. In case there is an undefined method, let us see how Ruby handles it.

  >> s = 'uranium'
   => "uranium" 
  >> s.to_xml
  NoMethodError: undefined method `to_xml' for "uranium":String

There you go. But there is no need to raise the ‘NoMethodError’ in your code. Instead, simply call ‘super’ if you are not handling the method. Whether you have your own class or inheriting from another, do not forget to call ‘super’ with your ‘method_missing’

  class RadioActive
    def method_missing(name, *args)
      if name.to_s =~ /^to_(\w+)$/
        to_format($1)
      else
        super
      end
    end
  end
  
  d = RadioActive.new('/path/to/uranium')
  d.undefined_method
  # => in `method_missing': undefined method `undefined_method' for # (NoMethodError)

Calling ‘super’ is not just for ‘missing_method’. You also need to do the same for the other hook methods like ‘const_missing’, ‘append_features’, ‘method_added’.

Forgetting respond_to?

If you modify ‘method_missing’, it will also affect the behavior of ‘respond_to?’ because what you are adding in as methods do not actually exist — they are ghost methods. If you check the list of instance methods for our class, it will only show 2.

  >> RadioActive.instance_methods(false)
  => ["method_missing", "to_format"]
 
  >> d.respond_to?('to_format')
  => true
  >> d.respond_to?('to_xml')
  => false
  

Every time you modify ‘method_missing’, you also need to update ‘respond_to?’

  class RadioActive
    def respond_to?(name)
      !!(name.to_s =~ /^to_/ || super)
    end
  end
  
Tagged ,

Ruby 101: Improving your code by defining methods dynamically

Let’s say you have a user and you want to check its role.


  class User
    attr_accessor :role
  end
  
  u = User.new
  u.role = 'admin'
  
  # somewhere in your code you check the role
  
  if u.role == 'admin'
    puts 'admin'
  elsif u.role == 'moderator'
    puts 'moderator'
  elsif u.role == 'guest'
    puts 'guest'
  end

  

Using a string value is bad code and you can improve this by using constants instead. But still, this is bad code becauses it exposes implementation details of your User class.

For our first improvement, we define methods that check the user’s role and hide the implementation of the role checking inside the User class.


  class User
  
    attr_accessor :role
  
    def is_admin?
      self.role == 'admin'
    end
  
    def is_moderator?
      self.role == 'moderator'
    end
  
    def is_guest?
      self.role == 'guest'
    end
  
  end
  
  u = User.new
  u.role = 'guest'
  
  
  if u.is_admin?
    puts 'admin'
  elsif u.is_moderator?
    puts 'moderator'
  elsif u.is_guest?
    puts 'guest'
  end

  

Our first improvement is definitely better than the original but there are duplicate code in the role checking. You can eliminate the duplicate code by delegating the role checking to a single method.

  class User
  
    attr_accessor :role
  
    def is_admin?
      is_role? 'admin'
    end
  
    def is_moderator?
      is_role? 'moderator'
    end
  
    def is_guest?
      is_role? 'guest'
    end
  
  protected
  
    def is_role?(name)
      self.role == name
    end
  
  end

  

Our second improvement is a classic refactoring technique and common in any modern programming language. In other words, there is nothing “Ruby” about it. Before you get bored, I will now show the Ruby version.

The Ruby version uses ‘define_method()’ to further eliminate duplicate code.


  class User
  
    attr_accessor :role
  
    def self.has_role(name)
      define_method("is_#{name}?") do
        self.role == "#{name}"
      end
    end
  
    has_role :admin
    has_role :moderator
    has_role :guest
  
  end

  

By using ‘define_method()’, we were able to add instance methods to our class User. You can check the new instance methods via irb.


  ruby-1.9.2-p0 > User.instance_methods.grep /^is/
  => [:is_admin?, :is_moderator?, :is_guest?, :is_a?] 

  

Note that ‘has_role()’ is just another method and as such you can modify it to accept several parameters, an array, or other class. For example, we can make ‘has_role’ accept a list of roles.

  class User
  
    attr_accessor :role
  
    def self.has_roles(*names)
      names.each do |name|
        define_method("is_#{name}?") do
          self.role == "#{name}"
        end
      end
    end
  
    has_roles :admin, :moderator, :guest
  
  end

  
Tagged
Follow

Get every new post delivered to your Inbox.