#hacking Ansible

Before you continue, let me congratulate myself. It’s been 2 years since I’ve written a post. Hooray!
Also, thank you to Rad for guiding me on my Ansible adventure. The ansible code I reference here are available in github.

There are tons of posts about Ansible already so this is more about gotchas I learned along the way.

Variables

Just like good coding, you should isolate things that will change. Store variables in vars/defaults.yml for things like initial database password, deployment directory, ruby version, and whatnot. I started out with using snake format but I later learned you can have a hierarchy for your variables as well, for example:

app:
name: google.app

Inside your tasks, you can reference it with {{ app.name }}. Of course, nothing prevents you from just simply naming your variable as app_name and use it as {{ app_name }}

Hosts and groups

You can use ansible to work on 1 or more servers at a time. These servers are specified in the hosts file (see hosts in the example). You can also group servers by using [groupname]. In the example below, you can have ansible target qa only (using the —limit option) and it will update the 2 servers you listed under the [qa] group.

[local]
localhost ansible_python_interpreter=/usr/local/bin/python

[www]
http://www.yourapp.com

[staging]
staging.yourapp.com

[qa]
qa1.yourapp.com
qa2.yourapp.com

Use roles

Roles allow you to separate common tasks into sort of like modules giving you flexibility. Some use roles to group common tasks like ‘db’, ‘web’, etc. For starters like me and if you are playing with combining different software, I used roles to define specific software. I have a role named ‘mysql’, a role ‘nginx-puma’, or a role ‘nginx-passenger’. Over time, you may split the roles into functional distinctions like web, db, etc.

Creating an EC2 instance

In my example, just update the variables below (included in vars/defaults.yml) to suit your requirements.

ec2_keypair: yourapp-www
ec2_instance_type: t1.micro
ec2_security_group: web-security-group
ec2_region: us-west-1
ec2_zone: us-west-1c
ec2_image: ami-f1fdfeb4 # Ubuntu Server 14.04 LTS (PV) 64-bit
ec2_instance_count: 1

You need your AWS_ACCESS_KEY, AWS_SECRET_KEY, and PEM file to create an instance. Creating the AWS access keys requires access to the IAM (Identity and Access Management).

When you’re ready, run the command:

AWS_ACCESS_KEY=”access” AWS_SECRET_KEY=”secret” ansible-playbook -i hosts ec2.yml

I have set ec2.yml to run against ‘local’ group (specified in the ’hosts’ file) because you don’t have a host yet. The Ansible EC2 module is smart enough to know that you are still creating an instance at this point.

After running the playbook, notice that it creates a new entry in the [launched] group in your hosts file. The new entry points to the EC2 instance you just created. At this point, you now have a server to run the rest of the playbooks.

Creating the deploy user

When the instance is created, it would create a new user. In my project where I use Ubuntu, it creates an ‘ubuntu’ user. In other cases, it might only create ‘root’ user. Regardless, your next step is to create a ‘deploy’ user because it is not a good idea to keep using root. You can change the name of the deploy user (see vars/defaults.yml) but I prefer using ‘deploy’ because it is also the default user in capistrano and I’m a Rails guy.

ansible-playbook -i hosts create-user.yml –user root –limit launched –private-key ~/.ssh/yourapp.pem

This playbook does several things:

  • Creates the ‘deploy’ user.
  • Adds your ssh key to /home/deploy/.ssh/authorized_keys
  • Gives deploy sudo rights.
  • Disables ssh and password access for root.

Note that I specified in the initial user ‘root’. Depending on the instance you just created it might be ‘ubuntu’ or some other initial user.

Creating an instance in other providers

In Digital Ocean (and other similar providers), you can create an instance using their admin interface. The initial user is ‘root’ and you can specify a password and/or public key. If you forgot the public key, you must add it before you continue.

scp ~/.ssh/id_rsa.pub root@yourapp.com:~/uploaded_key.pub
ssh root@staging.app.com
mkdir -m og-rwx .ssh
cat ~/uploaded_key.pub >> ~/.ssh/authorized_keys

Bootstrapping

Now that you have your deploy user, just run the command below, take a coffee, and come back after 30 mins. That’s how awesome ansible is.

ansible-playbook -i hosts bootstrap.yml –limit launched

You may need to tweak the roles or update a specific server. For example, you’re testing an upgrade on a staging server. In that case, make sure you specify the correct host group in the —limit parameter

A few more things I learned with this exercise:

  • I can set the file mode with mkdir.
  • Use `bash —login` to load login profile. When I ssh using ‘deploy@domain‘, my ruby gets loaded correctly. However, when using ansible (which uses ssh), my ruby is not loaded and thus, you will get errors like ‘bundle not found’.
  • I’m torn between using system ruby and chruby. On one hand, I feel like there is no need for chruby because this is a server and no need to switch rubies from time-to-time. On the other hand, why not try it and see how it works.

Mining Twitter Data with Ruby – Visualizing User Mentions

In my previous post on mining twitter data with ruby, we laid our foundation for collecting and analyzing Twitter updates. We stored these updates in MongoDB and used map-reduce to implement a simple counting of tweets. In this post, we’ll show relationships between users based on mentions inside the tweet. Fortunately for us, there is no need to parse each tweet just to get a list of users mentioned in the tweet because Twitter provides the “entities.mentions” field that contains what we need. After we collected the “who mentions who”, we then construct a directed graph to represent these relationships and convert them to an image so we can actually see it.

First, we start with the aggregation of mentions per user. We will use the same code base as last time. So if this is your first time, I recommend reading my previous related post or you can follow the changes in Github. Note to self: Convert this to an actual gem in the next post.

# user_mention.rb

module UserMention

  def mentions_by_user
    map_command = %q{
      function() {
        var mentions = this.entities.user_mentions,
            users = [];

        if (mentions.length > 0) {
          for(i in mentions) {
            users.push(mentions[i].id_str)
          }

          emit(this.user.id_str, { mentions: users });
        }
      }
    }

    reduce_command = %q{
      function(key, values) {
        var users = [];

        for(i in values) {
          users = users.concat(values[i].mentions);
        }

        return { mentions: users };
      }
    }

    options = {:out => {:inline => 1}, :raw => true, :limit => 50 }
    statuses.map_reduce(map_command, reduce_command, options)
  end

end

We then again use map-reduce in MongoDB to implement our aggregation. Of course, this sort of thing can be done in Ruby directly but it would be way more efficient if we do it in MongoDB especially if you have a big collection to process. Note that we limit the number of documents to process because we don’t want our graph to look unrecognizable when we display it.

Now that we have our aggregation working, we construct a directed graph of user mentions using the rgl library.

require "bundler"
Bundler.require

require File.expand_path("../tweetminer", __FILE__)

settings = YAML.load_file File.expand_path("../mongo.yml", __FILE__)
miner = TweetMiner.new(settings)

require "rgl/adjacency"
require "rgl/dot"

graph = RGL::DirectedAdjacencyGraph.new

miner.mentions_by_user.fetch("results").each do |user|
  user.fetch("value").fetch("mentions").each do |mention|
    graph.add_edge(user.fetch("_id"), mention)
  end
end

# creates graph.dot, graph.png
graph.write_to_graphic_file

Once you have the user-mentions relationships in a graph, you can do interesting things like who is connected to somebody and the degrees of separation. But for now, we are just interested in showing who mentioned whom. Our sample program saves the graph to the file graph.dot (using the DOT language) and PNG output. But the default PNG output is not laid out nicely. Instead, we will use the “neato” program to convert our graph.dot into a nice looking PNG file.

$ neato -Tpng graph.dot -o mentions.png

When you view “mentions.png”, you should see something similar as the one below. The labels are user IDs and the arrows show the mentioned users.

It would be cool to modify our program to use the users’ avatars and also make it interactive. Or, use Twitter’s streaming API and create an auto-update graph. I haven’t done any research yet but I’m sure there is some Javascript library out there that can help us display graph relationships.

Mining Twitter data with Ruby, MongoDB and Map-Reduce

When is the best time to tweet? If you care about reaching a lot of users, the best time probably is when your followers are also tweeting. In this exercise,we will try to figure out the day and time users are the most active. Since there is no way for us to do this for all users in the twitterverse, we will only use the users we follow as our sample.

What do we need

  • mongodb
  • tweetstream gem
  • awesome_print gem for awesome printing of Ruby objects
  • oauth credentials

Visit http://dev.twitter.com to get your oauth credentials. You just need to login, create an app, and the oauth credentials you need will be there. Copy the oauth settings to the twitter.yml file because that is where our sample code will be looking.

Collect status updates

We use the Tweetstream gem to access the Twitter Streaming APIs which allows our program to receive updates as they occur without the need to regularly poll Twitter.

# Collects user tweets and saves them to a mongodb
require "bundler"
require File.dirname(__FILE__) + "/tweetminer"

Bundler.require

# We use the TweetStream gem to access Twitter's Streaming API.
# https://github.com/intridea/tweetstream

TweetStream.configure do |config|
  settings = YAML.load_file File.dirname(__FILE__) + '/twitter.yml'

  config.consumer_key       = settings['consumer_key']
  config.consumer_secret    = settings['consumer_secret']
  config.oauth_token        = settings['oauth_token']
  config.oauth_token_secret = settings['oauth_token_secret']
end

settings = YAML.load_file File.dirname(__FILE__) + '/mongo.yml'
miner = TweetMiner.new(settings)

stream = TweetStream::Client.new

stream.on_error do |msg|
  puts msg
end

stream.on_timeline_status do |status|
  miner.insert_status status
  print "."
end

# Do not forget this to trigger the collection of tweets
stream.userstream

The code above handles the collection of status updates. The actual saving to mongodb is handled by the TweetMiner module.

# tweetminer.rb

require "mongo"

class TweetMiner
  attr_writer :db_connector
  attr_reader :options

  def initialize(options)
    @options = options
  end

  def db
    @db ||= connect_to_db
  end

  def insert_status(status)
    statuses.insert status
  end

  def statuses
    @statuses ||= db["statuses"]
  end

  private

  def connect_to_db
    db_connector.call(options["host"], options["port"]).db(options["database"])
  end

  def db_connector
    @db_connector ||= Mongo::Connection.public_method :new
  end

 end

We will be modifying our code along the way and if you want follow each step, you can view this commit at github.

Depending on how active the people you follow, it may take a while before you get a good sample of tweets. Actually, it would be interesting if you could run the collection for several days.

Assuming we have several days’ worth of data, let us proceed with the “data mining” part. Data mining would not be fun without a mention of map reduce – a strategy for data mining popularized by Google. The key innovation with map reduce is its ability to take a query over a data set, divide it, and run it in parallel over many nodes. “Counting”, for example, is a task that fits nicely with the map reduce framework. Imagine you and your friends are counting the number of people in a football stadium. First, you divide yourselves into 2 groups – group A counts the people in the lower deck while group B does the upper deck. Group A in turn divides the task into north, south, and endzones. When group A is done counting, they tally all their results. After group B is done, they combine the results with group A for which the total gives us the number of people in the stadium. Dividing your friends is the “map” part while the tallying of results is the “reduce” part.

Updates per user

First, let us do a simple task. Let us count the number of updates per user. We introduce a new module ‘StatusCounter’ which we include in our TweetMiner module. We also add a new program to execute the map reduce task.

# counter.rb

require "bundler"
Bundler.require

require File.dirname(__FILE__) + "/tweetminer"

settings = YAML.load_file File.dirname(__FILE__) + '/mongo.yml'
miner = TweetMiner.new(settings)

results = miner.status_count_by_user 
ap results

Map reduce commands in mongodb are written in Javascript. When writing Javascript, just be conscious about string interpolation because Ruby sees it as a bunch of characters and nothing else. For the example below, we use the here document which interprets backslashes. In our later examples, we switch to single quotes when we use regular expressions within our Javascript.

module StatusCounter

  class UserCounter
    def map_command
      <<-EOS
        function() {
          emit(this.user.id_str, 1);
        }
      EOS
    end

    def reduce_command
      <<-EOS
        function(key, values) {
          var count = 0;

          for(i in values) {
            count += values[i]
          }

          return count;
        }
      EOS
    end

  end

  def status_count_by_user
    counter = UserCounter.new
    statuses.map_reduce(counter.map_command, counter.reduce_command, default_mr_options)
  end

  def default_mr_options
    {:out => {:inline => 1}, :raw => true }
  end
 end

Follow this commit to view the changes from our previous examples.

When you run ‘ruby counter.rb’, you should see a similar screenshot as the one below:

Tweets per Hour

Now, let’s do something a little bit harder than the previous example. This time, we want to know how many tweets are posted per hour. Every tweet has a created_at field of type String. We then use a regular expression to extract the hour component.

created_at:  "Tue Sep 04 22:04:40 +0000 2012"
regex:  (\d{2,2}):\d{2,2}:\d{2,2}
match: 22

The only significant change is the addition of a new map command. Note the reduce command did not change from the previous example. See the commit.

  class HourOfDayCounter
    def map_command
      'function() {
        var re = /(\d{2,2}):\d{2,2}:\d{2,2}/;
        var hour = re.exec(this.created_at)[1];

        emit(hour, 1);
      }'
    end

    def reduce_command
      <<-EOS
        function(key, values) {
          var count = 0;

          for(i in values) {
            count += values[i]
          }

          return count;
        }
      EOS
    end

  end

  def status_count_by_hday
    counter = HourOfDayCounter.new
    statuses.map_reduce(counter.map_command, counter.reduce_command, default_mr_options)   
  end

Now run ‘ruby counter.rb’ in the console with the new method and the result should be something like the one below.

Filtering records

Our examples so far include every status since the beginning of time, which is pretty much useless. What we want is to apply the counting tasks to statuses posted the past 7 days, for example. MongoDB allows you to pass a query to your map-reduce so you can filter the data where the map-reduce is applied. One problem though: created_at field is a string. To get around this, we introduce a new field created_at_dt which is of type Date. You can hook it up in the insert_status method but since we already have our data, we instead run a query (using MongoDB console) to update our records. Please note the collection we are using is statuses and the new field is created_at_dt.

var cursor = db.statuses.find({ created_at_dt: { $exists: false } });
while (cursor.hasNext()) {
  var doc = cursor.next();
  db.statuses.update({ _id : doc._id }, { $set : { created_at_dt : new Date(doc.created_at) } } )
}

Now, that we have a Date field, let’s modify our method to include a days_ago parameter and a query in our map reduce.

  def status_count_by_hday(days_ago = 7)
    date     = Date.today - days_ago
    days_ago = Time.utc(date.year, date.month, date.day)
    query = { "created_at_dt" => { "$gte" => days_ago } }

    options = default_mr_options.merge(:query => query)

    counter = HourOfDayCounter.new
    statuses.map_reduce(counter.map_command, counter.reduce_command, options)   
end

Since we’re now getting the hang of it, why don’t we add another complexity. This time, let us count by day of the week and include a breakdown per hour. Luckily for us, the day of the week is also included in the created_at field and it is just a matter of extracting it. Of course, if Twitter decides to change the format, this will break. Let’s visit rubular.com and try our regular expression.

Now that we have our regex working, let’s include this in our new map command.

    def map_command
      'function() {
        var re = /(^\w{3,3}).+(\d{2,2}):\d{2,2}:\d{2,2}/;
        var matches = re.exec(this.created_at);

        var wday = matches[1],
            hday = matches[2];

        emit(wday, { count: 1, hdayBreakdown: [{ hday: hday, count: 1 }] });
      }'     
    end

Note the difference in the emit function from our previous examples. Before, we only emit a single numeric value that is why our reduce command is simple array loop. This time, our reduce command requires more work.

    def reduce_command
      'function(key, values) {
         var total = 0,
             hdays = {},
             hdayBreakdown;

         for(i in values) {
           total += values[i].count

           hdayBreakdown = values[i].hdayBreakdown;

           for(j in hdayBreakdown) {
             hday  = hdayBreakdown[j].hday;
             count = hdayBreakdown[j].count;

             if( hdays[hday] == undefined ) {
               hdays[hday] = count;
             } else {
               hdays[hday] += count;
             }
           }
         }

         hdayBreakdown = [];
         for(k in hdays) {
           hdayBreakdown.push({ hday: k, count: hdays[k] })
         }

         return { count: total, hdayBreakdown: hdayBreakdown }
       }'
    end

In our previous examples, the values parameter is a simple array of numeric values. Now, it becomes an an array of properties. On top of that, one of the properties (i.e. hdayBreakdown) is also an array. If everything works according to plan, you should see something like the image below when you run collect.rb.

Did you have fun? I hope so 🙂

Adding keyboard shortcuts in web pages

Adding keyboard shortcuts to interact with your web pages seems like a useless feature when the rest of the world is using a mouse. But for a programmer who wants everything to be a few keystrokes away, keyboard shortcuts are very handy.

In this tutorial, we will add a simple scrolling shortcuts to our webpage. This is just to illustrate what is possible. So please, do not copy-and-paste this to your production code.

What do we need?

  1. jquery
  2. sinatra
  3. coffeescript

Actually, the only critical piece we need is jQuery and knowledge of Javascript. However, since I am more of a Ruby guy, we will use Sinatra to build the page and CoffeeScript to write the Javascript.

Build the pages

The screenshot below (left side) shows how our directory structure would look like. It is pretty much a standard Sinatra structure.

Our HTML page displays 10 entries where each is grouped under a “div” element with an “.entry” class and an ID. We also add in some styling in our page to distinguish each entry.

<!DOCTYPE HTML>
<html>
  <head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    <title>Index</title>
    <link rel="stylesheet" href="css/style.css"/>

    <script type="text/javascript" charset="utf-8" src="http://code.jquery.com/jquery-1.7.1.min.js"></script>
    <script type="text/javascript" charset="utf-8" src="js/app.js">
    </script>
  </head>

<% 1.upto(10) do |i| %>
  <div id="<%= "entry_#{i}" %>"class="entry">
    <%= "Title #{i}" %>
    <p>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>
  </div>
<% end %>
</html>

If everything is setup correctly, you should be able to run the app and see 10 entries.

greg@gokou ~/dev/projects/wdr/keyboard-shortcuts $ ruby app.rb
[2012-08-30 13:48:44] INFO  WEBrick 1.3.1
[2012-08-30 13:48:44] INFO  ruby 1.9.2 (2012-04-20) [x86_64-darwin12.1.0]
== Sinatra/1.3.3 has taken the stage on 4567 for development with backup from WEBrick
[2012-08-30 13:48:44] INFO  WEBrick::HTTPServer#start: pid=12415 port=4567

Now for the juicy part. When the user presses ‘j’, we will scroll to the next entry while ‘k’ scrolls to the previous. If you are a Vim user, you know why.

current_entry = -1

$(document).keydown (e) ->
  switch(e.keyCode)
    when 74 then scroll_to_next() # j
    when 75 then scroll_to_previous() # k

scroll_to_next = ->
  #alert "scroll to next"
  current_entry++
  scroll_to_entry(current_entry)


scroll_to_previous = ->
  if current_entry > 0
    current_entry--
    scroll_to_entry(current_entry)

scroll_to_entry = (entry) ->
  # Get the element we need to scroll to
  id = $(".entry")[entry].id
  $("html, body").animate { scrollTop: $("##{id}").offset().top }, "slow"

That’s it! As I’ve mentioned before, this is not production ready. For example, the shortcut should not interfere with other actions in your page like when the user is interacting with an input field. This also assumes the current visible entry is the first one.

This post is based from the book Web Development Recipes. If you are looking for quick reference on how to improve your project, I suggest reading the book.

How to create a wrapper gem for service APIs – part 1

APIs are getting more and more popular as apps and services move to the cloud. Whenever you need to integrate a popular web service API into your Ruby app, 99.99% of the time there already exists a gem ready for use. That is a testament to how active Ruby developers are in supporting the community.

Even if integrating with the popular APIs is not in your radar, you may still have a need to create an API wrapper for internal use. For example, if your Rails application has grown tremendously (congratulations!), you may eventually need to adopt a services architecture to support upcoming features and make things manageable.

I created the gem for the Open Amplify API as part of my exploration to data mining. When I first created it, my primary goal was simply to wrap the API. Though I still didn’t write spaghetti code, it wasn’t a good example of structured code either. Two years later (yep, that’s how long I let the code rot), I decided to re-write the gem and adopt the architecture from the Twitter gem. It was a good exercise because not only I updated the gem for the newest API version, I also learned a great deal on how to write a gem.

Setup the project

We will create a wrapper for the fictitious Awesome API and thus call our gem ‘awesome’. To get things started, let’s use bundler to set up our initial code.

$> bundle gem awesome
      create  awesome/Gemfile
      create  awesome/Rakefile
      create  awesome/LICENSE
      create  awesome/README.md
      create  awesome/.gitignore
      create  awesome/awesome.gemspec
      create  awesome/lib/awesome.rb
      create  awesome/lib/awesome/version.rb
Initializating git repo in /Users/greg/dev/code/awesome

This is the standard directory structure and naming convention of Ruby gems. The files Gemfile, Rakefile, and .gitignore are not necessary but they would be very useful while developing your gem.

Gem dependencies. All gem dependencies should go into awesome.gemspec and not in Gemfile. Inside your Gemfile, the line ‘gemspec’ takes care of identifying the gems you needed in your local.

$> more Gemfile
source 'https://rubygems.org'

# Specify your gem's dependencies in awesome.gemspec
gemspec

Versioning. You specify the version of your gem inside lib/awesome/version.rb

$> more lib/awesome/version.rb
module Awesome
  VERSION = "0.0.1"
end

You may be wondering how is this used by the gem. Take a peek at awesome.gemspec and you’ll see that Awesome::VERSION is used by the .gemspec file.

$> more awesome.gemspec
# -*- encoding: utf-8 -*-
require File.expand_path('../lib/awesome/version', __FILE__)

Gem::Specification.new do |gem|
  gem.authors       = ["Greg Moreno"]
  gem.email         = ["greg.moreno@gmail.com"]
  gem.description   = %q{TODO: Write a gem description}
  gem.summary       = %q{TODO: Write a gem summary}
  gem.homepage      = ""

  gem.files         = `git ls-files`.split($\)
  gem.executables   = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
  gem.test_files    = gem.files.grep(%r{^(test|spec|features)/})
  gem.name          = "awesome"
  gem.require_paths = ["lib"]
  gem.version       = Awesome::VERSION
end

Additional modules and classes. You can write all your code in the file lib/awesome.rb and it would still work. However, in the spirit of making code maintainable, it is highly recommended that you put your classes and modules under the directory lib/awesome just like what we did with lib/awesome/version.rb

Testing the gem

We will use minitest but you can always use any test framework you prefer. For our test setup, we need to do the ff:

  1. Setup our test directory manually since bundler didn’t do this for us.
  2. Create a rake task to run our tests.
  3. Specify gem dependencies in our common test helper file.
$> mkdir test
$> touch test/helper.rb
$> mkdir test/awesome
$> touch test/awesome/awesome_test.rb

# Rakefile
require 'bundler/gem_tasks'

require 'rake/testtask'
Rake::TestTask.new do |test|
  test.libs << 'lib' << 'test'
  test.ruby_opts << "-rubygems"
  test.pattern = 'test/**/*_test.rb'
  test.verbose = true
end

# test/helper.rb
require 'awesome'
require 'minitest/spec'
require 'minitest/autorun'

Now that we have our testing in place, let’s write a simple test and see if everything works.

# test/awesome/awesome_test.rb
require 'helper'

describe Awesome do
  it 'should have a version' do
    Awesome::VERSION.wont_be_nil
  end
end

# Then, let's run the test
$> rake test
(in /Users/greg/dev/code/awesome)
/Users/greg/.rbenv/versions/1.9.2-p290/bin/ruby -I"lib:lib:test" -rubygems "/Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader.rb" "test/awesome/awesome_test.rb"
Loaded suite /Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader
Started
.
Finished in 0.000695 seconds.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

Test run options: --seed 55984

Perfect! Now, let’s start working on the juicy parts of our gem.

Configuration

Every webservice API would definitely require a per-user configuration and your gem should be able to support that. For example in the Twitter gem, some methods require authentication and you setup the default configuration with this:

    Twitter.configure do |config|
      config.consumer_key = YOUR_CONSUMER_KEY
      config.consumer_secret = YOUR_CONSUMER_SECRET
      config.oauth_token = YOUR_OAUTH_TOKEN
      config.oauth_token_secret = YOUR_OAUTH_TOKEN_SECRET     
    end

Every API has different options but if you are wrapping a webservice, the options often fall into two categories – connections and functional options. For example, connection-related options include the endpoint, user agent, and authentication keys while functional options include request format (e.g. json), number of pages to return and other parameters required by specific API functions. In some APIs, the api key is passed as a parameter to GET calls so while it may be connection-related, it is better to group it with parameter options so you can easily encode all parameters in a single call.Our Awesome API is simple and will not deal with OAuth like the Twitter gem does. For the configuration, we should be able to do this:

Awesome.api_key = 'YOUR_API_KEY'
Awesome.format = :json
# Other options are: user_agent, method

Now, let’s write some tests. Of course, these should fail at first 🙂

# test/awesome/configuration_test.rbrequire 'helper'

describe 'configuration' do

  describe '.api_key' do
    it 'should return default key' do
      Awesome.api_key.must_equal Awesome::Configuration::DEFAULT_API_KEY
    end
  end

  describe '.format' do
    it 'should return default format' do
      Awesome.format.must_equal Awesome::Configuration::DEFAULT_FORMAT
    end
  end

  describe '.user_agent' do
    it 'should return default user agent' do
      Awesome.user_agent.must_equal Awesome::Configuration::DEFAULT_USER_AGENT
    end
  end

  describe '.method' do
    it 'should return default http method' do
      Awesome.method.must_equal Awesome::Configuration::DEFAULT_METHOD
    end
  end

end

As I mentioned before, the best way to write your gem (or any program for that matter) is to cleary separate the functionalities into modules and classes. In our case, we will put all configuration defaults inside a module (i.e. lib/awesome/configuration.rb). We also want to provide class methods for the module Awesome which we can easily do using Ruby’s ‘extend’.

# lib/awesome/configuration.rb

module Awesome
  module Configuration
    VALID_CONNECTION_KEYS = [:endpoint, :user_agent, :method].freeze
    VALID_OPTIONS_KEYS    = [:api_key, :format].freeze
    VALID_CONFIG_KEYS     = VALID_CONNECTION_KEYS + VALID_OPTIONS_KEYS

    DEFAULT_ENDPOINT    = 'http://awesome.dev/api'
    DEFAULT_METHOD      = :get
    DEFAULT_USER_AGENT  = "Awesome API Ruby Gem #{Awesome::VERSION}".freeze

    DEFAULT_API_KEY      = nil
    DEFAULT_FORMAT       = :json

    # Build accessor methods for every config options so we can do this, for example:
    #   Awesome.format = :xml
    attr_accessor *VALID_CONFIG_KEYS

    # Make sure we have the default values set when we get 'extended'
    def self.extended(base)
      base.reset
    end

    def reset
      self.endpoint   = DEFAULT_ENDPOINT
      self.method     = DEFAULT_METHOD
      self.user_agent = DEFAULT_USER_AGENT

      self.api_key    = DEFAULT_API_KEY
      self.format     = DEFAULT_FORMAT
    end

  end # Configuration
end

# lib/awesome.rb              
require 'awesome/version'
require 'awesome/configuration'

module Awesome
  extend Configuration
end


$> rake test
(in /Users/greg/dev/code/awesome)
/Users/greg/.rbenv/versions/1.9.2-p290/bin/ruby -I"lib:lib:test" -rubygems "/Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader.rb" "test/awesome/awesome_test.rb" "test/awesome/configuration_test.rb" 
Loaded suite /Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader
Started
.....
Finished in 0.001600 seconds.

5 tests, 5 assertions, 0 failures, 0 errors, 0 skips

Our gem will not be awesome if we don’t support a ‘configure’ block like what the Twitter gem does. We want to setup the configuration like this:

  Awesome.configure do |config|
    config.api_key = 'YOUR_API_KEY'
    config.method  = :post
    config.format  = :json
  end

Fortunately, it’s an easy fix. We just need to add a ‘configure’ method to the Configuration module. We also update our tests to make sure this new method works.

# lib/awesome/configuration.rb
    def configure
      yield self
    end

# test/awesome/configuration_test.rb  
  after do
    Awesome.reset
  end

  describe '.configure' do
    Awesome::Configuration::VALID_CONFIG_KEYS.each do |key|
      it "should set the #{key}" do 
        Awesome.configure do |config|
          config.send("#{key}=", key)
          Awesome.send(key).must_equal key
        end
      end
    end
  end

Before we move on, let’s take a second look at our configuration tests. We have tests for checking default values and setting-up new ones. What if we added a new configuration key for our gem? The ‘configure’ tests will be able to handle the new key but we still have to add another test for checking the default value. And we don’t want to right another test code, right? More importantly, we don’t want our tests to yield false positives. If we fail to add the ‘default value’ check, our tests will still pass even though we forgot to set a default value.Let us remove all our default value tests and replace it with code that relies on VALID_CONFIG_KEYS instead.

# test/awesome/configuration_test.rb  
  Awesome::Configuration::VALID_CONFIG_KEYS.each do |key|
    describe ".#{key}" do
      it 'should return the default value' do
        Awesome.send(key).must_equal Awesome::Configuration.const_get("DEFAULT_#{key.upcase}")
      end
    end
  end

$> rake test 
(in /Users/greg/dev/code/awesome)
/Users/greg/.rbenv/versions/1.9.2-p290/bin/ruby -I"lib:lib:test" -rubygems "/Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader.rb" "test/awesome/awesome_test.rb" "test/awesome/configuration_test.rb" 
Loaded suite /Users/greg/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/rake/rake_test_loader
Started
...........
Finished in 0.002935 seconds.

11 tests, 11 assertions, 0 failures, 0 errors, 0 skips

Test run options: --seed 21540

Configuring clients

Our end goal is to wrap API calls that fits nicely into our application and the common approach to do that is to wrap the API calls under a ‘Client’ class. Depending on the size of the API you want to support, the Client class maybe delegating the method calls to other classes and modules but from the point of view your program, the action happens inside the Client class. There are two ways to configure the Client class:

  1. It inherits the configuration values defined in the Awesome module;
  2. It overrides the configuration values per client.
# Use the values defined in the Awesome module
client = Awesome::Client.new
client.make_me_awesome('gregmoreno')

client_xml = Awesome::Client.new :format => :xml
client_json = Awesome::Client.new :format => :json

We are not going to show our tests in here but if you are interested, you can view the test code from the github repository. Instead, we show the code that handles the two scenarios for client configuration.

# lib/awesome/client.rb

module Awesome

  class Client

    # Define the same set of accessors as the Awesome module
    attr_accessor *Configuration::VALID_CONFIG_KEYS

    def initialize(options={})
      # Merge the config values from the module and those passed
      # to the client.
      merged_options = Awesome.options.merge(options)

      # Copy the merged values to this client and ignore those
      # not part of our configuration
      Configuration::VALID_CONFIG_KEYS.each do |key|
        send("#{key}=", merged_options[key])
      end
    end

  end # Client

end

We also need to update our Awesome module. First, we need to require the new file awesome/client.rb so it will be loaded when we require the gem. Second, we need to implement a method that returns all the configuration values inside the Awesome module. Since this is still about configuration, our new method should go inside the Configuration module.

# lib/awesome/configuration.rb
    def options
      Hash[ * VALID_CONFIG_KEYS.map { |key| [key, send(key)] }.flatten ]
    end

We’re finally done with the configuration part of our gem. I know it’s a lot of work for a simple task but we managed to put a good structure in our code. Plus, we learned how to make our tests less brittle, and use Ruby’s awesome power to make our code better. In our next installment, we’ll discuss requests and error handling.

Create your own Rails 3 engine

Engine is an interesting feature of Rails. Engines are miniature applications that live inside your application and they have structure that you would normally find in a typical Rails application. If you have used the Devise gem, which itself is an engine, you know the benefits of being able to add functionality to your application with just a few lines of code. Another great benefit of engines is when you or your team are maintaining a number of applications the common functionalities can be extracted into engines.

Engines are already available prior to Rails 3 but it is not a core feature of the framework. As such, engine developers resorted to monkey-patching which, oftentimes, lead to engines breaking when Rails gets updated. In Rails 3.1, engines are now supported by the framework and there is now a clealy defined place where to hook your engines into Rails.

Now, let us go through the steps of building a simple engine. We will be working on authentication engine (like Devise) that allows users of your application to use their Twitter or Facebook credentials.

# This is the app that will use our engine
$> rails new social_app

# This is our engine
$> rails plugin new undevise --mountable

The –mountable option tells Rails you want to generate a mountable plugin, commonly known as engine. When you look at the directory structure of your engine, it is much different from your Rails app. The engine has controllers, models, views, mailers, lib, and even its own config/routes.rb (didn’t we just said it is a miniature Rails app).

Include the engine in your app

Just like any gem, you should update your app’s Gemfile to use the engine we created.

# social_app/Gemfile
gem 'undevise', :path => '../undevise'

Of course, you can set :path to any location or if it is in a git repository, you can use the :git option. If you are developing your engine alongside your app, a better approach is to use gem groups in your Gemfile. For example:

group :development do
  gem 'undevise', :path => '../undevise'
end

group :production do
  gem 'undevise', :git => 'git://github.com/yourname/undevise.git'
end

After adding the engine in your Gemfile, let’s make sure all dependencies are available for the application. If everything works, you should be able to see a reference to undevise inside Gemfile.lock

$> cd social_app
$> bundle install
$> more Gemfile.lock
PATH
  remote: ../undevise
  specs:
    undevise (0.0.1)
      rails (~> 3.2.1)

Mount the engine

Next, we will mount the engine and see if we can route requests to it. What this does is make sure requests starting with /auth will be passed to our engine.

# social_app/config/routes.rb
SocialApp::Application.routes.draw do
  mount Undevise::Engine, :at => '/auth'
end

# Run the social app. Make sure you are in the social_app directory.
$> rails s

# Then visit http://localhost:3000/auth

Rails 3 engine

When you visit ‘/auth‘, you will get a routing error because you haven’t defined any routes in your engine yet.

# undevise/config/routes.rb
Undevise::Engine.routes.draw do
  root :to => 'auth#index'
end

Remember even though your engine is mounted at ‘/auth’, what your engine sees is the path after the ‘/auth’. Routes in engines are namespaced to avoid conflicts with your app. You can change the mounted path in your Rails app anytime and your engine wouldn’t care. Let’s try again and see what Rails would tell us.

$> cd social_app
$> rails s

Perfect! Now we know the request is being passed to our engine. We now just have to define our controller.

$ cd undevise
$ rails g controller auth

# undevise/app/controllers/undevise/auth_controller.rb
module Undevise
  class AuthController < ApplicationController

    def index
      render :text => 'Hello world'
    end

  end
end

# Now, visit http://localhost:3000/auth

Cool! We have the obligatory hello world program working. At the the risk of sounding like a broken record, remember your engine code should be namespaced. If you forget this, strange things will happen to your application and Rails will not usually complain about it.

Gem dependencies

I’m sure your idea for an engine is very far from what we have shown so far. When you generate an engine, it also creates a .gemspec file. While in your Rails app you list the gems in Gemfile, in your engine you list the gems inside the .gemspec file. This can be confusing because the engine also contains a Gemfile.

$> cd undevise
$> more Gemfile

source "http://rubygems.org"

# Declare your gem's dependencies in undevise.gemspec.
# Bundler will treat runtime dependencies like base dependencies, and
# development dependencies will be added by default to the :development group.
gemspec

As you can see, there is no need to list the gems your engine needs in the Gemfile. The line ‘gemspec’ makes sure the gems you listed in your .gemspec file are installed when you run bundle install.

Now, let’s add some gems in our engine and see how it will affect our Rails app.

# undevise/undevise.gemspec
  s.add_dependency "rails", "~> 3.2.1"
  s.add_dependency "omniauth"
  s.add_dependency "omniauth-twitter"
  s.add_dependency "omniauth-facebook"


$ cd social_app
$ bundle install
$ more Gemfile.lock
PATH
  remote: ../undevise
  specs:
    undevise (0.0.1)
      omniauth
      omniauth-facebook
      omniauth-twitter
      rails (~> 3.2.1)

Here we can see the gems we specifed in undevise/undevise.gemspec are also included in the main Rails app.

Configure OmniAuth

If you are using omniauth directly in your app, your will configuration will definitely be in the file config/initializers/omniauth.rb. Since our engine is pretty much just another Rails app, it will also have its own config/initializers/omniauth.rb file. The only consideration with regards to the configuration is where would the Twitter or Facebook credentials be located. You definitely don’t want to embed it in your engine.

Our solution is to store the credentials inside a config/twitter.yml file (or config/facebook.yml) inside your main Rails app. Then have our engine pull the values out of these files to configure omniauth.

$> cd undevise/config
$> mkdir initializers
$> touch initializers/omniauth.rb

# We have to create the initializers directory because it not created by default.

# undevise/config/initializers/omniauth.rb
providers = %w(twitter facebook).inject([]) do |providers, provider|
  fpath = Rails.root.join('config', "#{provider}.yml")

  if File.exists?(fpath)
    config = YAML.load_file(fpath)
    providers << [ provider, config['consumer_key'], config['consumer_secret'] ]
  end

  providers
end

raise 'You have not created config/twitter.yml or config/facebook.yml' if providers.empty?

Rails.application.config.middleware.use OmniAuth::Builder do
  providers.each do |p|
    provider *p
  end
end

Now, let’s go back to our main Rails app and start the server.

$> cd social_app
$> rails s
=> Booting WEBrick
=> Rails 3.2.1 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
Exiting
/Users/greg/dev/tmp/ruby/engine-tutorial/undevise/config/initializers/omniauth.rb:12:in `<top (required)>': You have not created config/twitter.yml or config/facebook.yml (RuntimeError)

Oops! We forgot to create our Twitter or Facebook configuration file. In your main Rails app, go ahead and create config/twitter.yml. If you are not familiar with Twitter apps, visit their developer site at https://dev.twitter.com/

Gemfile dependencies and sub-depencies

# social_app/config/twitter.yml
consumer_key:  'APP_CONSUMER_KEY'
consumer_secret: 'APP_CONSUMER_SECRET'

$> rails s
=> Booting WEBrick
=> Rails 3.2.1 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
Exiting
/Users/greg/dev/tmp/ruby/engine-tutorial/undevise/config/initializers/omniauth.rb:12:in `<top (required)>': uninitialized constant OmniAuth (NameError)
     from /Users/greg/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/railties-3.2.1/lib/rails/engine.rb:588:in `block (2 levels) in <class:Engine>'

The NameError occurs because during the main Rails’ app boot up, Bundler will only require dependencies listed in the Gemfile but not the sub-dependencies. As you can see from Gemfile.lock, omniauth is not a direct dependency. You could list the gems in your main app’s Gemfile but that’s defeating the purpose of isolating the gem dependencies through your engine.

The solution right now is to require your dependencies inside your engine and the place to do that is inside lib/undevise/engine.rb

# undevise/lib/undevise/engine.rb
require 'omniauth'
require 'omniauth-twitter'
require 'omniauth-facebook'

module Undevise
  class Engine < ::Rails::Engine
    isolate_namespace Undevise
  end
end

After listing required dependencies inside your engine, restart your main Rails app, then visit http://localhost:3000/auth/twitter/

When you visit http://localhost:3000/auth/twitter/, you should see the error above. The callback url is part of OmniAuth’s behaviour and should be fixed by adding a route in your engine and adding the method to handle it in your controller.

# undevise/config/routes.rb
Undevise::Engine.routes.draw do
  root :to => 'auth#index'
  match ':provider/callback' => 'auth#callback'
end

# undevise/app/controllers/undevise/auth_controller.rb
module Undevise
  class AuthController < ApplicationController

    def index
      render :text => 'Hello world'
    end

    def callback
      render :text => "Hello from #{params[:provider]}"
    end

  end
end

If everything work fine, you should see a message from Twitter.

We only scratched the surface with Rails 3 engine. Your engine, much like any normal Rails app, can have models and migrations, javascripts, css, specs, etc. If you want to dig deeper into engines, I recommend Rails 3 in Action by Ryan Bigg and Yehuda Katz. It includes a whole chapter about engines, discussion of middleware, and how tests your engine.

More Ruby tips and tricks

String to number conversion gotcha

>> Float('3.14159')
=> 3.14159 
>> '3.14159'.to_f
=> 3.14159 

# However, Float() method will return an exception if given
# a bad input while to_f() will ignore everything from the 
# offending character.

>> Float('3.x14159')
ArgumentError: invalid value for Float(): "3.x14159"
	from (irb):4:in 'Float'
	from (irb):4

>> '3.x14159'.to_f
=> 3.0


# Similar case with to_i() and Integer().

>> Integer('19x69')
ArgumentError: invalid value for Integer(): "19x69"
	from (irb):15:in 'Integer'
	from (irb):15
	from /Users/greg/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in '<main>'

>> '19x69'.to_i
=> 19 

Case insensitive regular expression

# Regex is case sensitive by default. 
# Adding 'i' for insensitive match
puts 'matches' if  /AM/i =~ 'am'

Hash is ordered in 1.9

# new syntax in 1.9
h = {first: 'a', second: 'b', third: 'c'}

# hashes in 1.9 are ordered
h.each do |e|
  pp e
end

Filter a list using several conditions

conditions = [
    proc { |i| i > 5 },
    proc { |i| (i % 2).zero? },
    proc { |i| (i % 3).zero? }
  ]

matches = (1..100).select do |i|
  conditions.all? { |c| c[i] }
end

Randomly pick an element from an array

>> [1,2,3,4,5].sample
=> 2 
>> [1,2,3,4,5].sample
=> 1 

# pick 2 random elements

>> [1,2,3,4,5].sample(2)
=> [1, 5]

List methods unique to a class

# List all instance methods that starts with 're' 
# including those inherited by String.

>> String.instance_methods.grep /^re/
=> [:replace, :reverse, :reverse!, :respond_to?, :respond_to_missing?] 

# List methods unique to String, i.e. not include 
# those defined by its ancestors.

>> String.instance_methods(false).grep /^re/
=> [:replace, :reverse, :reverse!]

Globbing key-value pairs

>> h = Hash['a', 1, 'b', 2]
=> {"a"=>1, "b"=>2}

>> h = Hash[ [ ['a', 1], ['b', 2] ] ] 
=> {"a"=>1, "b"=>2}

>> h = Hash[ 'a' => 1, 'b' => 2 ]
=> {"a"=>1, "b"=>2}

# The first form is very useful for globbing key-value pairs in Rails’ routes. For example, if you have the following:

# route definition in Rails 3
match 'items/*specs' => 'items#specs'

# sample url
http://localhost:3000/items/year/1969/month/7/day/21

# params[:specs] will be set

>> params[:specs]
=> "year/1969/month/7/day/21"

>> h = Hash[*params[:specs].split('/')]
=> {"year"=>"1969", "month"=>"7", "day"=>"21"}

Scala loops

Today, I finally got to play with Scala. I’ve been using mainly Ruby (and loving it) for most of my projects but I promised myself (in one of my new year’s resolution) that I will learn a new programming language this year. So from time to time, I will sprinkle this blog with some Scala love.

Now, for some mandatory hello world program. Nah. The code below simply prints the arguments to your Scala program.

var i = 0
while (i < args.length) {
  println("hello " + args(i) + "!")
  i += 1
}

// Assuming you have saved the code in hello.scala, do the following to run the code:
// scala hello.scala arg1 arg2 arg3

The code above is an imperative style used normally in languages like C or C++. Most new programmers are exposed to this style.

Being a Ruby developer, I quickly wondered if there is something similar to “.each” method? Yes, there is.

args.foreach(arg => println(arg))
args.foreach(println)

Above is a functional style and is something you will be comfortable with as you improve your programming skill. In Scala, the second line is possible if the function accepts a single argument.

Since ‘println’ is just a function, you could just replace it with your own function.

def printer(x: String) = {
  println(x.toUpperCase)
}

args.foreach(arg => printer(arg))

Yes, you guessed right. You can also define anonymous function.

args.foreach(arg => { println(arg.toUpperCase) })

When I first heard that Scala is somewhat related to Java, my first thought was it is going to be another verbose language. Well, I was wrong 🙂 It appears to be a fun and powerful language worth learning.