rune.hammersland.net: Using ruby.h to extend Ruby with C

This page is in english so that others than Norwegians can benefit from the writeup. I’m writing this mostly for myself, but also so that others who might be interested in this subject can have a broader amount of introductions available. While looking for information on the subject, I looked into the ONLamp article Extending Ruby with C by Garrett Rooney, the Extending Ruby chapter in the Pickaxe, README.EXT (located at /usr/share/doc/ruby1.8-dev/README.EXT.gz on my Ubuntu system) and got some help from Kjetil.

The resulting file can be found here: wtmplist.c.

Disclaimer: I’m not resposible if you should destroy something with your newfound skills, nor am I responsible if you should suddenly get the urge to leave your wife/loved ones in favor of Ruby.

What to Extend?

Obviously you start by figuring out what you want to extend Ruby with. Some library missing, or better implemented in C? Some library you wrote yourself and want to be able to use from Ruby scripts?

I have a project coming up where I’ll have to look into /var/log/wtmp to get login entries, and didn’t find any classes/modules that made this possible in Ruby. There are documented (through Man) C functions to access the structures in the file, so I started to look at the introductions to extending Ruby (regrettably I somehow forgot about the Pickaxe at this point, but Kjetil reminded me after I had been cursing for a while).

Moving on to the next step …

How Do I Want the Interface to Be?

The point here is to think how you would like the information and methods represented in Ruby. In my case, I want to read in all entries in wtmp, and have them available in a list of some sort. In Ruby that means using the array. So an Array of entries doesn’t sound too bad. To make things cleaner, I’ll also put it in a module. This maps to this Ruby code (roughly):

module Wtmp

    class List < Array

        def initialize
            # open wtmp
            # wtmp.each_struct do
            #     self.push Entry.new
            # end
            # close wtmp
        end

    end

end

So our job is to make the pseudo code prefixed with hashes happen in C. To store the structures, Kjetil suggested we use a hash, which is an excellent idea. Thus we get:

module Wtmp

    class Entry < Hash

        def initialize
            # Maybe something here?
        end

    end

end

Get to it! #include <ruby.h>

Groundwork.

So let’s code this in C, using ruby.h to create the module, the classes, and even some of the array and hash methods. First we have to create a function that initializes our library, and creates the module and classes. This has to be named Init_<basename of C file>. Since my file is named wtmplist.c, my function will be called Init_wtmplist:

#include <utmp.h>
#include <ruby.h>

static VALUE mWtmp;
static VALUE cList;
static VALUE cEntry;

static VALUE
cList_initialize (VALUE self)
{
    // Empty for now.
}

void
Init_wtmplist ()
{
    mWtmp = rb_define_module("Wtmp");
    cList = rb_define_class_under (mWtmp, "List", rb_cArray);
    cEntry = rb_define_class_under (mWtmp, "Entry", rb_cHash);

    rb_define_method (cList, "initialize", cList_initialize, 0);
}

OK. Now we have a basic skeleton to work with. Let’s start at the top. utmp.h is included to get the functions we’ll use later on to access the structures in the wtmp file. In ruby.h lies all the goodness we’ll need to use to make this available to our Ruby scripts.

So … What are those VALUE thingies? Since Ruby is a dynamically typed language, all variables are references to objects, and our methods will therefore need to return references. This is implemented as a pointer called VALUE, and makes things transparent when you use it in Ruby. When we need to get data from a VALUE variable later on, we’ll need to convert it to a primitive C datatype (but more on that later). VALUE mWtmp is a pointer to the module we’ll be making. The cList_initialize function will be List’s initialize method, but for now it’s empty.

Then the function to initialize our library. First we define the module Wtmp, using rb_define_module(). We’ll store it in our mWtmp pointer. Not much magic here. Then we create our classes using rb_define_class_under() to put them under our newly created module. As you can see, this function takes three arguments: module to put class under, name of class, and which class to inherit. If we were to create a generic class and not inherit from any class in particular, we’d used rb_cObject as the last argument. And lastly we define List’s initialize method using rb_define_method(). You should read more about this function in README.EXT.

Make it Compile.

Now we’ll make it compile so it becomes possible to test that things actually work. There are a couple of ways to compile it: by hand, writing a Makefile, get Ruby to write a Makefile. I’ll write about the last, as you’ll be able to figure out the two others after doing this.

Edit ./extconf.rb, and add these two lines:

require 'mkmf'
create_makefile('wtmplist')

… where wtmplist is the name of the library. Then run: ruby extconf.rb, and behold: a Makefile! Quickly: make, and you’ll have a library file available as wtmplist.so. If you require this in Ruby, you should now be able to make a Wtmp::List and a Wtmp::Entry, using their new methods: Wtmp::List.new.

Collect the Data to Make Available.

So now we need to get the contents of the wtmp file, and find a way to make it available through the Entry class. We’ll do this in the initialize method of the List class, so creating a new list will automatically populate it with Entries.

static VALUE
cList_initialize (VALUE self)
{
    struct utmp *result;
    utmpname (_PATH_WTMP);
    setutent ();

    while ((result = getutent ()) != NULL) {
        // Empty for now.
    }

    free (result);
    return (self);
}

First we create a pointer to a utmp struct, to hold temporary structures in. Then we specify which file to read from (_PATH_WTMP is a constant that holds the path to the wtmp file, usually /var/log/wtmp), and call setutent() to open the file and set the filepointer at the start of the file. Then we create a while loop to get one structure at a time. When we’re finished, we free the pointer, and return the new object.

But how do we make the data available through the Entry class? Since the List class inherits Array, we can push objects to self (put them in the array). We also want to store the values as a hash (which the Entry class inherits), so creating an instance of the Entry class and setting the elements should do the trick. We’ll also want to use symbols as keys instead of strings. More code (put this inside the while loop):

        VALUE entry = rb_class_new_instance (0, 0, cEntry);

        rb_hash_aset (entry,
                      ID2SYM(rb_intern("name")),
                      rb_str_new2(result-&gt;ut_name));

        rb_hash_aset (entry,
                      ID2SYM(rb_intern("pid")),
                      INT2FIX(result-&gt;ut_pid));

        rb_hash_aset (entry,
                      ID2SYM(rb_intern("time")),
                      INT2NUM(result-&gt;ut_time));

        rb_ary_push (self, entry);

To create a symbol, we need to use the ID2SYM macro. This converts an ID VALUE to a Symbol. To get the ID for a string, we use rb_intern(), so ID2SYM(rb_intern("name")) creates a symbol :name for the string name. To set the value of the key, we need to convert the C types to Ruby VALUEs. The function rb_str_new2 takes a C string ( char *str), and makes a Ruby String object. The macro INT2FIX creates a Fixnum object from an integer, and the INT2NUM creates a Fixnum or Bignum from an integer, depending on how big the integer is. At the end of the while loop, we push the new Entry onto the array (self), using rb_ary_push(), which takes an array and an object to push as arguments.

The C code we just wrote maps to this Ruby code (roughly):

entry = Entry.new

entry[:name] = # some value we get with C
entry[:pid]  = # some value we get with C
entry[:time] = # some value we get with C

self.push entry

Adding More Functionality.

I want to add a method to find out if a process is still running (so it’s possible to find out who is logged on the system). To accomodate this, I’ll write a method to the Entry class: Entry.alive? and add a method to the module: Wtmp::alive?. Since both these methods will be sharing some code, I’ll put some of it in a seperate function not available through Ruby.

static int
_alive (pid_t pid)
{
    if (pid <= 0) { return 0; }

    // Straight from last.c
    if (kill (pid, 0) != 0 && errno == ESRCH)
        return 0;
    else
        return 1;
}

To get this to compile, you need to include <signal.h>, <sys/types.h> and <errno.h>. The code is copy pasted from last.c and sends a kill signal to the pid specified (but not SIGINT or something that will actually kill the process). To hook this up in the class and the module, we need to create two new functions:

static VALUE
mWtmp_alive (VALUE self, VALUE arg)
{
    Check_Type (arg, T_FIXNUM);
    pid_t pid = FIX2INT (arg);

    return _alive (pid) ? Qtrue : Qfalse;
}

static VALUE
cEntry_initialize (VALUE self)
{
    rb_hash_aset (self,
                  ID2SYM(rb_intern("pid")),
                  INT2FIX(-1));
    return self;
}

static VALUE
cEntry_alive (VALUE self)
{
    pid_t pid = FIX2INT (rb_hash_aref (self, ID2SYM(rb_intern("pid"))));

    return _alive (pid) ? Qtrue : Qfalse;
}

… and define them as methods of the module and the Entry class:

    // Add this to Init_wtmplist():
    rb_define_module_function (mWtmp, "alive?", mWtmp_alive, 1);
    rb_define_method (cEntry, "initialize", cEntry_initialize, 0);
    rb_define_method (cEntry, "alive?", cEntry_alive, 0);

Time for some more explaining. Firstly, you’ll notice I added a initialize method to Entry as well. This is to set the PID to a unvalid value, so if you just create an empty Entry object (with no PID set), there is no way our alive? method will return true. The code should be recognizable.

The mWtmp_alive() function takes one argument (in addition to VALUE self, which every Ruby functions takes), and checks that this is a Fixnum using the Check_Type macro. This macro raises an ArgumentError exception if the wrong kind of argument is supplied. Then we just convert the Fixnum VALUE to a pid_t (which is an integer on most systems), and call the C function we just wrote with this PID. We then return Qtrue or Qfalse ( TrueClass and FalseClass in Ruby) depending on the result.

In the cEntry_alive() function we do something similar, but instead of getting the PID as an argument to the method, we look it up in the hash which is self with rb_hash_aref(), using self and the key as arguments, and converting the answer to a integer.

To hook it up to the module and the class, we add a couple of lines to Init_wtmplist() to define the module function and the two class methods. The module function takes one parameter (remember the PID argument), and that’s why the last argument is 1 (again: read README.EXT to get an explanation for the rb_define_method() function and the rb_define_module_function() function).

Test Your New Library.

OK. You should of course do this during development, but I wrote it up last anyways. Of course Kjetil and I had to test the functions with Ruby scripts and printf’s in the C code while developing, to iron out bugs and find out how to get things to work. When we “finished”, we (and by we I mean Kjetil) wrote a unit test:

#!/usr/bin/env ruby
# test.rb

require 'wtmplist.so'
require 'test/unit'

class TestWtmpList < Test::Unit::TestCase
  def test_add
    ignore = %q(LOGIN reboot runlevel shutdown)

    list = Wtmp::List.new
    legit = list.delete_if {|x| ignore.include? x[:name]}
    test = legit.last.dup
    test[:pid] = 0
    assert_equal(false, test.alive?)
    assert_equal(false, Wtmp::Entry.new.alive?)
    assert_equal(true, Wtmp::alive?(1))
    assert_equal(false, Wtmp::alive?(-1))
  end
end

This basically creates a new List, removes entries we’re not interested in, and checks some assertions on the methods. For instance we know that a PID of 0 should not exist, and neither should a Entry without a PID specified. The PID 1 will always exist (as this is init), so it’s safe to set that assertion to true.

Running this script gives:

$ ruby test.rb 
Loaded suite unittest
Started
.
Finished in 0.030911 seconds.

1 tests, 5 assertions, 0 failures, 0 errors

Looks like it’s all good. :)

Table of Contents: