Redis Lua scripting is badass

Roughly a year ago Salvatore Sanfilippo the author of Redis wrote a blog post discussing the inclusion of Lua as a scripting language. I finally decided to try this out, and let’s just say it’s pretty badass.

Lua is a great fit for Redis, they have similar philosophies, being simple, small, and fast. Suppose for example you have 200,000 jobs, each represented in Redis as a hash, and you want to map/reduce the job duration, the new scripting capabilities make this really easy!

Here’s the node setup script to generate these jobs:

  var redis = require('redis')
    , db = redis.createClient();

  var n = 500000
    , pending = n
    , ms;

  while (n--) {
    ms = Math.random() * 200 | 0;
    db.hset('job:' + n, 'duration', ms, function(){
      --pending || process.exit();
    })
  }

Next here is what you might consider scripting in your host language without the new Redis scripting feature, manually reducing the value:

  var redis = require('redis')
    , db = redis.createClient();

  var n = 200000
    , start = new Date
    , pending = n
    , ms = 0;

  while (n--) {
    db.hget('job:' + n, 'duration', function(err, n){
      if (err) throw err;
      ms += ~~n;
      --pending || (function(){
        console.log('%d minutes spent processing jobs', ms / (1000 * 60) | 0);
        console.log('took %ds', (new Date - start) / 1000 | 0);
        process.exit();
      })();
    })
  }

On my Air this took roughly 7s, not too great, keep in mind that there is no throttling here I’m just plastering it with 200k commands. Now let’s try it with Lua! The following script is ad-hoc, but it’ll do the trick. To signal an error all you have to do is return a table with the err slot. redis.call() is effectively the public Redis API exposed to your Redis script, so you can use it just like you would your host language Redis bindings or redis-cli(1).

   local sum = 0
   for i = 0, 200000, 1 do
     local key = "job:" .. i
     local ms = tonumber(redis.call("hget", key, "duration"))
     if ms == nil then return { err = key .. " is not an integer" } end
     sum = sum + ms
   end
   return sum

Here I’ve embedded it in the JS script, but you could of course generate these, load them from files etc (beware of redis-injection?).

   var redis = require('redis')
     , db = redis.createClient();

   var script = '\
   local sum = 0 \
   for i = 0, 200000, 1 do \
     local key = "job:" .. i \
     local ms = tonumber(redis.call("hget", key, "duration")) \
     if ms == nil then return { err = key .. " is not an integer" } end \
     sum = sum + ms \
   end \
   return sum';

   var start = new Date;

   db.eval(script, 0, function(err, ms){
     if (err) throw err;
     console.log('%d minutes spent processing jobs', ms / (1000 * 60) | 0);
     console.log('took %dms', new Date - start | 0);
     process.exit();
   });

After running the script with the new EVAL command what previously took several seconds dropped to ~850ms, much better. EVAL and EVALSHA are actually a lot more flexible than I’ve explained here, accepting keys and arbitrary arguments.

Check out antirez.com/post/an-update-on-redis-and-lua.html for more.

Mocha string diffs

Mocha 0.14.0 adds a single feature - string diffs! This is a small but very handy feature for some. When the strings are small, Mocha will use a character diff, when consisting of several lines a line-numbered “gutter” is added and a word diff is used as shown in the following image:

mocha string diff example

This is very useful when authoring things like template engines, transpilers, and other string-based libraries. For example the Stylus test suite is comprised of nothing but acceptance tests, the input file is compiled, and the resulting CSS is checked using actual.trim().should.equal(css);. In combination with Mocha’s BDD interface I can now simply iterate through the files and define test-cases as shown here:

  var stylus = require('../')
    , fs = require('fs');

  // test cases

  var cases = fs.readdirSync('test/cases').filter(function(file){
    return ~file.indexOf('.styl');
  }).map(function(file){
    return file.replace('.styl', '');
  });

  describe('integration', function(){
    cases.forEach(function(test){
      var name = test.replace(/[-.]/g, ' ');
      it(name, function(){
        var path = 'test/cases/' + test + '.styl';
        var styl = fs.readFileSync(path, 'utf8');
        var css = fs.readFileSync('test/cases/' + test + '.css', 'utf8');

        var style = stylus(styl)
          .set('filename', path)
          .include(__dirname + '/images')
          .include(__dirname + '/cases/import.basic')
          .define('url', stylus.url());

        if (~test.indexOf('compress')) style.set('compress', true);

        style.render(function(err, actual){
          if (err) throw err;
          actual.trim().should.equal(css);
        });
      })
    });
  })

Now if something were go to wrong, you get a nice diff!

stylus diff example

Assertion library authors

To support this feature all you have to do is populate err.expected and err.actual with their respective values, Mocha will take care of the presentation.

NOTE: I just toggled the colors, green is now the expected color

commander.js - nodejs command-line interfaces made easy

Commander.js is a small node.js module allowing you to define options and interacte with the user’s terminal in a simple and natural way, inspired by the Ruby library of the same name.

Features

  • self-documenting code
  • auto-generated help
  • combined short flags (“-abc” == “-a -b -c”)
  • option defaults
  • option coercion
  • command parsing
  • prompts

Example

A basic commander program looks something like the following (taken from serve). It’s extremely easy to see what’s going on, all the options provided by the executable are laid out infront of you.

program
  .version('0.0.1')
  .option('-p, --port <port>', 'specify the port [3000]', Number, 3000)
  .option('-H, --hidden', 'enable hidden file serving')
  .option('-I, --no-icons', 'disable file icons')
  .option('-L, --no-logs', 'disable request logging')
  .parse(process.argv);

In the previous example only --port accepts an argument, and the value of program.port defaults to 3000. The options --no-icons and --no-logs default their properties to true, only when --no-icons is specified will program.icons be false.

The usage information is free!:

$ serve --help

  Usage: serve [options]

  Options:

    -v, --version      output the version number
    -p, --port <port>  specify the port [3000]
    -H, --hidden       enable hidden file serving
    -I, --no-icons     disable file icons
    -L, --no-logs      disable request logging
    -h, --help         output usage information

Utilities

Commander is bundled with some utilities for prompting user input, confirmations, passwords, lists of choices etc. Most of these utilities will ask for input if the user simply hits enter and should respond.

Below is an example of asking for a name using a single-line input prompt:

program.prompt('name: ', function(name){
  console.log('hi %s', name);
});

Multi-line input is easy too, just leave out the trailing space in the message:

program.prompt('description:', function(name){
  console.log('hi %s', name);
});

Coercion is useful for dates, numbers etc:

 program.prompt('Age: ', Number, function(age){
  console.log('age: %j', age);
});

Password prompts masking off input:

program.password('Password: ', function(pass){
  console.log('got "%s"', pass);
});

Or providing a mask char:

program.password('Password: ', '*', function(pass){
  console.log('got "%s"', pass);
});

Confirmations require “yes” or “y” to result in true:

 program.confirm('continue? ', function(ok){
   console.log(' got %j', ok);
 });

There’s also choice support, so users can select from a list:

var list = ['tobi', 'loki', 'jane', 'manny', 'luna'];

console.log('Choose the coolest pet:');
program.choose(list, function(i){
  console.log('you chose %d "%s"', i, list[i]);
});

presenting:

Choose the coolest pet:
  1) tobi
  2) loki
  3) jane
  4) manny
  5) luna

Commands

Though I haven’t had time to polish them up yet, commander supports the idea of .. well… “commands”. The “root” executable is an instanceof Command, and well you can recursively define these to create a rich interface. GIT is a great example of this, many larger utilities use sub-command such as git remote to accept arguments, and may all then have their own options etc, using the same API as the root command. The following is a simple example:

#!/usr/bin/env node

var program = require('../');

program
  .version('0.0.1')
  .option('-C, --chdir <path>', 'change the working directory')
  .option('-c, --config <path>', 'set config path [./deploy.conf]')
  .option('-T, --no-tests', 'ignore test hook')

// $ deploy setup stage
// $ deploy setup
program
  .command('setup [env]')
  .description('run setup commands for all envs')
  .action(function(env){
    env = env || 'all';
    console.log('setup for %s env(s)', env);
  });

// $ deploy stage
// $ deploy production
program
  .command('*')
  .action(function(env){
    console.log('deploying "%s"', env);
  });

program.parse(process.argv);

Moar Libraries!

Dont forget to check out these other great CLI-related libraries:

Reds - light-weight full text search for nodejs backed by redis

Reds (red-s) is a very small (~300LOC) light-weight full text search library for node.js.

I wrote reds with Kue in mind, a priority job queue for node. I wanted to add search capabilities so you can easily find jobs by any of the arbitrary data provided, names, emails, anything.

API

You can use reds for multiple isolated search indexes, which is why you must pass a key to reds.createSearch(), as it’s used for namespacing.

 var search = reds.createSearch('pets');

As mentioned this library could be used with anything, to illustrate this we can even use it with a regular javascript array by indexing the value indices as shown below, where the first value passed to search.index() is the text, and the second is the id.

 var strs = [];
 strs.push('Tobi wants four dollars');
 strs.push('Tobi only wants $4');
 strs.push('Loki is really fat');
 strs.push('Loki, Jane, and Tobi are ferrets');
 strs.push('Manny is a cat');
 strs.push('Luna is a cat');
 strs.push('Mustachio is a cat');

 strs.forEach(function(str, i){ search.index(str, i); });

Within the another process, or the same one, we can then invoke search.query() with a string and callback invoked with possible error and array of ids. With these ids we can then determine their original values, be it an array, object in another data store etc.

 search.query('luna cat', function(err, ids){
   if (err) throw err;
   console.log('Search results:');
   ids.forEach(function(id){
     console.log('  - %s', strs[id]);
   });
 });

Producing:

  Search results:
     - Luna is a cat

You may also remove an id from the index:

  search.remove(id[, callback]);

Implementation

While reds is backed by Redis you can easily use reds with any other data store, as it simply indexes by arbitrary numeric or string ids. This means you could create an index of files on disk, mongodb documents, urls, anything.

The indexing process works like this:

- tokenize words
- strip stop words ("about", "after", "am", "an", ...)
- stem words
- apply metaphone
- add id to metaphone constant set 

The process of stemming the words and applying the metaphone algorithm provide leeway so the user does not need exact matches. For example thanks to metaphone the names “steven” and “stephen” both resolve to the constant STFN. The process of stemming reduces variants to it’s stem, for example “counting” becomes “count”, and “waits” becomes “wait”.

When a query is performed the same sequence is applied, resulting in an array of metaphone constants, providing us with the keys necessary to perform the Redis union or intersection to fetch our ids.

Performance

Preliminary benchmarks show that a small 1.6kb body of text is currently indexed in ~6ms, or 163 ops/s. Medium bodies such as 40kb operate around 6 ops/s, or 166ms.

Querying with a multi-word phrase, and an index containing ~3500 words operates around 5300 ops/s. Not too bad.

Indexing performance was decreased by nearly 100% by applying the porter stemmer algorithm in Chris Umbel’s natural library, however reds is still reasonably quick at indexing text. If you have huge documents, you may want to consider allowing users to specify a description instead.

Future

  • use sorted sets for ordering and priority
  • ranges
  • sorting
  • perf optimization if necessary

Jade Mixins & Includes

The latest release of Jade 0.13.0 adds mixins and includes. Jade users have longed for some method of static include, so it’s (finally) here, and compliments mixins nicely, allowing you to store mixins in one or more separate files.

Mixins

A mixin definition takes the form mixin <name> [( params )] block, where params is identical to JavaScript function params, simply a list, as it’s converted to a JavaScript function within Jade to become part of the output function.

Using a mixin is identical to the definition, however omitting the block. Mixins allow you to encapsulate repeated chunks of a template, such as form fields as shown in the following snippet. This example has a local variable user passed to the template with the properties { name: '...', email: '...' }, which is then accessible throughout any mixins defined without explicitly passing it (though you may if you like).

mixin field(type, name, label)
  .field(class='field-' + type)
    label #{label}:
    input(type=type, name='user[#{name}]', value=user[name])

form
  mixin field('text', 'name', 'Username')
  mixin field('text', 'email', 'Email')
  mixin field('password', 'pass', 'Password')
  input(type='submit', value='Sign Up')

Outputting:

<form>
  <div class="field field-text">
    <label>Username:</label>
    <input type="text" name="user[name]" value="TJ Holowaychuk"/>
  </div>
  <div class="field field-text">
    <label>Email:</label>
    <input type="text" name="user[email]" value="tj@learnboost.com"/>
  </div>
  <div class="field field-password">
    <label>Password:</label>
    <input type="password" name="user[pass]"/>
  </div>
  <input type="submit" value="Sign Up"/>
</form>

Includes

The include <path> directive signals Jade to read and parse the file, then return it’s root node (a block), injecting it into the calling template, as if it were written in the same file. The <path> given is relative to the dirname of the calling template, which is exposed to Jade via the filename option, which is populated by jade.renderFile() and Express, otherwise you need to supply this.

For example our mixin(s) could live in ./mixins, and our previous example could look something like below:

include mixins/form-helpers

form
  mixin field('text', 'name', 'Username')
  mixin field('text', 'email', 'Email')
  mixin field('password', 'pass', 'Password')
  input(type='submit', value='Sign Up')

Or the classic header / footer example, first index.jade:

html
  include includes/head  
  body
    h1 My Site
    p Welcome to my super lame site.
    include includes/foot

includes/foot.jade:

#footer
  p Copyright (c) foobar

includes/head.jade:

head
  title My Site
  script(src='/javascripts/jquery.js')
  script(src='/javascripts/app.js')

That’s all for now, hopefully I’ll have some time in the near future to clean things up and get 1.0 (finally again :)) out the door.