Redis Lua scripting is badass
Roughly a year ago Salvatore Sanfilippo the author of Redis
wrote a blog post
discussing the inclusion of Lua as a scripting language. I
finally decided to try this out, and let’s just say it’s pretty badass.
Lua is a great fit for Redis, they have similar philosophies, being simple,
small, and fast. Suppose for example you have 200,000 jobs, each represented
in Redis as a hash, and you want to map/reduce the job duration, the new scripting
capabilities make this really easy!
Here’s the node setup script to generate these jobs:
var redis = require('redis')
, db = redis.createClient();
var n = 500000
, pending = n
, ms;
while (n--) {
ms = Math.random() * 200 | 0;
db.hset('job:' + n, 'duration', ms, function(){
--pending || process.exit();
})
}
Next here is what you might consider scripting in your host language
without the new Redis scripting feature, manually reducing the value:
var redis = require('redis')
, db = redis.createClient();
var n = 200000
, start = new Date
, pending = n
, ms = 0;
while (n--) {
db.hget('job:' + n, 'duration', function(err, n){
if (err) throw err;
ms += ~~n;
--pending || (function(){
console.log('%d minutes spent processing jobs', ms / (1000 * 60) | 0);
console.log('took %ds', (new Date - start) / 1000 | 0);
process.exit();
})();
})
}
On my Air this took roughly 7s, not too great, keep in mind that there is no throttling here I’m just plastering it with 200k commands. Now let’s try it with Lua! The
following script is ad-hoc, but it’ll do the trick. To signal an error all you
have to do is return a table with the err slot. redis.call() is effectively
the public Redis API exposed to your Redis script, so you can use it just like you
would your host language Redis bindings or redis-cli(1).
local sum = 0
for i = 0, 200000, 1 do
local key = "job:" .. i
local ms = tonumber(redis.call("hget", key, "duration"))
if ms == nil then return { err = key .. " is not an integer" } end
sum = sum + ms
end
return sum
Here I’ve embedded it in the JS script, but you could of course generate these,
load them from files etc (beware of redis-injection?).
var redis = require('redis')
, db = redis.createClient();
var script = '\
local sum = 0 \
for i = 0, 200000, 1 do \
local key = "job:" .. i \
local ms = tonumber(redis.call("hget", key, "duration")) \
if ms == nil then return { err = key .. " is not an integer" } end \
sum = sum + ms \
end \
return sum';
var start = new Date;
db.eval(script, 0, function(err, ms){
if (err) throw err;
console.log('%d minutes spent processing jobs', ms / (1000 * 60) | 0);
console.log('took %dms', new Date - start | 0);
process.exit();
});
After running the script with the new EVAL command what previously took
several seconds dropped to ~850ms, much better. EVAL and EVALSHA
are actually a lot more flexible than I’ve explained here, accepting keys
and arbitrary arguments.
Check out antirez.com/post/an-update-on-redis-and-lua.html for more.
Introducing Texty & Super Agent screencast
In this 20 minute screencast we dive into the full-text Redis search library reds for nodejs, an introduction to the canvas-only text editing library Texty, discussing how you can style canvas drawings with CSS, and the “ajax with less suck” library superagent.
Reds - light-weight full text search for nodejs backed by redis
Reds (red-s) is a very small (~300LOC) light-weight full text search library for node.js.
I wrote reds with Kue in mind, a priority job queue for node. I wanted to add search capabilities so you can easily find jobs by any of the arbitrary data provided, names, emails, anything.
API
You can use reds for multiple isolated search indexes, which is why you must pass a key to reds.createSearch(), as it’s used for namespacing.
var search = reds.createSearch('pets');
As mentioned this library could be used with anything, to illustrate this we can even use it with a regular javascript array by indexing the value indices as shown below, where the first value passed to search.index() is the text, and the second is the id.
var strs = [];
strs.push('Tobi wants four dollars');
strs.push('Tobi only wants $4');
strs.push('Loki is really fat');
strs.push('Loki, Jane, and Tobi are ferrets');
strs.push('Manny is a cat');
strs.push('Luna is a cat');
strs.push('Mustachio is a cat');
strs.forEach(function(str, i){ search.index(str, i); });
Within the another process, or the same one, we can then invoke search.query() with a string and callback invoked with possible error
and array of ids. With these ids we can then determine their original values, be it an array, object in another data store etc.
search.query('luna cat', function(err, ids){
if (err) throw err;
console.log('Search results:');
ids.forEach(function(id){
console.log(' - %s', strs[id]);
});
});
Producing:
Search results:
- Luna is a cat
You may also remove an id from the index:
search.remove(id[, callback]);
Implementation
While reds is backed by Redis you can easily use reds with any other data store, as it simply indexes by arbitrary numeric or string ids. This means you could create an index of files on disk, mongodb documents, urls, anything.
The indexing process works like this:
- tokenize words
- strip stop words ("about", "after", "am", "an", ...)
- stem words
- apply metaphone
- add id to metaphone constant set
The process of stemming the words and applying the metaphone algorithm provide leeway so the user does not need exact matches. For example thanks to metaphone the names “steven” and “stephen” both resolve to the constant STFN. The process of stemming reduces variants to it’s stem, for example “counting” becomes “count”, and “waits” becomes “wait”.
When a query is performed the same sequence is applied, resulting in an array of metaphone constants, providing us with the keys necessary to perform the Redis union or intersection to fetch our ids.
Performance
Preliminary benchmarks show that a small 1.6kb body of text is currently indexed in ~6ms, or 163 ops/s. Medium bodies such as 40kb operate around 6 ops/s, or 166ms.
Querying with a multi-word phrase, and an index containing ~3500 words operates around 5300 ops/s. Not too bad.
Indexing performance was decreased by nearly 100% by applying the porter stemmer algorithm in Chris Umbel’s natural library, however reds is still reasonably quick at indexing text. If you have huge documents, you may want to consider allowing users to specify a description instead.
Future
- use sorted sets for ordering and priority
- ranges
- sorting
- perf optimization if necessary
Redis Implemented With Node
Nedis is a (partial) redis implementation written with node. Primarily for fun, however as our team grows larger, and as we add more non-technical team members over at LearnBoost I figured it would be nice help prevent the need for compiling development dependencies.
Nedis is an old side project I had going, and is no where near complete, but it does work, so I figured I would open-source it. Currently Nedis implements the unified Redis protocol which is an brilliantly simple binary-safe protocol that is human and machine friendly.
Using Existing Tools
Currently we use Redis for sessions in our app, so having a drop-in replacement is a great way to get session support for your app without booting up redis-server. For example the nodejs module connect-redis can utilize Matt Ranney’s fantastic redis client without change.
Another neat side-effect is that you can use existing redis tools such as redis-cli to interact with Nedis. First let’s start Nedis with nedis-server:
$ nedis-server
Now we can play with the cli, interacting with node
$ redis-cli
redis> hmset users:tj email tj@vision-media.ca age 23
OK
redis> hgetall users:tj
1) "email"
2) "tj@vision-media.ca"
3) "age"
4) "23"
redis> keys users:*
1) "users:tj"
Note that nedis-server basically consists of no more than the line of js below, so it’s easy to boot from within your process if desired.
nedis.createServer(options).listen(port);
Supported Commands
Below is a list of the commands currently supported by Nedis
- PING
- ECHO
- QUIT
- SELECT
- HLEN
- HVALS
- HKEYS
- HSET
- HMSET
- HGET
- HGETALL
- HEXISTS
- TYPE
- EXISTS
- RANDOMKEY
- DEL
- RENAME
- KEYS
- FLUSHDB
- FLUSHALL
- DBSIZE
- INFO
- BGREWRITEAOF
- GET
- GETSET
- GET
- SETNX
- INCR
- INCRBY
- DECR
- DECRBY
- STRLEN
- APPEND
- SETRANGE
- GETRANGE
- MGET
- MSET
MSETNX
I have yet to do any kind of profiling, heavy optimization, or stress testing. If nothing more hopefully Nedis will help you guys explore Redis, or how you can prototype basic databases with node. Head over to the GitHub repo for installation instructions etc.