home shape

Using mruby as Alternative: Server-Side | ArangoDB Blog 2012

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

Introduction

One of the design goals of AvocadoDB is:

Use AvocadoDB as an application server and fuse your application and database together for maximal throughput

In order to act as application server, AvocadoDB has a build in language, currently JavaScript. As mruby became publically available a few hours ago, I started to see if can be viable alternative. This blog post summarizes my finding.

Status Quo

AvocadoDB acts as HTTP server. Each request is parsed, routed, and passed to a small piece of JavaScript, which generates an appropriate response. This can be as simple as

actions.defineHttp({
  url : "hallo",
  context : "api",

  callback : function (req, res) {
    var result = "World";

    actions.resultOk(req, res, actions.HTTP_OK, result);
  }
});

Note that no database access is involved in the above example. A more complex example

actions.defineHttp({
  url : "world",
  context : "api",

  callback : function (req, res) {
    var collection = db._collection(req.parameter.collection);

    if (collection == null) {
      ...
    }
    else {
      ...
    }

    actions.resultOk(req, res, actions.HTTP_OK, result);
  }
});

also accesses the database.

Using JavaScript for Actions and Transaction has the advantage that one can use one language from Front-End to the Storage-Engine. For example, JavaScript in Chrome together with jQuery for the browser, JavaScript in NODE.JS and express for the Web-Server, and JavaScript in the storage engine & app-server AvocadoDB. This is an option we already support.

However, there are also drawbacks:

  • JavaScript is a “complicated” language: not necessarily for the programmer but for the JavaScript engines like V8.
  • V8 itself is quite complex: dealing with isolates, contexts, handle scopes can lead to some amount of C++ code. Making it even hard to understand the V8 C++ code.
  • This makes it hard to use V8 in a multi-threaded environment. Presumable one of the reasons, why NODE.JS decided to stay single-threaded. AvocadoDB assume that it is not just a storage engine, but also needs to execute code to implement business logic. Next to the data, avoiding any network traffic. As a consequence it is multi-threaded.

Because of these complications I started looking for alternatives. Fortunately, just a few hours ago (2012-04-20 04:00) a first version of mruby was released. While the goals was to bring Ruby to mobile devices, mruby IMHO has some unique features that also make to very suitable for the above task – if anyone with deeper knowledge spots any mistakes, please let me know:

  • mruby is plain and simple C code
  • documentation is still sparse, but the C code itself is well-structured, so that it is easy to feel “at home”
  • in V8 it is very hard to create JavaScript wrappers for C++ objects. It seems that one of the design goals of mruby is to make this as easy as possible
  • mruby does not support threads – and yes, for me that is an advantage. I need to synchronize the executed code and the database access. Therefore have a language which does not support threads is very helpful

Minimalistic Ruby

When mruby come out a few hours ago, I decided to try a proof of concept. I wanted to implement at least the following features modeled after the JavaScript interface:

  • create a variable holding the current database “$db”
  • create a method “_collection” for the database to access a named collection
  • create a method “document” for the collection to access a document

The similar task in V8 took me quite some time to implement, with mruby it was much faster.

DataMapper / ActiveRecords

Note that this proof of concept is different from the idea of AvocadoDB API (aka adapter) currently discussed at Ashikawa. I’m not planning to implemented something similar to the DataMapper or ActiveRecords. The idea here is to implement the basics. It might be possible and useful to work with these basics to build a full ODM – or it might not.

Crossing the bridge between C and mruby

(1) Download the Source

Download the code from github: https://github.com/mruby/mruby

(2) Compile

Compile everything: I’ve tried the current version under Linux and MacOS

(3) REP

Create a basic read-eval-print loop

extern "C" {
#include "mruby.h"
#include "mruby/proc.h"
#include "mruby/data.h"
#include "compile.h"           // not available in "include"
}

int AvocadoServer::executeRubyShell () {
  struct mrb_parser_state *p;
  mrb_state *mrb;
  int n;
  string str;

  // create a new ruby shell
  mrb = mrb_open();

  while (true) {
    getline(cin, str);

    if (cin.eof()) {
      cout << "Bye Bye! Auf Wiedersehen! さようなら\n";
      break;
    }

    cout << "INPUT: " << str << "\n";

    p = mrb_parse_string(mrb, const_cast(str.c_str()));

    if (p == 0 || p->tree == 0 || 0 < p->nerr) {
      cout << "UPPS!\n";
      continue;
    }

    n = mrb_generate_code(mrb, p->tree);

    if (n < 0) {
      cout << "UPPS 2: " << n << "\n";
      continue;
    }

    mrb_run(mrb, mrb_proc_new(mrb, mrb->irep[n]), mrb_nil_value());

    if (mrb->exc) {
      cout << "OUTPUT\n";
      mrb_funcall(mrb, mrb_nil_value(), "p", 1, mrb_obj_value(mrb->exc));
      cout << "\nOUTPUT END\n";
    }
  }

  return EXIT_SUCCESS;
}

Note that “compile.h” does not live in “include”. This code gives you a very simple shell – you must enter the program as one-line-program 🙂

fceller@huerth:~/AvocadoDB> ./avocado /tmp/vocbase --ruby-console -l warn
1 + 2
INPUT: 1 + 2
p 1 + 2
INPUT: p 1 + 2
3
$i = 1
INPUT: $i = 1
p $i
INPUT: p $i
1
$i += 2
INPUT: $i += 2
p $i
INPUT: p $i
3
Bye Bye! Auf Wiedersehen! さようなら

(4) Wrap C classes

Wrap the internal database variable into a mruby class. There was a post by Yukihiro Matsumoto on how to do it. So, you need a pointer to class you want to store. Easy in my case, the pointer lives in “_vocbase”. You need a structure describing what to do, when the variable is garbage collected (yes, compared to V8 it is THAT easy). In my case also trivial, because “_vocbase” points to singleton.

You also need a class. So I looked at the code to find the following example to create a class

mrb_value MR_AvocadoDatabase_Inialize (mrb_state* mrb, mrb_value exc) {
  printf("initializer of AvocadoDatabase called\n");
  return exc;
}

static void MR_AvocadoDatabase_Free (mrb_state* mrb, void* p) {
  printf("free of AvocadoDatabase called\n");
}

static const struct mrb_data_type MR_AvocadoDatabase_Type = {
  "AvocadoDatabase", MR_AvocadoDatabase_Free
};


int AvocadoServer::executeRubyShell () {
  ...

  // define the Ruby class "AvocadoDatabase"
  struct RClass* AvocadoDatabaseClass = mrb_define_class(mrb, "AvocadoDatabase", mrb->object_class);

  mrb_define_method(mrb, AvocadoDatabaseClass, "initialize", MR_AvocadoDatabase_Inialize, ARGS_ANY());
  ...
}

Testing the code give:

fceller@huerth:~/AvocadoDB> ./avocado /tmp/vocbase --ruby-console -l warn
p $db
INPUT: p $db
#
a = AvocadoDatabase.new
INPUT: a = AvocadoDatabase.new
initializer of AvocadoDatabase called

So, I got my variable “$db” holding my database instance. The new should produce a singleton, but that is for later.

(5) Unwrap

Create a method returning a collection

mrb_value MR_AvocadoDatabase_Collection (mrb_state* mrb, mrb_value exc) {
  char* name;
  TRI_vocbase_t* vocbase;
  TRI_vocbase_col_t* collection;
  struct RData* rdata;

  // check "class.c" to see how to specify the arguments
  mrb_get_args(mrb, "s", &name);

  if (name == 0) {
    return exc;
  }

  // check 

  printf("using collection '%s'\n", name);

  // looking at "mruby.h" I assume that is the way to unwrap the pointer
  rdata = (struct RData*) mrb_object(exc);
  vocbase = (TRI_vocbase_t*) rdata->data;
  collection = TRI_FindCollectionByNameVocBase(vocbase, name, false);

  if (collection == NULL) {
    printf("unknown collection (TODO raise error)\n");
    return exc;
  }
  
  return mrb_obj_value(Data_Wrap_Struct(mrb, AvocadoCollectionClass, &MR_AvocadoCollection_Type, (void*) collection));
}


int AvocadoServer::executeRubyShell () {
  ...
  // add a method to extract the collection
  mrb_define_method(mrb, AvocadoDatabaseClass, "_collection", MR_AvocadoDatabase_Collection, ARGS_ANY());
  ...
}

Unwrapping the pointer uses the macro “mrb_object”. Extracting the arguments to function, uses “mrb_get_args”. That finishes my proof of concept. Accessing a document is identical to accessing a collection.

fceller@opensuse114:~/AvocadoDB> ./avocado /tmp/vocbase -l warn --ruby-console
p $db
INPUT: p $db
#<AvocadoDatabase:0xfb5de8>
p $db._collection("example") 
INPUT: p $db._collection("example")
using collection 'example'
#<AvocadoCollection:0xfb5c28>

Open Todos

  • try proper error handling
  • check if the GC works as expected
  • find out how to implement a real console

Conclusion

It is much easier to implement the basic functionality with mruby as compared to V8. So, I will continue my experiments. mruby is still in an early phase and change quickly – but it already seems usable.

Frank Celler

Frank Celler

Frank is both entrepreneur and backend developer, developing mostly memory databases for two decades. He is the CTO and co-founder of ArangoDB. Try to challenge Frank asking him questions on C, C++ and MRuby. Besides Frank organizes Cologne’s NoSQL group & is an active member of NoSQL community.

3 Comments

  1. Roco on April 21, 2012 at 12:11 am

    Great post. Looking forward to using AvocadoDB more.

    * I thought the m in mruby was for minimal not mobile.

    • Frank Celler on April 21, 2012 at 11:52 am

      Fixed. Too many abbreviations, I got lost.

  2. Coder on April 24, 2012 at 8:32 pm

    Have you look at google’s new language – Dart? Looks like it would be a good fit for you. http://www.dartlang.org/
    Full disclosure – I’m not in any way connected with google or the dart team.

Leave a Comment





Get the latest tutorials, blog posts and news: