Asynchronous C++ Addon for Node.js with N-API and node-addon-api

The ability to use C/C++ codebase in your Node.js application could be a game changer in the long run. A number of scenarios exist where the use of C/C++ can be very rewarding:

  • placing long-running tasks on their own thread
  • significantly improving performance of CPU-bound tasks
  • harnessing the vast amount of mature and time-tested C/C++ code

Node.js (and v8 for that matter) is actually implemented in C/C++ so it seems quite natural that third party C/C++ code could be invoked from within the Javascript Node.js executes. Nonetheless it is not really a fully automated process. A number of caveats exist that if not handled properly could significantly diminish the benefits. The biggest one being that Node.js and underlying v8 APIs exhibit major changes on every new release. That means that a C/C++ addon has to be rewritten with every new release of Node.js. That is not very efficient.

Node.js Native Addon Abstraction Layer Options

Given the lack of Node.js C++ API stability creating a stable abstraction layer to avoid modifying addons with each new release of Node.js seems natural. Not surprisingly there are multiple abstraction layer options available. Two of the most popular at present seem to be N-API and NAN.

While NAN has been around for a very long time and is battle tested it is a third party addon that needs to be updated with each new Node.js version. N-API is part of Node.js and is maintained by the Node.js team. It also guarantees ABI compatibility with any future Node.js version. I.e. a C/C++ node addon built with N-API should never have to be rebuilt when updating Node.js. It will just keep on working out of the box.

Thus N-API is very promising technologically. It is still somewhat under documented though. So in this blog we are going to focus on N-API.

N-API Experimental Stage Disambiguation

N-API was introduced as experimental feature in Node.js 8. It went out of experimental stage in Node.js 10 but at that point it did so retroactively for Node.js 8 future releases, too. That is the root of a lot of confusion since conflicting evidence is found on whether N-API is experimental on Node.js 8 or not.

In short the answer is N-API is experimental on all releases of Node.js 8 up until Node.js 8.12. In 8.12 and later it is considered stable and no more warnings are issued.

N-API and node-addon-api

While very powerful N-API is plain C API which is not the most convenient option if one is creating a C++ addon. node-addon-api module steps in to fill this gap. This is a C++ thin wrapper of the plain C N-API and it is provided and maintained by the Node.js team same as N-API itself. The N-API ABI compatibility is unaffected by the use of the C++ node-addon-api since node-addon-api interacts with node only via N-API.

The Sample Task

We will be showcasing and ubiquitous scenario - a Node.js application receives big chunk of data (from HTTP request, WebSocket, etc.), manipulates that data and sends the result back. For the sake of simplicity data will be read and written from/to a file (~90MB) and the only manipulation on the data will be doubling the value of every byte. However, our native addon will do the data manipulation asynchronously thus leaving Node.js capable of handling other requests in the meantime.

Tools

We are going to use Node.js latest take on writing native addons - N-API. Since, for the sake of convenience, we would like to implement our native add on in C++ instead of plain C we are also going to need the node-addon-api module described above.

Of course we are going to need Node.js itself and node-gyp in order to build our native addon.

Implementation

For the sake of comparison we are going to implement our sample task both in plain Javascript and as native Node.js addon. This will allow us to do a performance and complexity comparison.

Plain Javascript

plain-javascript-app.js

console.time('Program runtime');

const fs = require('fs');

const buf = fs.readFileSync('test-data');

console.time('Data manipulation');
for (let i = 0; i < buf.length; i++) {
    buf[i] *= 2;
}
console.timeEnd('Data manipulation');

fs.writeFileSync('test-data-modified', buf);

console.timeEnd('Program runtime');

Evidently the Javascript codebase is very small and simple. We just read file called "test-data" in a Buffer, double every single element and write it back to "test-data-modified".

N-API/node-addon-api C++ Node.js Native Addon

Using the Addon in Javascript

This is the JS code needed to implement the same functionality as above, but using a N-API C++ addon to do the actual data manipulation.

native-addon-app.js

console.time('Program runtime');

const fs = require('fs');

const addon = require('./build/Release/addon.node');

const buf = fs.readFileSync('test-data');

console.time('Time spent by native addon on main event loop thread');
console.time('Data manipulation');

addon.processData(buf, () => {
    console.timeEnd('Data manipulation');

    fs.writeFileSync('test-data-modified', buf);

    console.timeEnd('Program runtime');
});

console.timeEnd('Time spent by native addon on main event loop thread');

const addon = require('./build/Release/addon.node'); requires our C++ N-API native addon going by the name addon.node. There is absolutely no difference in syntax with requiring a JS module. Just like a JS module our native module exports some functions and all the exports of our native module are saved to the addon variable. Then we can use these exports as regular JS functions no matter that they are implemented in C++ - e.g. addon.processData(buf, () => {.

We have also added a new performance measurement - the time spent on the main Node.js event loop. We are making an attempt at quantifying one of the advantages of our native addon - the ability to offload a CPU intensive tasks to another thread.

The C++ Code

The C++ native addon implementation introduces quite some infrastructural overhead. We have intentionally kept the actual data manipulation logic very simple so that we can focus on the N-API infrastructural code.

The setup of our native addon is straightforward.

Addon.cc

#include <napi.h>

#include <DataProcessingAsyncWorker.h>

using namespace Napi;

void ProcessData(const CallbackInfo& info) {
    Buffer<uint8_t> data = info[0].As<Buffer<uint8_t>>();
    Function cb = info[1].As<Function>();

    DataProcessingAsyncWorker *worker = new DataProcessingAsyncWorker(data, cb);
    worker->Queue();
}

Object Init(Env env, Object exports) {
    exports.Set(String::New(env, "processData"),
                Function::New(env, ProcessData));
    return exports;
}

NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)

The NODE_API_MODULE macro defined in napi.h is the way to register a module with N-API and define the module's initialization function. At compile time node-gyp will define NODE_GYP_MODULE_NAME to be the addon name as specified in bindings.gyp. The second argument of the NODE_API_MODULE macro is the function N-API will call when initializing our native module. In our case this is the Init function. This is where we export a JS function under the name processData and define its behavior to be calling the C++ ProcessData function defined above.

The ProcessData C++ function is the other important part of our native addon. This is where the async functionality of our native addon is exported via the DataProcessingAsyncWorker variable. It is worth mentioning that CallbackInfo type is the node-addon-api way to encapsulate the variable number of arguments every Javascript function takes. The CallbackInfo has an indexer operator and makes retrieving argument at position n as simple as info[n]. Another interesting observation is that we expect the second argument of ProcessData to be of type Function - that is the type node-addon-api uses to encapsulate a JS functions. In our example this function is the JS callback to be called when the async task is done.

The gist of our native addon is the async worker implementation. Before we delve into that, though we should consider a problem that Node.js being inherently single threaded never exhibits. Namely, how access to memory shared by the Node.js event loop and the worker threads introduced by the async addon gets synchronized.

So calling methods on objects exposed from the V8/Node C++ API (e.g. Function, Buffer, ObjectReference, etc) while on worker thread will result in crashes. Those should be accessed only while on the main thread. In C/C++ it is of course possible to get a pointer to the memory occupied by such objects and manipulate the memory directly. The consequences of doing that, however, are unpredictable (for one thing Javascript has garbage collector that moves stuff around). In general neither N-API nor node-addon-api seem to offer thread safe objects or APIs. I.e. the scenario where Javascript objects or functions are accessed by C++ code running in a worker thread in a native addon is not really covered and seems to be discouraged.

In theory, though, one can expose the thread synchronization primitives from C/C++ to Javascript via the native addon and deal with such a scenario. One has to be careful with Javascript created objects nonetheless because of the garbage collector and heap compression. Node.js Buffer seems to be a good candidate to use for multi thread access since it lives outside the v8 heap in Node.js and thus is not a subject to move due to heap compression. It is still managed by the v8 garbage collector though so one has to control its lifetime if using it in a native addon.

Leaving the concurrent access scenario aside and assuming sequential access the fact that Node.js buffer is allocated outside of the v8 heap makes (Instances of the Buffer class are similar to arrays of integers but correspond to fixed-sized, raw memory allocations outside the V8 heap.) it an inherently good candidate for passing data to a native addon since a pointer to a buffer can be safely manipulated.

DataProcessingAsyncWorker.h

#include <napi.h>

using namespace Napi;

class DataProcessingAsyncWorker : public AsyncWorker
{
    public:
        DataProcessingAsyncWorker(Buffer<uint8_t> &data,
                                  Function &callback);

        void Execute();

        void OnOK();

    private:
        ObjectReference dataRef;
        uint8_t *dataPtr;
        size_t dataLength;
};

DataProcessingAsyncWorker.cc

#include <DataProcessingAsyncWorker.h>

DataProcessingAsyncWorker::DataProcessingAsyncWorker(Buffer<uint8_t> &data,
                                                        Function &callback) : AsyncWorker(callback),
                                                                              dataRef(ObjectReference::New(data, 1)),
                                                                              dataPtr(data.Data()),
                                                                              dataLength(data.Length())
{
}

void DataProcessingAsyncWorker::Execute()
{
    for (size_t i = 0; i < dataLength; i++)
    {
        uint8_t value = *(dataPtr + i);
        *(dataPtr + i) = value * 2;
    }
}

void DataProcessingAsyncWorker::OnOK()
{
    Callback().Call({});

    dataRef.Unref();
}

Inheriting from the node-addon-api provided AsyncWorker class allows DataProcessingAsyncWorker to execute code on worker thread different than the Node.js main event loop.

Let's start from the top. The constructor - it accepts reference to Buffer<uint8_t> which contains the data to be manipulated and a reference to Function which is the JS callback that should be fired when the async task is done. The callback argument is passed directly to the parent class. From the data argument we create an ObjectReference which we can use to get a pointer to the underlying data and get the length of the buffer and save these as fields for later usage.

There is another important benefit of creating a ObjectReference to the Buffer we get from the JS side. The Buffer is created inside JS code and thus it is eligible for garbage collection. By creating ObjectReference to the Buffer we are increasing its reference count. So even if there are no more references to this Buffer in JS the GC will not collect it until we call Unref(). We have to call Unref() for each ObjectReference we create, though, or the object referenced by the ObjectReference will never get collected and introduce a memory leak.

The Execute method is where all the business logic goes - in our case the byte doubling. This method overrides the Napi::AsyncWorker::Execute method and as such will be executed on a worker thread different than the Node.js event loop. As already noted - we cannot call methods of the Buffer class on the worker thread, i.e. in the Execute method. That's why we get pointer to the data using the Data() method and the length using the Length() method in the constructor of DataProcessingAsyncWorker. The constructor was called on the main thread, i.e. the Node.js event loop.

The OnOK method is an override of Napi::AsyncWorker::OnOK and will be called if Napi::AsyncWorker::SetError does not get called in the Execute method. There is also a Napi::AsyncWorker::OnError method, which can be implemented to handle the error reported by Napi::AsyncWorker::SetError. The important thing here is that both OnOK and OnError will be called on the Node.js main event loop and not on the worker thread Execute runs on. Thus these methods can make use of Node.js/v8 objects, e.g. fire a JS callback. So in OnOK we are calling the callback function passed as argument to the constructor and Unref the ObjectReference we are holding to the Buffer. This will tell the garbage collector that we are no longer using this object and it can be collected.

Performance

A performance comparison between the native addon and the plain JS implementation might be of significant interest. That would allow us to make judgement whether the extra complexity of developing the native addon would be worth it.

The execution time was measured on i7 CPU laptop running Ubuntu 18.04.

The Asynchronous Native Addon

Average of 100 runs:

Time spent in addon on main event loop thread: 0.2666ms
Data manipulation: 47.0888ms
Program runtime: 168.2024ms

The Plain JS

Average of 100 runs:

Data manipulation: 4123.7252ms
Program runtime: 4241.3536ms

Evidently the overall performance improvement of the native addon over the Javascript implementation is 25+ times. Even more importantly with the native addon the time which we got the event loop blocked is under third of a millisecond as compared to the 4+ seconds in pure JS.

Native Addon Distribution

Node.js addons are distributed like every other NPM package. On install node-gyp will be called and the native C/C++ code will be compiled. There is one extra step that needs to be taken in order to make your package consumable with require(). Your package's main.js should export the native addon like so - module.exports = require('./build/Release/addon.node');. The only problem with that is the relative path. Throughout the years the node-gyp build output path has changed several times causing troubles to the native addons developers. This naturally led to development of several packages resolving that problem. Popular example is bindings. It takes care of the relative path for you - module.exports = require('bindings')('addon.node');.

It should also be noted that alternatives to node-gyp exist like Node CMake, etc.

The code

All the code in this article can be found in CodeMerx's code-samples repo on GitHub.

This Post Has 3 Comments

  1. Paulo Coghi

    Excellent article! Well explained! Thanks!

  2. Ramón Hoyo

    Good post, I’m curious about of create a process counter, for example, every 5% then use a callback to show in console the current process. Can you show a very simple demostration of this point?

Leave a Reply