Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node-api: use c-based api for libnode embedding #54660

Draft
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

vmoroz
Copy link
Member

@vmoroz vmoroz commented Aug 30, 2024

Note: this is an active work in progress and there are still a lot of code churning. You are welcome to comment on the code and share your thoughts, but please be aware that the code is not final yet.

This is a temporary spin off from the PR #43542.
This separate PR is created to simplify merging and rebasing with the latest code while we discuss the new API design.
When the code is ready it should be merged back to PR #43542.

The goal of the original PR is to enable C API and the Node-API for the embedded scenarios.
The C API allows using the shared libnode from runtimes that do not interop with C++ such as WASM, C#, Java, etc.
This PR works towards the same goal with some changes to the original code.

This is the related issue #23265.

The API design principles

  • Follow the best practices of the Node-API design and provide a way to interop with it.
  • Prefix the new API constructs with node_embedding_.
  • Design the API for ABI safety and being future proof for new requirements.
    • Follow the Builder pattern for the API design.
    • The typical use is to create an object, configure it, initialize it based on the configuration, use it, and then delete it. The configuration changes are prohibited after the object is initialized.
    • What if the initialization sequence must be customized? It means that we add a new configuration function and insert a customization hook into the initialization sequence. Thus, we can evolve the API by adding new configuration functions, and occasionally deprecating the old functions.
    • All behavior changes must be associated with a new API version number.

The API usage

  • To use the C embedding API, we must create, configure, and initialize the global node_embedding_platform. It initializes Node and V8 JS engine once per process and parses the CLI arguments.
  • Then, we create, configure, and initialize one or more node_embedding_runtimes. A runtime is responsible for running JavaScript code.
  • The runtime CLI arguments are initialized by default with the args and exec_args from the result of the platform initialization. They can be overridden while configuring the runtime.
  • A runtime can run in its own thread, several runtimes can share the same thread, or the same runtime can be run from multiple threads.
  • The runtime event loop APIs provide control over the runtime execution. These functions can be called many times because they do not destroy the runtime in the end.
  • The runtime offers to specify version of Node-API and to retrieve the associated napi_api instance. Any Node-API code that uses the napi_env must be run in the runtime scope controlled by node_embedding_runtime_open_scope and node_embedding_runtime_close_scope functions.

The API overview

Based on the use scenarios, the API can be split up into six groups.

Error handling API

  • node_embedding_on_error sets the global error handling hook.

Global platform API

  • node_embedding_set_api_version
  • node_embedding_run_main
  • node_embedding_create_platform
  • node_embedding_delete_platform
  • node_embedding_platform_set_flags
  • node_embedding_platform_get_parsed_args

Runtime API

  • node_embedding_run_runtime
  • node_embedding_create_runtime
  • node_embedding_delete_runtime
  • node_embedding_runtime_set_flags
  • node_embedding_runtime_set_args
  • node_embedding_runtime_on_preload
  • node_embedding_runtime_on_start_execution
  • node_embedding_runtime_add_module
  • add API to handle unhandled exceptions

Runtime API to run event loops

  • node_embedding_runtime_set_task_runner
  • node_embedding_run_event_loop
  • node_embedding_complete_event_loop
  • node_embedding_terminate_event_loop
  • add API for emitting beforeExit event
  • add API for emitting exit event

Runtime API to interop with Node-API

  • node_embedding_run_node_api
  • node_embedding_open_node_api_scope
  • node_embedding_close_node_api_scope

Documentation

  • The new C embedding API is added to the existing embedding.md file after the C++ embedding API description.
  • The index.md is changed to indicate that the embedding.md has docs for C++ and C APIs.
  • TODO: complete the examples section.

Tests

  • The new C embedding API tests pass the same scenarios as the C++ embedding API tests.
  • The embedtest executable can be run in several modes controlled by the first CLI argument. It effectively contains several main functions for different test scenarios.
  • The JS test code is changed to provide the test mode argument based on the scenario.
  • Added several new test scenarios:
    • run several Node.js runtimes each in its own thread;
    • run several Node.js runtimes all in the same thread;
    • run Node.js runtime from different threads.
    • test that preload callback is called for the main and worker threads.

The PR status

The code is not 100% complete yet. There are still a few TODO items, but I would like to start a discussion with the Node-API team about the new API.

  • Address outstanding TODOs
    • Allow running Node.js uv_loop from UI loop. Follow the Electron
      implementation. - Complete implementation for non-Windows.
    • Can we use some kind of waiter concept instead of the
      observer thread?
    • Generate the main script based on the runtime settings.
    • Set the global Inspector for he main runtime.
    • Start workers from C++.
    • Worker to inherit parent Inspector.
    • Cancel pending event loop tasks on runtime deletion.
    • Can we initialize platform again if it returns early?
    • Test passing the V8 thread pool size.
    • Add a way to terminate the runtime.
    • Allow to provide custom thread pool from the app.
    • Consider adding a v-table for the API functions to simplify
      binding with other languages.
    • We must not exit the process on node::Environment errors.
    • Be explicit about the recoverable errors.
    • Store IsolateScope in TLS.
  • Review the API design
  • Write docs

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/node-api

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Aug 30, 2024
@vmoroz vmoroz marked this pull request as draft August 30, 2024 14:58
@legendecas legendecas added the node-api Issues and PRs related to the Node-API. label Aug 30, 2024
// Skip printing output for --help, --version, --v8-options.
node_api_platform_no_print_help_or_version_output = 1 << 12,
// Initialize the process for predictable snapshot generation.
node_api_platform_generate_predictable_snapshot = 1 << 14,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have an option which is something like

node_api_platform_nodejs_binary_default

which gives you the same configuration that is present for the node.js binary

typedef struct node_api_env_options__* node_api_env_options;

typedef enum {
node_api_platform_no_flags = 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since a bunch of them seem to disable specific flags should there be an all_flags, or are they all on by default and then there are no/disable flags only?

Copy link
Member Author

@vmoroz vmoroz Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the approach since our last Node-API meeting. These flags are 1-to-1 mapping to the flags defined in the node.h. The default is the no_flags configuration. Then, embedders can disable some default Node.js features.
We can add an alias for the no_flags as a default_flags.

src/node_api_embedding.cc Outdated Show resolved Hide resolved
return napi_ok;
}

napi_status NAPI_CDECL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an engine does not support snapshots, can it just do nothing in the snapshot functions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so. Maybe we can change it in a way that the snapshot can be just a JS text. In JSI they use term "prepared JavaScript" for the same purpose. The only question if we want this API to be Node-specific, or we rather target it to be Runtime/engine independent. E.g. I use this API with the jsr_ prefix across the V8 and Hermes JS engines (it is also based on the Node-API): https://github.com/microsoft/v8-jsi/blob/master/src/node-api/js_runtime_api.h

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are already using it, being cross runtime might makes sense, just need to makes sure its easy for a platform to not support it and still have the same code run.

return std::move(env_setup_);
}

napi_status OpenScope() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that a scope is something different than a handle_scope - https://nodejs.org/api/n-api.html#napi_handle_scope, just wondering if there might be confusion between the concepts?

Copy link
Member Author

@vmoroz vmoroz Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is different. When we are inside of a module we already have some current v8::Isolate and v8::Context. We do not have them when we are outside and operating with the environment. So, we must establish them to use any V8/Node API. In the standalone v8-jsi project I used a function jsr_run_task that opens/closes the scope internally. (edit: I see that the v8-jsi also has the open/close scope. It is convenient to use when we do not want to create a lot of lambdas.)

doc/api/embedding.md Outdated Show resolved Hide resolved
doc/api/n-api.md Outdated Show resolved Hide resolved
src/node_api_embedding.h Outdated Show resolved Hide resolved
return napi_ok;
}

napi_status NAPI_CDECL node_api_open_env_scope(napi_env env) {
Copy link
Member

@mhdawson mhdawson Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if all functions which are only to be called as part of embedding versus in an add-on implementation should have some extra bit in the name. For exampe in this method node_api_embed_open_env_scope

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This open/close scope API gives us a lot of flexibility, but it is difficult to use and like you said it is quite confusing.
I am currently considering to replace it with a function that receives a lambda (c function + void state), and then the napi_env will be available only for that lambda. Other APIs will change from using napi_env to something like node_embedding_env or node_embedded_env.

return napi_ok;
}

napi_status NAPI_CDECL node_api_await_promise(napi_env env,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought this might be an extension to the promise support we already have - https://nodejs.org/api/n-api.html#promises

This is a good example were I think we needed the embed or something else in the name as otherwise people might get confused and think it could be called from an addon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe the prefix should be node_embedding_api_XXX

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we shorten it to the node_embedding_?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with node_embedding_

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All API is changed to use the napi_embedding_ prefix.


// Runs the Node.js runtime event loop.
NAPI_EXTERN node_embedding_exit_code NAPI_CDECL
node_embedding_runtime_run_event_loop(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though libuv doesn't provide ABI guarantees, I think we should expose uv in embedder C API instead of wrapping it, allowing more flexibility of this C API since embedders have more control over libnode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with you that it can provide more flexibility, my concern is that non-C/C++ users have to bind to a much broader set of APIs which like you said may not be ABI safe.
Also, while we offer functions that seems to do the same as the uv_run, in practice they may be doing some more Node.js work such as draining the v8::Isolate work items.

Currently the node_api.h exposes the uv_loop_t associated with the napi_env. Technically, users can get it today. Though, we were discussing to deprecate it.

I also would like to explore a scenario where the UV loop is not used for processing the task queue, but we rather offer something like the V8 foreground task runner where the embedder API is responsible for pumping the task queue.

Thus, my proposal is to wait for a scenario where exposing the raw uv_loop_t is required before we add it.
Adding new API is easy, deprecating is more difficult.

@vmoroz
Copy link
Member Author

vmoroz commented Sep 20, 2024

We have discussed the API today 09/20/2024 with @mhdawson. The key take aways:

  • It is not clear how to use the new node_embedding_on_wake_up_event_loop. Its goal is to enable running UV loop tasks in app's UI event loop. It is not obvious how to use it. After the discussion and replying to @legendecas feedback, I started to consider replacing it with a V8-like "foreground task runner" concept. It is being currently used for the V8 ABI safe API based on Node-API.
  • It would be great to provide the key scenarios which this API targets to address.
  • An API function that supposed to aggregate other functions must use them for its implementation rather than calling existing Node.js aggregating implementations. E.g. the node_embedding_complete_event_loop must use node_embedding_run_event_loop and other currently missing functions to raise Node.js beforeExit and exit events. This way we can validate that we expose the right APIs and developers can use the low level functions without hitting a wall.
  • We also discussed an idea to replace the "callback+data" pairs with small structs. Hopefully it can make the API easier to use from C++ and other languages, and reduce the number of parameters in some cases.
  • The API is still churning. It is probably worth to get another review pass in a couple of weeks.

@vmoroz
Copy link
Member Author

vmoroz commented Sep 23, 2024

Does the new C-based embedding API has a goal to do the same as the C++ embedding API?

I think this question could be better addressed with an approach for embedders to opt into the "bleeding-edge" C++ API, like mentioned in #43542 (comment). An embedder can highly customize the behavior of V8/Node.js, e.g. Inspectors. If such advanced needs arise in an embedder that already adopted the C embedding API, I believe it would not be trivial for them to migrate to C++ based APIs. Allowing conversion between C/C++ API types would reduce the gaps for embedders using the two variant interfaces.

I just do not see how it can be done in practice.
The C API is targeting languages that cannot do the C++ interop. E.g. C#, Python, or a C++ compiler that does not understand the libnode C++ mangled/decorated names.
If they cannot interop with C++, then converting between C and C++ cannot help.
From another hand, if the embedder code can work with C++ API, then I do not see a point to use the C API.

The only real "escape hatch" is to add the missing functionality to the C API and compile the libnode privately until the PR is accepted by Node.js. Thus, the C API is designed to be extensible from the beginning.

While Node.js provides an extensive C++ embedding API that can be used from C++
applications, the C-based API is useful when Node.js is embedded as a shared
libnode library into C++ or non-C++ applications.

Copy link
Member

@mhdawson mhdawson Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might make sense to say up front that the API is experimental at this point and is not subject to SemVer in respect to changes.


One of the goals for the C based embedder API is to be ABI stable. It means that
applications must be able to use newer libnode versions without recompilation.
The following design principles are targeting to achieve that goal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, but for the first version I think we need to add some text to clarify that while the goal is that we end up with an ABI stable API, that is not part of what is being promised yet.

} node_embedding_platform_flags;
```

These flags match to the C++ `node::ProcessInitializationFlags` and control the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have a link to the definition of the node::ProcessInitializationFlags


#### Callback types

##### `node_embedding_handle_error_callback`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to match what is covered in this section?


```c
const char* NAPI_CDECL
node_embedding_last_error_message_set(const char* message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be used by user code?

The last error message can be removed by passing null to this function.
##### `node_embedding_main_run`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this function belongs in this section. It's not really a platform API but one which uses all of the apis to provide an easy way to run like the Node.js binary.

} node_embedding_runtime_flags;
```

These flags match to the C++ `node::EnvironmentFlags` and control the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment about making node::EnvironmentFlags a link?

Returns `node_embedding_status_ok` if there were no issues.
The function opens up V8 Isolate, V8 handle, and V8 context scopes where it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite specific to V8. I assume there would be an equivalent in other runtimes and maybe we should desribed it more generally and then say "for example with V8 that would include..."

The function closes the V8 Isolate, V8 handle, and V8 context scopes.
Then, it triggers the uncaught exception handler if there were any
unhandled errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just be before the examples it might be useful to show the steps to get javascript run
I assume something like

  • create platform
  • create runtime
    ..etc

napi_env env,
napi_value load_result);

// The error is to be handled by using napi_env.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I understand that the comment is telling me.


// Runs Node.js main function.
// By default it is the same as running Node.js from CLI.
NAPI_EXTERN node_embedding_status NAPI_CDECL node_embedding_main_run(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as in docs, not sure this is a platform function versus a helper function to avoid having to call methods from the different categoriies ?

@@ -0,0 +1,124 @@
# C embedding API

This file is an overview for C embedding API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment stil make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe it more that the who README.md needs an update, given that there is documentation elsewhere?


static napi_status CallMe(node_embedding_runtime runtime, napi_env env);
static napi_status WaitMe(node_embedding_runtime runtime, napi_env env);
static napi_status WaitMeWithCheese(node_embedding_runtime runtime,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments up front on what each function does might help.

@@ -0,0 +1,12 @@
#include "embedtest_c_api_common.h"

// The simplest Node.js embedding scenario where the Node.js main function is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an equivalent to this where all of the callbacks that can be used to configure something before the secripts runs are used ? That might be useful as something that people could start with and remove parts they don't need.

Maybe that is in one of the other tests but it was not obvious to me as I went through the test files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. embedding Issues and PRs related to embedding Node.js in another project. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. node-api Issues and PRs related to the Node-API.
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants