Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PluggableDevices #8040

Open
cromefire opened this issue Oct 28, 2023 · 1 comment · May be fixed by #8524
Open

Support PluggableDevices #8040

cromefire opened this issue Oct 28, 2023 · 1 comment · May be fixed by #8524

Comments

@cromefire
Copy link

System information

  • TensorFlow.js version (you are using): 4.6.0
  • Are you willing to contribute it (Yes/No): Probably not, I definitely lack the expertise for that if it's not a really simple thing

Describe the feature and the current behavior/state.

Both tensorflow python and libtensorflow provide the option to load a pluggable device driver, like Intel's extensions for tensorflow, where as TensorFlow.js on Node.js currently seems to be vendor locked on NVIDIA GPUs, making it kinda less of a tfjs-node-gpu than a tfjs-node-nvidia.

Will this change the current api? How?

There'd probably be an additional api to load the driver .so and there'd maybe be more kinds of devices available.

Who will benefit with this feature?

Basically anyone that uses tfjs node and has any hardware that is not NVIDIA, but support pluggable devices, notably Intel GPUs (also optimized code for their CPUs) and Apple devices, AMD still doesn't have a pluggable device driver I think, but uses a custom fork of tensorflow.

Any Other info.

As (although I can't find any confirmation, other than the log format indicating it) tfjs-node seems to be built on libtensorflow, it seems to you'd "just" need to call the C API (TF_LoadPluggableDeviceLibrary(<lib_path>, status);) from libtensorflow for loading the driver and there you go.

@cromefire cromefire added the type:feature New feature or request label Oct 28, 2023
@cromefire
Copy link
Author

cromefire commented Oct 28, 2023

You'd probably do it somewhere in here:

static napi_value InitTFNodeJSBinding(napi_env env, napi_value exports) {
napi_status nstatus;
TFJSBackend *const backend = TFJSBackend::Create(env);
ENSURE_VALUE_IS_NOT_NULL_RETVAL(env, backend, nullptr);
// Store the backend in node's instance data for this addon
nstatus = napi_set_instance_data(env, backend, &FinalizeTFNodeJSBinding, nullptr);
ENSURE_NAPI_OK_RETVAL(env, nstatus, exports);
// TF version
napi_value tf_version;
nstatus = napi_create_string_latin1(env, TF_Version(), -1, &tf_version);
ENSURE_NAPI_OK_RETVAL(env, nstatus, exports);
But I lack the knowledge how to pass an argument into the function and how to properly do error handling in it or you probably also could load it via a new function on the binding, but either way I lack the knowledge to add new headers to the build (as tensorflow/c/c_api_experimental.h is needed) and how to properly write all that C code (not a C dev, especially not with node-gyp...).

You also probably want to use XPU devices here, as Intel calls them XPUs for some reason and add another flag isXpuDevice or so:

// TODO(kreeger): Add better support for this in the future through the JS
// API. https://github.com/tensorflow/tfjs/issues/320
std::string cpu_device_name;
const int num_devices = TF_DeviceListCount(device_list);
for (int i = 0; i < num_devices; i++) {
const char *device_type =
TF_DeviceListType(device_list, i, tf_status.status);
ENSURE_TF_OK(env, tf_status);
// Keep a reference to the host CPU device:
if (strcmp(device_type, "CPU") == 0) {
cpu_device_name =
std::string(TF_DeviceListName(device_list, i, tf_status.status));
ENSURE_TF_OK(env, tf_status);
} else if (strcmp(device_type, "GPU") == 0) {
device_name =
std::string(TF_DeviceListName(device_list, i, tf_status.status));
ENSURE_TF_OK(env, tf_status);
}
}
// If no GPU devices found, fallback to host CPU:
if (device_name.empty()) {
device_name = cpu_device_name;
is_gpu_device = false;
} else {
is_gpu_device = true;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants