-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for the new CUDA virtual memory management functions for shared memory. #4538
Comments
I believe you are correct that system shared memory is the way to pass input tensors. I'm not sure that it's on our roadmap yet, though we could file a feature request. @CoderHam may be able to provide additional information. |
Thanks for your feature request. There is a ticket on our backlog for the same but it has not been priotorized yet. |
Hi, @Tabrizian @dyastremsky |
Thanks for checking. Yes, it is. |
Hi, @Tabrizian @dyastremsky |
We have not yet announced a public release date. |
Hi,
I'm trying to use the Triton server on Jetson platform (Jetpack 5).
Previously, before the jetson, we used Triton server via the grpc client, and passed a Cuda shared memory handle, allocated with the cuda IPC API.
As I understand, the cuIPC functions are not supported on the Jetson, and instead, I have to use the new CUDA virtual memory management functions:
cuMemExportToShareableHandle
As described here
Currently, cuda shared memory registration on the Triton server is only implemented for the cuda-IPC memory handle.
(in RegisterCUDASharedMemory method at shared_memory_manager.cpp).
Does it mean that the only current option, in Jetson platform, to pass input tensors to the Triton server, is via system shared memory?
Is supporting cuda shared memory with the new memory management API, on your roadmap?
E.g. Implementation of RegisterCUDASharedMemory which uses
cuMemImportFromShareableHandle
function, and gRPC client support for it.If so, when do you plan on releasing it?
Thanks very much
The text was updated successfully, but these errors were encountered: