-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add options for growable memory and single state buffers #104
Conversation
protobuf/model_config.proto
Outdated
//@@ Currently, this option only applies for implicit state that uses CUDA and | ||
//@@ use_single_buffer must be enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we force the implicit state memory used in growable block manager to be GPU memory for now. Maybe the wording here could be clearer that regardless of what the user does, this will use GPU memory for now, even if they request other memory types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't we error if they try different types? Just double checking - we don't silently switch preferences, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't we error if they try different types?
What the user specifies is just "preferences" (i.e., there is no guarantee that Triton will satisfy the request). The same pattern exists in TRITONBackend_OutputBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the question is which options take precedence - does the model config or the API call to get the buffer - will discuss offline
3416aca
to
9246bbe
Compare
protobuf/model_config.proto
Outdated
|
||
//@@ .. cpp:var:: bool use_growable_memory | ||
//@@ | ||
//@@ The optional field to allow an implicit state buffer to grow the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a user visible option? Wondering if we could always grow memory if using CUDA memory in the implementation.
protobuf/model_config.proto
Outdated
//@@ This option is only available for CUDA memory and requires enabling | ||
//@@ use_same_buffer_for_input_output. When using this option, | ||
//@@ StateBuffer call will always return CUDA memory even if CPU memory | ||
//@@ is provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//@@ is provided. | |
//@@ is requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch fixed!
* Add same input/output bstate buffer option * Add an option for using GrowableMemory * Review comments * Format * Review comments * Review comment * Fix description
* Add same input/output bstate buffer option * Add an option for using GrowableMemory * Review comments * Format * Review comments * Review comment * Fix description
No description provided.