You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
In ONNXRuntime, the OpenVINO EP accepts configuration options to set the number of threads and number of streams documented here, but these are ignored when passed to the EP in the Triton model config, for example:
Triton Information
Last tested with the Triton container 24.05.
To Reproduce
Serving an ONNX model we observe:
The intra_op_thread_count / inter_op_thread_count affect the number of inference threads used when OpenVINO is disabled
Enabling OpenVINO optimizations, CPU usage jumps to the default/max number of CPU threads
Attempting to set num_of_threads or num_streams has no effect
Expected behavior
Expected behaviour would be that the OpenVINO EP ignores intra_op_thread_count and inter_op_thread_count but obeys num_of_threads and num_streams.
Unless I missed something and the ORT backend with OpenVINO optimizations reads the OpenVINO backend parameters?
The text was updated successfully, but these errors were encountered:
Description
In ONNXRuntime, the OpenVINO EP accepts configuration options to set the number of threads and number of streams documented here, but these are ignored when passed to the EP in the Triton model config, for example:
The threading configuration for the ONNXRuntime backend is also ignored (expected)
Triton Information
Last tested with the Triton container 24.05.
To Reproduce
Serving an ONNX model we observe:
intra_op_thread_count
/inter_op_thread_count
affect the number of inference threads used when OpenVINO is disablednum_of_threads
ornum_streams
has no effectExpected behavior
Expected behaviour would be that the OpenVINO EP ignores
intra_op_thread_count
andinter_op_thread_count
but obeysnum_of_threads
andnum_streams
.Unless I missed something and the ORT backend with OpenVINO optimizations reads the OpenVINO backend parameters?
The text was updated successfully, but these errors were encountered: