Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenVINO EP doesn't respect threading parameters #260

Open
mbahri opened this issue Jul 6, 2024 · 0 comments
Open

OpenVINO EP doesn't respect threading parameters #260

mbahri opened this issue Jul 6, 2024 · 0 comments

Comments

@mbahri
Copy link

mbahri commented Jul 6, 2024

Description
In ONNXRuntime, the OpenVINO EP accepts configuration options to set the number of threads and number of streams documented here, but these are ignored when passed to the EP in the Triton model config, for example:

optimization { execution_accelerators {
  cpu_execution_accelerator : [ {
    name : "openvino"
    parameters { key: "num_of_threads" value: "4" }
    parameters { key: "num_streams" value: "4" }
  } ]
}}

The threading configuration for the ONNXRuntime backend is also ignored (expected)

parameters { key: "intra_op_thread_count" value: { string_value: "4" } }
parameters { key: "inter_op_thread_count" value: { string_value: "2" } }

Triton Information
Last tested with the Triton container 24.05.

To Reproduce
Serving an ONNX model we observe:

  • The intra_op_thread_count / inter_op_thread_count affect the number of inference threads used when OpenVINO is disabled
  • Enabling OpenVINO optimizations, CPU usage jumps to the default/max number of CPU threads
  • Attempting to set num_of_threads or num_streams has no effect

Expected behavior
Expected behaviour would be that the OpenVINO EP ignores intra_op_thread_count and inter_op_thread_count but obeys num_of_threads and num_streams.

Unless I missed something and the ORT backend with OpenVINO optimizations reads the OpenVINO backend parameters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant