Skip to content

Reshape feature implementation #573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: ovep-develop
Choose a base branch
from

Conversation

jatinwadhwa921
Copy link

@jatinwadhwa921 jatinwadhwa921 commented Feb 11, 2025

Reshape feature implementation, This feature will help you set lower and upper bound for ov tensors only for NPU. Command used to run the feature -

onnxruntime_perf_test.exe -v -e openvino -m times -r 1 -i "device_type|NPU reshape_input|data[1,3,60,80..120]" <model_path>

@ankitm3k ankitm3k force-pushed the ovep-develop branch 3 times, most recently from b66301b to e93f0b0 Compare February 17, 2025 09:28
@jatinwadhwa921 jatinwadhwa921 force-pushed the jatin_latest_reshape_refactor branch 2 times, most recently from 53278fe to be37fd9 Compare February 17, 2025 17:23
@jatinwadhwa921 jatinwadhwa921 marked this pull request as ready for review February 17, 2025 17:28
@ankitm3k ankitm3k force-pushed the ovep-develop branch 2 times, most recently from 42d6f14 to e85411a Compare February 24, 2025 13:21
@sfatimar
Copy link

@jatinwadhwa921 please update this branch

@jatinwadhwa921
Copy link
Author

@jatinwadhwa921 please update this branch

sure, i will rebase this branch again with latest ovep-develop

@jatinwadhwa921 jatinwadhwa921 force-pushed the jatin_latest_reshape_refactor branch from be37fd9 to ca2bc91 Compare March 3, 2025 10:13
@@ -236,6 +236,97 @@ struct OpenVINO_Provider : Provider {

pi.precision = ParsePrecision(provider_options, pi.device_type, "precision");

if (provider_options.contains("reshape_input") && pi.device_type == "NPU") {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jatinwadhwa921 exactly what are we trying to do here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not comfortable leaving so much parsing in main functions can we create a file parse_utils.cc and dump all parsing functions there

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will move the parser functions at the time of rebasing

@sfatimar
Copy link

sfatimar commented Mar 4, 2025

@preetha-intel @ankitm3k can you please review this PR

@sfatimar
Copy link

I would expect all parsing functions inside openvino_provider_factory to move to parse utils.

@jatinwadhwa921
Copy link
Author

Design Document for reshape_input.docx
Attaching the design document for this feature

// Save the indexes of graph inputs among fused_node's inputDefs
// (which also contains initializers).
if (!session_context_.shape.empty()) {
ValidateInputShapes(session_context_.shape, subgraph.GetInputs());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this added here

for (uint32_t index = 0; const auto& node : subgraph.GetInputs()) {
if(subgraph.GetGraph().GetConsumerNodes(node->Name()).size()==0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment on to why this is required. Are there dangling inputs ?

for (uint32_t index = 0; const auto& node : subgraph.GetInputs()) {
if(subgraph.GetGraph().GetConsumerNodes(node->Name()).size()==0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this part of code doing

@@ -100,7 +108,7 @@ BackendManager::BackendManager(SessionContext& session_context,
}
}

if (ModelHasSymbolicInputDims(subgraph)) {
if (ModelHasSymbolicInputDims(subgraph) && session_context_.shape.empty()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is shapeempty checking for upper bound, lower bound ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this portion of code needs to be rewritten... to be device agnostic and to have one approach for dynamism

@@ -39,6 +39,8 @@ class BackendManager {

bool ModelHasSymbolicInputDims(const onnxruntime::GraphViewer& subgraph) const;
bool ModelHasBatchedInputs(const ONNX_NAMESPACE::ModelProto& model_proto) const;
void ValidateInputShapes(const shape_t& shape,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidateInputShapes do not have a string argument in declaration but in definition there is one. Is it by design.

@@ -146,6 +146,11 @@ CreateOVModel(const std::string model,
try {
auto ov_model = OVCore::Get()->ReadModel(model, session_context.onnx_model_path_name.string());

if (!session_context.shape.empty()) {
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape";
ov_model->reshape(session_context.shape);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This converts the model to static shape ?

@@ -146,6 +146,11 @@ CreateOVModel(const std::string model,
try {
auto ov_model = OVCore::Get()->ReadModel(model, session_context.onnx_model_path_name.string());

if (!session_context.shape.empty()) {
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" Reshape the model inputs to specified shape "

@@ -96,6 +97,7 @@ BasicBackend::BasicBackend(std::unique_ptr<ONNX_NAMESPACE::ModelProto>& model_pr
} else if (!session_context_.has_external_weights &&
!subgraph_context_.has_dynamic_input_shape &&
!session_context_.so_context_enable &&
session_context.shape.empty() &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see the model still has dynamic shape so this should not be here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And by keeping this here we are saying that we wont use unified compile model API for upper bound and lower bound models... which will impact FIL.

ov_tensor_data.tensor_ptr = std::make_shared<ov::Tensor>(input.get_element_type(), input.get_shape(),
const_cast<void*>(tensor.GetTensorRawData()));

if (!session_context_.shape.empty()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should change the logic in top most braces:

if (subgraph_context_.has_dynamic_input_shape &&
!session_context_.disable_dynamic_shapes ) {

const_cast<void*>(tensor.GetTensorRawData()));
} else {
ov_tensor_data.tensor_ptr = std::make_shared<ov::Tensor>(input.get_element_type(), input.get_shape(),
const_cast<void*>(tensor.GetTensorRawData()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole logic of startasyncinference needs to be rewritten

@@ -434,6 +447,10 @@ void BasicBackend::StartAsyncInference(Ort::KernelContext& context, OVInferReque
}
} // Loop subgraph original input names

if (!session_context_.shape.empty()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This infer should not be here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Infer should only be called from Concrete backend or dynamic backend... this is breaking existing design

@sfatimar
Copy link

I think should be removed:

@sfatimar
Copy link

// Always true for NPU plugin or when passed .
if (pi.device_type.find("NPU") != std::string::npos) {
pi.disable_dynamic_shapes = true;
}

@sfatimar
Copy link

I would logic for dynamic shapes to be correctly documented than putting if else conditions randomly. I think instead of adding shapes information everywhere and checking for CPU, GPU please just rely on flags disable_dynamic_shapes and has_dynamic_input_shape for dynamic shapes . Currently since NPU does not support dynamic shapes logic should be:
First logic for provider options flags based on device:

  1. If device type is NPU and reshape is not present enable dynamic backend. (disable dynamics shape=true)
  2. If device type is NPU and reshape is present enable dynamic inputs. (disable_dynamic shape = false)
    Uniform logic in subsequent code
  3. If model input is dynamic .and disable_dynamic shape is true (dynamic backend is enabled) go for dynamic backend. Reshape parameters have no effect.
  4. If model input is dynamic and disable_dynamic_backend is false(dynamic backend is disabled) go for concrete backend and dynamic inputs.
  5. If shape parameters are present apply partial shape/reshape to model and inputs (Read Model/Compile Model or Unified Compile Model -- Preetha/Jatin to give feedback here)

ExportCompiledBlobAsEPCtxNode
If dynamic backend is disabled and model is dynamic go for export ep context model...

@@ -79,6 +80,7 @@ struct ProviderInfo {
uint32_t num_of_threads{0}; // [num_of_threads]: Overrides the accelerator default value of
// number of threads with this value at runtime.
config_t load_config{}; // JSON config map to load custom OV parameters.
shape_t shape{}; // Used for reshaping ov tensors to a particular lower and upper bound

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to name this variable as the EP-specific option named - "reshape_input", or at least mention that name in the comment?
As this is the only place where OVEP-specific options are listed in comprehensible format, so we're referring to these names/comments.

Could also the description of this be added to https://github.com/intel/onnxruntime/blob/master/onnxruntime/test/perftest/command_args_parser.cc please?

Btw is there a documentation dedicated to OVEP-specific runtime options?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's also the "valid_provider_keys" field (added recently, probably)

const std::unordered_set<std::string> valid_provider_keys = {"device_type", "device_id", "device_luid", "cache_dir", "precision",

could this be added there too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants