-
Notifications
You must be signed in to change notification settings - Fork 39
Reshape feature implementation #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ovep-develop
Are you sure you want to change the base?
Conversation
059f7c1
to
f84614c
Compare
b66301b
to
e93f0b0
Compare
53278fe
to
be37fd9
Compare
42d6f14
to
e85411a
Compare
@jatinwadhwa921 please update this branch |
sure, i will rebase this branch again with latest ovep-develop |
be37fd9
to
ca2bc91
Compare
@@ -236,6 +236,97 @@ struct OpenVINO_Provider : Provider { | |||
|
|||
pi.precision = ParsePrecision(provider_options, pi.device_type, "precision"); | |||
|
|||
if (provider_options.contains("reshape_input") && pi.device_type == "NPU") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jatinwadhwa921 exactly what are we trying to do here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not comfortable leaving so much parsing in main functions can we create a file parse_utils.cc and dump all parsing functions there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will move the parser functions at the time of rebasing
@preetha-intel @ankitm3k can you please review this PR |
I would expect all parsing functions inside openvino_provider_factory to move to parse utils. |
Design Document for reshape_input.docx |
// Save the indexes of graph inputs among fused_node's inputDefs | ||
// (which also contains initializers). | ||
if (!session_context_.shape.empty()) { | ||
ValidateInputShapes(session_context_.shape, subgraph.GetInputs()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this added here
for (uint32_t index = 0; const auto& node : subgraph.GetInputs()) { | ||
if(subgraph.GetGraph().GetConsumerNodes(node->Name()).size()==0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment on to why this is required. Are there dangling inputs ?
for (uint32_t index = 0; const auto& node : subgraph.GetInputs()) { | ||
if(subgraph.GetGraph().GetConsumerNodes(node->Name()).size()==0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this part of code doing
@@ -100,7 +108,7 @@ BackendManager::BackendManager(SessionContext& session_context, | |||
} | |||
} | |||
|
|||
if (ModelHasSymbolicInputDims(subgraph)) { | |||
if (ModelHasSymbolicInputDims(subgraph) && session_context_.shape.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is shapeempty checking for upper bound, lower bound ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this portion of code needs to be rewritten... to be device agnostic and to have one approach for dynamism
@@ -39,6 +39,8 @@ class BackendManager { | |||
|
|||
bool ModelHasSymbolicInputDims(const onnxruntime::GraphViewer& subgraph) const; | |||
bool ModelHasBatchedInputs(const ONNX_NAMESPACE::ModelProto& model_proto) const; | |||
void ValidateInputShapes(const shape_t& shape, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValidateInputShapes do not have a string argument in declaration but in definition there is one. Is it by design.
@@ -146,6 +146,11 @@ CreateOVModel(const std::string model, | |||
try { | |||
auto ov_model = OVCore::Get()->ReadModel(model, session_context.onnx_model_path_name.string()); | |||
|
|||
if (!session_context.shape.empty()) { | |||
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape"; | |||
ov_model->reshape(session_context.shape); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This converts the model to static shape ?
@@ -146,6 +146,11 @@ CreateOVModel(const std::string model, | |||
try { | |||
auto ov_model = OVCore::Get()->ReadModel(model, session_context.onnx_model_path_name.string()); | |||
|
|||
if (!session_context.shape.empty()) { | |||
LOGS_DEFAULT(INFO) << log_tag << "Reshaping the ov tensor to specified shape"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" Reshape the model inputs to specified shape "
@@ -96,6 +97,7 @@ BasicBackend::BasicBackend(std::unique_ptr<ONNX_NAMESPACE::ModelProto>& model_pr | |||
} else if (!session_context_.has_external_weights && | |||
!subgraph_context_.has_dynamic_input_shape && | |||
!session_context_.so_context_enable && | |||
session_context.shape.empty() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see the model still has dynamic shape so this should not be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And by keeping this here we are saying that we wont use unified compile model API for upper bound and lower bound models... which will impact FIL.
ov_tensor_data.tensor_ptr = std::make_shared<ov::Tensor>(input.get_element_type(), input.get_shape(), | ||
const_cast<void*>(tensor.GetTensorRawData())); | ||
|
||
if (!session_context_.shape.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change the logic in top most braces:
if (subgraph_context_.has_dynamic_input_shape &&
!session_context_.disable_dynamic_shapes ) {
const_cast<void*>(tensor.GetTensorRawData())); | ||
} else { | ||
ov_tensor_data.tensor_ptr = std::make_shared<ov::Tensor>(input.get_element_type(), input.get_shape(), | ||
const_cast<void*>(tensor.GetTensorRawData())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the whole logic of startasyncinference needs to be rewritten
@@ -434,6 +447,10 @@ void BasicBackend::StartAsyncInference(Ort::KernelContext& context, OVInferReque | |||
} | |||
} // Loop subgraph original input names | |||
|
|||
if (!session_context_.shape.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This infer should not be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Infer should only be called from Concrete backend or dynamic backend... this is breaking existing design
I think should be removed: |
// Always true for NPU plugin or when passed . |
I would logic for dynamic shapes to be correctly documented than putting if else conditions randomly. I think instead of adding shapes information everywhere and checking for CPU, GPU please just rely on flags disable_dynamic_shapes and has_dynamic_input_shape for dynamic shapes . Currently since NPU does not support dynamic shapes logic should be:
ExportCompiledBlobAsEPCtxNode |
@@ -79,6 +80,7 @@ struct ProviderInfo { | |||
uint32_t num_of_threads{0}; // [num_of_threads]: Overrides the accelerator default value of | |||
// number of threads with this value at runtime. | |||
config_t load_config{}; // JSON config map to load custom OV parameters. | |||
shape_t shape{}; // Used for reshaping ov tensors to a particular lower and upper bound |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to name this variable as the EP-specific option named - "reshape_input", or at least mention that name in the comment?
As this is the only place where OVEP-specific options are listed in comprehensible format, so we're referring to these names/comments.
Could also the description of this be added to https://github.com/intel/onnxruntime/blob/master/onnxruntime/test/perftest/command_args_parser.cc please?
Btw is there a documentation dedicated to OVEP-specific runtime options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's also the "valid_provider_keys" field (added recently, probably)
const std::unordered_set<std::string> valid_provider_keys = {"device_type", "device_id", "device_luid", "cache_dir", "precision", |
could this be added there too?
Reshape feature implementation, This feature will help you set lower and upper bound for ov tensors only for NPU. Command used to run the feature -
onnxruntime_perf_test.exe -v -e openvino -m times -r 1 -i "device_type|NPU reshape_input|data[1,3,60,80..120]" <model_path>