-
Notifications
You must be signed in to change notification settings - Fork 475
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Backend] Support onnxruntime DirectML inference. (#1304)
* Fix links in readme * Fix links in readme * Update PPOCRv2/v3 examples * Update auto compression configs * Add neww quantization support for paddleclas model * Update quantized Yolov6s model download link * Improve PPOCR comments * Add English doc for quantization * Fix PPOCR rec model bug * Add new paddleseg quantization support * Add new paddleseg quantization support * Add new paddleseg quantization support * Add new paddleseg quantization support * Add Ascend model list * Add ascend model list * Add ascend model list * Add ascend model list * Add ascend model list * Add ascend model list * Add ascend model list * Support DirectML in onnxruntime * Support onnxruntime DirectML * Support onnxruntime DirectML * Support onnxruntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Support OnnxRuntime DirectML * Remove DirectML vision model example * Imporve OnnxRuntime DirectML * Imporve OnnxRuntime DirectML * fix opencv cmake in Windows * recheck codestyle
- Loading branch information
Showing
22 changed files
with
393 additions
and
60 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
[English](../../en/build_and_install/directml.md) | 简体中文 | ||
|
||
# DirectML部署库编译 | ||
Direct Machine Learning (DirectML) 是Windows系统上用于机器学习的一款高性能, 提供硬件加速的 DirectX 12 库. | ||
目前, Fastdeploy的ONNX Runtime后端已集成DirectML,让用户可以在支持DirectX 12的 AMD/Intel/Nvidia/Qualcomm的GPU上部署模型. | ||
|
||
更多详细介绍可见: | ||
- [ONNX Runtime DirectML Execution Provider](https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html) | ||
|
||
# DirectML使用需求 | ||
- 编译需求: Visuald Studio 2017 及其以上工具链. | ||
- 操作系统: Windows10, 1903 版本, 及其更新版本. (DirectML为操作系统的组成部分, 无需单独安装) | ||
- 硬件需求: 支持DirectX 12的显卡, 例如, AMD GCN 第一代及以上版本/ Intel Haswell HD集成显卡及以上版本/Nvidia Kepler架构及以上版本/ Qualcomm Adreno 600及以上版本. | ||
|
||
# 编译DirectML部署库 | ||
DirectML是基于ONNX Runtime后端集成, 所以要使用DirectML, 用户需要打开编译ONNX Runtime的选项. 同时, FastDeploy的DirectML支持x64/x86(Win32)架构的程序构建. | ||
|
||
|
||
x64示例, 在Windows菜单中,找到`x64 Native Tools Command Prompt for VS 2019`打开,执行如下命令 | ||
```bat | ||
git clone https://github.com/PaddlePaddle/FastDeploy.git | ||
cd FastDeploy | ||
mkdir build && cd build | ||
cmake .. -G "Visual Studio 16 2019" -A x64 ^ | ||
-DWITH_DIRECTML=ON ^ | ||
-DENABLE_ORT_BACKEND=ON ^ | ||
-DENABLE_VISION=ON ^ | ||
-DCMAKE_INSTALL_PREFIX="D:\Paddle\compiled_fastdeploy" ^ | ||
msbuild fastdeploy.sln /m /p:Configuration=Release /p:Platform=x64 | ||
msbuild INSTALL.vcxproj /m /p:Configuration=Release /p:Platform=x64 | ||
``` | ||
编译完成后,即在`CMAKE_INSTALL_PREFIX`指定的目录下生成C++推理库. | ||
如您使用CMake GUI可参考文档[Windows使用CMakeGUI + Visual Studio 2019 IDE编译](../faq/build_on_win_with_gui.md) | ||
|
||
|
||
x86(Win32)示例, 在Windows菜单中,找到`x86 Native Tools Command Prompt for VS 2019`打开,执行如下命令 | ||
```bat | ||
git clone https://github.com/PaddlePaddle/FastDeploy.git | ||
cd FastDeploy | ||
mkdir build && cd build | ||
cmake .. -G "Visual Studio 16 2019" -A Win32 ^ | ||
-DWITH_DIRECTML=ON ^ | ||
-DENABLE_ORT_BACKEND=ON ^ | ||
-DENABLE_VISION=ON ^ | ||
-DCMAKE_INSTALL_PREFIX="D:\Paddle\compiled_fastdeploy" ^ | ||
msbuild fastdeploy.sln /m /p:Configuration=Release /p:Platform=Win32 | ||
msbuild INSTALL.vcxproj /m /p:Configuration=Release /p:Platform=Win32 | ||
``` | ||
编译完成后,即在`CMAKE_INSTALL_PREFIX`指定的目录下生成C++推理库. | ||
如您使用CMake GUI可参考文档[Windows使用CMakeGUI + Visual Studio 2019 IDE编译](../faq/build_on_win_with_gui.md) | ||
|
||
# 使用DirectML库 | ||
DirectML编译库的使用方式, 和其他硬件在Windows上使用的方式一样, 参考以下链接. | ||
- [FastDeploy C++库在Windows上的多种使用方式](../faq/use_sdk_on_windows_build.md) | ||
- [在 Windows 使用 FastDeploy C++ SDK](../faq/use_sdk_on_windows.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
English | [中文](../../cn/build_and_install/directml.md) | ||
|
||
# How to Build DirectML Deployment Environment | ||
Direct Machine Learning (DirectML) is a high-performance, hardware-accelerated DirectX 12 library for machine learning on Windows systems. | ||
Currently, Fastdeploy's ONNX Runtime backend has DirectML integrated, allowing users to deploy models on AMD/Intel/Nvidia/Qualcomm GPUs with DirectX 12 support. | ||
|
||
More details: | ||
- [ONNX Runtime DirectML Execution Provider](https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html) | ||
|
||
# DirectML requirements | ||
- Compilation requirements: Visual Studio 2017 toolchain and above. | ||
- Operating system: Windows 10, version 1903, and newer. (DirectML is part of the operating system and does not need to be installed separately) | ||
- Hardware requirements: DirectX 12 supported graphics cards, e.g., AMD GCN 1st generation and above/ Intel Haswell HD integrated graphics and above/ Nvidia Kepler architecture and above/ Qualcomm Adreno 600 and above. | ||
|
||
# How to Build and Install DirectML C++ SDK | ||
The DirectML is integrated with the ONNX Runtime backend, so to use DirectML, users need to turn on the option to compile ONNX Runtime. Also, FastDeploy's DirectML supports building programs for x64/x86 (Win32) architectures. | ||
|
||
For the x64 example, in the Windows menu, find `x64 Native Tools Command Prompt for VS 2019` and open it by executing the following command | ||
```bat | ||
git clone https://github.com/PaddlePaddle/FastDeploy.git | ||
cd FastDeploy | ||
mkdir build && cd build | ||
cmake .. -G "Visual Studio 16 2019" -A x64 ^ | ||
-DWITH_DIRECTML=ON ^ | ||
-DENABLE_ORT_BACKEND=ON ^ | ||
-DENABLE_VISION=ON ^ | ||
-DCMAKE_INSTALL_PREFIX="D:\Paddle\compiled_fastdeploy" ^ | ||
msbuild fastdeploy.sln /m /p:Configuration=Release /p:Platform=x64 | ||
msbuild INSTALL.vcxproj /m /p:Configuration=Release /p:Platform=x64 | ||
``` | ||
Once compiled, the C++ inference library is generated in the directory specified by `CMAKE_INSTALL_PREFIX` | ||
If you use CMake GUI, please refer to [How to Compile with CMakeGUI + Visual Studio 2019 IDE on Windows](../faq/build_on_win_with_gui.md) | ||
|
||
|
||
For the x86(Win32) example, in the Windows menu, find `x86 Native Tools Command Prompt for VS 2019` and open it by executing the following command | ||
```bat | ||
git clone https://github.com/PaddlePaddle/FastDeploy.git | ||
cd FastDeploy | ||
mkdir build && cd build | ||
cmake .. -G "Visual Studio 16 2019" -A Win32 ^ | ||
-DWITH_DIRECTML=ON ^ | ||
-DENABLE_ORT_BACKEND=ON ^ | ||
-DENABLE_VISION=ON ^ | ||
-DCMAKE_INSTALL_PREFIX="D:\Paddle\compiled_fastdeploy" ^ | ||
msbuild fastdeploy.sln /m /p:Configuration=Release /p:Platform=Win32 | ||
msbuild INSTALL.vcxproj /m /p:Configuration=Release /p:Platform=Win32 | ||
``` | ||
Once compiled, the C++ inference library is generated in the directory specified by `CMAKE_INSTALL_PREFIX` | ||
If you use CMake GUI, please refer to [How to Compile with CMakeGUI + Visual Studio 2019 IDE on Windows](../faq/build_on_win_with_gui.md) | ||
|
||
# How to use compiled DirectML SDK. | ||
The DirectML compiled library can be used in the same way as any other hardware on Windows, see the following link. | ||
- [Using the FastDeploy C++ SDK on Windows Platform](../faq/use_sdk_on_windows.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
// | ||
// Licensed under the Apache License, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, software | ||
// distributed under the License is distributed on an "AS IS" BASIS, | ||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
#include "fastdeploy/runtime.h" | ||
|
||
namespace fd = fastdeploy; | ||
|
||
int main(int argc, char* argv[]) { | ||
// create option | ||
fd::RuntimeOption runtime_option; | ||
|
||
// model and param files | ||
std::string model_file = "mobilenetv2/inference.pdmodel"; | ||
std::string params_file = "mobilenetv2/inference.pdiparams"; | ||
|
||
// read model From disk. | ||
// runtime_option.SetModelPath(model_file, params_file, | ||
// fd::ModelFormat::PADDLE); | ||
|
||
// read model from buffer | ||
std::string model_buffer, params_buffer; | ||
fd::ReadBinaryFromFile(model_file, &model_buffer); | ||
fd::ReadBinaryFromFile(params_file, ¶ms_buffer); | ||
runtime_option.SetModelBuffer(model_buffer, params_buffer, | ||
fd::ModelFormat::PADDLE); | ||
|
||
// setup other option | ||
runtime_option.SetCpuThreadNum(12); | ||
// use ONNX Runtime DirectML | ||
runtime_option.UseOrtBackend(); | ||
runtime_option.UseDirectML(); | ||
|
||
// init runtime | ||
std::unique_ptr<fd::Runtime> runtime = | ||
std::unique_ptr<fd::Runtime>(new fd::Runtime()); | ||
if (!runtime->Init(runtime_option)) { | ||
std::cerr << "--- Init FastDeploy Runitme Failed! " | ||
<< "\n--- Model: " << model_file << std::endl; | ||
return -1; | ||
} else { | ||
std::cout << "--- Init FastDeploy Runitme Done! " | ||
<< "\n--- Model: " << model_file << std::endl; | ||
} | ||
// init input tensor shape | ||
fd::TensorInfo info = runtime->GetInputInfo(0); | ||
info.shape = {1, 3, 224, 224}; | ||
|
||
std::vector<fd::FDTensor> input_tensors(1); | ||
std::vector<fd::FDTensor> output_tensors(1); | ||
|
||
std::vector<float> inputs_data; | ||
inputs_data.resize(1 * 3 * 224 * 224); | ||
for (size_t i = 0; i < inputs_data.size(); ++i) { | ||
inputs_data[i] = std::rand() % 1000 / 1000.0f; | ||
} | ||
input_tensors[0].SetExternalData({1, 3, 224, 224}, fd::FDDataType::FP32, | ||
inputs_data.data()); | ||
|
||
// get input name | ||
input_tensors[0].name = info.name; | ||
|
||
runtime->Infer(input_tensors, &output_tensors); | ||
|
||
output_tensors[0].PrintInfo(); | ||
return 0; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.