Skip to content

Commit

Permalink
[Doc] Add doc for vdl serving (#1110)
Browse files Browse the repository at this point in the history
* add doc for vdl serving

* add doc for vdl serving

* add doc for vdl serving

* fix link

* fix link

* fix gif size

* fix gif size

* add english version

* fix links

* fix links

* update format

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

---------

Co-authored-by: heliqi <[email protected]>
  • Loading branch information
rainyfly and heliqi authored Jan 30, 2023
1 parent 8c651f9 commit 595ca69
Show file tree
Hide file tree
Showing 19 changed files with 689 additions and 29 deletions.
15 changes: 15 additions & 0 deletions examples/text/ernie-3.0/serving/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,18 @@ entity: 华夏 label: LOC pos: [14, 15]

## Configuration Modification
The current classification task (ernie_seqcls_model/config.pbtxt) is by default configured to run the OpenVINO engine on CPU; the sequence labelling task is by default configured to run the Paddle engine on GPU. If you want to run on CPU/GPU or other inference engines, you should modify the configuration. please refer to the [configuration document.](../../../../serving/docs/EN/model_configuration-en.md)

## Use VisualDL for serving deployment visualization

You can use VisualDL for [serving deployment visualization](../../../../serving/docs/EN/vdl_management-en.md) , the above model preparation, deployment, configuration modification and client request operations can all be performed based on VisualDL.

The serving deployment of ERNIE 3.0 by VisualDL only needs the following three steps:
```text
1. Load the model repository: ./text/ernie-3.0/serving/models
2. Download the model resource file: click the ernie_seqcls_model model, click the version number 1 to add the pre-training model, and select the text classification model ernie_3.0_ernie_seqcls_model to download. Click the ernie_tokencls_model model, click the version number 1 to add the pre-training model, and select the text classification model ernie_tokencls_model to download.
3. Start the service: Click the "launch server" button and input the launch parameters.
```

<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211708353-507d6038-b754-4520-884b-1156703a44c6.gif" width="100%"/>
</p>
14 changes: 14 additions & 0 deletions examples/text/ernie-3.0/serving/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,17 @@ entity: 华夏 label: LOC pos: [14, 15]
## 配置修改

当前分类任务(ernie_seqcls_model/config.pbtxt)默认配置在CPU上运行OpenVINO引擎; 序列标注任务默认配置在GPU上运行Paddle引擎。如果要在CPU/GPU或其他推理引擎上运行, 需要修改配置,详情请参考[配置文档](../../../../serving/docs/zh_CN/model_configuration.md)

## 使用VisualDL进行可视化部署

可以使用VisualDL进行[Serving可视化部署](../../../../serving/docs/zh_CN/vdl_management.md),上述启动服务、配置修改以及客户端请求的操作都可以基于VisualDL进行。

通过VisualDL的可视化界面对ERNIE 3.0进行服务化部署只需要如下三步:
```text
1. 载入模型库:./text/ernie-3.0/serving/models
2. 下载模型资源文件:点击ernie_seqcls_model模型,点击版本号1添加预训练模型,选择文本分类模型ernie_3.0_ernie_seqcls_model进行下载。点击ernie_tokencls_model模型,点击版本号1添加预训练模型,选择文本分类模型ernie_tokencls_model进行下载。
3. 启动服务:点击启动服务按钮,输入启动参数。
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211708353-507d6038-b754-4520-884b-1156703a44c6.gif" width="100%"/>
</p>
17 changes: 16 additions & 1 deletion examples/text/uie/serving/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
English | [简体中文](README_CN.md)
English | [简体中文](README_CN.md)

# Example of UIE Serving Deployment

Expand Down Expand Up @@ -144,3 +144,18 @@ results:
## Configuration Modification

The current configuration is by default to run the paddle engine on CPU. If you want to run on CPU/GPU or other inference engines, modifying the configuration is needed.Please refer to [Configuration Document](../../../../serving/docs/EN/model_configuration-en.md).

## Use VisualDL for serving deployment visualization

You can use VisualDL for [serving deployment visualization](../../../../serving/docs/EN/vdl_management-en.md) , the above model preparation, deployment, configuration modification and client request operations can all be performed based on VisualDL.

The serving deployment of UIE by VisualDL only needs the following three steps:
```text
1. Load the model repository: ./text/uie/serving/models
2. Download the model resource file: click the uie model, click the version number 1 to add the pre-training model, and select the text information extraction model uie-base to download.
3. Start the service: Click the "launch server" button and input the launch parameters.
```

<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211708353-507d6038-b754-4520-884b-1156703a44c6.gif" width="100%"/>
</p>
14 changes: 14 additions & 0 deletions examples/text/uie/serving/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,17 @@ results:
## 配置修改

当前默认配置在GPU上运行Paddle引擎,如果要在CPU/GPU或其他推理引擎上运行, 需要修改配置,详情请参考[配置文档](../../../../serving/docs/zh_CN/model_configuration.md)

## 使用VisualDL进行可视化部署

可以使用VisualDL进行[Serving可视化部署](../../../../serving/docs/zh_CN/vdl_management.md),上述启动服务、配置修改以及客户端请求的操作都可以基于VisualDL进行。

通过VisualDL的可视化界面对UIE进行服务化部署只需要如下三步:
```text
1. 载入模型库:./text/uie/serving/models
2. 下载模型资源文件:点击uie模型,点击版本号1添加预训练模型,选择信息抽取模型uie-base进行下载
3. 启动服务:点击启动服务按钮,输入启动参数。
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211709329-3261758e-69af-4efd-9711-693f5f031131.gif" width="100%"/>
</p>
36 changes: 25 additions & 11 deletions examples/vision/classification/paddleclas/serving/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
English | [简体中文](README_CN.md)
# PaddleClas Service Deployment Example

Before the service deployment, please confirm
Before the service deployment, please confirm

- 1. Refer to [FastDeploy Service Deployment](../../../../../serving/README.md) for software and hardware environment requirements and image pull commands.

Expand All @@ -13,25 +13,25 @@ Before the service deployment, please confirm
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/examples/vision/classification/paddleclas/serving

# Download ResNet50_vd model files and test images
# Download ResNet50_vd model files and test images
wget https://bj.bcebos.com/paddlehub/fastdeploy/ResNet50_vd_infer.tgz
tar -xvf ResNet50_vd_infer.tgz
wget https://gitee.com/paddlepaddle/PaddleClas/raw/release/2.4/deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg

# Put the configuration file into the preprocessing directory
# Put the configuration file into the preprocessing directory
mv ResNet50_vd_infer/inference_cls.yaml models/preprocess/1/inference_cls.yaml

# Place the model under models/runtime/1 and rename them to model.pdmodel和model.pdiparams
mv ResNet50_vd_infer/inference.pdmodel models/runtime/1/model.pdmodel
mv ResNet50_vd_infer/inference.pdiparams models/runtime/1/model.pdiparams

# Pull the fastdeploy image (x.y.z represent the image version. Refer to the serving document to replace them with numbers)
# GPU image
# GPU image
docker pull registry.baidubce.com/paddlepaddle/fastdeploy:x.y.z-gpu-cuda11.4-trt8.4-21.10
# CPU image
# CPU image
docker pull registry.baidubce.com/paddlepaddle/fastdeploy:x.y.z-cpu-only-21.10

# Run the container named fd_serving and mount it in the /serving directory of the container
# Run the container named fd_serving and mount it in the /serving directory of the container
nvidia-docker run -it --net=host --name fd_serving -v `pwd`/:/serving registry.baidubce.com/paddlepaddle/fastdeploy:x.y.z-gpu-cuda11.4-trt8.4-21.10 bash

# Start the service (The CUDA_VISIBLE_DEVICES environment variable is not set, which entitles the scheduling authority of all GPU cards)
Expand All @@ -43,7 +43,7 @@ CUDA_VISIBLE_DEVICES=0 fastdeployserver --model-repository=/serving/models --bac
>> If "Address already in use" appears when running fastdeployserver to start the service, use `--grpc-port` to specify the port number and change the request port number in the client demo.
>> Other startup parameters can be checked by fastdeployserver --help
>> Other startup parameters can be checked by fastdeployserver --help
Successful service start brings the following output:
```
Expand All @@ -54,17 +54,17 @@ I0928 04:51:15.826578 206 http_server.cc:167] Started Metrics Service at 0.0.0.0
```


## Client Request
## Client Request

Execute the following command in the physical machine to send the grpc request and output the result
```
# Download test images
# Download test images
wget https://gitee.com/paddlepaddle/PaddleClas/raw/release/2.4/deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg
# Install client dependencies
# Install client dependencies
python3 -m pip install tritonclient\[all\]
# Send the request
# Send the request
python3 paddlecls_grpc_client.py
```

Expand All @@ -77,3 +77,17 @@ output_name: CLAS_RESULT
## Configuration Change

The current default configuration runs the TensorRT engine on GPU. If you want to run it on CPU or other inference engines, please modify the configuration in `models/runtime/config.pbtxt`. Refer to [Configuration Document](../../../../../serving/docs/EN/model_configuration-en.md) for more information.

## Use VisualDL for serving deployment visualization

You can use VisualDL for [serving deployment visualization](../../../../serving/docs/EN/vdl_management-en.md) , the above model preparation, deployment, configuration modification and client request operations can all be performed based on VisualDL.

The serving deployment of PaddleClas by VisualDL only needs the following three steps:
```text
1. Load the model repository: ./vision/classification/paddleclas/serving/models
2. Download the model resource file: click the runtime model, click the version number 1 to add the pre-training model, and select the image classification model ResNet50_vd to download.
3. Start the service: Click the "launch server" button and input the launch parameters.
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211708702-828d8ad8-4e85-457f-9c62-12f53fc81853.gif" width="100%"/>
</p>
14 changes: 14 additions & 0 deletions examples/vision/classification/paddleclas/serving/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,17 @@ output_name: CLAS_RESULT
## 配置修改

当前默认配置在GPU上运行TensorRT引擎, 如果要在CPU或其他推理引擎上运行。 需要修改`models/runtime/config.pbtxt`中配置,详情请参考[配置文档](../../../../../serving/docs/zh_CN/model_configuration.md)

## 使用VisualDL进行可视化部署

可以使用VisualDL进行[Serving可视化部署](../../../../../serving/docs/zh_CN/vdl_management.md),上述启动服务、配置修改以及客户端请求的操作都可以基于VisualDL进行。

通过VisualDL的可视化界面对PaddleClas进行服务化部署只需要如下三步:
```text
1. 载入模型库:./vision/classification/paddleclas/serving/models
2. 下载模型资源文件:点击runtime模型,点击版本号1添加预训练模型,选择图像分类模型ResNet50_vd进行下载。
3. 启动服务:点击启动服务按钮,输入启动参数。
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211708702-828d8ad8-4e85-457f-9c62-12f53fc81853.gif" width="100%"/>
</p>
29 changes: 22 additions & 7 deletions examples/vision/detection/paddledetection/serving/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ Confirm before the serving deployment
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/examples/vision/detection/paddledetection/serving

# Download PPYOLOE model files and test images
# Download PPYOLOE model files and test images
wget https://bj.bcebos.com/paddlehub/fastdeploy/ppyoloe_crn_l_300e_coco.tgz
wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg
tar xvf ppyoloe_crn_l_300e_coco.tgz

# Put the configuration file into the preprocessing directory
# Put the configuration file into the preprocessing directory
mv ppyoloe_crn_l_300e_coco/infer_cfg.yml models/preprocess/1/

# Place the model under models/runtime/1 and rename them to model.pdmodel and model.pdiparams
Expand All @@ -36,14 +36,14 @@ cp models/runtime/ppyoloe_runtime_config.pbtxt models/runtime/config.pbtxt

# Attention: Given that the mask_rcnn model has one more output, we need to rename mask_config.pbtxt to config.pbtxt in the postprocess directory (models/postprocess)

# Pull the FastDeploy image (x.y.z represent the image version. Users need to replace them with numbers)
# Pull the FastDeploy image (x.y.z represent the image version. Users need to replace them with numbers)
# GPU image
docker pull registry.baidubce.com/paddlepaddle/fastdeploy:x.y.z-gpu-cuda11.4-trt8.4-21.10
# CPU image
docker pull paddlepaddle/fastdeploy:z.y.z-cpu-only-21.10


# Run the container named fd_serving and mount it in the /serving directory of the container
# Run the container named fd_serving and mount it in the /serving directory of the container
nvidia-docker run -it --net=host --name fd_serving --shm-size="1g" -v `pwd`/:/serving registry.baidubce.com/paddlepaddle/fastdeploy:x.y.z-gpu-cuda11.4-trt8.4-21.10 bash

# Start Service (The CUDA_VISIBLE_DEVICES environment variable is not set, which entitles the scheduling authority of all GPU cards)
Expand All @@ -68,14 +68,14 @@ I0928 04:51:15.826578 206 http_server.cc:167] Started Metrics Service at 0.0.0.0
```


## Client Request
## Client Request

Execute the following command in the physical machine to send the grpc request and output the results
```
# Download test images
# Download test images
wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg
# Install client dependencies
# Install client dependencies
python3 -m pip install tritonclient[all]
# Send requests
Expand All @@ -93,3 +93,18 @@ output_name: DET_RESULT
## Configuration Change

The current default configuration runs on GPU. If you want to run it on CPU or other inference engines, please modify the configuration in `models/runtime/config.pbtxt`. Refer to [Configuration Document](../../../../../serving/docs/EN/model_configuration-en.md) for more information.

## Use VisualDL for serving deployment visualization

You can use VisualDL for [serving deployment visualization](../../../../../serving/docs/EN/vdl_management-en.md) , the above model preparation, deployment, configuration modification and client request operations can all be performed based on VisualDL.

The serving deployment of PaddleDetection by VisualDL only needs the following three steps:
```text
1. Load the model repository: ./vision/detection/paddledetection/serving/models
2. Download the model resource file: click the preprocess model, click the version number 1 to add the pre-training model, and select the detection model ppyoloe_crn_l_300e_coco to download. click the runtime model, click the version number 1 to add the pre-training model, and select the detection model ppyoloe_crn_l_300e_coco to download.
3. Set startup config file: click the "ensemble configuration" button, choose configuration file ppyoloe_config.pbtxt, then click the "set as startup config" button. click the runtime model, choose configuration file ppyoloe_runtime_config.pbtxt, then click the "set as startup config" button.
4. Start the service: Click the "launch server" button and input the launch parameters.
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211710983-2d1f1427-6738-409d-903b-2b4e4ab6cbfc.gif" width="100%"/>
</p>
16 changes: 16 additions & 0 deletions examples/vision/detection/paddledetection/serving/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,19 @@ output_name: DET_RESULT
## 配置修改

当前默认配置在GPU上运行Paddle引擎, 如果要在CPU或其他推理引擎上运行。 需要修改`models/runtime/config.pbtxt`中配置,详情请参考[配置文档](../../../../../serving/docs/zh_CN/model_configuration.md)


## 使用VisualDL进行可视化部署

可以使用VisualDL进行[Serving可视化部署](../../../../../serving/docs/zh_CN/vdl_management.md),上述启动服务、配置修改以及客户端请求的操作都可以基于VisualDL进行。

通过VisualDL的可视化界面对PaddleDetection进行服务化部署只需要如下三步:
```text
1. 载入模型库:./vision/detection/paddledetection/serving/models
2. 下载模型资源文件:点击preprocess模型,点击版本号1添加预训练模型,选择检测模型ppyoloe_crn_l_300e_coco进行下载,此时preprocess中将会有资源文件infer_cfg.yml。点击runtime模型,点击版本号1添加预训练模型,选择检测模型ppyoloe_crn_l_300e_coco进行下载,此时runtime中将会有资源文件model.pdmodel和model.pdiparams。
3. 设置启动配置文件:点击ensemble配置按钮,选择配置文件ppyoloe_config.pbtxt,并设为启动配置文件。点击runtime模型,选择配置文件ppyoloe_runtime_config.pbtxt,并设为启动配置文件。
4. 启动服务:点击启动服务按钮,输入启动参数。
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211710983-2d1f1427-6738-409d-903b-2b4e4ab6cbfc.gif" width="100%"/>
</p>

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions examples/vision/detection/yolov5/serving/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,17 @@ output_name: detction_result


The default is to run ONNXRuntime on CPU. If developers need to run it on GPU or other inference engines, please see the [Configs File](../../../../../serving/docs/EN/model_configuration-en.md) to modify the configs in `models/runtime/config.pbtxt`.

## Use VisualDL for serving deployment visualization

You can use VisualDL for [serving deployment visualization](../../../../../serving/docs/EN/vdl_management-en.md) , the above model preparation, deployment, configuration modification and client request operations can all be performed based on VisualDL.

The serving deployment of yolov5 by VisualDL only needs the following three steps:
```text
1. Load the model repository: ./vision/detection/yolov5/serving/models
2. Download the model resource file: click the runtime model, click the version number 1 to add the pre-training model, and select the detection model yolov5s to download.
3. Start the service: Click the "launch server" button and input the launch parameters.
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211709339-023fef22-3ffc-4b3d-bce5-ea4202bb9c61.gif" width="100%"/>
</p>
15 changes: 15 additions & 0 deletions examples/vision/detection/yolov5/serving/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,18 @@ output_name: detction_result
## 配置修改

当前默认配置在CPU上运行ONNXRuntime引擎, 如果要在GPU或其他推理引擎上运行。 需要修改`models/runtime/config.pbtxt`中配置,详情请参考[配置文档](../../../../../serving/docs/zh_CN/model_configuration.md)


## 使用VisualDL进行可视化部署

可以使用VisualDL进行[Serving可视化部署](../../../../../serving/docs/zh_CN/vdl_management.md),上述启动服务、配置修改以及客户端请求的操作都可以基于VisualDL进行。

通过VisualDL的可视化界面对yolov5进行服务化部署只需要如下三步:
```text
1. 载入模型库:./vision/detection/yolov5/serving/models
2. 下载模型资源文件:点击runtime模型,点击版本号1添加预训练模型,选择检测模型yolov5s进行下载。
3. 启动服务:点击启动服务按钮,输入启动参数。
```
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/211709339-023fef22-3ffc-4b3d-bce5-ea4202bb9c61.gif" width="100%"/>
</p>
Loading

0 comments on commit 595ca69

Please sign in to comment.