A simple Triton backend that copies input tensors to corresponding output tensors. This backend is used primarily for testing. To learn more about writing your own Triton backend including simple examples, see the documentation included in the backend repo.
Ask questions or report problems with the Identity backend in the main Triton issues page.
Use cmake to build and install in a local directory.
$ mkdir build
$ cd build
$ cmake -DTRITON_ENABLE_GPU=ON -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install ..
$ make install
The following required Triton repositories will be pulled and used in the build. By default the "main" branch/tag will be used for each repo but the listed CMake argument can be used to override.
- triton-inference-server/backend: -DTRITON_BACKEND_REPO_TAG=[tag]
- triton-inference-server/core: -DTRITON_CORE_REPO_TAG=[tag]
- triton-inference-server/common: -DTRITON_COMMON_REPO_TAG=[tag]
If you are building on a release branch (or on a development branch that is based off of a release branch), then you must set these cmake arguments to point to that release branch as well. For example, if you are building the r23.04 identity_backend branch then you need to use the following additional cmake flags:
-DTRITON_BACKEND_REPO_TAG=r23.04
-DTRITON_CORE_REPO_TAG=r23.04
-DTRITON_COMMON_REPO_TAG=r23.04
When TRITON_ENABLE_METRICS
is enabled, this backend implements an example
of registering a custom metric to Triton's existing metrics endpoint via the
Metrics API.
This metric will track the cumulative input_byte_size
of all requests
to this backend per-model. Here's an example output of the custom metric
from Triton's metrics endpoint after a few requests to each model:
# HELP input_byte_size_counter Cumulative input byte size of all requests received by the model
# TYPE input_byte_size_counter counter
input_byte_size_counter{model="identity_uint32",version="1"} 64.000000
input_byte_size_counter{model="identity_fp32",version="1"} 32.000000
This example can be referenced to implement custom metrics for various use cases.