feat: TensorRT AOT Plugin #3504

bowang007 · 2025-05-05T05:52:07Z

Description

This PR demonstrates how to use AOT plugin in Torch-TensorRT

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/dynamo/aot_plugin.py	2025-05-05 05:52:23.878918+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/dynamo/aot_plugin.py	2025-05-05 05:52:44.176344+00:00
@@ -23,13 +23,11 @@
    output = x + 1
    tl.store(y_ptr + offsets, output, mask=mask)


@torch.library.custom_op("my::add_one", mutates_args=())  # type: ignore[misc]
-def add_one(
-    X: torch.Tensor
-) -> torch.Tensor:
+def add_one(X: torch.Tensor) -> torch.Tensor:
    # Ensure the tensors are on the GPU
    assert X.is_cuda

    # Create output tensor
    Y = torch.empty_like(X)
@@ -53,19 +51,22 @@

# torch_tensorrt.dynamo.conversion.plugins.generate_plugin(
#     "my::add_one"
# )

+
@trtp.register("my::add_one")
def add_plugin_desc(X: trtp.TensorDesc) -> Tuple[trtp.TensorDesc]:
    return X.like()

+
@trtp.aot_impl("my::add_one")
def add_plugin_aot_impl(
    X: trtp.TensorDesc, outputs: Tuple[trtp.TensorDesc], tactic: int
-) -> Tuple[Union[str, bytes], Union[str, bytes], trtp.KernelLaunchParams, trtp.SymExprs]:
-
+) -> Tuple[
+    Union[str, bytes], Union[str, bytes], trtp.KernelLaunchParams, trtp.SymExprs
+]:

    type_str = "fp32" if X.dtype == trt.float32 else "fp16"

    block_size = 256
    src = triton.compiler.ASTSource(
@@ -101,10 +102,11 @@
        compiled_kernel.asm["ptx"],
        launch_params,
        extra_args,
    )

+
torch_tensorrt.dynamo.conversion.plugins.generate_plugin_converter(
    "my::add_one",
    supports_dynamic_shapes=False,
    requires_output_allocator=False,
    aot=True,
@@ -127,18 +129,15 @@
    parser.add_argument(
        "--aot", action="store_true", help="Try to use AOT compilation", default=False
    )
    args = parser.parse_args()

-
-    
    my_model = MyModel().to("cuda")
    m = torch.full((64, 64), 2, device="cuda", dtype=torch.float)

    # This works!
    assert my_model(X=m)[0][0] == 3.0
-

    with torch_tensorrt.logging.debug():
        trt_inputs = [m]
        model_trt = torch_tensorrt.compile(
            my_model,
@@ -151,6 +150,6 @@
        for i in range(10):
            res = model_trt(m)
            assert torch.allclose(res, my_model(m)), "Results do not match!"

    print("Inference successful!")
-    print(res)
\ No newline at end of file
+    print(res)

examples/dynamo/aot_plugin.py

py/torch_tensorrt/dynamo/conversion/plugins/_generate_plugin_converter.py

narendasan · 2025-05-22T04:07:24Z

py/torch_tensorrt/dynamo/conversion/plugins/_generate_plugin_converter.py

@@ -31,7 +31,7 @@ def _generate_plugin_converter(
    priority: ConverterPriority = ConverterPriority.STANDARD,
    supports_dynamic_shapes: bool = False,
    requires_output_allocator: bool = False,
-    aot: bool = False,
+    use_aot_if_available: bool = False,


Default to true

narendasan

LGTM

facebook-github-bot added the cla signed label May 5, 2025

github-actions bot added component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels May 5, 2025

github-actions bot requested a review from gs-olive May 5, 2025 05:52

github-actions bot requested changes May 5, 2025

View reviewed changes

narendasan reviewed May 6, 2025

View reviewed changes

examples/dynamo/aot_plugin.py Show resolved Hide resolved

narendasan reviewed May 6, 2025

View reviewed changes

examples/dynamo/aot_plugin.py Outdated Show resolved Hide resolved

narendasan reviewed May 6, 2025

View reviewed changes

py/torch_tensorrt/dynamo/conversion/plugins/_generate_plugin_converter.py Outdated Show resolved Hide resolved

narendasan mentioned this pull request May 7, 2025

❓ [Question] How wo you export a triton kernel with model to a serialized engine that can be run in c++? #3469

Open

narendasan reviewed May 22, 2025

View reviewed changes

feat: enable AOT tensorrt plugin example

35437b1

bowang007 force-pushed the aot_plugin branch from 0167831 to 35437b1 Compare June 4, 2025 01:27

narendasan approved these changes Jun 4, 2025

View reviewed changes

bowang007 merged commit a2979a9 into main Jun 4, 2025
84 of 85 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: TensorRT AOT Plugin #3504

feat: TensorRT AOT Plugin #3504

Uh oh!

bowang007 commented May 5, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

narendasan May 22, 2025

Uh oh!

narendasan left a comment

Uh oh!

Uh oh!

Uh oh!

feat: TensorRT AOT Plugin #3504

feat: TensorRT AOT Plugin #3504

Uh oh!

Conversation

bowang007 commented May 5, 2025

Description

Type of change

Checklist:

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

narendasan May 22, 2025

Choose a reason for hiding this comment

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!