Skip to content
This repository was archived by the owner on Feb 3, 2025. It is now read-only.

Cuda synchronize alternative for profiling #304

Open
@aimilefth

Description

@aimilefth

Greetings,

I am currently using tf-trt and I want to measure the perfomance of my models (Latency, Throughput).

The tensorrt c++ API has the functionality of cuda synchronize via the cuda events API https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#cuda-events

On top of that, Pytorch contains the torch.cuda.synchronize() alternative
https://pytorch.org/docs/stable/generated/torch.cuda.synchronize.html

However in the TF TRT docs, I cant find something similar, which in my opinion is essential in order to correctly measure perfomance metrics

Have I missed anything or are there plans to integrate such functionality?

Thank you

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions