This repository was archived by the owner on Feb 3, 2025. It is now read-only.
Cuda synchronize alternative for profiling #304
Open
Description
Greetings,
I am currently using tf-trt and I want to measure the perfomance of my models (Latency, Throughput).
The tensorrt c++ API has the functionality of cuda synchronize via the cuda events API https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#cuda-events
On top of that, Pytorch contains the torch.cuda.synchronize() alternative
https://pytorch.org/docs/stable/generated/torch.cuda.synchronize.html
However in the TF TRT docs, I cant find something similar, which in my opinion is essential in order to correctly measure perfomance metrics
Have I missed anything or are there plans to integrate such functionality?
Thank you
Metadata
Metadata
Assignees
Labels
No labels