-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create an API for measuring the total runtime of an arbitrary ttnn op chain #16920
Comments
arminaleTT
changed the title
Create an API for measuring the total runtime of an arbitrary tonne op chain
Create an API for measuring the total runtime of an arbitrary ttnn op chain
Jan 20, 2025
arminaleTT
added a commit
that referenced
this issue
Jan 20, 2025
arminaleTT
added a commit
that referenced
this issue
Jan 20, 2025
6 tasks
arminaleTT
added a commit
that referenced
this issue
Jan 21, 2025
arminaleTT
added a commit
that referenced
this issue
Jan 21, 2025
arminaleTT
added a commit
that referenced
this issue
Jan 22, 2025
arminaleTT
added a commit
that referenced
this issue
Jan 22, 2025
… call end_trace_capture and release_trace
arminaleTT
added a commit
that referenced
this issue
Jan 22, 2025
arminaleTT
added a commit
that referenced
this issue
Jan 22, 2025
…n for use during forge compilation (#16921) ### Ticket #16920 ### Problem description Provide an API for the forge optimizer to run arbitrary ttnn ops **during** forge compilation and measure their runtime. These compile-time perf measurements are an alternative to offline perf models while those are being developed for each op. API should: - take an arbitrary callable of ttnn ops and an arbitrary set of arguments - return the runtime of the callable by actually running it on the device - should match the interface and nomenclature of the L1 constraints API - see PR #15046 and ticket #15291 ### What's changed - Create a new `get_op_runtime()` API with identical interface to `get_op_constraints()` - Use trace capture for perf measurement - Given an op chain, capture the trace of the op chain. Then execute the trace and report the runtime of the trace as the perf measurement - Enables end-to-end perf measurement without using a profiler-enabled build or any dependency on the device profiler - Unit tests to demonstrate functionality for single op and a chain of ops. Note: the forge consumer for this API has not been built yet ### Checklist - [x] Post commit CI passes - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes
patrickroberts
pushed a commit
that referenced
this issue
Jan 25, 2025
…n for use during forge compilation (#16921) ### Ticket #16920 ### Problem description Provide an API for the forge optimizer to run arbitrary ttnn ops **during** forge compilation and measure their runtime. These compile-time perf measurements are an alternative to offline perf models while those are being developed for each op. API should: - take an arbitrary callable of ttnn ops and an arbitrary set of arguments - return the runtime of the callable by actually running it on the device - should match the interface and nomenclature of the L1 constraints API - see PR #15046 and ticket #15291 ### What's changed - Create a new `get_op_runtime()` API with identical interface to `get_op_constraints()` - Use trace capture for perf measurement - Given an op chain, capture the trace of the op chain. Then execute the trace and report the runtime of the trace as the perf measurement - Enables end-to-end perf measurement without using a profiler-enabled build or any dependency on the device profiler - Unit tests to demonstrate functionality for single op and a chain of ops. Note: the forge consumer for this API has not been built yet ### Checklist - [x] Post commit CI passes - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Prerequisite for compile-time perf measurements in tt-mlir
The text was updated successfully, but these errors were encountered: