Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an API for measuring the total runtime of an arbitrary ttnn op chain #16920

Open
arminaleTT opened this issue Jan 20, 2025 · 0 comments
Open

Comments

@arminaleTT
Copy link
Contributor

Prerequisite for compile-time perf measurements in tt-mlir

@arminaleTT arminaleTT self-assigned this Jan 20, 2025
@arminaleTT arminaleTT changed the title Create an API for measuring the total runtime of an arbitrary tonne op chain Create an API for measuring the total runtime of an arbitrary ttnn op chain Jan 20, 2025
arminaleTT added a commit that referenced this issue Jan 20, 2025
arminaleTT added a commit that referenced this issue Jan 21, 2025
arminaleTT added a commit that referenced this issue Jan 22, 2025
arminaleTT added a commit that referenced this issue Jan 22, 2025
arminaleTT added a commit that referenced this issue Jan 22, 2025
…n for use during forge compilation (#16921)

### Ticket
#16920 

### Problem description
Provide an API for the forge optimizer to run arbitrary ttnn ops
**during** forge compilation and measure their runtime. These
compile-time perf measurements are an alternative to offline perf models
while those are being developed for each op.

API should:
- take an arbitrary callable of ttnn ops and an arbitrary set of
arguments
- return the runtime of the callable by actually running it on the
device
- should match the interface and nomenclature of the L1 constraints API
  - see PR #15046 and ticket #15291 

### What's changed
- Create a new `get_op_runtime()` API with identical interface to
`get_op_constraints()`
- Use trace capture for perf measurement
- Given an op chain, capture the trace of the op chain. Then execute the
trace and report the runtime of the trace as the perf measurement
- Enables end-to-end perf measurement without using a profiler-enabled
build or any dependency on the device profiler
- Unit tests to demonstrate functionality for single op and a chain of
ops.

Note: the forge consumer for this API has not been built yet

### Checklist
- [x] Post commit CI passes
- [ ] Blackhole Post commit (if applicable)
- [ ] Model regression CI testing passes (if applicable)
- [ ] Device performance regression CI testing passes (if applicable)
- [ ] **(For models and ops writers)** Full [new
models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml)
tests passes
- [x] New/Existing tests provide coverage for changes
patrickroberts pushed a commit that referenced this issue Jan 25, 2025
…n for use during forge compilation (#16921)

### Ticket
#16920 

### Problem description
Provide an API for the forge optimizer to run arbitrary ttnn ops
**during** forge compilation and measure their runtime. These
compile-time perf measurements are an alternative to offline perf models
while those are being developed for each op.

API should:
- take an arbitrary callable of ttnn ops and an arbitrary set of
arguments
- return the runtime of the callable by actually running it on the
device
- should match the interface and nomenclature of the L1 constraints API
  - see PR #15046 and ticket #15291 

### What's changed
- Create a new `get_op_runtime()` API with identical interface to
`get_op_constraints()`
- Use trace capture for perf measurement
- Given an op chain, capture the trace of the op chain. Then execute the
trace and report the runtime of the trace as the perf measurement
- Enables end-to-end perf measurement without using a profiler-enabled
build or any dependency on the device profiler
- Unit tests to demonstrate functionality for single op and a chain of
ops.

Note: the forge consumer for this API has not been built yet

### Checklist
- [x] Post commit CI passes
- [ ] Blackhole Post commit (if applicable)
- [ ] Model regression CI testing passes (if applicable)
- [ ] Device performance regression CI testing passes (if applicable)
- [ ] **(For models and ops writers)** Full [new
models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml)
tests passes
- [x] New/Existing tests provide coverage for changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant