Single-GPU benchmark for preconditioners #171

namgyu-youn · 2025-05-15T19:52:58Z

Introduces single-GPU benchmarks for comparing various preconditioners: SGD, AdaGrad, Root Inverse Shampoo, Eigendecomposed Shampoo, and Eigenvalue-Corrected Shampoo

In rich Console, developers can check the following:

Total/Average time taken for each preconditioner
CPU/GPU usage

In PyTorch profiler, developers can check the following:

Most time-consuming operations (5-th)
Bottleneck analysis for each preconditioner

Co-authored-by: Tsung-Hsien [email protected]

@tsunghsienlee

The benchmark compares the performance of various preconditioners (SGD, AdaGrad, Root Inverse Shampoo, Eigendecomposed Shampoo, and Eigenvalue-Corrected Shampoo) using rich console and PyTorch profiler. In rich console, you can check the following: - Total time taken for each preconditioner - Average time taken per epoch - Memory usage in MB - GPU utilization percentage (if applicable) In PyTorch profiler, you can check the following: - Most time-consuming operations (5-th) - Bottleneck analysis for each preconditioner Requested by @tsunghsienlee in facebookresearch#157 for developers experience. Co-authored-by: Tsung-Hsien Lee <[email protected]>

namgyu-youn · 2025-05-15T19:56:46Z

Sorry for the multiple-PRs; I have to learn more about VCS... Also, I will gratefully wait the review until July based on #163 - Comment. But I truly believe this PR would be useful. Example is attached in #171 - README.md

namgyu-youn · 2025-06-04T10:32:06Z

Benchmarks in NVIDIA RTX AX2000:

1. Hardcode the device to "cuda" and basic configurations for benchmarks. 2. Enhance sorting logics for profiling results. 3. Fix typo in rich Console output.

- top_ops is not a valid name for a variable, it should be profiling_table

namgyu-youn · 2025-06-30T09:23:08Z

cc @runame @tsunghsienlee

tsunghsienlee · 2025-07-07T15:05:41Z

cc @runame @tsunghsienlee

Hi @namgyu-youn , sorry for my late reply, and I was too busy for the work so I might not be able to review this. Sorry that I bring this idea to you before.

namgyu-youn · 2025-07-07T15:30:39Z

Hi @namgyu-youn , sorry for my late reply, and I was too busy for the work so I might not be able to review this. Sorry that I bring this idea to you before.

Never mind. Since learning torch.profiler was a valuable experience for me, I want to appreciate your suggestion.

But lastly, I want to ask if this PR could be triaged because this update must be helpful for your teams. I will wait @runame, but it seems the review might be delayed (or neglected). The result log message is here, and I hope this update could be helpful for this project; Please consider the review.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 15, 2025

namgyu-youn marked this pull request as draft June 4, 2025 06:46

namgyu-youn added 2 commits June 4, 2025 20:48

Update preconditioner benchmarks

590bb9f

1. Hardcode the device to "cuda" and basic configurations for benchmarks. 2. Enhance sorting logics for profiling results. 3. Fix typo in rich Console output.

Fix mypy type[annotation-unchecked] complain

418fda6

- top_ops is not a valid name for a variable, it should be profiling_table

namgyu-youn marked this pull request as ready for review June 4, 2025 12:01

namgyu-youn changed the title ~~Build single-GPU benchmark for preconditioners~~ Single-GPU benchmark for preconditioners Jul 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Single-GPU benchmark for preconditioners #171

Single-GPU benchmark for preconditioners #171

Uh oh!

namgyu-youn commented May 15, 2025 •

edited

Loading

Uh oh!

namgyu-youn commented May 15, 2025 •

edited

Loading

Uh oh!

namgyu-youn commented Jun 4, 2025 •

edited

Loading

Uh oh!

namgyu-youn commented Jun 30, 2025

Uh oh!

tsunghsienlee commented Jul 7, 2025

Uh oh!

namgyu-youn commented Jul 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Single-GPU benchmark for preconditioners #171

Are you sure you want to change the base?

Single-GPU benchmark for preconditioners #171

Uh oh!

Conversation

namgyu-youn commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

namgyu-youn commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

namgyu-youn commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

namgyu-youn commented Jun 30, 2025

Uh oh!

tsunghsienlee commented Jul 7, 2025

Uh oh!

namgyu-youn commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

namgyu-youn commented May 15, 2025 •

edited

Loading

namgyu-youn commented May 15, 2025 •

edited

Loading

namgyu-youn commented Jun 4, 2025 •

edited

Loading

namgyu-youn commented Jul 7, 2025 •

edited

Loading