-
Notifications
You must be signed in to change notification settings - Fork 50
Single-GPU benchmark for preconditioners #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The benchmark compares the performance of various preconditioners (SGD, AdaGrad, Root Inverse Shampoo, Eigendecomposed Shampoo, and Eigenvalue-Corrected Shampoo) using rich console and PyTorch profiler. In rich console, you can check the following: - Total time taken for each preconditioner - Average time taken per epoch - Memory usage in MB - GPU utilization percentage (if applicable) In PyTorch profiler, you can check the following: - Most time-consuming operations (5-th) - Bottleneck analysis for each preconditioner Requested by @tsunghsienlee in facebookresearch#157 for developers experience. Co-authored-by: Tsung-Hsien Lee <[email protected]>
Sorry for the multiple-PRs; I have to learn more about VCS... Also, I will gratefully wait the review until July based on #163 - Comment. But I truly believe this PR would be useful. Example is attached in #171 - README.md |
1. Hardcode the device to "cuda" and basic configurations for benchmarks. 2. Enhance sorting logics for profiling results. 3. Fix typo in rich Console output.
- top_ops is not a valid name for a variable, it should be profiling_table
Hi @namgyu-youn , sorry for my late reply, and I was too busy for the work so I might not be able to review this. Sorry that I bring this idea to you before. |
Never mind. Since learning But lastly, I want to ask if this PR could be triaged because this update must be helpful for your teams. I will wait @runame, but it seems the review might be delayed (or neglected). The result log message is here, and I hope this update could be helpful for this project; Please consider the review. |
Introduces single-GPU benchmarks for comparing various preconditioners: SGD, AdaGrad, Root Inverse Shampoo, Eigendecomposed Shampoo, and Eigenvalue-Corrected Shampoo
In rich Console, developers can check the following:
In PyTorch profiler, developers can check the following:
Co-authored-by: Tsung-Hsien [email protected]