Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Higher UNet Compute Buffer Sizes in Winograd vs. im2col Implementations #3

Open
rmatif opened this issue Feb 4, 2025 · 0 comments

Comments

@rmatif
Copy link

rmatif commented Feb 4, 2025

I'm developing a Flutter wrapper around the sdcpp implementation and have been testing the performance. While I'm seeing a 1.15-1.2x speedup on Android compared to the original sdcpp repo, I've noticed an unexpected pattern in memory usage where the Winograd implementation shows higher UNet compute buffer sizes compared to im2col, which seems counterintuitive given that im2col typically requires more memory for intermediate matrices. Typically like here

Here are some results I have compiled during my testings :

Stable Diffusion 1.5 (SD1.5)

Resolution Winograd UNET Buffer Size im2col UNET Buffer Size Difference (Winograd - im2col)
512x512 611.79 MB 559.71 MB +52.08 MB
384x384 244.16 MB 192.08 MB +52.08 MB
256x256 100.41 MB 49.43 MB +50.98 MB

Stable Diffusion XL (SDXL)

Resolution Winograd UNET Buffer Size im2col UNET Buffer Size Difference (Winograd - im2col)
1024x1024 864.29 MB 830.19 MB +34.10 MB
512x512 156.29 MB 131.85 MB +24.44 MB
384x384 113.32 MB 95.47 MB +17.85 MB
256x256 96.29 MB 60.31 MB +35.98 MB

I also tried compiling with OpenCL, as you mention in your report that it is supported on Android. However, I encountered crashes and compilation errors. I would appreciate it if you could confirm its support and include a guide in the README

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant