-
Notifications
You must be signed in to change notification settings - Fork 58
perf(autograd): optimize grey_dilation with striding #2589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
3 files reviewed, no comments
Edit PR Review Bot Settings | Greptile
Diff CoverageDiff: origin/develop...HEAD, staged and unstaged changes
Summary
tidy3d/plugins/autograd/functions.py
|
266d6a0
to
be0b9b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this implementation, the speed up looks awesome especially for a function that will be in a lot of robust optimizations!
left some comments/questions, some just for my own understanding!
The previous implementation of `grey_dilation` was based on convolution, which was slow for both the forward and backward passes. This commit replaces it with a high-performance implementation that uses NumPy's `as_strided` to create sliding window views of the input array. This avoids redundant computations and memory allocations, leading to significant speedups. The VJP (gradient) for the primitive is also updated to use the same striding technique, ensuring the backward pass is also much faster. Benchmarks show speedups of 10-100x depending on the array and kernel size.
be0b9b0
to
21486df
Compare
The previous implementation of
grey_dilation
was based on convolution, which was slow for both the forward and backward passes.This PR replaces it with a high-performance implementation that uses NumPy's
sliding_window_view
to create sliding window views of the input array. I also wrote a custom VJP that uses the same striding technique to make the backward pass faster too.I also simplified the implementation of
grey_erosion
so thatgrey_dilation
is now the only function that does the heavy lifting.Benchmarks show speedups of 10-100x depending on the array and kernel size.
This should make these ops much more usable in topopt @groberts-flex
Greptile Summary
Significant performance optimization of the
grey_dilation
morphological operation by replacing convolution-based implementation with NumPy'ssliding_window_view
for strided array operations.tidy3d/plugins/autograd/functions.py
, achieving 10-100x speedupgrey_erosion
by expressing it through duality withgrey_dilation