-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance degradation for higher batch size on PPC64LE #5085
Comments
That's bad but my impression was that it is a question of right vs faster. It is possible of course that it is a specific corner case that needed fixing rather than the algorithm in general, or that #4920 papers over an actual issue elsewhere. |
Hi @martin-frbg Thank you for your response. We haven't observed the zgemm/cgemm test failure on PPC64LE. Would it be possible to implement a conditional check for PPC so that we can continue using the previous code? If not, could you guide me on the best way to proceed with this? |
PPC64LE is seeing as much as a 2X slowdown with the above patch. I suspect that there might be an issue with C/ZGEMM for x86 (stride issue, overlapping data, ?) and though this fix prevents that, it had an adverse effect of slowing some cases significantly. We are not seeing it most likely because we specialize these cases. |
@martin-frbg Hi Martin we would like to make below change. If you have any other suggestions please let me know. #if defined(POWER) |
Hi,
While running the large batch size model we have observed the performance degradation. We were able to trace it back to #4920
The text was updated successfully, but these errors were encountered: