Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: vectorized implementation of the cross poduct #155

Merged

Conversation

niermann999
Copy link
Contributor

@niermann999 niermann999 commented Mar 17, 2025

Use shuffle operations for the implementation of the cross product in the Vc AoS plugin. Since there is a trailing zero element to reach a fixed alignment, this has to be initialized to a specific value from the input vectors, so that the shuffles result in the correct vectors to do the multiplication and subtraction with.

Edit: For some reason the shuffles don't compile into something efficient on avx registers, so I kept the old fma approach for double precision

@niermann999 niermann999 added the performance Improves performance label Mar 17, 2025
@niermann999

This comment was marked as outdated.

@niermann999 niermann999 force-pushed the feat-optimize-vectorization branch from e35c809 to 586b28f Compare March 17, 2025 13:09
@niermann999 niermann999 marked this pull request as draft March 17, 2025 13:09
@niermann999 niermann999 force-pushed the feat-optimize-vectorization branch 2 times, most recently from b9cf02d to 53a78ec Compare March 18, 2025 12:58
@niermann999 niermann999 marked this pull request as ready for review March 18, 2025 12:58
@niermann999 niermann999 force-pushed the feat-optimize-vectorization branch 2 times, most recently from d1c727c to f877287 Compare March 18, 2025 14:03
@niermann999 niermann999 force-pushed the feat-optimize-vectorization branch 2 times, most recently from be68160 to ce4417f Compare March 18, 2025 14:31
… the Vc AoS plugin. Since there is a trailing zero element to reach a fixed alignment, this has to be initialized to a specific value from the input vectors, so that the shuffles result in the correct vectors to do the multiplication and subtraction with.
@niermann999 niermann999 force-pushed the feat-optimize-vectorization branch from ce4417f to 64cfdd7 Compare March 18, 2025 14:42
@niermann999 niermann999 merged commit 01cf491 into acts-project:main Mar 18, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Improves performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants