Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Rotary Positional Embedding from NeuralAttentionlib.jl #2524

Closed
wants to merge 4 commits into from

Conversation

mashu
Copy link

@mashu mashu commented Nov 14, 2024

This is pretty much standard thing to do these days with MultiHeadAttention layer, I think we should have it as part of Flux.
If anyone can review it, I would be happy. Too often I miss this.

@mashu mashu closed this Nov 15, 2024
@CarloLucibello
Copy link
Member

Why did you close this? It would make sense to have it. Actually you could file a PR to NNlib with the function generating the rotatory embedding.

@mashu
Copy link
Author

mashu commented Nov 15, 2024

Because it might not work on GPU and I figured I want to rewrite it not to compute rotations on fly but cache them.

@CarloLucibello
Copy link
Member

This is in a package now
https://github.com/mashu/PositionalEmbeddings.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants