Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add training and inference support for RWKV LSTMs #88

Open
TodayAI opened this issue Jun 20, 2024 · 1 comment
Open

Add training and inference support for RWKV LSTMs #88

TodayAI opened this issue Jun 20, 2024 · 1 comment

Comments

@TodayAI
Copy link

TodayAI commented Jun 20, 2024

Is your feature request related to a problem? Please describe.
Your Seq2SeqSharp project already support LSTMs. Please consider to implement the RWKV large language "linear attention" idea into your c# solution. The linear attention model of RWKV has a great performance on inference.
See: https://www.rwkv.com/

Describe the solution you'd like
Maybe it just needs a few functions to implement into Seq2SeqSharp LSTM functionality like "token shift" or "time decay". Maybe you have another idea how to improve the LSTM performance in Seq2SeqSharp.
I would like to implement your solution into the Godot Game Engine for training, fine tuning and inference in pure c# code.

Describe alternatives you've considered
Use "https://github.com/imxcstar/CSharp-RWKV" for inference only.

@zhongkaifu
Copy link
Owner

Hi @TodayAI
Thanks for your suggestions and sharing.

RWKV is more likely a "linear attention" version of Transformer rather than LSTM although it also uses gates-controls. I tried the early version of RWKV, and its performance is not good in practice, especially it's super sensitive to prompts. Maybe the new version already had some improvement, but I didn't get chance to try it out.

Based my knowledge, to implement RWKV, it would be easier to start from the existing Transformer code rather than existing LSTM code. For now, I don't have plan to do so, but it's always welcome to anyone who could contribute on it.

Thanks
Zhongkai Fu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants