Add training and inference support for RWKV LSTMs #88

TodayAI · 2024-06-20T11:39:35Z

Is your feature request related to a problem? Please describe.
Your Seq2SeqSharp project already support LSTMs. Please consider to implement the RWKV large language "linear attention" idea into your c# solution. The linear attention model of RWKV has a great performance on inference.
See: https://www.rwkv.com/

Describe the solution you'd like
Maybe it just needs a few functions to implement into Seq2SeqSharp LSTM functionality like "token shift" or "time decay". Maybe you have another idea how to improve the LSTM performance in Seq2SeqSharp.
I would like to implement your solution into the Godot Game Engine for training, fine tuning and inference in pure c# code.

Describe alternatives you've considered
Use "https://github.com/imxcstar/CSharp-RWKV" for inference only.

zhongkaifu · 2024-06-20T17:10:34Z

Hi @TodayAI
Thanks for your suggestions and sharing.

RWKV is more likely a "linear attention" version of Transformer rather than LSTM although it also uses gates-controls. I tried the early version of RWKV, and its performance is not good in practice, especially it's super sensitive to prompts. Maybe the new version already had some improvement, but I didn't get chance to try it out.

Based my knowledge, to implement RWKV, it would be easier to start from the existing Transformer code rather than existing LSTM code. For now, I don't have plan to do so, but it's always welcome to anyone who could contribute on it.

Thanks
Zhongkai Fu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add training and inference support for RWKV LSTMs #88

Add training and inference support for RWKV LSTMs #88

TodayAI commented Jun 20, 2024

zhongkaifu commented Jun 20, 2024

Add training and inference support for RWKV LSTMs #88

Add training and inference support for RWKV LSTMs #88

Comments

TodayAI commented Jun 20, 2024

zhongkaifu commented Jun 20, 2024