You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Your Seq2SeqSharp project already support LSTMs. Please consider to implement the RWKV large language "linear attention" idea into your c# solution. The linear attention model of RWKV has a great performance on inference.
See: https://www.rwkv.com/
Describe the solution you'd like
Maybe it just needs a few functions to implement into Seq2SeqSharp LSTM functionality like "token shift" or "time decay". Maybe you have another idea how to improve the LSTM performance in Seq2SeqSharp.
I would like to implement your solution into the Godot Game Engine for training, fine tuning and inference in pure c# code.
Hi @TodayAI
Thanks for your suggestions and sharing.
RWKV is more likely a "linear attention" version of Transformer rather than LSTM although it also uses gates-controls. I tried the early version of RWKV, and its performance is not good in practice, especially it's super sensitive to prompts. Maybe the new version already had some improvement, but I didn't get chance to try it out.
Based my knowledge, to implement RWKV, it would be easier to start from the existing Transformer code rather than existing LSTM code. For now, I don't have plan to do so, but it's always welcome to anyone who could contribute on it.
Is your feature request related to a problem? Please describe.
Your Seq2SeqSharp project already support LSTMs. Please consider to implement the RWKV large language "linear attention" idea into your c# solution. The linear attention model of RWKV has a great performance on inference.
See: https://www.rwkv.com/
Describe the solution you'd like
Maybe it just needs a few functions to implement into Seq2SeqSharp LSTM functionality like "token shift" or "time decay". Maybe you have another idea how to improve the LSTM performance in Seq2SeqSharp.
I would like to implement your solution into the Godot Game Engine for training, fine tuning and inference in pure c# code.
Describe alternatives you've considered
Use "https://github.com/imxcstar/CSharp-RWKV" for inference only.
The text was updated successfully, but these errors were encountered: