Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM Post Processing #81

Open
elebumm opened this issue Dec 1, 2024 · 0 comments
Open

LLM Post Processing #81

elebumm opened this issue Dec 1, 2024 · 0 comments

Comments

@elebumm
Copy link

elebumm commented Dec 1, 2024

I've started working on a feature that I think would be a life saver for me.

DemoWhisper.mp4

A post processing method that will take your transcription and run it through an OpenAI compatible API of your choosing. In my demo:

  • Understands when to do a new line
  • Understands when I made a mistake and to remove it from the final output
  • Converts "smiley face" into an emoji.
    (Used Gemini Flash with Open Router)

image

I wanted to take some feedback before I develop this into a PR. Currently I have it so that the user customizes the system prompt as well as the beginning of the user prompt that the transcription will be inserted in.

For my use case above, I also needed to change up pynput to support "new line" or emojis.

This is for a video for my channel, so interested to hear everyones thoughts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant