-
Clone this repo.
-
Create a
.env.local
file withBANANA_API_KEY=your_api_key
andBANANA_MODEL_KEY=your_model_key
. -
Install dependencies:
npm i
- Run the development server:
npm run dev
-
Open http://localhost:3000 with your browser to see your project!
-
Outline of Approach I used.
Referred a lot of API docs, and found out there were so many providers for OpenAI’s whisper.
So I checked and worked it all out.
OpenAI - https://platform.openai.com/docs/guides/speech-to-text
AssemblyAI - https://www.assemblyai.com/docs/walkthroughs#realtime-streaming-transcription
Banana.dev - https://www.banana.dev/ PS this one stood out in terms of performance and cost.
Images - UI
I tried with all only proper output that gave reliable results was banana.dev ‘s as those are hosted on heavy Tesla V100 GPUs absolute beast.
Others even if was noisey gave korean text sometimes, so meh.
Banana.dev >>>>> , lol