This repository adds the OpenAIRealtimeKani. Function calling works right out of the box!
demo.mp4
This package is considered provisional and maintained on a best-effort basis. As such, it will not be released on PyPI.
To install this package, you must install it using the git source:
$ pip install git+https://github.com/zhudotexe/kani-ext-realtime.git@main
See https://platform.openai.com/docs/guides/realtime for more information on the OpenAI Realtime API.
import asyncio
from kani.ext.realtime import OpenAIRealtimeKani, chat_in_terminal_audio_async
async def main():
ai = OpenAIRealtimeKani() # note - the OpenAIRealtimeKani does *not* take an engine!
await ai.connect() # additional step needed to connect to the Realtime API
await chat_in_terminal_audio_async(ai, mode="full_duplex")
if __name__ == "__main__":
asyncio.run(main())
import asyncio
from kani.ext.realtime import OpenAIRealtimeKani, chat_in_terminal_audio_async
async def handle_stream(stream):
# do processing for a single message's stream here...
# this example code does NOT account for multiple simultaneous messages
async for token in stream:
print(token, end="")
msg = await stream.message()
async def main():
ai = OpenAIRealtimeKani() # note - the OpenAIRealtimeKani does *not* take an engine!
await ai.connect() # additional step needed to connect to the Realtime API
stream_tasks = set()
async for stream in ai.full_duplex(audio_stream):
task = asyncio.create_task(handle_stream(stream))
# to keep a live reference to the task
# see https://docs.python.org/3/library/asyncio-task.html#creating-tasks
stream_tasks.add(task)
task.add_done_callback(stream_tasks.discard)
if __name__ == "__main__":
asyncio.run(main())
The OpenAIRealtimeKani is compatible with most standard kani interfaces -- you can, for example:
- Define
@ai_function
s which the realtime model will call (by subclassingOpenAIRealtimeKani
) - Supply a
system_prompt
or fewshot examples inchat_history
- (note: you likely want to supply a system prompt through
SessionConfig.instructions
instead, see below)
- (note: you likely want to supply a system prompt through
- Get text/audio completions like a normal text-text model with
.full_round
The new methods provided in the OpenAIRealtimeKani
are:
connect(config: SessionConfig)
full_duplex(input_audio_stream: AsyncIterable[bytes], output_audio_callback: AsyncCallable[[bytes], Any])
It also supports the configuration options provided by the Realtime API. When you call .connect
, you can supply a
SessionConfig
object, which supports all of the options listed
at https://platform.openai.com/docs/api-reference/realtime-client-events/session/update.
For more information: