Skip to content

Commit df31b9f

Browse files
authored
Minh/s2s context summary (#1813)
1 parent 5ab3df4 commit df31b9f

9 files changed

+739
-4
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -144,3 +144,4 @@ examples/fine-tuned_qa/local_cache/*
144144

145145
# VS Code files
146146
.vscode/
147+
.cursorignore

examples/Context_summarization_with_realtime_api.ipynb

+724
Large diffs are not rendered by default.

examples/Speech_transcription_methods.ipynb

+4-4
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@
142142
"### How it works\n",
143143
"\n",
144144
"\n",
145-
"![STT Not Streaming Transcription flow](./imgs/speech-to-text-not-streaming.png)\n",
145+
"![STT Not Streaming Transcription flow](../images/speech-to-text-not-streaming.png)\n",
146146
"\n",
147147
"#### Benefits\n",
148148
"\n",
@@ -250,7 +250,7 @@
250250
"- You need immediate transcription results (partial or final) as they arrive. \n",
251251
"- Scenarios where partial feedback improves UX, e.g., uploading a long voice memo.\n",
252252
"\n",
253-
"![STT Streaming Transcription flow](./imgs/speech-to-text-streaming.png)\n",
253+
"![STT Streaming Transcription flow](../images/speech-to-text-streaming.png)\n",
254254
"\n",
255255
"#### Benefits\n",
256256
"- **Real-time feel:** Users see transcription updates almost immediately. \n",
@@ -321,7 +321,7 @@
321321
"source": [
322322
"### How it works\n",
323323
"\n",
324-
"![Realtime Transcription flow](./imgs/realtime_api_transcription.png)\n",
324+
"![Realtime Transcription flow](../images/realtime_api_transcription.png)\n",
325325
"\n",
326326
"#### Benefits\n",
327327
"- **Ultra-low latency:** Typically 300–800 ms, enabling near-instant transcription. \n",
@@ -496,7 +496,7 @@
496496
"source": [
497497
"### How it works\n",
498498
"\n",
499-
"![Agents Transcription flow](./imgs/agents_sdk_transcription.png)\n",
499+
"![Agents Transcription flow](../images/agents_sdk_transcription.png)\n",
500500
"\n",
501501
"**Benefits**\n",
502502
"\n",

images/text-vs-audio-tokens.png

2.26 MB
Loading

registry.yaml

+10
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,16 @@
44
# should build pages for, and indicates metadata such as tags, creation date and
55
# authors for each page.
66

7+
- title: Context Summarization with Realtime API
8+
path: examples/Context_summarization_with_realtime_api.ipynb
9+
date: 2025-05-10
10+
authors:
11+
- minh-hoque
12+
tags:
13+
- audio
14+
- speech
15+
- tiktoken
16+
717
- title: Comparing Speech-to-Text Methods with the OpenAI API
818
path: examples/Speech_transcription_methods.ipynb
919
date: 2025-04-29

0 commit comments

Comments
 (0)