Audio transcriber #25

damielulade · 2024-01-19T14:46:18Z

In src/lib/transcriber.ts:

transcribe() is the exported function to integrate into the next stage of the project
extract_audio() used fluent-ffmpeg to take in a .mp4 video from the filesystem and output a .mp3 audio file back to the filesystem.
run_query() imports an audio file, and queries the Whisper-large-v3 Inference API to transcribe the video.

We need to find out the limit of file size/audio length that the fast prototyping can take in, so we can run the API in chunks (this slows down the code significantly). The other options are: paying for the development API to run longer videos; or taking the API as a python module (ask @VrishYT for more details on that).

Add textarea for transcript input, add summarise button which extracts transcript. Currently only alerts, summary not yet implemented.

…ranscript timestamped by slides

Connect GPT 3.5 API in orde to mae summary requests. Not yet linked to frontend.

openai-test.js can now be run with nodes openai-test.js "your-transcript-here" to output the summary of the argument given.

Extract the message content from the completion returned by the GPT API rather than the entire response

Added an abstract class that is extended by the classes that contain transcripts of the whole lecture and transcripts split by slides

define plan for document format based on classes in IR branch. (These classes such as Transcript, Slide etc. are not present yet on this branch.) classes Slide, Transcript etc. are not yet on this branch

…t_summariser

Move openai-test.js code into Summariser class to use with summariser tests

Change Summariser argument type from string to String

Resolve summary in summariser_tests.js and set to new variable if it isn't null (initialised to empty string).

vercel · 2024-01-19T14:46:24Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
summentia	❌ Failed (Inspect)			Jan 22, 2024 1:37pm

Replace switch statement with if else to fix summariser customisations

…line

FFmpeg does not exit when video with no audio is the input file

rishi-khiroya and others added 30 commits January 15, 2024 15:51

build: Fix Flowbite installation

d8ac179

style: Remove dark bg

c04ce80

feat: Add file upload box.

cbc39fa

feat: Roughly implement file uploading.

77fca83

style: Modify style of file upload.

f874e7f

refactor: Remove unused Navbar component

a4e4f81

feat: Extract lecture uploads to its own component

ac49aa2

feat: Roughly implement file uploads

ee047f1

feat: Add input features for testing transcript summary

3782d86

Add textarea for transcript input, add summarise button which extracts transcript. Currently only alerts, summary not yet implemented.

feat: Create initial internal representation for a transcript and a t…

00d4810

…ranscript timestamped by slides

build: Remove lib config setting in tsconfig

c107a65

feat: Add openai-test.js to lib for summary requests to GPT 3.5

bd07435

Connect GPT 3.5 API in orde to mae summary requests. Not yet linked to frontend.

feat: Modify background colour

9838541

feat: Add navbar and footbar components and add to default layout

1eab60e

style: Comment out unused footer links for now.

303695e

fix: Fix home page styling

1e99145

fix: Modify input box to have type URL

fd2e5db

feat: Implement submitting a URL for a lecture

d3cfce9

feat: Summariser script takes trancsipt as argument

bf346e6

openai-test.js can now be run with nodes openai-test.js "your-transcript-here" to output the summary of the argument given.

feat: Summariser now extracts text only from chatGPT response

30f67a7

Extract the message content from the completion returned by the GPT API rather than the entire response

feat: Restructure internal representation

790f4a9

Added an abstract class that is extended by the classes that contain transcripts of the whole lecture and transcripts split by slides

feat: Add file for document format internal representation

6570d14

build: Install vercel blob and postgres sdks

57e251c

feat: Create initial plan for document_format IR

bc0d4f4

define plan for document format based on classes in IR branch. (These classes such as Transcript, Slide etc. are not present yet on this branch.) classes Slide, Transcript etc. are not yet on this branch

Added a test case for the summariser functionality

0b1bef9

Merge remote-tracking branch 'origin/IR' into transcript_summariser

61e98fe

Merge remote-tracking branch 'origin/summariser_tests' into transcrip…

627e0ad

…t_summariser

feat: Implement Summariser class

0c2fb3c

Move openai-test.js code into Summariser class to use with summariser tests

fix: Fix Summariser argument type

df1ff73

Change Summariser argument type from string to String

fix: Fix Promise<String> type error in summariser_tests.ts

a00d6b7

Resolve summary in summariser_tests.js and set to new variable if it isn't null (initialised to empty string).

refactor: Export transcribe function from transcriber.ts

3338665

damielulade requested a review from preesha-gehlot January 19, 2024 14:46

damielulade self-assigned this Jan 19, 2024

feat: updated test suite to use vitest

45fd4eb

This was linked to issues Jan 19, 2024

Implement audio transcriber #8

Open

Split video into audio and video #4

Closed

Khiroya Rishi and others added 16 commits January 19, 2024 17:01

fix: Fix LectureUpload to allow click and upload

cff957b

fix: Update home page to use updated LectureUpload

71eda13

feat: Implement uploading uploaded files to BLOB.

4b2acb5

Merge "audio_transcriber" into "db_blob"

2a654b0

build: Install ffmpeg

f45afa7

fix: Fix switch case statement error in summariser

b0e85d2

Replace switch statement with if else to fix summariser customisations

refactor: Add 'any' types for event params in Svelte event handlers

947a9ee

build: Install ffmpeg-static

a17a96d

refactor: Change auto type to any

c9f14e7

feat: Integrate file uploads and transcriber

0562818

Merge remote-tracking branch 'origin/transcript_summariser' into pipe…

4985bc2

…line

refactor: Delete empty lib/index.ts file

aa20454

style: Fix code style, typing and immutability of variables.

8e642b9

build: Modify Summariser to read OpenAI API key from env

1ddb4b3

feat: Integrate summariser into the pipeline

6203aa9

Merge branch 'pipeline' into audio_transcriber

674ad35

vercel bot had a problem deploying to Preview January 22, 2024 11:43 Failure

chore: Delet test video mp4s

766e2c6

vercel bot had a problem deploying to Preview January 22, 2024 11:43 Failure

rishi-khiroya and others added 3 commits January 22, 2024 11:43

Merge remote-tracking branch 'origin/Document_Format_IR' into pipeline

1ec2577

Merge branch 'pipeline' into audio_transcriber

44a88eb

test: Add tests for transcribing audio from video files

1a5fb87

FFmpeg does not exit when video with no audio is the input file

vercel bot had a problem deploying to Preview January 22, 2024 13:37 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio transcriber #25

Audio transcriber #25

damielulade commented Jan 19, 2024

vercel bot commented Jan 19, 2024 •

edited

Loading

Audio transcriber #25

Are you sure you want to change the base?

Audio transcriber #25

Conversation

damielulade commented Jan 19, 2024

vercel bot commented Jan 19, 2024 • edited Loading

vercel bot commented Jan 19, 2024 •

edited

Loading