Concurrent requests #406

Visio-Biswaroop · 2023-08-06T15:19:56Z

Visio-Biswaroop
Aug 6, 2023

Hi all,

Is there any bench mark/literature (for me to understand) on the number of simultaneous (concurrent) requests we can process on faster-whisper (especially using GPUs). In my use-case, I will be typically sending short sentences/phrases.

What parameter will let me control the simultaneous/concurrent use.

cheers,

guillaumekln · 2023-08-08T07:22:25Z

guillaumekln
Aug 8, 2023

Hi,

See the section "Running multiple transcriptions in parallel" in #100 (comment) which shows how to initialize multiple model workers that can process requests in parallel.

0 replies

brajeshvisio01 · 2023-10-09T12:39:45Z

brajeshvisio01
Oct 9, 2023

@guillaumekln I have taken an instance of 2 GPU and set device_index=[0,1] and num_workers=2, then it should handle 4 req at a time and yes it handles but in best senario, but the problem is the overall time taken by the app is same when I run it on two instance of single gpu with gunicorn, the api is made using flask app. I have observed that the initially it takes time to give response of first request thats why the time has no defference. Please let me clarify . Thanks and regards

1 reply

mru4913 Aug 3, 2024

do you get the solution?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent requests #406

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Concurrent requests #406

Visio-Biswaroop Aug 6, 2023

Replies: 2 comments · 1 reply

guillaumekln Aug 8, 2023

brajeshvisio01 Oct 9, 2023

mru4913 Aug 3, 2024

Visio-Biswaroop
Aug 6, 2023

Replies: 2 comments 1 reply

guillaumekln
Aug 8, 2023

brajeshvisio01
Oct 9, 2023