Open
Description
Description
In addition to --timeout
please add two timeout related CLI arguments:
ttft_timeout_s - The max amount of seconds after which a request fails if no tokens were received
itl_timeout_s - The max amount of seconds between two tokens, after the first token is generated, above which the request fails.
Motivation
Currently, there’s a timeout setting for each request. However, additional timeout such as ttft_timeout
& itl_timeout
could be useful as they also indicate a situation where we might wanna consider the request a failure.
Metadata
Metadata
Assignees
Type
Projects
Status
Ready