Skip to content

[Feature Request] TTFT & ITL Timeouts #107

@markVaykhansky

Description

@markVaykhansky

Description
In addition to --timeout please add two timeout related CLI arguments:
ttft_timeout_s - The max amount of seconds after which a request fails if no tokens were received
itl_timeout_s - The max amount of seconds between two tokens, after the first token is generated, above which the request fails.

Motivation
Currently, there’s a timeout setting for each request. However, additional timeout such as ttft_timeout & itl_timeout could be useful as they also indicate a situation where we might wanna consider the request a failure.

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions