add EIS rerank default inference endpoint #129681

brendan-jugan-elastic · 2025-06-19T04:23:30Z

Overview

This PR adds the rerank default inference endpoint for the Elastic Inference Service. This changes makes a few assumptions:

The future model ID contained in the EIS authorizations response is rerank-v1
The default inference endpoint ID follows existing conventions and is .rerank-v1-elastic
The EIS task type -> inference API task type mapping is rerank/text/text-similarity -> TaskType.RERANK
- This was pulled from this document outlining EIS task type mappings. This information might be outdated, so happy to modify the mapping if we want.

Testing

My testing included basic chat completions and sparse embeddings requests using the existing default endpoints with eis-gateway and eis-ray running locally, to ensure existing functionality works as expected. I've also modified the authorized models in my local eis-gateway, restarted ES, and verified that the list of default endpoints includes the new one for rerank.

Chat Completions:

curl -k -N --location 'http://localhost:9200/_inference/chat_completion/.rainbow-sprinkles-elastic/_stream' \
  --header 'Content-Type: application/json' \
  --header "Authorization: Basic ${ES_AUTH}" \
  --data '{
    "messages": [
        {
            "role": "user",
            "content": "In only two digits and nothing else, what is the meaning of life?"
        }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 300
}'

Sparse Embeddings:

curl -k --location --request POST 'http://localhost:9200/_inference/sparse_embedding/.elser-v2-elastic' \
  --header 'Content-Type: application/json' \
  --header "Authorization: Basic ${ES_AUTH}" \
  --data '{
    "input": "A blue sky"
}'

Default Endpoints:

curl -k --location --request GET 'http://localhost:9200/_inference?pretty' \ 
  --header 'Content-Type: application/json' \
  --header "Authorization: Basic ${ES_AUTH}"
{
  "endpoints" : [
    {
      "inference_id" : ".elser-v2-elastic",
      "task_type" : "sparse_embedding",
      "service" : "elastic",
      "service_settings" : {
        "model_id" : "elser-v2",
        "rate_limit" : {
          "requests_per_minute" : 1000
        }
      }
    },
    {
      "inference_id" : ".rainbow-sprinkles-elastic",
      "task_type" : "chat_completion",
      "service" : "elastic",
      "service_settings" : {
        "model_id" : "rainbow-sprinkles",
        "rate_limit" : {
          "requests_per_minute" : 720
        }
      }
    },
    {
      "inference_id" : ".rerank-v1-elastic",
      "task_type" : "rerank",
      "service" : "elastic",
      "service_settings" : {
        "model_id" : "rerank-v1",
        "rate_limit" : {
          "requests_per_minute" : 500
        }
      }
    },
    other default inference endpoints....
  ]
}

elasticsearchmachine · 2025-06-19T04:23:55Z

Pinging @elastic/search-inference-team (Team:Search - Inference)

elasticsearchmachine · 2025-06-19T04:23:55Z

Pinging @elastic/search-eng (Team:SearchOrg)

add EIS rerank default inference endpoint

59e2c38

brendan-jugan-elastic requested review from timgrein, jonathan-buttner and a team June 19, 2025 04:23

brendan-jugan-elastic added >enhancement :SearchOrg/Inference Label for the Search Inference team v8.19.0 v9.1.0 labels Jun 19, 2025

elasticsearchmachine added Team:SearchOrg Meta label for the Search Org (Enterprise Search) Team:Search - Inference labels Jun 19, 2025

elasticsearchmachine and others added 4 commits June 19, 2025 04:32

[CI] Auto commit changes from spotless

ecf6908

Merge branch 'main' into eis-rerank-default-inference-endpoint

ef407a7

fix integ tests

121c68e

Merge branch 'main' into eis-rerank-default-inference-endpoint

e3e68d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add EIS rerank default inference endpoint #129681

add EIS rerank default inference endpoint #129681

brendan-jugan-elastic commented Jun 19, 2025

Uh oh!

elasticsearchmachine commented Jun 19, 2025

Uh oh!

elasticsearchmachine commented Jun 19, 2025

Uh oh!

Uh oh!

add EIS rerank default inference endpoint #129681

Are you sure you want to change the base?

add EIS rerank default inference endpoint #129681

Conversation

brendan-jugan-elastic commented Jun 19, 2025

Overview

Testing

Uh oh!

elasticsearchmachine commented Jun 19, 2025

Uh oh!

elasticsearchmachine commented Jun 19, 2025

Uh oh!

Uh oh!