Merge branch 'main' into fix-index-compatible-check

thecoop · Jan 17, 2025 · 4fac20f · 4fac20f
2 parents cc065c7 + 06e1621
commit 4fac20f
Show file tree

Hide file tree

Showing 106 changed files with 2,634 additions and 587 deletions.
diff --git a/docs/changelog/119536.yaml b/docs/changelog/119536.yaml
@@ -0,0 +1,5 @@
+pr: 119536
+summary: Fix ROUND() with unsigned longs throwing in some edge cases
+area: ES|QL
+type: bug
+issues: []
diff --git a/docs/changelog/120271.yaml b/docs/changelog/120271.yaml
@@ -0,0 +1,5 @@
+pr: 120271
+summary: Optimize indexing points with index and doc values set to true
+area: Geo
+type: enhancement
+issues: []
diff --git a/docs/reference/esql/functions/kibana/definition/round.json b/docs/reference/esql/functions/kibana/definition/round.json
diff --git a/docs/reference/esql/functions/types/round.asciidoc b/docs/reference/esql/functions/types/round.asciidoc
diff --git a/docs/reference/mapping/types/dense-vector.asciidoc b/docs/reference/mapping/types/dense-vector.asciidoc
@@ -121,11 +121,13 @@ The three following quantization strategies are supported:
 * `bbq` - experimental:[] Better binary quantization which reduces each dimension to a single bit precision. This reduces the memory footprint by 96% (or 32x) at a larger cost of accuracy. Generally, oversampling during query time and reranking can help mitigate the accuracy loss.
 
 
-When using a quantized format, you may want to oversample and rescore the results to improve accuracy. See <<dense-vector-knn-search-reranking, oversampling and rescoring>> for more information.
+When using a quantized format, you may want to oversample and rescore the results to improve accuracy. See <<dense-vector-knn-search-rescoring, oversampling and rescoring>> for more information.
 
 To use a quantized index, you can set your index type to `int8_hnsw`, `int4_hnsw`, or `bbq_hnsw`. When indexing `float` vectors, the current default
 index type is `int8_hnsw`.
 
+Quantized vectors can use <<dense-vector-knn-search-rescoring,oversampling and rescoring>> to improve accuracy on approximate kNN search results.
+
 NOTE: Quantization will continue to keep the raw float vector values on disk for reranking, reindexing, and quantization improvements over the lifetime of the data.
 This means disk usage will increase by ~25% for `int8`, ~12.5% for `int4`, and ~3.1% for `bbq` due to the overhead of storing the quantized and raw vectors.
 

diff --git a/docs/reference/query-dsl/knn-query.asciidoc b/docs/reference/query-dsl/knn-query.asciidoc
@@ -137,6 +137,9 @@ documents are then scored according to <<dense-vector-similarity, `similarity`>>
 and the provided `boost` is applied.
 --
 
+include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-rescore-vector]
+
+
 `boost`::
 +
 --

diff --git a/docs/reference/rest-api/common-parms.asciidoc b/docs/reference/rest-api/common-parms.asciidoc
@@ -1356,3 +1356,27 @@ tag::rrf-filter[]
 Applies the specified <<query-dsl-bool-query, boolean query filter>> to all of the specified sub-retrievers,
 according to each retriever's specifications.
 end::rrf-filter[]
+
+tag::knn-rescore-vector[]
+
+`rescore_vector`::
++
+--
+(Optional, object) Functionality in preview:[]. Apply oversampling and rescoring to quantized vectors.
+
+NOTE: Rescoring only makes sense for quantized vectors; when <<dense-vector-quantization,quantization>> is not used, the original vectors are used for scoring.
+Rescore option will be ignored for non-quantized `dense_vector` fields.
+
+`oversample`::
+(Required, float)
++
+Applies the specified oversample factor to `k` on the approximate kNN search.
+The approximate kNN search will:
+
+* Retrieve `num_candidates` candidates per shard.
+* From these candidates, the top `k * oversample` candidates per shard will be rescored using the original vectors.
+* The top `k` rescored candidates will be returned.
+
+See <<dense-vector-knn-search-rescoring,oversampling and rescoring quantized vectors>> for details.
+--
+end::knn-rescore-vector[]
diff --git a/docs/reference/search/retriever.asciidoc b/docs/reference/search/retriever.asciidoc
@@ -233,6 +233,8 @@ include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-filter]
 +
 include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-similarity]
 
+include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-rescore-vector]
+
 ===== Restrictions
 
 The parameters `query_vector` and `query_vector_builder` cannot be used together.
@@ -576,15 +578,15 @@ This example demonstrates how to deploy the {ml-docs}/ml-nlp-rerank.html[Elastic
 
 Follow these steps:
 
-. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>. 
+. Create an inference endpoint for the `rerank` task using the <<put-inference-api, Create {infer} API>>.
 +
 [source,console]
 ----
 PUT _inference/rerank/my-elastic-rerank
 {
   "service": "elasticsearch",
   "service_settings": {
-    "model_id": ".rerank-v1", 
+    "model_id": ".rerank-v1",
     "num_threads": 1,
     "adaptive_allocations": { <1>
       "enabled": true,
@@ -595,7 +597,7 @@ PUT _inference/rerank/my-elastic-rerank
 }
 ----
 // TEST[skip:uses ML]
-<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations. 
+<1> {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[Adaptive allocations] will be enabled with the minimum of 1 and the maximum of 10 allocations.
 +
 . Define a `text_similarity_rerank` retriever:
 +
-Original file line number
+Diff line change
@@ Expand Up @@
     and the provided `boost` is applied.
     --
+    include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=knn-rescore-vector]
     `boost`::
     +
     --
@@ Expand Down @@