Skip to content

Knn query vector is serialized as array of double, not float #1034

Open
@tmoschou

Description

@tmoschou

Java API client version

9.0.3

Java version

21

Elasticsearch Version

9.0.3

Problem description

KnnQuery models queryVector as List<Float> but serializes as List<Double>, adding roughly 9 bytes per value. For dense_vector of typical sizes, e.g. 1024, this can add roughly 9kb to the uncompressed request body.

This reason is due to the float being cast to a double as consumed by jakarta.json.stream.JsonGenerator.write(double) for which does not have a write(float) overload. The solution would be write as a BigDecimal instead, converting the value first to a string.

new java.math.BigDecimal(Float.toString(float))

A unit test which reproduces the issue

public class EsTest {
    @Test
    public void knnSerialization() {
        var expectedJson = """
            {"knn":{"field":"f1","query_vector":[3.1415927,2.7182817,0.501,100.0,1.0E-5]}}""";
        Query query1 = QueryBuilders.knn(k -> k
            .queryVector(List.of((float) Math.PI, (float) Math.E, 0.501f, 100f, 1e-5f))
            .field("f1")
        );
        String actualJson = JsonpUtils.toJsonString(query1, new JacksonJsonpMapper());
        assertThat(actualJson).isEqualTo(expectedJson);
        // Expected :"{"knn":{"field":"f1","query_vector":[3.1415927,2.7182817,0.501,100.0,1.0E-5]}}"
        // Actual   :"{"knn":{"field":"f1","query_vector":[3.1415927410125732,2.7182817459106445,0.5009999871253967,100.0,9.999999747378752E-6]}}"
    }
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions