diff --git a/standards/core/store.md b/standards/core/store.md index 0eedc62..a05b0e0 100644 --- a/standards/core/store.md +++ b/standards/core/store.md @@ -1,69 +1,41 @@ --- -title: WAKU2-STORE +slug: 13 +title: 13/WAKU2-STORE name: Waku Store Query editor: Hanno Cornelius contributors: - Dean Eigenmann - Oskar Thorén - - Aaryamann Challani + - Aaryamann Challani - Sanaz Taheri --- -> **Note:** This version of WAKU2-STORE is earmarked to replace RFC [13/WAKU2-STORE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/13/store.md) once it reaches draft status - ---- - -# Abstract - -This specification explains the `WAKU2-STORE` protocol which enables querying of messages received through the relay protocol and -stored by other nodes. -It also supports pagination for more efficient querying of historical messages. +## Abstract +This specification explains the `WAKU2-STORE` protocol, +which enables querying of [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md)s. + **Protocol identifier***: `/vac/waku/store-query/3.0.0` -## Terminology +### Terminology + The term PII, Personally Identifiable Information, refers to any piece of data that can be used to uniquely identify a user. For example, the signature verification key, and the hash of one's static IP address are unique for each user and hence count as PII. -# Design Requirements -The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, -“RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119](https://www.ietf.org/rfc/rfc2119.txt). - -Nodes willing to provide the storage service using `WAKU2-STORE` protocol, -SHOULD provide a complete and full view of message history. -As such, they are required to be *highly available* and -specifically have a *high uptime* to consistently receive and store network messages. -The high uptime requirement makes sure that no message is missed out hence a complete and -intact view of the message history is delivered to the querying nodes. -Nevertheless, in case storage provider nodes cannot afford high availability, -the querying nodes may retrieve the historical messages from multiple sources to achieve a full and intact view of the past. - -The concept of `ephemeral` messages introduced in [`WAKU2-MESSAGE`](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) affects `WAKU2-STORE` as well. -Nodes running `WAKU2-STORE` SHOULD support `ephemeral` messages as specified in [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md). -Nodes running `WAKU2-STORE` SHOULD NOT store messages with the `ephemeral` flag set to `true`. +## Wire Specification +The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, +“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and +“OPTIONAL” in this document are to be interpreted as described in [RFC2119](https://www.ietf.org/rfc/rfc2119.txt). -# Adversarial Model -Any peer running the `WAKU2-STORE` protocol, i.e. -both the querying node and the queried node, are considered as an adversary. -Furthermore, -we currently consider the adversary as a passive entity that attempts to collect information from other peers to conduct an attack but -it does so without violating protocol definitions and instructions. -As we evolve the protocol, -further adversarial models will be considered. -For example, under the passive adversarial model, -no malicious node hides or -lies about the history of messages as it is against the description of the `WAKU2-STORE` protocol. +### Design Requirements -The following are not considered as part of the adversarial model: -- An adversary with a global view of all the peers and their connections. -- An adversary that can eavesdrop on communication links between arbitrary pairs of peers (unless the adversary is one end of the communication). -In specific, the communication channels are assumed to be secure. - -# Wire Specification +The concept of `ephemeral` messages introduced in [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) affects `WAKU2-STORE` as well. +Nodes running `WAKU2-STORE` SHOULD support `ephemeral` messages as specified in [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md). +Nodes running `WAKU2-STORE` SHOULD NOT store messages with the `ephemeral` flag set to `true`. -## Payloads +### Payloads ```protobuf syntax = "proto3"; @@ -111,43 +83,52 @@ message StoreQueryResponse { optional bytes pagination_cursor = 51; } ``` -## General store query concepts -### Waku message key-value pairs +### General Store Query Concepts -The store query protocol operates as a query protocol for a key-value store of historical Waku messages, -with each entry having a [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) and associated pubsub topic as value, -and [deterministic message hash](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md#deterministic-message-hashing) as key. +#### Waku Message Key-Value Pairs + +The store query protocol operates as a query protocol for a key-value store of historical messages, +with each entry having a [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) +and associated `pubsub_topic` as the value, +and [deterministic message hash](/waku/standards/core/14/message.md#deterministic-message-hashing) as the key. The store can be queried to return either a set of keys or a set of key-value pairs. -Within the store query protocol, Waku message keys and values MUST be represented in a `WakuMessageKeyValue` message. -This message MUST contain the deterministic `message_hash` as key. -It MAY contain the full `WakuMessage` and associated pubsub topic as value in the `message` and `pubsub_topic` fields, -depending on the use case as set out below. + +Within the store query protocol, +the [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) keys and +values MUST be represented in a `WakuMessageKeyValue` message. +This message MUST contain the deterministic `message_hash` as the key. +It MAY contain the full [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) and +associated pubsub topic as the value in the `message` and +`pubsub_topic` fields, depending on the use case as set out below. + If the message contains a value entry in addition to the key, both the `message` and `pubsub_topic` fields MUST be populated. The message MUST NOT have either `message` or `pubsub_topic` populated with the other unset. Both fields MUST either be set or unset. -### Waku message store eligibility +#### Waku Message Store Eligibility -In order for a Waku message to be eligible for storage: -- it MUST be a _valid_ [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md). -- the `timestamp` field MUST be populated with the Unix epoch time at which the message was generated in nanoseconds. +In order for a message to be eligible for storage: + +- it MUST be a _valid_ [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md). +- the `timestamp` field MUST be populated with the Unix epoch time, +at which the message was generated in nanoseconds. If at the time of storage the `timestamp` deviates by more than 20 seconds either into the past or the future when compared to the store node’s internal clock, the store node MAY reject the message. - the `ephemeral` field MUST be set to `false`. -### Waku message sorting +#### Waku message sorting -The key-value entries in the store MUST be time-sorted by the `WakuMessage` `timestamp` attribute. -Where two or more key-value entries have identical `timestamps`, +The key-value entries in the store MUST be time-sorted by the [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) `timestamp` attribute. +Where two or more key-value entries have identical `timestamp` values, the entries MUST be further sorted by the natural order of their message hash keys. Within the context of traversing over key-value entries in the store, _"forward"_ indicates traversing the entries in ascending order, whereas _"backward"_ indicates traversing the entries in descending order. -### Pagination +#### Pagination If a large number of entries in the store service node match the query criteria provided in a `StoreQueryRequest`, the client MAY make use of pagination @@ -167,19 +148,20 @@ A `StoreQueryResponse` without a populated `pagination_cursor` indicates that there are no more matching entries in the store. The client MAY request the next page of entries from the store service node -by populating a subsequent `StoreQueryRequest` with the `pagination_cursor` received in the `StoreQueryResponse`. +by populating a subsequent `StoreQueryRequest` with the `pagination_cursor` +received in the `StoreQueryResponse`. All other fields and query criteria MUST be the same as in the preceding `StoreQueryRequest`. A `StoreQueryRequest` without a populated `pagination_cursor` indicates that the client wants to retrieve the "first page" of the stored entries matching the query. -## Store Query Request +### Store Query Request A client node MUST send all historical message queries within a `StoreQueryRequest` message. This request MUST contain a `request_id`. The `request_id` MUST be a uniquely generated string. -If the store query client requires the store service node to include Waku message values in the query response, +If the store query client requires the store service node to include [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) values in the query response, it MUST set `include_data` to `true`. If the store query client requires the store service node to return only message hash keys in the query response, it SHOULD set `include_data` to `false`. @@ -190,15 +172,18 @@ There are two types of filter use cases: 1. Content filtered queries and 2. Message hash lookup queries -### Content filtered queries +#### Content filtered queries A store query client MAY request the store service node to filter historical entries by a content filter. Such a client MAY create a filter on content topic, on time range or on both. -To filter on content topic, the client MUST populate _both_ the `pubsub_topic` _and_ `content_topics` field. -The client MUST NOT populate either `pubsub_topic` or `content_topics` and leave the other unset. +To filter on content topic, +the client MUST populate _both_ the `pubsub_topic` _and_ `content_topics` field. +The client MUST NOT populate either `pubsub_topic` or +`content_topics` and leave the other unset. Both fields MUST either be set or unset. -A mixed content topic filter with just one of either `pubsub_topic` or `content_topics` set, SHOULD be regarded as an invalid request. +A mixed content topic filter with just one of either `pubsub_topic` or +`content_topics` set, SHOULD be regarded as an invalid request. To filter on time range, the client MUST set `time_start`, `time_end` or both. Each `time_` field should contain a Unix epoch timestamp in nanoseconds. @@ -209,20 +194,24 @@ If any of the content filter fields are set, namely `pubsub_topic`, `content_topic`, `time_start`, or `time_end`, the client MUST NOT set the `message_hashes` field. -### Message hash lookup queries +#### Message hash lookup queries -A store query client MAY request the store service node to filter historical entries by one or more matching message hash keys. -This type of query acts as a "lookup" against a message hash key or set of keys already known to the client. +A store query client MAY request the store service node to filter historical entries by one or +more matching message hash keys. +This type of query acts as a "lookup" against a message hash key or +set of keys already known to the client. -In order to perform a lookup query, the store query client MUST populate the `message_hashes` field with the list of message hash keys it wants to lookup in the store service node. +In order to perform a lookup query, +the store query client MUST populate the `message_hashes` field with the list of message hash keys it wants to lookup in the store service node. If the `message_hashes` field is set, the client MUST NOT set any of the content filter fields, namely `pubsub_topic`, `content_topic`, `time_start`, or `time_end`. -### Presence queries +#### Presence queries -A presence query is a special type of lookup query that allows a client to check for the presence of one or more messages in the store service node, +A presence query is a special type of lookup query that allows a client to check for the presence of one or +more messages in the store service node, without retrieving the full contents (values) of the messages. This can, for example, be used as part of a reliability mechanism, whereby store query clients verify that previously published messages have been successfully stored. @@ -232,17 +221,22 @@ the store query client MUST populate the `message_hashes` field in the `StoreQue for which it wants to verify presence in the store service node. The `include_data` property MUST be set to `false`. The client SHOULD interpret every `message_hash` returned in the `messages` field of the `StoreQueryResponse` as present in the store. -The client SHOULD assume that all other message hashes included in the original `StoreQueryRequest` but not in the `StoreQueryResponse` is not present in the store. +The client SHOULD assume that all other message hashes included in the original `StoreQueryRequest` but +not in the `StoreQueryResponse` is not present in the store. -### Pagination info +#### Pagination info The store query client MAY include a message hash as `pagination_cursor`, to indicate at which key-value entry a store service node SHOULD start the query. The `pagination_cursor` is treated as exclusive and the corresponding entry will not be included in subsequent store query responses. -For forward queries, only messages following (see [sorting](#waku-message-sorting)) the one indexed at `pagination_cursor` will be returned. -For backward queries, only messages preceding (see [sorting](#waku-message-sorting)) the one indexed at `pagination_cursor` will be returned. +For forward queries, +only messages following (see [sorting](#waku-message-sorting)) the one indexed at `pagination_cursor` +will be returned. +For backward queries, +only messages preceding (see [sorting](#waku-message-sorting)) the one indexed at `pagination_cursor` +will be returned. If the store query client requires the store service node to perform a forward query, it MUST set `pagination_forward` to `true`. @@ -258,49 +252,60 @@ or larger than the service node's internal page size limit. See [pagination](#pagination) for more on how the pagination info is used in store transactions. -## Store Query Response +### Store Query Response In response to any `StoreQueryRequest`, a store service node SHOULD respond with a `StoreQueryResponse` with a `requestId` matching that of the request. This response MUST contain a `status_code` indicating if the request was successful or not. Successful status codes are in the `2xx` range. -Client nodes SHOULD consider all other status codes as error codes and assume that the requested operation had failed. -In addition, the store service node MAY choose to provide a more detailed status description in the `status_desc` field. +A client node SHOULD consider all other status codes as error codes and +assume that the requested operation had failed. +In addition, +the store service node MAY choose to provide a more detailed status description in the `status_desc` field. + +#### Filter matching -### Filter matching +For [content filtered queries](#content-filtered-queries), +an entry in the store service node matches the filter criteria in a `StoreQueryRequest` if each of the following conditions are met: -For [content filtered queries](#content-filtered-queries), an entry in the store service node matches the filter criteria in a `StoreQueryRequest` if each of the following conditions are met: - its `content_topic` is in the request `content_topics` set -and it was published on a matching `pubsub_topic` OR the request `content_topics` and `pubsub_topic` fields are unset +and it was published on a matching `pubsub_topic` OR the request `content_topics` and +`pubsub_topic` fields are unset - its `timestamp` is _larger or equal_ than the request `start_time` OR the request `start_time` is unset - its `timestamp` is _smaller_ than the request `end_time` OR the request `end_time` is unset -Note that for content filtered queries, `start_time` is treated as _inclusive_ and `end_time` is treated as _exclusive_. +Note that for content filtered queries, `start_time` is treated as _inclusive_ and +`end_time` is treated as _exclusive_. -For [message hash lookup queries](#message-hash-lookup-queries), an entry in the store service node matches the filter criteria if its `message_hash` is in the request `message_hashes` set. +For [message hash lookup queries](#message-hash-lookup-queries), +an entry in the store service node matches the filter criteria if its `message_hash` is in the request `message_hashes` set. -The store service node SHOULD respond with an error code and discard the request -if the store query request contains both content filter criteria and message hashes. +The store service node SHOULD respond with an error code and +discard the request if the store query request contains both content filter criteria +and message hashes. -### Populating response messages +#### Populating response messages The store service node SHOULD populate the `messages` field in the response only with entries matching the filter criteria provided in the corresponding request. Regardless of whether the response is to a _forward_ or _backward_ query, -the `messages`field in the response MUST be ordered in a forward direction +the `messages` field in the response MUST be ordered in a forward direction according to the [message sorting rules](#waku-message-sorting). If the corresponding `StoreQueryRequest` has `include_data` set to true, -the service node SHOULD populate both the `message_hash` and `message` for each entry in the response. -In all other cases, the store service node SHOULD populate only the `message_hash` field for each entry in the response. +the service node SHOULD populate both the `message_hash` and +`message` for each entry in the response. +In all other cases, +the store service node SHOULD populate only the `message_hash` field for each entry in the response. -### Paginating the response +#### Paginating the response The response SHOULD NOT contain more `messages` than the `pagination_limit` provided in the corresponding `StoreQueryRequest`. It is RECOMMENDED that the store node defines its own maximum page size internally. If the `pagination_limit` in the request is unset, or exceeds this internal maximum page size, -the store service node SHOULD ignore the `pagination_limit` field and apply its own internal maximum page size. +the store service node SHOULD ignore the `pagination_limit` field and +apply its own internal maximum page size. In response to a _forward_ `StoreQueryRequest`: - if the `pagination_cursor` is set, @@ -326,30 +331,50 @@ In response to a _backward_ `StoreQueryRequest`: the store service node SHOULD populate the `pagination_cursor` in the `StoreQueryResponse` with the message hash key of the _first_ entry _included_ in the response. -# Security Consideration +### Security Consideration + +The main security consideration while using this protocol is that a querying node has to reveal its content filters of interest to the queried node, +hence potentially compromising their privacy. -The main security consideration to take into account while using this protocol is that a querying node have to reveal their content filters of interest to the queried node, hence potentially compromising their privacy. +#### Adversarial Model + +Any peer running the `WAKU2-STORE` protocol, i.e. +both the querying node and the queried node, are considered as an adversary. +Furthermore, +we currently consider the adversary as a passive entity that attempts to collect information from other peers to conduct an attack but +it does so without violating protocol definitions and instructions. +As we evolve the protocol, +further adversarial models will be considered. +For example, under the passive adversarial model, +no malicious node hides or +lies about the history of messages as it is against the description of the `WAKU2-STORE` protocol. + +The following are not considered as part of the adversarial model: +- An adversary with a global view of all the peers and their connections. +- An adversary that can eavesdrop on communication links between arbitrary pairs of peers (unless the adversary is one end of the communication). +Specifically, the communication channels are assumed to be secure. -# Future Work +### Future Work - **Anonymous query**: This feature guarantees that nodes can anonymously query historical messages from other nodes i.e., -without disclosing the exact topics of [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) they are interested in. +without disclosing the exact topics of [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) they are interested in. As such, no adversary in the `WAKU2-STORE` protocol would be able to learn which peer is interested in which content filters i.e., -content topics of [14/WAKU2-MESSAGE](/spec/14). +content topics of [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md). The current version of the `WAKU2-STORE` protocol does not provide anonymity for historical queries, as the querying node needs to directly connect to another node in the `WAKU2-STORE` protocol and explicitly disclose the content filters of its interest to retrieve the corresponding messages. -However, one can consider preserving anonymity through one of the following ways: - - By hiding the source of the request i.e., anonymous communication. - That is the querying node shall hide all its PII in its history request e.g., its IP address. - This can happen by the utilization of a proxy server or by using Tor. - Note that the current structure of historical requests does not embody any piece of PII, otherwise, - such data fields must be treated carefully to achieve query anonymity. - - - By deploying secure 2-party computations in which the querying node obtains the historical messages of a certain topic, - the queried node learns nothing about the query. - Examples of such 2PC protocols are secure one-way Private Set Intersections (PSI). - +However, one can consider preserving anonymity through one of the following ways: + +- By hiding the source of the request i.e., anonymous communication. +That is the querying node shall hide all its PII in its history request e.g., its IP address. +This can happen by the utilization of a proxy server or by using Tor. +Note that the current structure of historical requests does not embody any piece of PII, otherwise, +such data fields must be treated carefully to achieve query anonymity. + +- By deploying secure 2-party computations in which the querying node obtains the historical messages of a certain topic, +the queried node learns nothing about the query. +Examples of such 2PC protocols are secure one-way Private Set Intersections (PSI). + - **Robust and verifiable timestamps**: Messages timestamp is a way to show that the message existed prior to some point in time. @@ -380,14 +405,12 @@ That is, messages contain the most recent block height perceived by their sender This proves accuracy within a range of minutes (e.g., in Bitcoin blockchain) or seconds (e.g., in Ethereum 2.0) from the time of origination. -# Copyright +## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). -# References -1. [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) +## References +1. [14/WAKU2-MESSAGE](/waku/standards/core/14/message.md) 2. [protocol buffers v3](https://developers.google.com/protocol-buffers/) -3. [11/WAKU2-RELAY](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/11/relay.md) -4. [Open timestamps](https://opentimestamps.org/) -5. [13/WAKU2-STORE v2 previous version](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/13/store.md) +3. [Open timestamps](https://opentimestamps.org/)