-
Notifications
You must be signed in to change notification settings - Fork 23
Artery Design Ideas (new remoting) #16
Comments
Artery DesignAeronThe following sections assume that we are using Aeron as underlying transport layer. We considered Aeron vs. using plain UDP and here are the pros and cons for using Aeron. Pros:
Cons:
QuarantineAfter quarantine we must not accept messages from the quarantined node. That is important because we must not deliver messages from an actor after it has been declared as terminated, i.e. after the It is also important to not send messages to a quarantined node. Otherwise it will get gossip updates, etc. Quarantined nodes are explicitly notified of being quarantined.
Quarantine is performed when:
We still need the following existing configuration properties:
HandshakeWhen opening an association we need a handshake to exchange uid. We need the uid of the destination system to be able to quarantine it and also to properly distinguish system message redelivery/ack information coming from different system incarnations. When the destination system is restarted and we continue to send to that address a new handshake should be initiated, i.e. new inbound association must be detected by the restarted system. Such handshake must trigger quarantine and We should avoid the gating feature but it might be needed to protect against too frequent handshake attempts. If we can’t establish the handshake for an outgoing connection it might make sense to gate that destination for a while, if the handshake attempt is a costly process. Note that if the remote system initiates a handshake the gating, if any, on the other side must be removed and the handshake should be accepted. System MessagesWe must not drop system messages on the way, and if buffers get full we must quarantine. Aeron delivery is reliable as long as the session is alive, very similar to a TCP connection. For long network partitions the session will timeout and be dropped, resulting in lost messages. This was confirmed and clarified by Martin Thompson. Therefore we need acked delivery for system messages. It can take advantage of that the messages are ordered and a simple protocol would be:
We still need the following existing configuration properties:
SecurityThere is an Aeron security discussion in ticket: aeron-io/aeron#203 One possible approach is that as part of the handshake negotiate a key with the remote system using a side-channel, using TCP+SSL and a Diffie-Hellman key-exchange (ECDH), then use symmetric encryption per sent message. The drawback is that with Aeron we have no access to the lowest level (just before packet hits UDP layer) and we can only encrypt whole messages (this leaks more information than doing this after the fragmenting/multiplexing layer). We would like to avoid any kind of algorithm negotiation, if algorithms used turns out to be weak, the remoting protocol should be updated instead (explicitly breaking compatibility). Key renegotiation is also not planned. As a first approach, every encrypted message should have
There is one drawback of the above setup is that the UID must be used to select the proper session key, before the HMAC has been verified (somewhat violating the Cryptographic Doom Principle ). No other operation is allowed to be performed on the message before the HMAC has been verified. Notes:
Sub-channelsSystem messages / priority lanesSystem messages should be delivered over a separate sub-channel, since they have different delivery semantics and should not be interfered by ordinary messages. Heartbeat messages for remote and cluster failure detection should be delivered over a separate sub-channel. They are ordinary messages, but should not be interfered by ordinary messages. This sub-channel is selected by message type ( Large messagesIt should be possible to send bulk data, large messages, over a separate sub-channel to avoid head of the line blocking. This sub-channel is selected based on configured destination path. GeneralFor ordinary messages not falling into the above categories we can use one or several sub-channels. If we use several they would be selected by hashing on the destination ref to preserve send order per sender-receiver pair. It should be possible to configure buffer sizes and max message size separately for each sub-channel type. Pruning of associationsInactive associations should be pruned. This is important because each channel/stream makes use of a considerable amount of memory. It could be a two step pruning where in the first step a shallow placeholder for the association remains but other memory usage is released. In the second step the placeholder can be removed after longer idle timeout. Protocol FormatMessageEnvelope:
This will be encoded by hand or with SBE (needs investigation). Handshake: All information is in the MessageEnvelope so we only need an empty message type for the handshake. Same can be used for both request and response. Another message type is needed for initiating a new handshake when receiving message from unknown uid. ActorRef compressionWe’ll find the most commonly communicated-with Actors (likely to use approximate data structure like Count-Min-Sketch), and the receiving end will advertise a mapping for that actor path to an identifier that it creates and stores locally in a table. Wire format draft: is to allow either encoding ActorRef as number or the path string. That field is prefixed with a marker in which mode we are, e.g. [0][full/path/to/actor] or [1][42]. If the receiving node crashes, and gets messages from someone it must initiate a fresh handshake anyway, we piggyback onto the handshake to also mean “invalidate table”, as the new receiving end does not have a table prepared yet. We of course detect and see if a compressed ref came in and does not match the table, so we could log warnings too. ConfigurationSane default values that should work for most users without changes. We should try to have a few high level settings for choosing between different trade-offs, e.g. low latency vs. cpu usage. Low level settings should be derived from these high level settings. Stable ProtocolWe need a stable protocol to support rolling upgrades involving Akka version updates. All Java serialization should be converted to protobuf serialization to make it easier to evolve the schema. We can include the new serializers in Akka 2.4.x, but not have them enabled by default. In Akka 3.0 they will be enabled by default. Regression tests are needed. Performance and ScalabilityWe aim for a solution that can handle high throughput, low latency, and good enough scalability. Goals (given fast network, and good servers such as EC2 m4.4xlarge):
Overview of Stream StagesOut of Scope
|
A place to capture ideas for Artery, the new remoting.
The text was updated successfully, but these errors were encountered: