Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Redis Cluster support #716

Merged
merged 1 commit into from
Jul 29, 2018

Conversation

supercaracal
Copy link
Contributor

@supercaracal supercaracal commented Sep 26, 2017

We'd like to use Redis Cluster in session store such as.

There are some gems such as:
https://github.com/redis-store/redis-rails
https://github.com/redis-store/redis-store

These gems depend to redis-rb. But redis-rb doesn't support Redis Cluster.
https://github.com/redis-store/redis-rails/issues/72
redis-store/redis-activesupport#89

So, we add a client to redis-rb for Redis Cluster. We can check it at this sample.
https://github.com/supercaracal/redis-cluster-playground

nodes = (7000..7005).map { |port| "redis://127.0.0.1:#{port}" }
redis = Redis.new(cluster: nodes)
redis.set('hogehoge', 1)
redis.get('hogehoge')

# @see https://redis.io/commands/readonly
Redis.new(cluster: nodes, replica: true)

ref: #546
ref: antirez/redis-rb-cluster#6
ref: antirez/redis-rb-cluster#8
ref: redis/node-redis#574
ref: redis/redis-py#931
ref: redis/redis-py#604

ref: https://github.com/go-redis/redis
ref: https://github.com/luin/ioredis
ref: https://github.com/xetorthio/jedis
ref: https://github.com/redisson/redisson
ref: https://github.com/phpredis/phpredis
ref: https://github.com/nrk/predis

@supercaracal
Copy link
Contributor Author

supercaracal commented Sep 27, 2017

CI failed at JRuby. Are these unrelated?

$ rvm use jruby-9 --install --binary --fuzzy
Unknown ruby string (do not know how to handle): jruby-9.1.13.0200.
jruby-9.1.13.0200 is not installed - installing.
Unknown ruby string (do not know how to handle): jruby-9.1.13.0200.
Searching for binary rubies, this might take some time.
Unknown ruby string (do not know how to handle): jruby-9.1.13.0200.
Requested binary installation but no rubies are available to download, consider skipping --binary flag.
Gemset '' does not exist, 'rvm jruby-9.1.13.0200 do rvm gemset create ' first, or append '--create'.
The command "rvm use jruby-9 --install --binary --fuzzy" failed and exited with 2 during .

@badboy
Copy link
Contributor

badboy commented Sep 27, 2017

Yes, these test failures look unrelated. I currently don't have time to look at the PR so don't expect a decision any time soon

@supercaracal supercaracal changed the title Add redis cluster minimal support Add Redis Cluster support Sep 29, 2017
end

def asking
try_cmd(find_node, :synchronize) { |client| client.call(%i[asking]) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antirez
Copy link
Contributor

antirez commented Sep 30, 2017

Long term... I would love to see redis-rb to get full support for Redis Cluster, perhaps starting from the POC I wrote here: https://github.com/antirez/redis-rb-cluster. I'm sure the same implementation, a bit polished and documented, would receive far more PRs/attention if part of Redis-rb.

@samuelebistoletti
Copy link

samuelebistoletti commented Oct 29, 2017

Hi,
any plans to merge this? Is it production ready? I would like to use this in our Redis cluster.

I also noticed that in redis-rb master branch there's a reference called Redis::Distributed, is that a redis cluster implementation? Anyone can explain me what is that, please?

Thanks

@supercaracal
Copy link
Contributor Author

supercaracal commented Oct 30, 2017

@samuelebistoletti I think Redis::Distributed is original implementation. It looks like it is different from the Redis Cluster and Sentinel.

https://redis.io/topics/partitioning#clients-supporting-consistent-hashing

@samuelebistoletti
Copy link

thanks. Do you think it's safe using your implementation of redis cluster in production? Or you are still testing it?

@supercaracal
Copy link
Contributor Author

@samuelebistoletti I think it's safe. I'm sure the same implementation as POC. But owners seem busy.

@supercaracal
Copy link
Contributor Author

supercaracal commented Nov 10, 2017

@badboy @antirez Could you start review? Isn't this PR enough to start review?

@badboy
Copy link
Contributor

badboy commented Nov 10, 2017

I won't review it as I currently have neither the time or energy to review or maintain this.

@supercaracal
Copy link
Contributor Author

supercaracal commented Nov 10, 2017

@badboy I understand. I am sorry.

@supercaracal
Copy link
Contributor Author

@djanowski @pietern @soveran @yaauie Excuse me. Could anyone start review when you're available? I am sorry to bother you while you are busy. Is there anything that I can do?

@synth
Copy link

synth commented Nov 15, 2017

If Redis-rb maintainers are not down to support cluster (understandably), what about creating a fork? I'll put 5 on it. 😸
UPDATE: I just got this setup on an AWS EC2 instance with TLS Elasticache cluster and it works beautifully! Thank you @supercaracal !!!

@filiptepper
Copy link

Perhaps this could be shipped as a separate gem?

@supercaracal
Copy link
Contributor Author

would receive far more PRs/attention if part of Redis-rb.

I think it would be better a part of redis-rb than a separate gem for the reasons above.

@jrmhaig
Copy link
Contributor

jrmhaig commented Jun 14, 2018

I understand that this doesn't exist as a separate gem and it is not likely to be merged in the short term. Can someone give an indication of the "best" way to use this?

I am trying to connect to a cluster that I can use from with the Redis cli as:

redis-cli -h $REDIS_CLUSTER -c

Ideally I would like to find an equivalent of the -c switch, so that I do not need to maintain a list of the nodes but if I have to I can still work with that.

@supercaracal
Copy link
Contributor Author

@byroot Excuse me. Do you have time or plan to review this PR?

@byroot
Copy link
Collaborator

byroot commented Jun 15, 2018

I can manage some time for it, the problem is more than I don't have experience with Redis cluster, so I'll have to gather lots of context before I can even start to review.

@byroot
Copy link
Collaborator

byroot commented Jun 15, 2018

But yes, at first sight I'd like this merged.

Copy link
Collaborator

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor style / technical comments.

I'll do my best to do a proper review either this weekend or early next week

ttl -= 1
node.send(command, *args, &block)
rescue TimeoutError, CannotConnectError, Errno::ECONNREFUSED, Errno::EACCES => err
raise err if ttl <= 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to raise without argument here, so that the original backtrace is kept.

if err.message.start_with?('MOVED')
redirection_node(err.message).send(command, *args, &block)
elsif err.message.start_with?('ASK')
raise err if ttl <= 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same raise issue here.

asking
retry
else
raise err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same raise issue here.

#
# @raise [ArgumentError] if addr is not a `String` or `Hash`
def to_client_option(addr)
if addr.is_a?(String)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: You can use a case here:

case addr
when String
when Hash
else
end

response
.split(/[\r\n]+/)
.map { |str| str.split(':') }
.map { |arr| [arr.first.to_sym, arr[1]] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for converting to a Symbol?

Copy link
Contributor Author

@supercaracal supercaracal Jun 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I thought that we are use to using Symbol as Hash key in Ruby. But I don't have rationales. Should we remove it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, symbol keys are more for internal data structures.

I think it should be strings here, to be consistent with Redis#info:

>> r.info
=> {"redis_version"=>"4.0.9", "redis_git_sha1"=>"00000000", ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it would be a good idea to use the same parsing code:

redis-rb/lib/redis.rb

Lines 277 to 279 in ddf058b

reply = Hash[reply.split("\r\n").map do |line|
line.split(":", 2) unless line =~ /^(#|$)/
end.compact]

We can probably extract it in some helper.

@jrmhaig
Copy link
Contributor

jrmhaig commented Jun 15, 2018

@supercaracal Having experimented with this further I found that we cannot use it as, for some reason, the commands for Streams (see https://redis.io/topics/streams-intro) do not work. As Streams are only appearing in the next version of Redis, which is still in beta, I do not thing this counts as a problem with this PR even though everything appears to work properly with the current master of the gem. I suspect there is some magic that automatically generates methods for Redis commands but this is not getting picked up properly by Redis::Cluster. I thought you might find this information useful.

We are going to use Sentinel for the time being but I would be interested in moving over to using this in the future.

@byroot
Copy link
Collaborator

byroot commented Jun 15, 2018

@jrmhaig can you be more specific? What's the version of your redis server & the actual code your are using?

@jrmhaig
Copy link
Contributor

jrmhaig commented Jun 15, 2018

I am using the current release candidate of Redis 5.0 from https://redis.io/download (version 4.9.101).

I can do:

irb(main):001:0> require 'redis'
=> true
irb(main):002:0> r = Redis.new host: 'localhost'
=> #<Redis client v4.0.1 for redis://localhost:6379/0>
irb(main):003:0> r.xadd 'test', '*', 'key', 'value'
=> "1529082632840-0"

but if I try this with Redis::Cluster instead I get 'unknown command'.

@supercaracal
Copy link
Contributor Author

@jrmhaig I try to learn Redis Streams API on the weekend. Thank you for the information.

@supercaracal
Copy link
Contributor Author

supercaracal commented Jun 17, 2018

@byroot Thank you for your reviewing. I fixed but CI was failure. It seems not related issues. Could you retry to build?

Copy link
Collaborator

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted other issues, however I'll stop here because at this point I believe this PR is attacking the problem from the wrong side.

Right now this PR adds a Redis::Cluster class which holds a list of Redis instances, and delegate commands to them.

Several of the issues I reported comes from this, and from reading the code it seems to be that the proper architecture should be for Redis::Cluster to be a replacement for Redis::Client and not Redis itself.

So that in the end you will have just a few methods like call, call_loop etc, and just have to delegate to the proper Redis::Client instance.

To paraphrase, I think the call order should be Redis -> Redis::Cluster -> Redis::Client whereas this PR currently implement Redis::Cluster -> Redis -> Redis::Client.

class Redis
module Helpers
# Helper methods for common processing of the reply data.
class ReplyHelper
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues here:

  • ReplyHelper is never instantiated, so it shouldn't be class. When you need a namespace for some static method, just use a module.
  • I think than instead of introducing a new namespace etc, we should follow the existing pattern, so something like HashifyInfo = lambda { ...

def try_cmd(node, command, *args, ttl: RETRY_COUNT, &block)
ttl -= 1
node.send(command, *args, &block)
rescue TimeoutError, CannotConnectError, Errno::ECONNREFUSED, Errno::EACCES => err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell me the reasoning behind this list of exceptions?

Why CannotConnectError and not the more generic BaseConnectionError?

Why Errno::ECONNREFUSED, Errno::EACCES ? Aren't they already rescued and re-raised as more specific errors by Redis::Client?

Is it really sensible to retry a TimeoutError ? Do we have the guarantee that the command was not processed on the server side ? Otherwise we'd risk executing a non idempotent commend twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ported from POC . But I couldn't realize those problems. I would like to fix it.

# @return [Object] depends on the command
def try_cmd(node, command, *args, ttl: RETRY_COUNT, &block)
ttl -= 1
node.send(command, *args, &block)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 2 problems here:

  • First, commands are public methods, so we should use public_send here, so that we don't allow calling private methods.
  • We have no way to distinguish between commands and utility methods on Redis, so if I'm understanding that code correctly, it will allow non-sensical things like this:
>> cluster._client
=> #<Redis::Client:0x007f92609182b0 @options={:host=>"127.0.0.1", :port=>7000, 

@supercaracal
Copy link
Contributor Author

@byroot Thank you for reviewing. I fixed the call order and others.

Copy link
Collaborator

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should cover more with the test, we should probably do like ditributed_test.rb & test a wide array of commands using the Cluster client.

find_node.respond_to?(method_name, include_private)
end

def method_missing(method_name, *args, &block)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do without method_missing? It seems to me that there is relatively few methods that would need to be implemented. I can see call, call_loop, call_pipeline, maybe a couple others.

Even if it's using a macro to do some standard delegation, I'd feel much more confortable with a whitelist of delegated methods rather than method_missing.

@supercaracal
Copy link
Contributor Author

In case of Node:

https://github.com/luin/ioredis/tree/622975d9d454eca6a9c495d35f5638d68e2662f2#transaction-and-pipeline-in-cluster-mode

You can't use multi without pipeline (aka cluster.multi({ pipeline: false }) ). This is because when you call cluster.multi({ pipeline: false }) , ioredis doesn't know which node the multi command should be sent to.

@supercaracal
Copy link
Contributor Author

In case of PHP (extension):

https://github.com/phpredis/phpredis/blob/develop/cluster.markdown#transactions

When you call RedisCluster->multi(), the cluster is put into a MULTI state, but the MULTI command is not delivered to any nodes until a key is requested on that node.

@supercaracal
Copy link
Contributor Author

supercaracal commented Jul 15, 2018

The transaction which executes on multiple nodes is not reliable. I think cluster client should raise AmbiguousNodeError when MULTI EXEC DISCARD are called without Ruby block (pipelining). May I fix it?

@byroot
Copy link
Collaborator

byroot commented Jul 15, 2018

May I fix it?

Of course. Don't feel like you should not change things because I approved.

@supercaracal supercaracal force-pushed the add_redis_cluster_support branch 3 times, most recently from afd521c to 9690c1e Compare July 16, 2018 02:24
@supercaracal
Copy link
Contributor Author

supercaracal commented Jul 16, 2018

I fixed as below.

  • Raise Redis::Cluster::AmbiguousNodeError when cluster client can't select node by command
  • Send Pub/Sub command to random node (Add to keyless commands) spec
  • Use redis-cli instead of redis-trib.rb
$ make trib_cluster 
yes yes | bundle exec ruby tmp/cache/redis-unstable/src/redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
WARNING: redis-trib.rb is not longer available!
You should use redis-cli instead.

All commands and features belonging to redis-trib.rb have been moved
to redis-cli.
In order to use them you should call redis-cli with the --cluster
option followed by the subcommand name, arguments and options.

Use the following syntax:
redis-cli --cluster SUBCOMMAND [ARGUMENTS] [OPTIONS]

Example:
redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 --cluster-replicas 1

To get help about all subcommands, type:
redis-cli --cluster help

makefile:67: recipe for target 'trib_cluster' failed
make: *** [trib_cluster] Error 1

@supercaracal
Copy link
Contributor Author

@byroot Are there any other concerns?

@supercaracal
Copy link
Contributor Author

supercaracal commented Jul 27, 2018

I fixed as below.

  • Use COMMAND instead of hard coding handling in key extraction.
  • Fix a issue that client should send writable/readonly commands to master/slave nodes exactly.
  • Raise Redis::Cluster::CrossSlotPipeliningError when commands in pipelining include cross slot keys.

@supercaracal supercaracal force-pushed the add_redis_cluster_support branch from b0e035b to f48af99 Compare July 28, 2018 03:53
@supercaracal
Copy link
Contributor Author

I think our Redis Cluster client is ready for shipping.

@byroot
Copy link
Collaborator

byroot commented Jul 28, 2018

Really looks good. I'd have another nitpick ;)

Redis::Cluster::CrossSlotPipeliningError could use a default message so that's it's easier for users to understand what's wrong.

After that I'd like to merge and if it requires further improvements it might as well happen in followup PRs.

I'll also see if I can get publish access so that we can release a RC to get some feedback from users.

@supercaracal
Copy link
Contributor Author

supercaracal commented Jul 29, 2018

  • Add default error message to Redis::Cluster::CrossSlotPipeliningError

@supercaracal supercaracal force-pushed the add_redis_cluster_support branch from f48af99 to 7f48c0b Compare July 29, 2018 03:28
@supercaracal
Copy link
Contributor Author

@byroot I have fixed it. I am grateful for your support. Thank you so much.

@byroot byroot merged commit 8b01ab6 into redis:master Jul 29, 2018
@supercaracal supercaracal deleted the add_redis_cluster_support branch July 30, 2018 02:39
@byroot
Copy link
Collaborator

byroot commented Aug 13, 2018

This was released as 4.1.0.beta1, if you are interested in this feature, please try to use it a report any issues you might spot.

@xfalcox
Copy link

xfalcox commented Sep 13, 2018

Hey @byroot amd @supercaracal, I'm trying to wrap around my head about using this in production on AWS.

When we setup a new High Avaliable Redis Cluster in AWS we get a single endpoint, which AWS calls configuration_endpoint_address, and is documented in Terraform here.

I can connect to this endpoint using redis-cli -h $configuration_endpoint_address -c and issue commands like SET and GET just fine, and I'll be routed to a shard.

That said, by reading this PR code, I'm supposed to pass an array of Redis hosts instead of a single one when establishing a connection. Are AWS and redis-rb approachs different, incompatible or I'm reading something wrong?

@supercaracal
Copy link
Contributor Author

supercaracal commented Sep 14, 2018

We're able to specify single node to this gem.

[1] pry(main)> cli = Redis.new(cluster: %w[redis://127.0.0.1:7000])
=> #<Redis client v4.1.0.beta1 for redis://127.0.0.1:7000/0 redis://127.0.0.1:7001/0 redis://127.0.0.1:7002/0>
[2] pry(main)> cli.get :key1
=> nil
[3] pry(main)> cli.get :key2
=> nil
[4] pry(main)> cli.get :key3
=> nil

It seems the redis-cli reconnect to correct node internally.
https://github.com/antirez/redis/blob/5.0-rc5/src/redis-cli.c#L1025-L1053

$ docker exec -it af8ee4af0186 /bin/bash
root@af8ee4af0186:/data# redis-cli -c -h 127.0.0.1 -p 7000
127.0.0.1:7000> get key1
-> Redirected to slot [9189] located at 127.0.0.1:7001
(nil)
127.0.0.1:7001> get key1
(nil)

@xfalcox
Copy link

xfalcox commented Sep 14, 2018

That's amazing @supercaracal. I just tried pointing the new gem to an AWS Elasticache configuration_endpoint_address and looks like the redirect is handled automatically. Thanks.

@byroot
Copy link
Collaborator

byroot commented Oct 1, 2018

It's already been cut over a month ago: https://rubygems.org/gems/redis/versions/4.1.0.beta1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants