ERR Protocol error: too big inline request (Redis::CommandError) #1185

Jakanapes · 2023-03-16T12:49:12Z

Redis 5.3.3
Ruby 3.1.3
Rails 7.0.4.2
redis gem 5.0.6
resque gem 2.4.0

Just upgraded the redis/resque gems and their dependencies and am getting this intermittent error ERROR -- : ERR Protocol error: too big inline request (Redis::CommandError)

The backtrace is not super helpful

E, [2023-03-14T14:41:13.543261 #44489] ERROR -- : ERR Protocol error: too big inline request (Redis::CommandError)
.....shared/bundle/ruby/3.1.0/gems/redis-5.0.6/lib/redis/subscribe.rb:52:in `subscription'
.....shared/bundle/ruby/3.1.0/gems/redis-5.0.6/lib/redis/subscribe.rb:17:in `subscribe'
.....shared/bundle/ruby/3.1.0/gems/redis-5.0.6/lib/redis.rb:188:in `_subscription'
.....shared/bundle/ruby/3.1.0/gems/redis-5.0.6/lib/redis/commands/pubsub.rb:17:in `subscribe'
.....shared/bundle/ruby/3.1.0/gems/actioncable-7.0.4.2/lib/action_cable/subscription_adapter/redis.rb:83:in `block in listen'
.....shared/bundle/ruby/3.1.0/gems/redis-client-0.12.1/lib/redis_client.rb:645:in `ensure_connected'
.....shared/bundle/ruby/3.1.0/gems/redis-5.0.6/lib/redis/client.rb:104:in `disable_reconnection'
.....shared/bundle/ruby/3.1.0/gems/redis-5.0.6/lib/redis.rb:79:in `without_reconnect'
.....shared/bundle/ruby/3.1.0/gems/actioncable-7.0.4.2/lib/action_cable/subscription_adapter/redis.rb:80:in `listen'
.....shared/bundle/ruby/3.1.0/gems/actioncable-7.0.4.2/lib/action_cable/subscription_adapter/redis.rb:154:in `block in ensure_listener_running'

seems to indicate that action_cable is involved, but we use it a lot and having trouble tracking it down. I can't reproduce it and it appears on calls that don't trigger any actioncable actions.

The error is generated in redis, which has not changed, but looking at the gem difference, the error being raised in subscribe.rb is new. Am I looking at something that was happening all along, but was being swallowed by earlier versions of the redis gem?

byroot · 2023-03-17T16:15:26Z

So this error is thrown by the server here: https://github.com/redis/redis/blob/2dd5c3a180a9892c339b66f76cc203f8b0f6fe5a/src/networking.c#L2112-L2115

And that constant is defined as 64kB:

#define PROTO_INLINE_MAX_SIZE   (1024*64)

So this seem to suggest that somehow the client sends more than 64kB of data without a single newline, that is very fishy.

But somehow that error rings a bell, it feels like I researched this before.

Jakanapes · 2023-03-17T16:27:02Z

What would be sending inline, though? Looking at the ActionCable source, it's calling publish in Redis::Commands::Pubsub, I believe. We're trying to eliminate possible causes so we're tracking any AC broadcast messages over 64k, but they don't correspond to the redis protocol errors because they bubble up in a separate thread from the AC listener. Neither AC nor Redis versions have changed, just the redis gem getting bumped up from 4.1.0

byroot · 2023-03-17T17:22:55Z

What would be sending inline, though?

Yeah, I don't know, it could be an overly long channel name on something like that. It will be hard to debug without a repro.

I'm not sure what you setup look like, but if you have that flexibility I'd try to add some logging in there

redis-rb/lib/redis/subscribe.rb

Line 49 in e4f5299

call_v([start, *channels])

to see what can of command is emitted and whether you can figure out a smaller repro?

Jakanapes · 2023-03-17T18:09:04Z

So looking at the backtrace, the line in actioncable that ends up at call_v is here

https://github.com/rails/rails/blob/7-0-stable/actioncable/lib/action_cable/subscription_adapter/redis.rb#L92

and it's explicitly sending _action_cable_internal, which is sent all the way through to call_v in the *channels params. Am I reading the code correctly there that the events pulled from client.next_event are a direct result of the call_v command?

I can try to add some logging there, but if that's the case, I don't see how it could be anything other than the hardcoded channel, let along a 64k long channel name

byroot · 2023-03-17T18:10:33Z

Hum, you're right.

Jakanapes · 2023-03-17T18:20:33Z

actioncable sets the message callback of the Subscription class to

                on.message do |chan, message|
                  broadcast(chan, message)
                end

If I'm reading it right, this block under call_v

      while event = @client.next_event(timeout)
        if event.is_a?(::RedisClient::CommandError)
          raise Client::ERROR_MAPPING.fetch(event.class), event.message
        end

        type, *rest = event
        if callback = sub.callbacks[type]
          callback.call(*rest)
        end
        break if type == stop && rest.last == 0
      end

is blocking while loop in its own thread? So it's likely that it subscribes to the channel, then waits for incoming messages.
Eventually, it gets a message, calls the callback which attempts to broadcast it on the channel, and that is what triggers to the "too big" error, which it picks up on the next loop?

If that's the case, though, we've attempted sending large messages through to trigger the exception with no luck

byroot · 2023-03-17T18:38:37Z

Note that it could be a total red herring. When googling for that error, you can see lots of these being resolved in various ways on the server side.

Given that you say the issue started when you upgraded, it makes sense to assume the client doesn't something wrong, but that wrong thing may not be to send a packet too big.

just the redis gem getting bumped up from 4.1.0

Have you tried the latest 4.x? Mostly to try to pinpoint when the issue might have been introduced.

Another good thing to try would be to enable the hiredis bindings, see if that changes anything. That could help track the source of the issue.

Jakanapes · 2023-03-17T18:45:12Z

That is a concern, that since the exception being raised is new in this version, that it's something that could have been happening all along and was either being swallowed or surfaced as another error that we haven't noticed is no longer happening.

[1] pry(main)> Redis::Connection.drivers
=> [RedisClient::HiredisConnection]

we're already using the hiredis-client gem.

Is there a way to see if the AC redis instance is using inline? I thought resp was what was always used?

byroot · 2023-03-17T18:47:16Z

we're already using the hiredis-client gem.

Then the opposite, try to disable it. It's really just a basic bisection to try to narrow down where the bug could be.

Is there a way to see if the AC redis instance is using inline?

I'm not sure what you are referring to by inline.

byroot · 2023-03-17T18:50:29Z

Oh wait, Hum, another thing, is there SSL into the mix? I cant remember nor find where I saw that too big inline request before, but I wonder if it wasn't a client trying to connect to an unsecured server using SSL, and the SSL handshake paquets were tripping the server.

Just a wild guess though.

Jakanapes · 2023-03-17T18:55:18Z

It does look like the gem uses RESP2

      def config(**kwargs)
        super(protocol: 2, **kwargs)
      end

The error itself says "inline request" and the comments on the method in the redis source say
"Like processMultibulkBuffer(), but for the inline protocol instead of RESP"
https://github.com/redis/redis/blob/2dd5c3a180a9892c339b66f76cc203f8b0f6fe5a/src/networking.c#L2094-L2117

however, it seems like every client in existence uses resp.

I'll have to check on the SSL issue, but if that was the case, I'd expect it to happen on every message, no?

byroot · 2023-03-17T18:56:30Z

Oh wait, this is an intermittent issue?

byroot · 2023-03-17T19:00:27Z

however, it seems like every client in existence uses resp.

Yeah, I have no idea what "inline protocol" is supposed to be never heard of it, and I can confirm that redis-rb uses RESP2.

Is there a proxy or something between the client and Redis? Or anything else that isn't just a raw standalone redis-server?

Jakanapes · 2023-03-17T19:09:26Z

Oh wait, this is an intermittent issue?

Oh yes, intermittent and extremely elusive. We've had 0 luck in recreating it on purpose. We see the error 6-8 times a day among thousands and thousands of requests. No other issues we can tell on straight redis calls for caching and resque use, just something with action cable.

We use consul to manage the url, but ATM we only have a single redis server in production.

byroot · 2023-03-17T19:12:01Z

Hum, that is extremely intriguing. I'll try to find some time to review if it's possible that on 4.x that error was swallowed or retried. That could explain it, e.g. a pre-existing issue with your server, and now we're just reporting it.

Jakanapes · 2023-03-18T16:13:45Z

It looks like in the previous version, it looks like the errors were being raised in call_loop in client.rb, but the ProtocolError child class was overwriting initialize

  # Raised by the connection when a protocol error occurs.
  class ProtocolError < BaseError
    def initialize(reply_type)
      super(<<-EOS.gsub(/(?:^|\n)\s*/, " "))
        Got '#{reply_type}' as initial reply byte.
        If you're in a forking environment, such as Unicorn, you need to
        connect to Redis after forking.
      EOS
    end
  end

Matt-Yorkley · 2023-06-15T12:38:43Z

I'm also seeing this intermittently after upgrading redis 4.8.1 to redis 5.0.6 and switching out hiredis 0.6.3 to hiredis-client 0.14.1. Same error and backtrace, and it's also rare and difficult to pin down...

Jakanapes · 2023-07-19T15:09:33Z

Spent the time after my last attempt slow rolling the upgrade. Redis gem up to 4.8.1 and resque up to 2.5.0 with no issues.

We have since moved from our old redis servers to GCP memorystore, which hosts
redis_version:6.2.11

In addition, there was a rails upgrade that fixed an ActionCable redis config with sentinel
https://github.com/rails/rails/blob/v7.0.6/actioncable/CHANGELOG.md#rails-706-june-29-2023

The final upgrade was swapping out hiredis for hiredis-client and upgrading the redis gem to 5.0.6

Within an hour I had received the error from a very, very small action cable call

Jakanapes · 2023-07-19T15:30:14Z

hmm, in fact, I can't find us using ActionCable in the call that Airbrake captures as the source at all. Could be some sneaky metaprogramming somewhere, but I'm not finding it.

Jakanapes · 2023-10-26T13:37:13Z

@Matt-Yorkley, did you ever find any cause? I haven't been able to try again in a while, but I'm working on some more logging in the action cable process to see if I can track it down.

mkllnk · 2024-01-25T02:50:42Z

No, Matt didn't solve the issue for us. I just ran CI again on our latest code and still saw this error popping up. Since Sam claimed that hiredis doesn't bring any performance improvement I decided to remove the gem from our code and just use the default redis gem. Let me know if you come to another conclusion.

Jakanapes · 2024-01-25T14:35:06Z

No, Matt didn't solve the issue for us. I just ran CI again on our latest code and still saw this error popping up. Since Sam claimed that hiredis doesn't bring any performance improvement I decided to remove the gem from our code and just use the default redis gem. Let me know if you come to another conclusion.

I added more logging around what gets passed to the redis gem from action cable, but no smoking guns. Still very intermittent and haven't been able to consistently reproduce it.

Are you saying that removing hiredis eliminated the issue?

Redis bulk load commands need to be under 64 KB per line (more or less): redis/redis-rb#1185 https://github.com/redis/redis/blob/07b292af5e86459e435da5986f87bc4f7415d437/src/server.h#L178 If too many IDs are included in the batch file, the lines are too long, causing an error: ERR Protocol error: too big inline request Fix by chunking IDs written to lines. What chunk size to use? 1000 IDs works, and leads to lines of approximately 23 KB. But Register 2 generates statement IDs of around 20 characters in length, whereas BODS 0.2 and BODS 0.3 says they must be between 32 and 64 chars (the fact that Register 2 doesn't comply with the minimum is a separate issue). Multiplying by 20/64 gives a target of around 20 KB, so reducing to 750 IDs should be comfortable, leading to lines of around 17 KB (as measured with `wc -c`).

mkllnk · 2024-01-26T09:56:26Z

Are you saying that removing hiredis eliminated the issue?

Yes, at least in our CI environment. We haven't tested production yet.

Jakanapes mentioned this issue Mar 21, 2023

ERR Protocol error: too big inline request (Redis::CommandError) redis-rb/redis-client#104

Closed

Jakanapes mentioned this issue Jun 14, 2023

Deprecation warning: Redis#sadd will always return an Integer in Redis 5.0.0. Use Redis#sadd? instead resque/redis-namespace#216

Closed

This was referenced Jun 16, 2023

Fix redis and hiredis-client version requirements openfoodfoundation/openfoodnetwork#10455

Merged

chore(deps): bump redis from 4.8.1 to 5.0.6 openfoodfoundation/openfoodnetwork#11038

Closed

This was referenced Oct 30, 2023

replace hiredis gem with hiredis-client openfoodfoundation/openfoodnetwork#11708

Merged

chore(deps): bump redis from 4.8.1 to 5.0.8 openfoodfoundation/openfoodnetwork#11698

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERR Protocol error: too big inline request (Redis::CommandError) #1185

ERR Protocol error: too big inline request (Redis::CommandError) #1185

Jakanapes commented Mar 16, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 18, 2023

Matt-Yorkley commented Jun 15, 2023

Jakanapes commented Jul 19, 2023

Jakanapes commented Jul 19, 2023

Jakanapes commented Oct 26, 2023

mkllnk commented Jan 25, 2024

Jakanapes commented Jan 25, 2024

mkllnk commented Jan 26, 2024

ERR Protocol error: too big inline request (Redis::CommandError) #1185

ERR Protocol error: too big inline request (Redis::CommandError) #1185

Comments

Jakanapes commented Mar 16, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 17, 2023

byroot commented Mar 17, 2023

Jakanapes commented Mar 18, 2023

Matt-Yorkley commented Jun 15, 2023

Jakanapes commented Jul 19, 2023

Jakanapes commented Jul 19, 2023

Jakanapes commented Oct 26, 2023

mkllnk commented Jan 25, 2024

Jakanapes commented Jan 25, 2024

mkllnk commented Jan 26, 2024