Replies: 5 comments 10 replies
-
@skry-dev thanks for using RabbitMQ and providing a reasonably detailed set of information. Please note how I edited your question to make it easier to read. Another option is to attach files to your comments on GitHub. In this specific case, we know nothing about the environment in which you are running RabbitMQ
|
Beta Was this translation helpful? Give feedback.
-
Your PerfTest workload will result in attempting to send 150,000,000 bytes of data per second (500 * 100 * 3000). This is equivalent to 1.2 gigibits per second. Your environment probably can't keep up. My recommendation would be to scale back some combination of message rate, size and publisher count, until you see acceptable latencies. @mkuratczyk will have insights here as well. |
Beta Was this translation helpful? Give feedback.
-
With this much load you should run your producer and consumer PerfTest instances on separate hosts. Since they are running on the same host they may be competing for resources. |
Beta Was this translation helpful? Give feedback.
-
RabbitMQ is completely overloaded in this scenario. First of all you are not using publisher confirms, so you keep publishing even though RabbitMQ can't keep up. This leads to Erlang process queues growing longer and longer, which is why you see high latency. You can use i'd say the short answer is - you are demanding a lot from RabbitMQ and there's no simple way that'd guarantee this workload can be handled well. You can use observer to confirm the above, you can record a flamegraph to see where CPU time is spent, but likely there won't be a single obvious thing that we can just improve. Some things you could try:
But as I said - I can't guarantee that any of this will be sufficient. You can also reconsider where you really need a workload like this - perhaps you can use a stream, perhaps you can use classic non-mirrored queues, perhaps you can split the workload between multiple RabbitMQ clusters, etc. In the future, we are thinking about having multiple Ra subsystems with QQs assigned (probably randomly) to one of them. This way there wouldn't be a single Erlang process which needs to handle everything - it'd be split between a few such processes, so we could use multiple CPUs and use the hard disk better. And if a single hard disk would still not be sufficient (fsync latency is the main bottleneck) you'd be able to mount multiple disks, one per a Ra system, so everything would parallelize better. I'm fairly sure we will make this change at some point but no guarantees about when. |
Beta Was this translation helpful? Give feedback.
-
Your receiving rate is lower than your sending rate so messages have to queue in the, erm, queues. That is why you see higher latency. Try a workload with fewer consumers and set a QoS value of each consumer in the region of 20-50 and experiment from there. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I did a performance test and saw high latencies in message flow.
End to end performance test latency goes up to 15 seconds. It shouldn't take more than a few seconds for me.
I checked resources and network but didn't find any anomaly.
How can I improvement message times?
My test command:
My cluster properties:
I have 3 node RabbitMQ (version 3.12.12) cluster.
Rabbitmq run as docker container on each nodes.
Each node has 200GB Disk, 16CPU and 20GB RAM.
I used quorum queues.
rabbitmq.conf
docker-compose.yml
Test results:
Thank you.
Expected behavior
I need improvement quorum queue performance.
Beta Was this translation helpful? Give feedback.
All reactions