Reproducible performance tests #267

kevinbader · 2019-10-02T15:48:24Z

Reproducible Performance Tests

The goal is to have automated tests that ensure that RIG is able to handle enterprise-grade traffic. We want to run them regularly, ideally in a fully automated way, which is why it is important that there are as few manual steps involved as possible.

Scenarios

Ramp up for each run:

No connection between RIG and Kafka
Kafka messages prepared in topic
Clients connect
Establish connection between RIG and Kafka

Run 1: Time it takes to consume and drop 1M messages from Kafka

Load 1M Kafka messages, each 1 kB in size, during the ramp up, where the first and the last message have eventType set to to_be_delivered, while all others have eventType set to ignored.
Connect a client and subscribe to events of type to_be_delivered. This causes all except the first and the last messages to be dropped.
Measure time between the first and the last message as received at the client.

Hypothesis: this takes 1-2 seconds.

Why we test this

One of the assumptions behind RIG's design is that many of the messages flowing through Kafka can actually be ignored from RIG's perspective, that is, are not subscribed to by any frontend that might be connected to it. This test validates that RIG is indeed capable of dropping large amounts of messages quickly.

Run 2: Resource consumption when clients all use the same configuration

Configurations: (a) 10k, (b) 20k, (c) 30k, (d) 40k clients

Load 1M Kafka messages, each 1 kB in size, during the ramp up, where all messages have eventType set to to_be_delivered.
Connect all clients with subscriptions for events of type to_be_delivered. This causes all clients to receive all messages.
Measure time, memory and cpu consumption over time until all clients have received 1M messages each.

Hypothesis: this takes <5 seconds.

Why we test this

If RIG is run with a dedicated Kafka topic attached to it, all messages consumed from Kafka are potentially relevant to frontends. This tests the case where there is only one frontend, instantiated once for each client, where the subscriptions are not user-specific.

Run 3: Resource consumption when clients receive a presumably realistic share of the events

Configurations: (a) 10k, (b) 20k, (c) 30k, (d) 40k clients

Create 1M Kafka messages, each 1 kB in size. Use the numbers from 1 to 5 as the messages' eventType, such that the "partitions" are interleaved. Load the messages during ramp up.
Connect the clients. Make sure that a fifth of them is subscribed to eventType 1, a fifth to eventType 2, etc. This causes all clients to receive a fifth of all messages.
Measure time, memory and cpu consumption over time until all clients have received 200k messages each.

Hypothesis: this is 5 times faster than Run 2.

Why we test this

If RIG is run with a dedicated Kafka topic attached to it, all messages consumed from Kafka are potentially relevant to frontends. This tests the case where there is only one frontend, instantiated once for each client, where the subscriptions are user-specific. In practice, the events would not be differentiated by their event type but by dedicated "user" field. However, for the purpose of this benchmark the eventType field is used to ease the comparison with Run 2.

Follow-up to #2.

The text was updated successfully, but these errors were encountered:

kevinbader assigned Azer0s Oct 2, 2019

kevinbader added Hacktoberfest and removed Hacktoberfest labels Oct 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducible performance tests #267

Reproducible performance tests #267

kevinbader commented Oct 2, 2019

Reproducible performance tests #267

Reproducible performance tests #267

Comments

kevinbader commented Oct 2, 2019

Reproducible Performance Tests

Scenarios

Run 1: Time it takes to consume and drop 1M messages from Kafka

Why we test this

Run 2: Resource consumption when clients all use the same configuration

Why we test this

Run 3: Resource consumption when clients receive a presumably realistic share of the events

Why we test this