Replies: 1 comment 1 reply
-
Hi @jlazic , Thanks for sharing these benchmark results! It seems likely that you are being limited by internal concurrency in transforms and sinks given the data you shared. It is something we'd like to improve upon but is somewhat of an ongoing effort. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Benchmark
AMD Epyc 9554 (128 core)
TCP inputs are fed with 50GB file containing 8M log events in syslog RFC5524 format
Each input is being sent the same 50GB file, so 4 inputs will process total of 200GB
VRL remap is CPU intensive 250 lines of code.
At first we were assuming that CPU will be bottleneck for our pipelines because of
intensive VRL remapping, but it turned out that with single source/sink we simply can't
saturate more than 20 CPU cores.
In production we pull data from Kafka and sink it into Elasticsearch. Issue is that with
one source and sink per vector server we can't utilize more than 20 cores.
What we do now is duplicate sources and sinks in order to utilize CPU more. This is working
fine but we'd like to get to bottom of this and hopefuly be able to run single source/sink
per server.
Issue #18164 prompted me to run these tests and see if we can avoid running multiple inputs/sinks.
Results
4in,4vrl,4s combination (4 inputs, 4 VRL remaps and 4 sinks) is winning combination for throughput,
and seems like only one that can scale verticaly to use most of the CPU available
Seems like sink is limiting factor, even though blackhole should not have any limitations.
This can be seen with 1in,1vrl,4s and 1in,4vrl,4s which both utilize CPU much better than any
of other combinations that have just single sink.
Insights and comments from someone with more vector.dev exp would be really appriciated here.
Bellow are configs used for testing
4 TCP sources into 4 VRL into 4 sinks
4 TCP sources into 4 VRL into 1 sink
4 TCP sources into 1 VRL into 1 sink
1 TCP source into 1 VRL into 1 sink
1 TCP source into 1 VRL into 4 sink
1 TCP source into 4 VRL into 4 sink
Beta Was this translation helpful? Give feedback.
All reactions