Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a benchmark to verify performance #53

Open
mariomastrodicasa opened this issue Apr 8, 2022 · 7 comments · Fixed by #59, #169, #178 or #383
Open

Add a benchmark to verify performance #53

mariomastrodicasa opened this issue Apr 8, 2022 · 7 comments · Fixed by #59, #169, #178 or #383
Assignees
Labels
enhancement New feature or request KNet KNet related issue

Comments

@mariomastrodicasa
Copy link
Contributor

Is your feature request related to a problem? Please describe.
The project needs a benchmark to verify its performance.

Describe the solution you'd like
Add some benchmark associated to the produce/consume API calls to verify performance and identify possible bottlenecks.

Describe alternatives you've considered
Introduce some counter or other mechanism to measure time like Stopwatch.

Additional context
N/A

@masesdevelopers
Copy link
Contributor

The benchmark cannot be built with an absolute vision. What it can done is to have some function that executes in the same way, but use a different underlying mechanism: i.e. use KNet in some function and use e.g. Confluent.Kafka in other.
Pairing this two executions it is possible to compare both implementations.
Anyway it is mandatory to use configurations which, more or less, create similar environments: if the configuration is in one case, or in another, is optimized the comparison has no real meaning. The previous consideration comes from the number of possible options can be used when a consumer/producer is allocated, how the underlying mechanism manages message batching both in receive mode (consume) and send mode (produce).

masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 29, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 29, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 29, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 29, 2022
masesdevelopers added a commit that referenced this issue Apr 29, 2022
* Version upgrade

* #54 (comment)

* #53 (#53 (comment)): added comparing benchmark test

* Fix on classes

* #53: Added metrics classes

* Alignment to latest JNet

* #53: Fix file

* CLI improvements

* #53: Benchmark update

* Added command line arg for log path

* Added KNetTopicCopyBenchmark

* Added specialized classes to manage produce/consume

* Upgrade to JNet v1.4.1

* KNetConsumer is marked as evolving

* Fix issue when GC recover objects too early

* Aligned to JNetCore and fix memory management
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 1, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 1, 2022
masesdevelopers added a commit that referenced this issue May 2, 2022
* Set single point of definition for topics name

* #53: added more statistics on report

* #53: added first results in a specific site page

* Fixed #56 (comment)

* Fix for #56 (comment)
@mariomastrodicasa
Copy link
Contributor Author

Maybe a new option shall be added, or a value shall be calculated, about the Average, stdev and CV without the maximum and minimum values in the series. Max and Min in a series can be affected from spurius conditions and can affect the other measurements: try to remove them and calculate the values.

masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 3, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 3, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 3, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 3, 2022
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue May 4, 2022
masesdevelopers added a commit that referenced this issue May 4, 2022
* Updates on KNetConsumer

* #53 (comment)

* Fixed some numbers

* #53: stdev becomes SD

* #53: updates on report

* #53: updates on report

* #53: extra tests
@mariomastrodicasa
Copy link
Contributor Author

Try to integrate BenchmarkDotNet (at https://github.com/dotnet/BenchmarkDotNet)

@masesdevelopers masesdevelopers added the KNet KNet related issue label Oct 25, 2022
masesdevelopers added a commit that referenced this issue Feb 8, 2023
* Updates on KNetConsumer

* #53 (comment)

* #53: stdev becomes SD

* #58: upgrade JNet and version

* Update copyright notice

* #92: update to latest JNetCore

* #92: set JCOBridge version to 2.5.2
@mariomastrodicasa
Copy link
Contributor Author

@masesdevelopers: Add a new benchmark which measures the time elapsed from the moment where a record is sent from the producer to the reception of the same record from a consumer: it is like a roundtrip. I think this benchmark can measure both producer and consumer performance, maybe it is possible to use #53 (comment)

masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Mar 13, 2023
@masesdevelopers masesdevelopers linked a pull request Mar 14, 2023 that will close this issue
9 tasks
masesdevelopers added a commit that referenced this issue Mar 14, 2023
* #92: upgrade to JNet 1.5.2 and JCOBridge 2.5.3

* #53: benchmark update

* #121: update to version 1.5.1

* #92: updates Java version in pom.xml to be aligned to JNet
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 16, 2023
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 21, 2023
masesdevelopers added a commit to masesdevelopers/KafkaBridge that referenced this issue Apr 21, 2023
masesdevelopers added a commit that referenced this issue Apr 21, 2023
* #53: added new roundtrip benchmark

* #53: update performance description file
masesdevelopers added a commit that referenced this issue Jul 6, 2023
* #168: first step: replace namespaces in all available classes

* #93: documentation alignment

* Added a benchmark to verify performance roundtrip (#178)

* #53: added new roundtrip benchmark

* #53: update performance description file

* #168: moved KNet specific classes into dedicated folder

* #179: fixed compacted topic creation (#180)

* #92: full class review due to breaking change in JCOBridge

* #168: first generation

* Added KNetCompactedReplicator, evolved KNetProducer and KNetConsumer, enhanced serializer/deserializer (#182)

* #92: full class review due to breaking change in JCOBridge

* #175: improvements in Consumer/Producer Builders

* #175: review classes to accept new KNetCompactedReplicator

* #175: added serialization projects

* #175: moved to Java SerDes due to error in C# compilation within container

* #175: review of KNetConsumer, KNetProducer and KNetSerDes; added specific test for KNetConsumer and KNetProducer

* #175: update serialization and added MessagePack type

* #175: updates on sync management

* #175: documentation update

* Update documentation after commit fb2bded

* Added missing SourceLink

* Configuration is now managed using a JSON file (masesgroup/JNet#179)

* Update configuration and files

* Temporary commit: many classes shall be removed because are old

* #185: fix .NET Framework PowerShell version (#186)

* #92: update to JNet 2.0.0.0

* #121: update to version 2.0.0.0

* Correction on namespace

* Update classes after JNetReflector update for masesgroup/JNet#195

* #168 (comment): implementation of special listeners

* Reviewed implementation of KNet version of ConnectStandalone and ConnectDistributed

* #88: full update to Apache Kafka 3.5.0

* Code alignment to latest JNetReflector: nullable native types converted into Java types

* Update workflows to avoid documentation generation out of main branch build
masesdevelopers added a commit that referenced this issue Jul 7, 2023
* Full project review based on latest version of JNet suite (#193)

* #168: first step: replace namespaces in all available classes

* #93: documentation alignment

* Added a benchmark to verify performance roundtrip (#178)

* #53: added new roundtrip benchmark

* #53: update performance description file

* #168: moved KNet specific classes into dedicated folder

* #179: fixed compacted topic creation (#180)

* #92: full class review due to breaking change in JCOBridge

* #168: first generation

* Added KNetCompactedReplicator, evolved KNetProducer and KNetConsumer, enhanced serializer/deserializer (#182)

* #92: full class review due to breaking change in JCOBridge

* #175: improvements in Consumer/Producer Builders

* #175: review classes to accept new KNetCompactedReplicator

* #175: added serialization projects

* #175: moved to Java SerDes due to error in C# compilation within container

* #175: review of KNetConsumer, KNetProducer and KNetSerDes; added specific test for KNetConsumer and KNetProducer

* #175: update serialization and added MessagePack type

* #175: updates on sync management

* #175: documentation update

* Update documentation after commit fb2bded

* Added missing SourceLink

* Configuration is now managed using a JSON file (masesgroup/JNet#179)

* Update configuration and files

* Temporary commit: many classes shall be removed because are old

* #185: fix .NET Framework PowerShell version (#186)

* #92: update to JNet 2.0.0.0

* #121: update to version 2.0.0.0

* Correction on namespace

* Update classes after JNetReflector update for masesgroup/JNet#195

* #168 (comment): implementation of special listeners

* Reviewed implementation of KNet version of ConnectStandalone and ConnectDistributed

* #88: full update to Apache Kafka 3.5.0

* Code alignment to latest JNetReflector: nullable native types converted into Java types

* Update workflows to avoid documentation generation out of main branch build

* Update documentation after commit 312b4bf

* Added missing documentation (#194)

* #24, #168: review documentation, removed unused classes, KafkaClientSupplier becomes a listener

* #24: fix documentation location (#195)

* #24: removed many warning from workflows output (#196)

* #24: added disclaimer for version 2.0.0 (#197)

* Update documentation after commit cb755ee

* V2 merge conflicts solved (#198)

* Remerge (#199)

* #24: added some documentation for serializer/deserializer (#201)

* Update documentation after commit b9e9db3

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
@masesdevelopers
Copy link
Contributor

Reopen to update statistics with latest versions of KNet and Confluent.Kafka

@masesdevelopers
Copy link
Contributor

Many KNet operations can, or cannot, be impacted during JNI operations because data can, or cannot, be the copy of the data available in the JVM.

As stated in https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html JVM can decide to copy, or pin, data depending on its internal implementation.

The data transfer used in KNet is based on JNI and the results of some benchmarks seems impacted from the following hypothesis:

  • JVM makes a copy of the data before send it to JNI and replace back the object data when JNI returns: so there are multiple copies of data during data exchange;
  • JVM Garbage Collector: some JNI methods can impact the GC operations since, depending on JVM implementation, the GC can pin or copy array of primitive types.

Current benchmarks are based on byte array to reduce exogenous interference (like the implementation of the serializers) and in general KNet specific implementation (KNetConsumer, KNetProducer, KNet Streams SDK, etc) uses byte array.

Maybe a better data exchange can be obtained reducing the number of array copies done during execution.
An issue will be opened to investigate on this possible evolution, meanwhile this issue is reopened.

@masesdevelopers
Copy link
Contributor

The information traverse the CLR-JVM boundary many times reducing the speed.

The local serializers stub invokes remote serializers: each time there is a conversion the JVM is impacted like in

public static byte[] SerializeBoolean(string topic, bool data)
or
public static bool DeserializeBoolean(string topic, byte[] data)

Each conversion moves data between CLR and JVM:

  • data (bool, int, double, etc) is sent to JVM
  • the JVM returns a converted byte array
  • the byte array is returned to the caller that insert it within e.g. a ProducerRecord
  • ProducerRecord send back again the array to the JVM

The same happens in opposite with ConsumerRecord:

  • JVM receives a ConsumerRecord
  • if one data is needed, e.g. the key, the byte array is requested to the JVM which send it back to the CLR
  • then CLR sent the byte array back again to the JVM
  • finally the JVM returns the converted type (bool, int, double, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment