Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch & parallelism send for sink connector? #54

Open
yichao-figma opened this issue Sep 16, 2024 · 1 comment
Open

Batch & parallelism send for sink connector? #54

yichao-figma opened this issue Sep 16, 2024 · 1 comment

Comments

@yichao-figma
Copy link

Looks like the SQS send is done on a per-record basis:
https://github.com/Nordstrom/kafka-connect-sqs/blob/master/src/main/java/com/nordstrom/kafka/connect/sqs/SqsSinkConnectorTask.java#L111

Given SQS.send usually takes 10ms+, this wouldn't scale for partition that has > 100 RPS.

There could be two options for optimization:

  1. Group record in batch of 10
  2. Use ExecutionService to parallel-send the batches, and use Future to ensure success of all batches before return in put()
@dylanmei
Copy link
Contributor

There is an outstanding MR for add an ASYNC mode; does this meet your expectations for optimization?

#53

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants