WriteAPI - WriteOptions #145

csballa · 2020-08-25T11:39:22Z

Hello,

I was trying to change a simple logger from the old java client to this one. I have expected a similar behaviour from the WriteOptions, as the previous client's batching mechanism.

Using the writeApi with the default WriteOptions doesn't seem to actually keep a batch, or retry writing if one attempt has failed.
It writes points to the given DB/retentionPolicy, but shuting down the influx server, or deleting the DB/RP would result in different logged, but swallowed errors (NotFoundException, InfluxException, ConnectException...). (I was trying to simulate connection loss/wrong configuration this way.)
After starting up the influx server, creating the DB/RP, it would continue to work, however points attempted to be written during the off time of the influxDB are lost.
As the errors are handled in the client, I don't see an easy solution to implement my own batching for points.

There is need for some extra config for the write options to take effect? As I have seen the default options are set even implicitly, but if I have to active them somehow I totally missed where or how:

(Not related: But I don't know where should I ask, but how can I test/query with this client for existence of a DB and/or RP?)

Thanks in advance for any fix/info!

Specifications:

Client Version: tested with: 1.10, 1.11
InfluxDB Version: 1.8
Platform: Windows

bednar · 2020-08-26T05:53:40Z

Hi @csballa,

thanks for using our client.

You could configure batching by WriteOptions, try to use something like:

WriteOptions writeOptions = WriteOptions.builder()
        .batchSize(5000)
        .flushInterval(1000)
        .bufferLimit(10000)
        .jitterInterval(1000)
        .retryInterval(5000)
        .build();

WriteApi writeApi = client.getWriteApi(writeOptions);

https://github.com/influxdata/influxdb-client-java/tree/master/client#writes

(Not related: But I don't know where should I ask, but how can I test/query with this client for existence of a DB and/or RP?)

We currently doesn't support DB/RP API (https://v2.docs.influxdata.com/v2.0/api/#tag/DBRPs), but it is a good suggestion to improve, so we will implemented it.

Regards

csballa · 2020-08-26T06:37:59Z

Hi @bednar,

thanks for the quick response.

As far as I can see the default write options are similar and having looked into the source code, I figured out that the issue here seems to be, that loosing connection or not having the configured DB/RP are not considered a retrieable error.

Consequently the points are lost, in the batch and the write OP is not retried. I would suggest:

That in no case an error response, unsuccesfull write operation should cause the loss of data. (The older influx client's batch retried all the time, after the batch started to run out it would start calling the passed in error handling function, so in case of write failure it was easy to store, the would be lost points.)
The WriteSuccesEvent contains the Line Protocol written, however, I couldn't find similar, easily accessible data about the attempted writes in Error events, so handling the faulty writes by other means also proves to be tricky, getting back the Point/Data in case of errors would also be a nice improvement for error handling.

What do you think about the 1. point? Would it be possible to change the implementation so all unsuccesful write would be retried later on and the batch kept until it grows over its limit?

bednar · 2020-09-24T12:27:46Z

Hi @csballa,

We are working on improvement of retry strategy for the client. We will introduce new configuration options to be more user friendly:

Property	Description	Default Value
max_retries	the number of max retries when write fails	5
max_retry_delay	maximum delay when retrying write in milliseconds	180000
exponential_base	the base for the exponential retry delay, the next delay is computed as `retry_interval * exponential_base^(attempts - 1) + random(jitter_interval)`	5
batch_abort_on_exception	the batching worker will be aborted after failed retry strategy	false

Regards

csballa · 2020-09-24T12:31:59Z

Hello @bednar,

Thank you for the improvement!
Just to clarify: Can we expect that all import errors, not just retrieables will be retried? (Like a wrong configuration/connection lost)

Thanks again!

bednar · 2020-09-24T12:39:18Z

We will be retry all HTTP connection error + HTTP errors >= 429.

csballa · 2020-09-25T13:47:03Z

Will you consider improving the error handling to provide the failed points in the ErrorEvents (Especially in WriteErrorEvent)?
Also a similar option for handling overflown points from the batch would be nice.
Without these options it requires some workaround to ensure no data gets lost. Also if we have access to the failed points it would be a lot more easier to write it into another target DB/file.

bednar · 2020-09-29T10:23:06Z

@csballa yes, we could do that in next PR after we will improve our retry strategy

csballa · 2021-01-05T14:24:42Z

Hello @bednar,

I have finally was able to try out the changes:

Getting the points in case of an error is still soarly missed, but the new config options are really helpful.
Maybe wort another ticket, but I have discovered a new issue during testing(client version 1.14, influx 1.8):

Scenario:
I tested the retries with simply starting up the influxdb only after some write attempts, so I can see how my writes would be retried.

Result:

Points attempted to be written before the retries happen, would be written after DB start up. (expected behaviour)
Points attempted to be written after the first retry, would throw (after db startup) an Interrupted exception. These point are lost, point before the retrie are saved correctly, and points after the error also. Here I would expect that the points during retries are also added to the batch, and would be part of the same retry attempts, or queued up with the other attempts.
This behaviour causes the point written during retries get lost, and therefore making the retries unutilizable.
InterruptedStackTrace.txt

Regards

bednar · 2022-07-18T05:33:09Z

The interrupted exception should be fixed by #358.

bednar mentioned this issue Sep 30, 2020

feat: add exponential backoff strategy for retry #156

Merged

6 tasks

bednar closed this as completed Jul 18, 2022

bednar added this to the 6.4.0 milestone Jul 18, 2022

bednar added the duplicate This issue or pull request already exists label Jul 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WriteAPI - WriteOptions #145

WriteAPI - WriteOptions #145

csballa commented Aug 25, 2020

bednar commented Aug 26, 2020

csballa commented Aug 26, 2020

bednar commented Sep 24, 2020 •

edited

Loading

csballa commented Sep 24, 2020

bednar commented Sep 24, 2020

csballa commented Sep 25, 2020

bednar commented Sep 29, 2020

csballa commented Jan 5, 2021

bednar commented Jul 18, 2022

WriteAPI - WriteOptions #145

WriteAPI - WriteOptions #145

Comments

csballa commented Aug 25, 2020

bednar commented Aug 26, 2020

csballa commented Aug 26, 2020

bednar commented Sep 24, 2020 • edited Loading

csballa commented Sep 24, 2020

bednar commented Sep 24, 2020

csballa commented Sep 25, 2020

bednar commented Sep 29, 2020

csballa commented Jan 5, 2021

bednar commented Jul 18, 2022

bednar commented Sep 24, 2020 •

edited

Loading