Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClientV2 ignoring schema inference hints and infer integer settings #1884

Closed
alxhill opened this issue Oct 24, 2024 · 5 comments · Fixed by ClickHouse/clickhouse-docs#2942
Closed

Comments

@alxhill
Copy link
Contributor

alxhill commented Oct 24, 2024

Describe your feedback

We run the following query through the clickhouse java client:

CREATE TABLE {tmpTableName:Identifier}  ENGINE=Log() AS SELECT * FROM s3('...file.csv', 'CSVWithNames')

With the following QuerySettings:

date_time_input_format=best_effort,
input_format_try_infer_integers=0,
schema_inference_hints='timestamp Nullable(Float64)'
input_format_try_infer_exponent_floats=1,
precise_float_parsing=1,
input_format_try_infer_dates=1,
input_format_try_infer_datetimes=1,
schema_inference_use_cache_for_s3=0,
schema_inference_make_columns_nullable=1

In ClientV1, the created table would have Nullable(Float64) for the timestamp column. ClientV2 seems to ignore the inference hints (and the "input_format_try_infer_integers=0" setting), as the column is Nullable(Int64) instead:

TableSchema{tableName='tmp_csv_0be155a3_920d_4496_87a7_ee2facee2efc', databaseName='nominal', columns=[timestamp Nullable(Int64), a Nullable(Float64), b Nullable(String)], metadata={a={type=Nullable(Float64)}, b={type=Nullable(String)}, timestamp={type=Nullable(Int64)}}, colIndex={a=1, b=2, timestamp=0}, hasDefaults=false}
@chernser
Copy link
Contributor

@alxhill thank you for reporting!
will look into it shortly.

@chernser chernser added the bug label Oct 25, 2024
@chernser chernser added this to the 0.7.2 milestone Oct 31, 2024
@alxhill
Copy link
Contributor Author

alxhill commented Nov 19, 2024

Okay, looks like the culprit is that ClientV1 appends all QuerySettings to the URI, while ClientV2 only appends server settings to the URI.

ClientV1

for (Entry<String, Serializable> entry : settings.entrySet()) {
// Skip internal settings
if (entry.getKey().equalsIgnoreCase("_set_roles_stmt")) {
continue;
}
appendQueryParameter(builder, entry.getKey(), String.valueOf(entry.getValue()));
}

ClientV2

for (Map.Entry<String, Object> entry : requestConfig.entrySet()) {
if (entry.getKey().startsWith(ClientSettings.SERVER_SETTING_PREFIX)) {
req.addParameter(entry.getKey().substring(ClientSettings.SERVER_SETTING_PREFIX.length()), entry.getValue().toString());
}
}

Changing from settings.setOption to settings.serverSetting fixed my tests. Seems like a break worth resolving as they're silently being ignored.

@chernser
Copy link
Contributor

@alxhill
sorry - my bad - I would need to document this thing.

Is the issue resolved and only documentation should be changed?

Thanks!

@alxhill
Copy link
Contributor Author

alxhill commented Nov 26, 2024

We are unblocked, but this does seem like a break in the client & not something I would expect to see upgrading between minor versions

@chernser
Copy link
Contributor

@alxhill sorry about that - my bad. This is consequences of using two different clients under the hood (old and new) .
We always keep in mind that changes may be breaking and so we need to handle them properly - why many fixes are getting feature flags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants