-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clickhouse_loader.py : fixing temporal and binary data types #338
Conversation
if 'datetime' == data_type or 'datetime(1)'== data_type or 'datetime(2)' == data_type or 'datetime(3)' == data_type: | ||
# CH datetime range is not the same as MySQL https://clickhouse.com/docs/en/sql-reference/data-types/datetime64/ | ||
select += f"case when {column_name} > substr('2283-11-11 23:59:59.999', 1, length({column_name})) then TRIM(TRAILING '0' FROM CAST('2283-11-11 23:59:59.999' AS datetime(3))) else case when {column_name} <= '1925-01-01 00:00:00' then TRIM(TRAILING '.' FROM TRIM(TRAILING '0' FROM CAST('1925-01-01 00:00:00.000' AS datetime(3)))) else TRIM(TRAILING '.' FROM TRIM(TRAILING '0' FROM {column_name})) end end" | ||
select += f"case when {column_name} > substr('2299-12-31 23:59:59.999', 1, length({column_name})) then substr(TRIM(TRAILING '0' FROM CAST('2299-12-31 23:59:59.999' AS datetime(3))),1,length({column_name})) else case when {column_name} <= '1900-01-01 00:00:00' then TRIM(TRAILING '.' FROM TRIM(TRAILING '0' FROM CAST('1900-01-01 00:00:00.000' AS datetime(3)))) else TRIM(TRAILING '.' FROM TRIM(TRAILING '0' FROM {column_name})) end end" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be nice if these values 2299-12-31 are in a separate file or a separate variable , so that it can be updated easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, this was tested with 23.3. @subkanthi do you know if there is a JDBC driver that supports those maximum values ?
will add it as a parameter as it may change again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@subkanthi Added --min_date_value
and --max_date_value
, please review !
This fixes the checksums for temporal types for timezone != UTC
MySQL timestamps are converted to DateTime64 in the client timezone.
MySQL DateTime are converted as String, so that they are converted in the server timezone.
For the migration to be consistent, the source and target timezones should be identical. It should not matter for timestamps but it could cause conversion issues for DateTime.
The binary types have been addressed. Binary types are encoded as base64 by
clickhouse_loader.py
, while the checksum encoding is currently hex only (see for the sink-connector FR #340)