Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Format] Support an official "timestamp with time zone offset" type #44248

Open
CurtHagenlocher opened this issue Sep 27, 2024 · 5 comments
Open

Comments

@CurtHagenlocher
Copy link
Contributor

Describe the enhancement requested

Relational databases Snowflake, MSSQL, Oracle, Teradata, and SAP SQL Anywhere all support a data type which stores both a timestamp and a time zone offset. This differs from the existing Arrow timestamp type by letting each individual value in the column have a different offset and by not being tied to a geopolitical time zone. This type also appears in Java as OffsetDateTime and in .NET as DateTimeOffset. It would be nice given how commonly it appears if there were a standard way to represent this in Arrow.

This could be done as an extension type for a structure consisting of separate 8-byte timestamp and 2-byte offset values, or as a new first-class type. Intervals are a structure with some similarity to this type and were done as a first-class type, but they also predate the extension type mechanism.

Component(s)

Format

@rok
Copy link
Member

rok commented Sep 27, 2024

For the record arrow can currently store timezone offsets per array as strings see here.
To store per value offsets an extension type sounds like a good idea. What temporal resolution would you propose? Minutes would fit into two bytes I suppose.

@CurtHagenlocher
Copy link
Contributor Author

The standard seems to be support for minute-level resolution with a range of something like -14:00 to +14:00. Storing the number of minutes as an int16 seems right.

@rok
Copy link
Member

rok commented Sep 28, 2024

Adding an extension type would start by opening a PR against CanonicalExtensions.rst describing the proposed type and calling for discussion/vote on the ML (e.g. 8-bit boolean). It might make sense to wait for more people to chime in before doing so though.

@kou kou changed the title [Format]: Support an official "timestamp with time zone offset" type [Format] Support an official "timestamp with time zone offset" type Sep 28, 2024
@CurtHagenlocher
Copy link
Contributor Author

Related request: apache/arrow-java#171

@aiguofer
Copy link
Contributor

Interesting, my request ^ was slightly unrelated. However, I've been doing some TZ work recently and would love to have a vector type that includes the offset/tz per-row instead of the existing type that includes the offset/tz for the entire vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants