-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cubesql) Implement format / col_description #8947
base: master
Are you sure you want to change the base?
feat(cubesql) Implement format / col_description #8947
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 8 Skipped Deployments
|
doesn't appear related to my changes, all specs i've added have passed 🤔 |
@pauldheinrichs My PR is also failing with the same set of tests. I was trying to find a PR for which this GHA was succeeding. |
I am not sure, but I prepared a potential fix for the CI issue: b75a37a @pauldheinrichs Could you please rebase your PR? Thanks! |
@ovr All set! Rebased! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8947 +/- ##
==========================================
- Coverage 82.66% 79.76% -2.90%
==========================================
Files 221 221
Lines 78303 78545 +242
==========================================
- Hits 64727 62654 -2073
- Misses 13576 15891 +2315
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
@ovr Is there anything further you'd like to see in this PR? I'd love to merge this so we can update our metabase instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a delay!
Here's what I've noticed. Also you need to update it to fresh master
.
Other than that I think it's good to land.
@@ -3148,6 +3155,155 @@ pub fn create_pg_get_serial_sequence_udf() -> ScalarUDF { | |||
) | |||
} | |||
|
|||
// Return a NOOP for this so metabase can sync without failing | |||
// TODO: Implement this | |||
pub fn create_col_description_udf() -> ScalarUDF { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please attach links to respective parts of PostgreSQL documentation?
Like https://www.postgresql.org/docs/17/functions-info.html#FUNCTIONS-INFO-COMMENT here, and https://www.postgresql.org/docs/17/functions-string.html#FUNCTIONS-STRING-FORMAT for format, maybe other important references.
|
||
ScalarUDF::new( | ||
"col_description", | ||
&Signature::one_of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of this Signature::one_of
? It matches single type signature, why not use Signature::exact
directly?
ScalarUDF::new( | ||
"col_description", | ||
&Signature::one_of( | ||
vec![TypeSignature::Exact(vec![DataType::Utf8, DataType::Utf8])], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
col_description
is documented as ( table oid, column integer ) → text
,
so I expect that argument types here would be ints, not strings.
You can check against pg_description
system table provider for this
https://github.com/cube-js/cube/blob/master/rust/cubesql/cubesql/src/compile/engine/information_schema/postgres/pg_description.rs
First argument would be used to filter objoid
, and second - objsubid
as_str | ||
))) | ||
// If the type name contains a dot, it's a schema-qualified name | ||
// and we should return return the approprate RegClass to be converted to OID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: return return
))) | ||
// If the type name contains a dot, it's a schema-qualified name | ||
// and we should return return the approprate RegClass to be converted to OID | ||
// For now, we'll return 0 so metabase can sync without failing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add TODO here along the lines of "implement something"
"col_description", | ||
&Signature::one_of( | ||
vec![TypeSignature::Exact(vec![DataType::Utf8, DataType::Utf8])], | ||
Volatility::Immutable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
col_description
is not immutable, as it can change values for same inputs in a different queries: column comment can change between queries.
But it cannot change values for a single query (I think, assuming PostgreSQL isolation), so it should be Volatility::Stable
.
|
||
ScalarUDF::new( | ||
"format", | ||
&Signature::variadic(vec![DataType::Utf8], Volatility::Immutable), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format
is documented like this
format(formatstr text [, formatarg "any" [, ...] ])
Signature::variadic([Utf8], ...)
will accept any number of arguments, but each argument must be one of [Utf8]
Unfortunately, there's no proper way to represent it in TypeSignature
right now. But current variant would just cast everything to Utf8 before calling format
, which is ok for now, I think. More recent DataFusion versions have extended TypeSignature
to support this case, so we can deal with this later.
Please comment this part: about casting and TODO about incorrect signature
// Quote identifier if necessary | ||
let needs_quoting = !value | ||
.chars() | ||
.all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit wrong, because identifiers 1
and 1a
need quoting, but a1
- does not.
+----------------+ | ||
| quoted_keyword | | ||
+----------------+ | ||
| select | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PostgreSQL actually wraps keywords with quotes, and accepts keywords unwrapped only in assigning column alias position, but does not accept those in subquery alias or in column expression.
Maybe just wrap any %I
argument with quotes? I can't imagine right now what could go wrong, in a sense that format
argument can already be quoted in result, so any consumer should handle it as well. That would differ from PostgreSQL behavior, of course. But current implementation also differs, and "quote everything" approach have a benefit of something like "any %I substitution would make a proper SQL".
@mcheshkov |
@pauldheinrichs, hope you are doing well! |
@mcheshkov of course! Feel free |
Semi-resolves: #8926
Hoping to unblock metabase failing field sync on it's latest version. This pull requests implements
format
macroIt also noops
schema
.table
identifiers)I don't have a tonne of bandwidth to fully implement the missing macro but this at least helps unblock the current request metabase is making to cube and failing when attempting to sync-fields.
It's not entirely clear if there's already an inherent way to lookup the table OID to get that result from the RegClass result to be leveraged in col_description implementation
Issue Reference this PR resolves
[For example #12]
Description of Changes Made (if issue reference is not provided)
[Description goes here]