-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support expression in limit clause #19834
base: main
Are you sure you want to change the base?
Conversation
src/sqlparser/src/ast/query.rs
Outdated
@@ -655,7 +655,7 @@ impl fmt::Display for OrderByExpr { | |||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | |||
pub struct Fetch { | |||
pub with_ties: bool, | |||
pub quantity: Option<String>, | |||
pub quantity: Option<Expr>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not as simple as it looks like. A parsing conflict can happen:
https://github.com/postgres/postgres/blob/REL_17_2/src/backend/parser/gram.y#L13233
* Allowing full expressions without parentheses causes various parsing * problems with the trailing ROW/ROWS key words. SQL spec only calls for * <simple value specification>, which is either a literal or a parameter (but * an <SQL parameter reference> could be an identifier, bringing up conflicts * with ROW/ROWS). We solve this by leveraging the presence of ONLY (see above) * to determine whether the expression is missing rather than trying to make it * optional in this rule. * c_expr covers almost all the spec-required cases (and more), but it doesn't * cover signed numeric literals, which are allowed by the spec. So we include * those here explicitly. We need FCONST as well as ICONST because values that * don't fit in the platform's "long", but do fit in bigint, should still be * accepted here. (This is possible in 64-bit Windows as well as all 32-bit * builds.)
For example,
select 1 offset 1 row fetch first row only;
^
This is valid in PostgreSQL. When we see this row
keyword, it should be part of the fetch first [n] row
syntax where n
is omitted, but it may be mistakenly treated as part of the row(field0, field1)
expression because we will attempt to parse an expression (instead of a numerical constant) after first
.
What is the actual parsing behavior when this commit sees this SQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about let me revert the changes for "fetch" but keep the changes for "limit",
the CI test required by SQLMesh integration needs limit simple expression
only.
let fetch
still stick to a literal number
I will open a new issue for the reasoning posted above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we force parentheses around the expression in Fetch just like PG
SELECT id
FROM generate_series(1, 10) AS id
ORDER BY id
OFFSET 5 ROWS
FETCH NEXT (1+4) ROWS ONLY;
id
----
6
7
8
9
10
(5 rows)
SELECT id
FROM generate_series(1, 10) AS id
ORDER BY id
OFFSET 5 ROWS
FETCH NEXT 1+4 ROWS ONLY;
psql:commands.sql:11: ERROR: syntax error at or near "+"
LINE 5: FETCH NEXT 1+4 ROWS ONLY;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we force parentheses around the expression in Fetch just like PG
PostgreSQL is more complex than this, but yes parenthesized expr is a subset of what PostgreSQL allows (c_expr
as defined in the linked gram.y
above, while limit
allows a_expr
. A includes C includes parenthesized expr.).
4ce5e6d
to
ee54595
Compare
src/frontend/src/planner/query.rs
Outdated
let limit = limit.unwrap_or(ExprImpl::literal_bigint(LIMIT_ALL_COUNT as i64)); | ||
if !limit.is_const() { | ||
return Err(ErrorCode::ExprError( | ||
"expects an integer or expression that can be evaluated to an integer after LIMIT" | ||
.into(), | ||
) | ||
.into()); | ||
} | ||
let limit_cast_to_bigint = limit.cast_explicit(DataType::Int64).map_err(|_| { | ||
RwError::from(ErrorCode::ExprError( | ||
"expects an integer or expression that can be evaluated to an integer after LIMIT" | ||
.into(), | ||
)) | ||
})?; | ||
let limit = match limit_cast_to_bigint.fold_const() { | ||
Ok(Some(datum)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This validation logic is better placed in binder, and
BoundQuery::limit
remains anOption<u64>
. Planner is in charge of transforming SQL into a plan graph. - We can do cast first and then
try_fold_const
, rather than usingis_const
andfold_const
separately. - The allowed casts in PostgreSQL is
assign
rather thanexplicit
. (int
requiresimplicit
,decimal
requiresassign
,jsonb
requiresexplicit
. I tested in PostgreSQL andLIMIT '1'::jsonb
is not allowed.) - When the evaluated output is
None
, which means SQL NULL, PostgreSQL treats it as no limit. Let's add a test case for this. - Test error cases: wrong type, not const, eval error, negative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test error cases: wrong type, not const, eval error, negative
wrong type, negative can be constructed, I added them
but "eval error and not const" seems to be avoided by the parsing stage already
c66c0db
to
9aab24b
Compare
})?; | ||
let limit = match limit_cast_to_bigint.try_fold_const() { | ||
Some(Ok(Some(datum))) => { | ||
let value = datum.as_integral(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as_i64
as we already did the cast above.
// wrong type, not const, eval error all belongs to this branch | ||
_ => return Err(ErrorCode::ExprError( | ||
"expects an integer or expression that can be evaluated to an integer after LIMIT" | ||
.into(), | ||
).into()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Wrong type already handled as
cast_assign
error above. - Not const corresponds to
None
. The test case can beselect 1 limit generate_series(1, 2);
- Eval error corresponds to
Some(Err(_))
. The test case can beselect 1 limit 2/0;
Preferably _
shall be avoided when possible.
statement error | ||
SELECT * FROM generate_series(0,3,1) as t(v) order by v limit NULL + '1'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include the error message for all statement error
cases below, to make sure they are not failing with an unexpected reason. For example, this query actually reports function add(unknown, unknown) is not unique
statement error
SELECT * FROM generate_series(0,3,1) as t(v) order by v limit NULL + '1';
----
db error: ERROR: Failed to run the query
Caused by these errors (recent errors listed first):
1: Failed to bind expression: NULL + '1'
2: Bind error: function add(unknown, unknown) is not unique
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
fix #19830
required by SQLMesh's CI test
The fix is to evaluate the expression during binding stage so that we either accept the result (it is a number >=0 after casting to bigint) or reject it (all the other cases)
Checklist
Documentation
Release note