"Leaking" of _expr_*
columns into result relations
#4719
Labels
bug
Invalid compiler output or panic
_expr_*
columns into result relations
#4719
What happened?
prqlc
produces SQL which uses_expr_*
columns internally for internal intermediate calculation results. These should never end up in the resulting output relation.The current situation is understandable since not all SQL dialects support the
SELECT EXCLUDE _expr_*
syntax and without schema information about the input relations, the compiler cannot explicitly enumerate the output columns.However if we think about PRQL from first principles as "a language for Relational Algebra" then I hazard the guess that we all agree that such internal implementation details should not form part of the output.
As pointed out in #4633 (comment), such extraneous columns can also cause errors in downstream calcuations.
One of my personal ambitions for PRQL is to extend it beyond the current SQL backends to also include other "relational" languages and query engines such as Pandas, Polars, (Power Query) M Language, etc... to name just a few. In those cases the
_expr_*
columns should also not show up and ideally PRQL should produce consistent results between different backends.PRQL input
SQL output
Expected SQL output
MVCE confirmation
Anything else?
I'm not sure what the solution would be so just raising this as a tracking issue and discussion point.
There are some more examples in #3130 (comment).
The text was updated successfully, but these errors were encountered: