Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not infer where a column is from but then the error message says that column is available? #4723

Open
2 tasks done
cottrell opened this issue Jul 12, 2024 · 2 comments
Open
2 tasks done
Labels
bug Invalid compiler output or panic

Comments

@cottrell
Copy link

cottrell commented Jul 12, 2024

What happened?

I'm not sure if this is a real bug or just a strange error message?

PRQL input

from data
join side:left other (other.id== data.id)
into A

from codes
into S

from A
select {
A.val
}

SQL output

Error: 
   ╭─[:7:3]
   │
 7A.val
   │   ──┬──  
   │     ╰──── Cannot infer where `A`.val is from. It could be any of [(119, {}), (116, {})]
   │ 
   │ Help: available columns: `A`.val
───╯

Expected SQL output

Any sql.

MVCE confirmation

  • Minimal example
  • New issue

Anything else?

No response

@cottrell cottrell added the bug Invalid compiler output or panic label Jul 12, 2024
@kgutwin
Copy link
Collaborator

kgutwin commented Jul 15, 2024

I did some experimentation with this example.

The from codes \n into S section isn't necessary to reproduce the problem, this produces the same output:

from data
join side:left other (==id)
into A

from A
select {A.val}
Error: 
   ╭─[:6:9]
   │
 6 │ select {A.val}
   │         ──┬──  
   │           ╰──── Cannot infer where `A`.val is from. It could be any of [(119, {}), (116, {})]
   │ 
   │ Help: available columns: `A`.val
───╯

The error goes from being confusing to clear when you remove the into A \n from A sequence and remove the A reference in the select:

from data
join side:left other (==id)
select {val}
Error: 
   ╭─[:3:9]
   │
 3 │ select {val}
   │         ─┬─  
   │          ╰─── Ambiguous name
   │ 
   │ Help: could be any of: data.val, other.val
───╯

So the root cause of the error message is that A.val is ambiguous in that it could come from either data or other. Unfortunately, the into A \n from A break makes it impossible to directly reference columns from the source relations:

from data
join side:left other (==id)
into A

from A
select {data.val}
Error: 
   ╭─[:6:9]
   │
 6 │ select {data.val}
   │         ────┬───  
   │             ╰───── Unknown name `data.val`
───╯

The most obvious solution to me is to recommend explicitly defining columns using a select prior to the into A. This resolves the error from the user perspective.

from data
join side:left other (==id)
select {data.val, other.col}
into A

from A
select {A.val}  # no error now

This might actually be the only true solution to this problem, since there doesn't seem to be any other way for the resolver to disambiguate a column reference when there are multiple source relations with wildcard columns.

@cottrell
Copy link
Author

Ah, yes I was just thinking in general about explicit col defines via select and didn't think to try it in these exampels.

I wonder if this is perhaps the ONE COMPROMISE that PRQL should really push hard on it's users. The more I get into the weeds the more I understand that very little can be done without knowing the original column names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Invalid compiler output or panic
Projects
None yet
Development

No branches or pull requests

2 participants