Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JOIN ... USING ( col1, col2, ... ) #730

Open
dwickern opened this issue Oct 11, 2023 · 8 comments · May be fixed by #862
Open

Support JOIN ... USING ( col1, col2, ... ) #730

dwickern opened this issue Oct 11, 2023 · 8 comments · May be fixed by #862
Labels
api Related to library's API enhancement New feature or request mysql Related to MySQL postgres Related to PostgreSQL sqlite Related to sqlite

Comments

@dwickern
Copy link

Many databases support this shorthand for joins when the column names are the same in both tables.

From Postgres docs:

The USING clause is a shorthand that allows you to take advantage of the specific situation where both sides of the join use the same name for the joining column(s). It takes a comma-separated list of the shared column names and forms a join condition that includes an equality comparison for each one. For example, joining T1 and T2 with USING (a, b) produces the join condition ON T1.a = T2.a AND T1.b = T2.b.

=> SELECT * FROM t1 INNER JOIN t2 ON t1.num = t2.num;
 num | name | num | value
-----+------+-----+-------
   1 | a    |   1 | xxx
   3 | c    |   3 | yyy
(2 rows)

=> SELECT * FROM t1 INNER JOIN t2 USING (num);
 num | name | value
-----+------+-------
   1 | a    | xxx
   3 | c    | yyy
(2 rows)

PostgreSQL: https://www.postgresql.org/docs/current/queries-table-expressions.html
MySQL https://dev.mysql.com/doc/refman/8.0/en/join.html
Sqlite: https://www.sqlite.org/syntax/join-constraint.html
Oracle: https://docs.oracle.com/javadb/10.6.2.1/ref/rrefsqljusing.html
SQL Server: not supported

Usage could be something like

db.selectFrom('person as a')
  .innerJoinUsing('person as b', ['first_name', 'last_name'])
  .select(['a.id', 'b.id'])
@koskimas
Copy link
Member

koskimas commented Oct 12, 2023

I don't know if this is worth the complication in the code. Especially if we did this in the most type-safe way possible, we should only accept columns both tables have. That'd be tricky.

But if we do add this, we could add it to the join builder

db.selectFrom('person as a')
  .innerJoin('person as b', (join) => join.using(['first_name', 'last_name']))
  .select(['a.id', 'b.id'])

@igalklebanov igalklebanov added enhancement New feature or request mysql Related to MySQL postgres Related to PostgreSQL api Related to library's API sql Related to general SQL labels Oct 12, 2023
@igalklebanov
Copy link
Member

igalklebanov commented Nov 4, 2023

@koskimas Apparently, they behave differently:

A USING clause can be rewritten as an ON clause that compares corresponding columns. However, although USING and ON are similar, they are not quite the same. Consider the following two queries:

a LEFT JOIN b USING (c1, c2, c3)
a LEFT JOIN b ON a.c1 = b.c1 AND a.c2 = b.c2 AND a.c3 = b.c3

With respect to determining which rows satisfy the join condition, both joins are semantically identical.

With respect to determining which columns to display for SELECT * expansion, the two joins are not semantically identical. The USING join selects the coalesced value of corresponding columns, whereas the ON join selects all columns from all tables. For the USING join, SELECT * selects these values:

COALESCE(a.c1, b.c1), COALESCE(a.c2, b.c2), COALESCE(a.c3, b.c3)

For the ON join, SELECT * selects these values:

a.c1, a.c2, a.c3, b.c1, b.c2, b.c3

With an inner join, COALESCE(a.c1, b.c1) is the same as either a.c1 or b.c1 because both columns have the same value. With an outer join (such as LEFT JOIN), one of the two columns can be NULL. That column is omitted from the result.

https://dev.mysql.com/doc/refman/8.0/en/join.html

Furthermore, the output of JOIN USING suppresses redundant columns: there is no need to print both of the matched columns, since they must have equal values. While JOIN ON produces all columns from T1 followed by all columns from T2, JOIN USING produces one output column for each of the listed column pairs (in the listed order), followed by any remaining columns from T1, followed by any remaining columns from T2.

https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-JOIN


So it's not just syntactic sugar. But is it an important enough difference worth adding 3-4 methods for everywhere and the type complexity of query context (e.g. columns a.c1 & b.c1 no longer exist, but c1 does) and result type? 🤷

@dwickern
Copy link
Author

dwickern commented Nov 5, 2023

Interesting. There is a difference when using an unqualified select * from joined tables with duplicate column names.

select * from a join b on a.c1 = b.c1 is equivalent to select a.*, b.*. Column c1 appears twice in the result set. In my tool it's prefixed like a.c1 and b.c1.

select * from a join b using (c1) only has a single copy of c1.

Kysely seems to overwrite columns with the same name, so you get the same results in either case. There is only a single c1. So there's no need to change Kysely's types.

@igalklebanov
Copy link
Member

igalklebanov commented Nov 5, 2023

I'm not concerned with select *, but with query context - can b.c1 be referenced in clauses after from (e.g. where) when using join using? Seems to be, at least in https://sqliteonline.com/.

Kysely seems to overwrite columns with the same name, so you get the same results in either case. There is only a single c1. So there's no need to change Kysely's types.

So if query context is the same, and there's no difference in result (in Kysely), this is all just syntactic sugar. As such, the importance of having this in the API is quite low.

@igalklebanov igalklebanov added sqlite Related to sqlite and removed sql Related to general SQL labels Nov 5, 2023
@dwickern
Copy link
Author

dwickern commented Nov 5, 2023

Yes, the query context is the same

@EitanKatsightfull
Copy link

Is there any update on the implementation of this ? @igalklebanov @koskimas

@dwickern dwickern linked a pull request Jan 29, 2024 that will close this issue
@alexpmay
Copy link

alexpmay commented Feb 9, 2024

We use kysely extensively and love it. I just wanted to upvote this request as it make joins more concise. We use the using syntax extensively in raw sql, it would be great to have it in kysely too.

@dwickern
Copy link
Author

dwickern commented Feb 9, 2024

I have an open PR for this in #862

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to library's API enhancement New feature or request mysql Related to MySQL postgres Related to PostgreSQL sqlite Related to sqlite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants