You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nested pagination through m2m fields is broken with the optimizer enabled. This appears to be because using .prefetch_related() with a QuerySet annotated with a Window function where partition_by is an m2m field causes an extra join and duplicate results.
As suggested above, when you annotate a .prefetch_related()QuerySet with a Window function and refer back to the other side of the m2m, you add another join - causing duplicate results. When Django introduced the ability to prefetch with sliced querysets (which Strawberry-Django isn't directly using) they had to do a bunch of hacks to combine calls to filter together (see QuerySet._next_is_sticky() and _filter_prefetch_queryset()).
To demonstrate the difference directly with the Django ORM see: a6f4e31.
In the unit test above, Strawberry-Django produces the following SQL query for the paginated prefetch:
SELECT*FROM
(SELECT ("projects_issue_tags"."tag_id") AS"_prefetch_related_val_tag_id",
"projects_issue"."name"AS"col1",
"projects_issue"."id"AS"col2",
ROW_NUMBER() OVER (PARTITION BY "projects_issue_tags"."tag_id"ORDER BY"projects_issue"."id"ASC) AS"_strawberry_row_number",
COUNT(1) OVER (PARTITION BY "projects_issue_tags"."tag_id") AS"_strawberry_total_count"FROM"projects_issue"LEFT OUTER JOIN"projects_issue_tags"ON ("projects_issue"."id"="projects_issue_tags"."issue_id")
INNER JOIN"projects_issue_tags" T4 ON ("projects_issue"."id"= T4."issue_id")
WHERE T4."tag_id"IN (1,
2)
ORDER BY"projects_issue"."id"ASC) "qualify"WHERE"_strawberry_row_number"<=2ORDER BY"col2"ASC
Whereas Django produces this SQL query for the paginated prefetch:
SELECT"_prefetch_related_val_tag_id",
"col1",
"col2"FROM
(SELECT*FROM
(SELECT ("projects_issue_tags"."tag_id") AS"_prefetch_related_val_tag_id",
"projects_issue"."name"AS"col1",
"projects_issue"."id"AS"col2",
ROW_NUMBER() OVER (PARTITION BY "projects_issue_tags"."tag_id"ORDER BY"projects_issue"."id"ASC) AS"qual0"FROM"projects_issue"INNER JOIN"projects_issue_tags"ON ("projects_issue"."id"="projects_issue_tags"."issue_id")
WHERE"projects_issue_tags"."tag_id"IN (1,
2)
ORDER BY"projects_issue"."id"ASC) "qualify"WHERE ("qual0">0AND"qual0"<=2)) "qualify_mask"ORDER BY"col2"ASC
Note the extra LEFT OUTER JOIN which causes the issue.
The solution here may be to actually use Django's inbuilt support for .prefetch_related()QuerySet slicing. However, this means we can't annotate _strawberry_total_count onto the nodes anymore. It is likely that we would need to refactor that functionality onto the parent records, maybe using a subquery count and OuterRef?
Upvote & Fund
We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.
The text was updated successfully, but these errors were encountered:
The main complexity here is how to get Strawberry / Strawberry-Django to:
Recognise that a tag's issues will actually be on the _strawberry_optimized_issues attribute (instead of issues)
Recognise that a tag's issue count will actually be on the _strawberry_total_count_issues attribute (instead of annotated on its issues as _strawberry_total_count)
The other complexity is how to calculate things like cursor / start_cursor / end_cursor / has_previous_page / has_next_page - but I think conceptually it should still possible.
Describe the Bug
Nested pagination through m2m fields is broken with the optimizer enabled. This appears to be because using
.prefetch_related()
with aQuerySet
annotated with aWindow
function wherepartition_by
is an m2m field causes an extra join and duplicate results.Reproducible Example
See a minimal reproducible example here:
mre/nested-pagination-m2m
In the unit test above, the expected result of the query is:
However, the test fails, and the actual result of the query is:
System Information
Additional Context
It appears that this is caused by the implementation of
apply_window_pagination()
.As suggested above, when you annotate a
.prefetch_related()
QuerySet
with aWindow
function and refer back to the other side of the m2m, you add another join - causing duplicate results. When Django introduced the ability to prefetch with sliced querysets (which Strawberry-Django isn't directly using) they had to do a bunch of hacks to combine calls to filter together (seeQuerySet._next_is_sticky()
and_filter_prefetch_queryset()
).To demonstrate the difference directly with the Django ORM see: a6f4e31.
In the unit test above, Strawberry-Django produces the following SQL query for the paginated prefetch:
Whereas Django produces this SQL query for the paginated prefetch:
Note the extra
LEFT OUTER JOIN
which causes the issue.Potential Solution
As noted in the
apply_window_pagination()
docstring, Django 4.2+ actually supports slicedQuerySet
s in.prefetch_related()
now.The solution here may be to actually use Django's inbuilt support for
.prefetch_related()
QuerySet
slicing. However, this means we can't annotate_strawberry_total_count
onto the nodes anymore. It is likely that we would need to refactor that functionality onto the parent records, maybe using a subquery count andOuterRef
?Upvote & Fund
The text was updated successfully, but these errors were encountered: