Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify and embed multiple filters that operate on the same property #2055

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

andrejtonev
Copy link
Contributor

Description

A query like
MATCH(n) WHERE n.p > 10 AND n.p < 20 RETURN n;
will generate a tree with 2 independent filters (n.p > 10 and n.p < 20).
This becomes a problem when we come to our optimizer.
The optimizer tries to embed filters into range scans. The optimizer will embed one filter, but not the other. It limits the scans to a single filter, so only the upper or lower bound gets embedded.

This PR tries to introduce a new operator RangeOperator.
It should fix this problem and generate a single filter with a lower and upper bound, which gets correctly inserted into a ranged scan.
The operator is not generated by the CypherMainVisitor but later as part of the tree rewrite. This is so other optimizations that move filters around can get picked up and potentially combine filters from different parts of the query into a single range.

[master < Task] PR

  • Provide the full content or a guide for the final git message
    • [FINAL GIT MESSAGE]

Documentation checklist

  • Add the documentation label tag
  • Add the bug / feature label tag
  • Add the milestone for which this feature is intended
    • If not known, set for a later milestone
  • Write a release note, including added/changed clauses
    • [Release note text]
  • Link the documentation PR here
    • [Documentation PR link]
  • Tag someone from docs team in the comments

@andrejtonev andrejtonev self-assigned this May 20, 2024
@andrejtonev andrejtonev added Docs - changelog only Docs - changelog only CI -build=release -test=core Run release build and core tests on push CI -build=release -test=e2e Run release build and e2e tests on push CI -build=release -test=stress Run release build and stress tests on push CI -build=release -test=benchmark Run release build and benchmark on push labels May 20, 2024
@andrejtonev andrejtonev force-pushed the range_operator branch 2 times, most recently from 1af2133 to 640ce88 Compare May 21, 2024 14:30
@andrejtonev andrejtonev marked this pull request as ready for review May 21, 2024 14:32
@andrejtonev andrejtonev requested a review from Ignition May 21, 2024 14:32
Rewriting the Cypher AST tree to combine multiple filters into a range
Insert range filter into range scans
@@ -251,6 +252,13 @@ BINARY_OPERATOR_VISIT(SubscriptOperator, "Subscript");

#undef BINARY_OPERATOR_VISIT

void ExpressionPrettyPrinter::Visit(RangeOperator &op) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we reach this code/see its effect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used when outputting the plan as a JSON.
The JSON is sent as part of the summary when running an EXPLAIN query.

Query (with index):

EXPLAIN MATCH (n:l) WHERE 1 < n.p < 10 RETURN n;

Explain query result:

+---------------------------------------------+
| QUERY PLAN                                  |
+---------------------------------------------+
| " * Produce {n}"                            |
| " * ScanAllByLabelPropertyRange (n :l {p})" |
| " * Once"                                   |
+---------------------------------------------+

JSON:

"explain": {
  "input":{
    "input":{
      "name":"Once"
    },
    "label":"l",
    "lower_bound":{
      "type":"exclusive",
      "value":"(ParameterLookup 7)"
    },
    "name":"ScanAllByLabelPropertyRange",
    "output_symbol":"n",
    "property":"p",
    "upper_bound":{
      "type":"exclusive",
      "value":"(ParameterLookup 13)"
    }
  }
}

@@ -184,7 +186,8 @@ void AddExpansionsToMatching(std::vector<Expansion> &expansions, Matching &match
}
}

auto SplitExpressionOnAnd(Expression *expression) {
template <typename F1, typename F2>
void SplitExpressionOnAnd(Expression *expression, F1 &&if_cb, F2 &&else_cb) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only two usages...inline into both places and refactor them to have simple implementations. The generic callbacks reduce readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@@ -368,6 +369,12 @@ class ReturnBodyContext : public HierarchicalTreeVisitor {

#undef VISIT_BINARY_OPERATOR

bool PostVisit(RangeOperator &op) override {
bool res = op.expr1_->Accept(*this);
res |= op.expr2_->Accept(*this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does 2nd Accept need to be visited?
Can be do a short-cutting ||

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to check both branches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the expression a < x < b, a and b do not have to be literals and thus need to be checked.

Copy link

sonarcloud bot commented May 23, 2024

Quality Gate Passed Quality Gate passed

Issues
4 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
1.2% Duplication on New Code

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI -build=release -test=benchmark Run release build and benchmark on push CI -build=release -test=core Run release build and core tests on push CI -build=release -test=e2e Run release build and e2e tests on push CI -build=release -test=stress Run release build and stress tests on push Docs - changelog only Docs - changelog only
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ScanAllByLabelPropertyRange does not work with < and > operators
2 participants