You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
No change to language semantics , but consider some implementation improvements to help with maintainability & delegation.
[1] Comment the implementation
[2] In Summarize, line 1446, we do a uniqueness check: var key = keyRecord.ToExpression();
This is not ideal. We really shouldn't call ToExpression at runtime like this - I'm wondering if there are cases it could be unpredictable.
This overlaps with Distinct. It would be good to compare how distinct does it.
[3] Consider a "Plan" design pattern.
At the source-level, grouping columns are identifiers.
Aggregate columns are 'As' nodes.
At the IR level, we pass all this through a formulaValue[].
grouping columns are encoded as StringValues
aggregate columns are encoded as LambdaFormulaValue around a LazyEvalNode around a RecordNode , where that Record has 1 field, where the field's Name is the 'As' , and the field's vakue is the lambda to run the expression. (phew, this is starting to get complex!)
This is kind of "hidden" contract between IR and interpreter - it's kind of arbitrary. But Delegation needs to be aware of this.
This is a pretty simple contract, so it's not a big deal. But it's super brittle- there's a log of arbitrary choices in here on encoding, and if we ever changed it, we'd break delegation.
Whereas a "Plan" object would make it an explicit contract. Ie, imagine if the IR captures all of the key information in some structured C# class that it passed through.
Join has a similar thing as is more brittle.
A "plan" object, might look like:
// Eg: Summarize(t1, SalesDate, Sum(ThisGroup.Amount) As Total);
// Binder has guaranteed uniqueness of all column names
class SummarizePlan
{
// The columns used to group. Ie "SalesDate".
// This is unordered.
IEnumerable<string> GroupingColumnNames {get;set; }
// The aggregate columns computed on each group.
// This takes in a 'ThisGroup' identifier specified via rule scope.
// Key is column name (eg, "Total")
// Value is the IR to run it, (eg, Sum(ThisGroup.Amount))
Dict<string, IRNode> AggregateColumns {get;set; }
// Possibly extra FormulaType info if needed.
}
The text was updated successfully, but these errors were encountered:
MikeStall
changed the title
Consider Summarize improvements
Consider Summarize implementation improvements
Dec 17, 2024
No change to language semantics , but consider some implementation improvements to help with maintainability & delegation.
[1] Comment the implementation
[2] In Summarize, line 1446, we do a uniqueness check:
var key = keyRecord.ToExpression();
This is not ideal. We really shouldn't call ToExpression at runtime like this - I'm wondering if there are cases it could be unpredictable.
This overlaps with Distinct. It would be good to compare how distinct does it.
[3] Consider a "Plan" design pattern.
At the source-level, grouping columns are identifiers.
Aggregate columns are 'As' nodes.
At the IR level, we pass all this through a formulaValue[].
grouping columns are encoded as StringValues
aggregate columns are encoded as LambdaFormulaValue around a LazyEvalNode around a RecordNode , where that Record has 1 field, where the field's Name is the 'As' , and the field's vakue is the lambda to run the expression. (phew, this is starting to get complex!)
This is kind of "hidden" contract between IR and interpreter - it's kind of arbitrary. But Delegation needs to be aware of this.
This is a pretty simple contract, so it's not a big deal. But it's super brittle- there's a log of arbitrary choices in here on encoding, and if we ever changed it, we'd break delegation.
Whereas a "Plan" object would make it an explicit contract. Ie, imagine if the IR captures all of the key information in some structured C# class that it passed through.
Join has a similar thing as is more brittle.
A "plan" object, might look like:
The text was updated successfully, but these errors were encountered: