Replies: 2 comments
-
Following a review of JOIN functionality, 0.14 will now focus on addressing problems found with 3+ way CROSS JOINS. This will be an opportunity to address other issues and improvements with joins such as
|
Beta Was this translation helpful? Give feedback.
-
0.14 has primarily been bug fixes and performance tuning. JOIN improvements require more information about tables and so the introduction of statistics is required beforehand. This is starting to show the limitations of Arrow Tables as an internal format. However we rely on pyarrow for some higher level functionality for JOINs and for GROUP BY. Lower level functions like column naming etc isn't significant friction to moving from Arrow Tables, but rewriting GROUP BY is. Mabel should start to write manifests to storage, and the analyze statement the same. Reading manifests may introduce a slow step in the planning, but this should be able to be triviality parallelized and memcache used to cache. The parallelization isn't currently feasible for data reads as it quickly read more data than memory in the execution environment. |
Beta Was this translation helpful? Give feedback.
-
Version 0.13 was primarily a bugfix and performance tuning release after the major refactor of the planned in 0.12, the primary goal for 0.14 is repositioning changes ready for 0.15.
0.12 took approximately six months to release, due to a significant amount of change done in this release, 0.15 aims to start with the push-based query engine, another major change to the engine. To reduce the amount of change required to move to a push-based engine, this is being split over two releases; 0.14 which will change how execution nodes are called, currently iterating over and yielding generators, to performing an action on a morsel per call. 0.15 will then wrap these calls in a push-based engine.
It's anticipated that these two releases may be longer than the monthly cadence of releases, however, should each be much shorter than the 6-month release 0.12 was.
0.14 intends to also include changes to how
IN
conditions are planned and executed, with a view to reinstatingIN (SUBQUERY)
support.Beta Was this translation helpful? Give feedback.
All reactions