Replies: 2 comments 1 reply
-
Thank you for initiating this discussion. We(Databend Core Team) have reviewed the paper "Exploiting Cloud Object Storage for High-Performance Analytics" and analyzed the AnyBlob code: https://github.com/durner/AnyBlob. It would be valuable to hear @Xuanwo's informations on this, he is the OpenDAL maintainer. |
Beta Was this translation helpful? Give feedback.
-
Okay , I got it. Thanks a lot ! |
Beta Was this translation helpful? Give feedback.
-
I recently came across a fascinating paper titled 'Exploiting Cloud Object Storage for High-Performance Analytics' by the Umbra team, presented at VLDB 23.
It got me thinking about Databend, which, as far as I understand, is an OLAP database built on cloud object storage and utilizes both memory and disk caching to speed up IO. It also seems to have a scheduler based on Morsel-Driven Parallelism. Databend uses something like AnyBlob, akin to OpenDal, to provide a consistent interface on different cloud storage services, though AnyBlob not 'free'.
Considering these similarities, I'm curious if the Databend team has considered some of the key ideas from the Umbra paper or if you've already implemented some of these features. It feels like there's a structural resemblance between Databend and Umbra, and I wonder how these insights might be integrated into Databend's architecture.
Beta Was this translation helpful? Give feedback.
All reactions