SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
-
Updated
May 21, 2024 - Rust
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Apache Ignite 3
YTsaurus is a scalable and fault-tolerant open-source big data platform.
AI + Data, online. https://vespa.ai
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Apache Ignite
北京交通大学计算机与信息技术学院系统与网络实验室 https://fangvv.github.io/Homepage/
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features.
Visual, interactive queries against big databases
An open source time-series database for fast ingest and SQL queries
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Resilient data pipeline framework running on Apache Spark
Scalable, redundant, and distributed object store for Apache Hadoop
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."