Scalable datastore for metrics, events, and real-time analytics
datafusion
🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforms such as MinIO and Ceph.
Production-grade Rust-native trading engine with deterministic event-driven architecture
📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics
Open-source developer platform to power your entire infra and turn scripts into webhooks, workflows and UIs. Fastest workflow engine (13x vs Airflow). Open-source alternative to Retool and Temporal.
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Visualize, query, and stream to train on multimodal robotics data.
Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.
Simple, Elastic-quality search for Postgres
Apache DataFusion SQL Query Engine
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
The open-source Observability 2.0 database. One engine for metrics, logs, and traces — replacing Prometheus, Loki & ES.
Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
One SQL interface over APIs, files, and live sources — built for agents.
Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
A native Rust library for Delta Lake, with bindings into Python
An extensible, state-of-the-art framework for columnar compression, and the fastest FOSS columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.
Fastest library to load data from DB to DataFrames in Rust and Python
Parseable is an observability datalake built from first principles.
Apache DataFusion Ballista Distributed Query Engine
The Feldera Incremental Computation Engine
Apache Kafka® compatible broker with S3, PostgreSQL, SQLite, Apache Iceberg and Delta Lake
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
A cloud-native open source distributed time series database with high performance, high compression ratio and high availability.
Apache Iceberg
PyTorch Single Controller
Version control system for AI agents
Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust
Analytical database for data-driven Web applications 🪶
A single-node analytical database engine with geospatial as a first-class citizen
Protocol and libraries for sending and receiving OpenTelemetry data using Apache Arrow
Lakehouse native graph engine with git-style workflows
A SQL transformation engine that type-checks your whole pipeline and catches breaking changes before they run — branches, replay, column-level lineage, compile-time contracts, per-model cost. Adapters: Databricks, Snowflake, BigQuery, DuckDB. Single static Rust binary. Apache 2.0.
View parquet files online
Batteries included CLI, TUI, and server implementations for DataFusion.
Apache Paimon Rust The rust implementation of Apache Paimon.
A timeseries database created for events, logs, traces and metrics. Speaks the postgres dialect, and stores data in s3 via delta lake protocol
Real-time data processing/feature engineering tailored for modern AI/ML systems.
Run Graph Queries with Lance
Postgres protocol frontend for DataFusion
A Rust-native DuckLake engine built on Apache DataFusion
Rust based high-performance Apache Uniffle shuffle-server
JSON support for DataFusion (unofficial)
Geometry and Geography Support for Apache DataFusion
Scalable Observability
Drop-in Acceleration of SQL/PromQL queries
A benchmark for assessing geospatial SQL analytics query performance across database systems