Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

tokenizers crossbeam-queue libm

⑂ 696◎ 1.1k

greptimedb★ 6.3kactive

db grpc

The open-source Observability 2.0 database. One engine for metrics, logs, and traces — replacing Prometheus, Loki & ES.

humantime-serde arrow-array crossbeam-utils

⑂ 495◎ 202

materialize★ 6.3kactive

grpc axum

The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL

reqwest-middleware maplit aho-corasick

⑂ 507◎ 593

Daft★ 5.6kactive

grpc axum

High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale

reqwest-middleware aho-corasick arrow-array

⑂ 484◎ 324

arrow-rs★ 3.5kactive

grpc tokio

Official Rust implementation of Apache Arrow

arrow-array lz4_flex brotli

⑂ 1.2k◎ 759

delta-rs★ 3.2kactive

A native Rust library for Delta Lake, with bindings into Python

reqwest-middleware maplit arrow-array

⑂ 624◎ 193

vortex★ 3.0kactive

wasm cli

An extensible, state-of-the-art framework for columnar compression, and the fastest FOSS columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.

aho-corasick arrow-array tabled

⑂ 165◎ 283

spiceai★ 3.0kactive

ai wasm

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

tokenizers arrow-array wat

⑂ 199◎ 379

parseable★ 2.4kactive

grpc cli

Parseable is an observability datalake built from first principles.

humantime-serde arrow-array fs_extra

⑂ 165◎ 31

tv★ 2.2kdormant

cli

📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.

parquet arrow owo-colors

⑂ 40◎ 29

tonbo★ 1.6kactive

wasm

Tonbo is an embedded database for serverless and edge runtimes.

arrow-array arrow-schema parquet

⑂ 98◎ 19

Raphtory★ 621active

ai axum

Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust

minijinja arrow-array sqlparser

⑂ 69◎ 164

sedona-db★ 465active

A single-node analytical database engine with geospatial as a first-class citizen

arrow-array approx aws-credential-types

⑂ 53◎ 111

geoarrow-rs★ 410active

wasm

GeoArrow in Rust, Python, and JavaScript (WebAssembly) with vectorized geometry operations

arrow-array approx arrow-schema

⑂ 46◎ 83

otel-arrow★ 357active

grpc cli

Protocol and libraries for sending and receiving OpenTelemetry data using Apache Arrow

humantime-serde minijinja data-encoding

⑂ 107◎ 299

omnigraph★ 289active

axum tokio

Lakehouse native graph engine with git-style workflows

arrow-array arrow-schema color-eyre

⑂ 24◎ 9

timefusion★ 170active

db grpc

A timeseries database created for events, logs, traces and metrics. Speaks the postgres dialect, and stores data in s3 via delta lake protocol

aws-sdk-s3 arrow-schema fastrand

⑂ 8◎ 5

swanlake★ 150active

grpc axum

DuckLake took Flight. Welcome to SwanLake.

arrow-array sqlparser native-tls

⑂ 17◎ 19

nanograph★ 148active

cli

On-device property graph database. Schema-as-code. One CLI → One Folder. No Server. Think: DuckDB for graphs.

napi arrow-array arrow-schema

⑂ 12

zerobus-sdk★ 78active

grpc tokio

Databricks's Zerobus Ingest SDKs

napi arrow-array arrow-schema

⑂ 18◎ 57

lance-context★ 70active

axum

Manage Multimodal Agentic Context Lifecycle with Lance

arrow-array arrow-schema pyo3

⑂ 11◎ 3

tangent★ 61dormant

wasm db

High-performance, DSL-free stream processing

handlebars aws-sdk-s3 fs_extra

⑂ 3◎ 2

micromegas★ 46active

wasm db

Scalable Observability

axum-extra rsa object_store

⑂ 6◎ 32

laminardb★ 37active

ai db

Open-source streaming SQL engine written in Rust using Apache Arrow and DataFusion. Supports continuous queries, temporal stream joins, tumbling/session windows, and CDC/Kafka connectors. Lightweight, embeddable, and sub-microsecond latency

tokenizers humantime-serde arrow-array

⑂ 4◎ 25

uni-db★ 37active

ai wasm

Uni is a modern, embedded database that combines property graph (OpenCypher), vector search, and columnar storage (Lance) into a single, cohesive engine. It is designed for applications requiring local, fast, and multimodal data access, backed by object storage (S3/GCS) durability.

tokenizers crossbeam-queue arrow-array

⑂ 4◎ 7

KalamDB★ 30active

wasm grpc

KalamDB — a lightweight, real-time, storage-efficient SQL database. Designed for per-user data isolation and scalable performance — ideal for the AI era.

rocksdb sqlparser rpassword

⑂ 1◎ 9

arrow-wasm★ 28maintenance

wasm

Building block library for using Apache Arrow in Rust WebAssembly modules.

arrow-array arrow-schema serde-wasm-bindgen

⑂ 7◎ 11

greptimedb-ingester-rust★ 24active

grpc tokio

A Rust ingester for GreptimeDB, which is compatible with GreptimeDB protocol and lightweight.

arrow-array derive_builder arrow-schema

⑂ 11◎ 6

orbit-knowledge-graph★ 21active

grpc axum

Orbit, aka the GitLab Knowledge Graph, is a project that aims to provide a unified context API for AI systems and human users. This project has both a local Knowledge Graph for your code and a backend service for the entire SDLC.

tree-sitter sqlparser const_format

⑂ 6◎ 2

← Browse all repos