columnar
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
Lance is a modern columnar data format optimized for machine learning and AI applications. It efficiently handles diverse multimodal data types while providing high-performance querying and versioning capabilities.
Related contents:
1000x Faster Analytics in Postgres. Postgres-native Data Warehouse.
pg_mooncake is a Postgres extension that adds columnar storage and vectorized execution (DuckDB) for fast analytics within Postgres. Postgres + pg_mooncake ranks among the top 10 fastest in ClickBench.
The universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics.
Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.
Related contents:
Zero-ETL data analytics with Postgres.
Simple and cost-effective cloud analytics platform automatically synced with your data sources.
BemiDB is a Postgres read replica optimized for analytics. It consists of a single binary that seamlessly connects to a Postgres database, replicates the data in a compressed columnar format, and allows you to run complex queries using its Postgres-compatible analytical query engine.
Fast Open-Source OLAP DBMS.
ClickHouse® is an open-source column-oriented database management system that allows generating analytical data reports in real-time.
Related contents: