apache-iceberg
Your data lakehouse, built like software.
Bauplan is a cloud-native lakehouse platform for engineering teams who treat data like software. Ship pipelines without managing infrastructure, using a specialized Python runtime, Git-for-Data built on Apache Iceberg, and just a few simple APIs.
Related contents:
Real-Time Event Streaming Platform. Streaming CDC, stream processing, low-latency serving, and Iceberg management.
RisingWave is a real-time event streaming platform designed to offer the simplest and most cost-effective way to process, analyze, and manage real-time event data — with built-in support for the Apache Iceberg™ open table format. It provides both a Postgres-compatible SQL interface and a DataFrame-style Python interface.
RisingWave can ingest millions of events per second, continuously join and analyze live streams with historical data, serve ad-hoc queries at low latency, and persist fresh, consistent results to Apache Iceberg™ or any other downstream system.
Related contents:
Transactional Catalog for Data Lakes with Git-like semantics.
Nessie supports Iceberg Tables/Views. Additionally, Nessie is focused on working with the widest range of tools possible, which can be seen in the feature matrix.
Related contents:
a data lakehouse for you and me. managed + real-time Iceberg.
🥮 is real-time + managed Apache Iceberg. bringing open analytical tables on object store to every team.
pg_mooncake is a ClickHouse alternative for real-time analytics built on Postgres. It turns Postgres into a real-time analytics database by adding:
Columnar storage (Apache Iceberg, via Moonlink)
Vectorized execution with DuckDB (via pg_duckdb).
Fast analytics queries require both columnar storage & vectorized execution, and previous Postgres analytics solutions only solved half the problem.
Fastest way to Replicate your Database data in Data Lake. OLake makes data replication faster by parallelizing full loads, leveraging change streams for real-time sync, and pulling data in a database-native format for efficient ingestion.
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB and MySQL
Related contents:
Postgres with Iceberg and data lake access.
pg_lake integrates Iceberg and data lake files into Postgres. With the pg_lake extensions, you can use Postgres as a stand-alone lakehouse system that supports transactions and fast queries on Iceberg tables, and can directly work with raw data files in object stores like S3.
Related contents:
JavaScript Iceberg Client.
Icebird is a library for reading Apache Iceberg tables in JavaScript. It is built on top of hyparquet for reading the underlying parquet files.