apache-iceberg
Transactional Catalog for Data Lakes with Git-like semantics.
Nessie supports Iceberg Tables/Views. Additionally, Nessie is focused on working with the widest range of tools possible, which can be seen in the feature matrix.
Related contents:
a data lakehouse for you and me. managed + real-time Iceberg.
🥮 is real-time + managed Apache Iceberg. bringing open analytical tables on object store to every team.
pg_mooncake is a ClickHouse alternative for real-time analytics built on Postgres. It turns Postgres into a real-time analytics database by adding:
Columnar storage (Apache Iceberg, via Moonlink)
Vectorized execution with DuckDB (via pg_duckdb).
Fast analytics queries require both columnar storage & vectorized execution, and previous Postgres analytics solutions only solved half the problem.
JavaScript Iceberg Client.
Icebird is a library for reading Apache Iceberg tables in JavaScript. It is built on top of hyparquet for reading the underlying parquet files.
Postgres with Iceberg and data lake access.
pg_lake integrates Iceberg and data lake files into Postgres. With the pg_lake extensions, you can use Postgres as a stand-alone lakehouse system that supports transactions and fast queries on Iceberg tables, and can directly work with raw data files in object stores like S3.
Related contents:
Real-Time Event Streaming Platform. Streaming CDC, stream processing, low-latency serving, and Iceberg management.
RisingWave is a real-time event streaming platform designed to offer the simplest and most cost-effective way to process, analyze, and manage real-time event data — with built-in support for the Apache Iceberg™ open table format. It provides both a Postgres-compatible SQL interface and a DataFrame-style Python interface.
RisingWave can ingest millions of events per second, continuously join and analyze live streams with historical data, serve ad-hoc queries at low latency, and persist fresh, consistent results to Apache Icebergâ„¢ or any other downstream system.
Related contents:
Your data lakehouse, built like software.
Bauplan is a cloud-native lakehouse platform for engineering teams who treat data like software. Ship pipelines without managing infrastructure, using a specialized Python runtime, Git-for-Data built on Apache Iceberg, and just a few simple APIs.
Related contents:
Fastest way to Replicate your Database data in Data Lake. OLake makes data replication faster by parallelizing full loads, leveraging change streams for real-time sync, and pulling data in a database-native format for efficient ingestion.
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. âš¡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB and MySQL
Related contents: