data-lake
Your data lakehouse, built like software.
Bauplan is a cloud-native lakehouse platform for engineering teams who treat data like software. Ship pipelines without managing infrastructure, using a specialized Python runtime, Git-for-Data built on Apache Iceberg, and just a few simple APIs.
Related contents:
DuckLake is an integrated data lake and catalog format
DuckLake delivers advanced data lake features without traditional lakehouse complexity by using Parquet files and your SQL database. It's an open, standalone format from the DuckDB team.
DuckLake is an open Lakehouse format that is built on SQL and Parquet. DuckLake stores metadata in a catalog database, and stores data in Parquet files. The DuckLake extension allows DuckDB to directly read and write data from DuckLake.
AI data platform.
From data warehouse to autonomous data and AI platform
BigQuery is the autonomous data to AI platform, automating the entire data life cycle, from ingestion to AI-driven insights, so you can go from data to AI to action faster.
Gemini in BigQuery features are now included in BigQuery pricing models.
Related contents:
Fastest way to Replicate your Database data in Data Lake. OLake makes data replication faster by parallelizing full loads, leveraging change streams for real-time sync, and pulling data in a database-native format for efficient ingestion.
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB and MySQL
Related contents:
A unified metadata lake across all your sources, formats, cloud providers, and regions in a federated architecture. World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets.
An Open Source Data Lake Platform.
Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics.