<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>columnar</title>
    <link rel="self" type="application/atom+xml" href="https://links.biapy.com/guest/tags/1133/feed"/>
    <updated>2026-04-29T08:43:37+00:00</updated>
    <id>https://links.biapy.com/guest/tags/1133/feed</id>
            <entry>
            <id>https://links.biapy.com/links/11775</id>
            <title type="text"><![CDATA[Parquet]]></title>
            <link rel="alternate" href="https://parquet.apache.org/" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/11775"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression and encoding schemes to handle complex data in bulk and is supported in many programming languages and analytics tools.

- [Apache Parquet Format @ GitHub](https://github.com/apache/parquet-format).

Related contents:

- [Building Your Modern Data Analytics Stack with Python, Parquet, and DuckDB @ KD nuggets](https://www.kdnuggets.com/building-your-modern-data-analytics-stack-with-python-parquet-and-duckdb).]]>
            </summary>
            <updated>2026-02-11T10:19:02+00:00</updated>
        </entry>
            <entry>
            <id>https://links.biapy.com/links/11104</id>
            <title type="text"><![CDATA[Apache TsFile]]></title>
            <link rel="alternate" href="https://tsfile.apache.org/" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/11104"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[File Format for Internet of Things

TsFile is a columnar storage file format designed for time series data, which supports efficient compression, high throughput of read and write, and compatibility with various frameworks, such as Spark and Flink. It is easy to integrate TsFile into IoT big data processing frameworks.

- [Apache TsFile @ GitHub](https://github.com/apache/tsfile).]]>
            </summary>
            <updated>2025-11-26T12:41:09+00:00</updated>
        </entry>
            <entry>
            <id>https://links.biapy.com/links/10697</id>
            <title type="text"><![CDATA[Lance]]></title>
            <link rel="alternate" href="https://lancedb.github.io/lance/" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/10697"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming.. 

Lance is a modern columnar data format optimized for machine learning and AI applications. It efficiently handles diverse multimodal data types while providing high-performance querying and versioning capabilities.

- [Lance @ GitHub](https://github.com/lancedb/lance).

Related contents:

- [Lance takes aim at Parquet in file format joust @ The Register](https://www.theregister.com/2025/10/14/lance_parquet/).]]>
            </summary>
            <updated>2025-10-17T12:01:52+00:00</updated>
        </entry>
            <entry>
            <id>https://links.biapy.com/links/1339</id>
            <title type="text"><![CDATA[pg_mooncake]]></title>
            <link rel="alternate" href="https://github.com/Mooncake-Labs/pg_mooncake" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/1339"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[1000x Faster Analytics in Postgres.  Postgres-native Data Warehouse.

pg_mooncake is a Postgres extension that adds columnar storage and vectorized execution (DuckDB) for fast analytics within Postgres. Postgres + pg_mooncake ranks among the top 10 fastest in ClickBench.

Related contents:

- [Postgres Is the Gateway Drug @  Vignesh Ravichandran](https://viggy28.dev/article/postgres-gateway-drug/).]]>
            </summary>
            <updated>2026-03-23T16:35:46+00:00</updated>
        </entry>
            <entry>
            <id>https://links.biapy.com/links/2586</id>
            <title type="text"><![CDATA[Apache Arrow]]></title>
            <link rel="alternate" href="https://arrow.apache.org/" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/2586"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[The universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics.

Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.

- [Apache Arrow @ GitHub](https://github.com/apache/arrow/).
- [arrow-rs @ GitHub](https://github.com/apache/arrow-rs).

Related contents:

- [Fast columnar JSON decoding with arrow-rs @ arroyo](https://www.arroyo.dev/blog/fast-arrow-json-decoding).
- [I spent 6 hours learning Apache Arrow: Overview @ Data Engineer Things&amp;#039;s Medium](https://blog.det.life/i-spent-6-hours-learning-apache-arrow-overview-e7f3b8ee85b2).]]>
            </summary>
            <updated>2025-08-28T23:07:06+00:00</updated>
        </entry>
            <entry>
            <id>https://links.biapy.com/links/2931</id>
            <title type="text"><![CDATA[BemiDB]]></title>
            <link rel="alternate" href="https://bemidb.com/" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/2931"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[Zero-ETL data analytics with Postgres.

Simple and cost-effective cloud analytics platform automatically synced with your data sources.

BemiDB is a Postgres read replica optimized for analytics. It consists of a single binary that seamlessly connects to a Postgres database, replicates the data in a compressed columnar format, and allows you to run complex queries using its Postgres-compatible analytical query engine.

- [BemiDB @ GitHub](https://github.com/BemiHQ/BemiDB).]]>
            </summary>
            <updated>2025-08-29T00:06:42+00:00</updated>
        </entry>
            <entry>
            <id>https://links.biapy.com/links/5505</id>
            <title type="text"><![CDATA[ClickHouse]]></title>
            <link rel="alternate" href="https://clickhouse.com/" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/5505"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[Fast Open-Source OLAP DBMS.

ClickHouse® is an open-source column-oriented database management system that allows generating analytical data reports in real-time.

- [ClickHouse @ GitHub](https://github.com/ClickHouse/ClickHouse).

Related contents:

- [Altinity Kubernetes Operator for ClickHouse @ GitHub](https://github.com/Altinity/clickhouse-operator).
- [ClickHouse on Kubernetes @ Sr. Data Engineer](https://blog.duyet.net/2024/03/clickhouse-on-kubernetes/).
- [Inside ClickHouse full-text search: fast, native, and columnar @ ClickHouse](https://clickhouse.com/blog/clickhouse-full-text-search).
- [How we made ClickHouse log queries 99.5% faster with resource fingerprinting @ SigNoz](https://signoz.io/blog/query-performance-improvement/).
- [From Millions to Billions @ geocodio](https://www.geocod.io/code-and-coordinates/2025-10-02-from-millions-to-billions/).
- [The KFC Architecture Blueprint: Kafka, Flink, and ClickHouse @ Big Data Boutique](https://bigdataboutique.com/blog/kfc-architecture-blueprint-kafka-flink-and-clickhouse).
- [How we give every user SQL access to a shared ClickHouse cluster @ Trigger.dev](https://trigger.dev/blog/how-trql-works).]]>
            </summary>
            <updated>2026-03-23T14:51:58+00:00</updated>
        </entry>
    </feed>
