<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>experiment</title>
    <link rel="self" type="application/atom+xml" href="https://links.biapy.com/guest/tags/3445/feed"/>
    <updated>2026-04-19T08:04:08+00:00</updated>
    <id>https://links.biapy.com/guest/tags/3445/feed</id>
            <entry>
            <id>https://links.biapy.com/links/12476</id>
            <title type="text"><![CDATA[Mesh LLM]]></title>
            <link rel="alternate" href="https://github.com/michaelneale/mesh-llm" />
            <link rel="via" type="application/atom+xml" href="https://links.biapy.com/links/12476"/>
            <author>
                <name><![CDATA[Biapy]]></name>
            </author>
            <summary type="text">
                <![CDATA[reference impl with llama.cpp compiled to distributed inference across machines, with real end to end demo.

Mesh LLM lets you pool spare GPU capacity across machines and expose the result as one OpenAI-compatible API.

If a model fits on one machine, it runs there. If it does not, Mesh LLM automatically spreads the work across the mesh.

Related contents:

- [Episode 661: Sink Your Claws In @ Linux Unplugged](https://linuxunplugged.com/661).]]>
            </summary>
            <updated>2026-04-09T06:08:02+00:00</updated>
        </entry>
    </feed>
