ollama
The Privacy-First Alternative to Ollama.
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
Shimmy is a 5.1MB single-binary that provides 100% OpenAI-compatible endpoints for GGUF models. Point your existing AI tools to Shimmy and they just work — locally, privately, and free.
A free and open documentation platform built with Laravel and Filament, enhanced by Ollama for local AI features, focused on clarity, structure, and self-hosted simplicity.
An open-source framework for building AI-powered apps, built and used in production by Google
Open-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in production by Google
It offers a unified interface for integrating AI models from providers like Google, OpenAI, Anthropic, Ollama, and more. Rapidly build and deploy production-ready chatbots, automations, and recommendation systems using streamlined APIs for multimodal content, structured outputs, tool calling, and agentic workflows.
Related contents:
LocalSite is a 100% local web development platform powered by Ollama. Generate modern, responsive websites using AI models running directly on your machine.
Related contents:
Advanced LLM-powered brute-force tool combining AI intelligence with automated login attacks.
Related contents:
RamaLama strives to make working with AI simple, straightforward, and familiar by using OCI containers.
RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.
Related contents:
Chat UI for Coderunner.
coderunner-ui is a local‑first AI workspace that lets you:
- Chat with local or remote LLMs
- Run generated code inside a fully isolated Apple Container VM
- Browse the web and automate tasks via a built‑in headless browser (Playwright) All without sending your data to the cloud.
Related contents:
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI.
Related contents:
Fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models.
Open Codex is a fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models like phi-4-mini.
Your agentic CLI developer.
Sidekick is an agentic CLI-based AI tool inspired by Claude Code, Copilot, Windsurf and Cursor. It's meant to be an open source alternative to these tools, providing a similar experience but with the flexibility of using different LLM providers while keeping the agentic workflow.
Terminal-based AI coding tool that can use any model that supports the OpenAI-style API.
Ollama Automated Security Intelligence Scanner.
🛡️ An AI-powered security auditing tool that leverages Ollama models to detect and analyze potential security vulnerabilities in your code.
Advanced code security analysis through the power of AI
Related contents:
Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
Nut is an open source fork of Bolt.new for helping you develop full stack apps using AI. AI developers frequently struggle with fixing even simple bugs when they don't know the cause, and get stuck making ineffective changes over and over. We want to crack these tough nuts, so to speak, so you can get back to building.
A command-line Ollama client for scripting.
The Ollama function caller, otherwise known as ofc, is a command-line tool for prompting Ollama models locally on your system. There are other programs out there that do similar things, but they either don't support streaming or don't give me access to important settings, like context length or temperature.
LLM playground to experiment with local models and build fine-tuning datasets and benchmarks.
A playground that gives you full control over the contents of a chat conversation: add, remove and edit messages (system, user and assistant) and shape the flow of the conversation to be exactly what you need.
Connect home devices into a powerful cluster to accelerate LLM inference. More devices mean faster performance, leveraging tensor parallelism and high-speed synchronization over Ethernet.
Supports Linux, macOS, and Windows. Optimized for ARM and x86_64 AVX2 CPUs.
Related contents:
LLM inference in C/C++.
The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud.
Related contents:
llamafile lets you distribute and run LLMs with a single file.
Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.
Use your locally running AI models to assist you in your web browsing.
Page Assist is an open-source browser extension that provides a sidebar and web UI for your local AI model. It allows you to interact with your model from any webpage.
An automated document analyzer for Paperless-ngx using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically analyze and tag your documents.
It features: Automode, Manual Mode, Ollama and OpenAI, a Chat function to query your documents with AI, a modern and intuitive Webinterface.
Simple frontend for LLMs built in react-native.
ChatterUI is a native mobile frontend for LLMs.
Run LLMs on device or connect to various commercial or open source APIs. ChatterUI aims to provide a mobile-friendly interface with fine-grained control over chat structuring.
Related contents:
A smarter web fuzzing tool that combines local LLM models and ffuf to optimize directory and file discovery.
This tool enhances traditional web fuzzing by using local AI language models (via Ollama) to generate intelligent guesses for potential paths and filenames.
A research project to add some brrrrrr to Burp.
"burpference" started as a research idea of offensive agent capabilities and is a fun take on Burp Suite and running inference. The extension is open-source and designed to capture in-scope HTTP requests and responses from Burp's proxy history and ship them to a remote LLM API in JSON format. It's designed with a flexible approach where you can configure custom system prompts, store API keys and select remote hosts from numerous model providers as well as the ability for you to create your own API configuration. The idea is for an LLM to act as an agent in an offensive web application engagement to leverage your skills and surface findings and lingering vulnerabilities. By being able to create your own configuration and model provider allows you to also host models locally via Ollama to prevent potential high inference costs and potential network delays or rate limits.
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
Benchmark Throughput Performance with running local large language models (LLMs) via ollama.