ollama
Open-Source AI Gateway with Workflows, Aliases, and Usage Tracking. AI Gateway for Tracking, Decoupling, Debugging your AI Usage.
High-performance AI gateway written in Go - unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, Groq, xAI & Ollama. LiteLLM alternative with observability, guardrails & streaming.
AI You Control. The Open-Source, Cross-Platform, Extensible AI Client. Choose your models. Own your data. Eliminate vendor lock-in.
Related contents:
Virtual desktop pet cats for macOS — pixel art cats that live on your dock and chat with you via Ollama LLM.
Related contents:
Hundreds of models & providers. One command to find what runs on your hardware.
A terminal tool that right-sizes LLM models to your system's RAM, CPU, and GPU. Detects your hardware, scores each model across quality, speed, fit, and context dimensions, and tells you which ones will actually run well on your machine.
Ships with an interactive TUI (default) and a classic CLI mode. Supports multi-GPU setups, MoE architectures, dynamic quantization selection, speed estimation, and local runtime providers (Ollama, llama.cpp, MLX, Docker Model Runner, LM Studio).
Related contents:
Ollama for GitHub Copilot VS Code Extension. Run Ollama models with full tool and vision support inside GitHub Copilot Chat.
Opilot integrates the full Ollama ecosystem — local models, cloud models, and the Ollama model library — directly into VS Code's Copilot Chat interface. Your conversations never leave your machine when using local models, and you can switch between models without leaving the editor.
Can I Run AI locally?
Find out which AI models your machine can actually run.
Related contents:
macOS menu bar app that exposes Apple's on-device Foundation Models as an OpenAI-compatible local API. Zero cloud. Zero dependencies.
Related contents:
AI-powered subdomain enumeration tool with local LLM analysis via Ollama - 100% private, zero API costs.
God's Eye is a powerful, ultra-fast subdomain enumeration and reconnaissance tool written in Go. It combines multiple passive sources with active DNS brute-forcing and comprehensive security checks to provide a complete picture of a target's attack surface.
Related contents:
AI-Powered Reverse Engineering with Ghidra.
OGhidra bridges Large Language Models (LLMs) via Ollama with the Ghidra reverse engineering platform, enabling AI-driven binary analysis through natural language. Interact with Ghidra using conversational queries and automate complex reverse engineering workflows.
Related contents:
Claude Code. Any Model. The most powerful AI coding agent now speaks every language.
Run Claude Code with any AI model - OpenRouter, Gemini, OpenAI, or local models.
Claudish (Claude-ish) is a CLI tool that allows you to run Claude Code with any AI model by proxying requests through a local Anthropic API-compatible server. Supports OpenRouter (100+ models), direct Google Gemini API, direct OpenAI API, and local models (Ollama, LM Studio, vLLM, MLX).
Open Source Locally Hosted Lovable with Full Stack Support. The Open-Source AI Development Platform Built for Self-Hosting.
AI-powered development environment with advanced agent orchestration - designed for complete data sovereignty and infrastructure control.
Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.
Enchanted is open source, Ollama compatible, elegant macOS/iOS/visionOS app for working with privately hosted models such as Llama 2, Mistral, Vicuna, Starling and more. It's essentially ChatGPT app UI that connects to your private models. The goal of Enchanted is to deliver a product allowing unfiltered, secure, private and multimodal experience across all of your devices in iOS ecosystem (macOS, iOS, Watch, Vision Pro).
Related contents:
Chat for Ollama, Private AI Chat.
Empowering LLM researchers and hobbyists with seamless control over self-hosted models. Connect remotely, customize prompts, manage chats, and fine-tune configurations. All in one intuitive app.
Related contents:
Ollama Manager App.
Unofficial Ollama manager app for macOS, iOS, iPadOS, and visionOS, featuring server management, model management, and simple chat feature.
Related contents:
Simple and Fast Retrieval-Augmented Generation.
The LightRAG Server is designed to provide Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provide an Ollama compatible interfaces, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bot, such as Open WebUI, to access LightRAG easily.
Your AI.Your Data. Zero Cloud. Offline ChatGPT alternative: open-source, on-device, and 100% private.
By connecting to Ollama local LLMs, NativeMind delivers the latest AI capabilities right inside your favourite browser — without sending a single byte to cloud servers.
Related contents:
The Privacy-First Alternative to Ollama.
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
Shimmy is a 5.1MB single-binary that provides 100% OpenAI-compatible endpoints for GGUF models. Point your existing AI tools to Shimmy and they just work — locally, privately, and free.
A free and open documentation platform built with Laravel and Filament, enhanced by Ollama for local AI features, focused on clarity, structure, and self-hosted simplicity.
An open-source framework for building AI-powered apps, built and used in production by Google
Open-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in production by Google
It offers a unified interface for integrating AI models from providers like Google, OpenAI, Anthropic, Ollama, and more. Rapidly build and deploy production-ready chatbots, automations, and recommendation systems using streamlined APIs for multimodal content, structured outputs, tool calling, and agentic workflows.
Related contents:
LocalSite is a 100% local web development platform powered by Ollama. Generate modern, responsive websites using AI models running directly on your machine.
Related contents:
Advanced LLM-powered brute-force tool combining AI intelligence with automated login attacks.
Related contents:
RamaLama strives to make working with AI simple, straightforward, and familiar by using OCI containers.
RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.
Related contents:
Chat UI for Coderunner.
coderunner-ui is a local‑first AI workspace that lets you:
- Chat with local or remote LLMs
- Run generated code inside a fully isolated Apple Container VM
- Browse the web and automate tasks via a built‑in headless browser (Playwright) All without sending your data to the cloud.
Related contents:
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI.
Related contents:
Fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models.
Open Codex is a fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models like phi-4-mini.
Your agentic CLI developer.
Sidekick is an agentic CLI-based AI tool inspired by Claude Code, Copilot, Windsurf and Cursor. It's meant to be an open source alternative to these tools, providing a similar experience but with the flexibility of using different LLM providers while keeping the agentic workflow.
Terminal-based AI coding tool that can use any model that supports the OpenAI-style API.
Ollama Automated Security Intelligence Scanner.
🛡️ An AI-powered security auditing tool that leverages Ollama models to detect and analyze potential security vulnerabilities in your code.
Advanced code security analysis through the power of AI
Related contents:
Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
Nut is an open source fork of Bolt.new for helping you develop full stack apps using AI. AI developers frequently struggle with fixing even simple bugs when they don't know the cause, and get stuck making ineffective changes over and over. We want to crack these tough nuts, so to speak, so you can get back to building.
A command-line Ollama client for scripting.
The Ollama function caller, otherwise known as ofc, is a command-line tool for prompting Ollama models locally on your system. There are other programs out there that do similar things, but they either don't support streaming or don't give me access to important settings, like context length or temperature.
LLM playground to experiment with local models and build fine-tuning datasets and benchmarks.
A playground that gives you full control over the contents of a chat conversation: add, remove and edit messages (system, user and assistant) and shape the flow of the conversation to be exactly what you need.
Connect home devices into a powerful cluster to accelerate LLM inference. More devices mean faster performance, leveraging tensor parallelism and high-speed synchronization over Ethernet.
Supports Linux, macOS, and Windows. Optimized for ARM and x86_64 AVX2 CPUs.
Related contents:
LLM inference in C/C++.
The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud.
Related contents:
- Everything I've learned so far about running local LLMs @ null program.
- Run large and small language models with llama.cpp (DeepSeek-R1, Phi-4) @ Modal Docs.
- Faire tourner un LLM localement sur votre ordinateur @ Quoi de neuf les devs ? :fr:.
- Optimizing Performance with llama.cpp @ Home Assistant community.
- Grand saut dans le déploiement sur site d'un serveur LLM à moindres privilèges @ Synacktiv :fr:.
Use your locally running AI models to assist you in your web browsing.
Page Assist is an open-source browser extension that provides a sidebar and web UI for your local AI model. It allows you to interact with your model from any webpage.
Discover, download, and run local LLMs.
Related contents:
- #104 Développer des projets IA - introduction @ Double Slash :fr:.
- Faire tourner un LLM localement sur votre ordinateur @ Quoi de neuf les devs ? :fr:.
- Drames et dramas d’août @ Le RDV Tech podcast.
- LM Studio : Faire tourner son IA (LLM) facilement (Chat, Developpement, ...) @ Adrien Linuxtricks' YouTube :fr:.
An automated document analyzer for Paperless-ngx using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically analyze and tag your documents.
It features: Automode, Manual Mode, Ollama and OpenAI, a Chat function to query your documents with AI, a modern and intuitive Webinterface.
Simple frontend for LLMs built in react-native.
ChatterUI is a native mobile frontend for LLMs.
Run LLMs on device or connect to various commercial or open source APIs. ChatterUI aims to provide a mobile-friendly interface with fine-grained control over chat structuring.
Related contents:
A smarter web fuzzing tool that combines local LLM models and ffuf to optimize directory and file discovery.
This tool enhances traditional web fuzzing by using local AI language models (via Ollama) to generate intelligent guesses for potential paths and filenames.
A research project to add some brrrrrr to Burp.
"burpference" started as a research idea of offensive agent capabilities and is a fun take on Burp Suite and running inference. The extension is open-source and designed to capture in-scope HTTP requests and responses from Burp's proxy history and ship them to a remote LLM API in JSON format. It's designed with a flexible approach where you can configure custom system prompts, store API keys and select remote hosts from numerous model providers as well as the ability for you to create your own API configuration. The idea is for an LLM to act as an agent in an offensive web application engagement to leverage your skills and surface findings and lingering vulnerabilities. By being able to create your own configuration and model provider allows you to also host models locally via Ollama to prevent potential high inference costs and potential network delays or rate limits.
Benchmark Throughput Performance with running local large language models (LLMs) via ollama.
Distribute and run LLMs with a single file.
Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.
Related contents: