llm
AI-powered subdomain enumeration tool with local LLM analysis via Ollama - 100% private, zero API costs.
God's Eye is a powerful, ultra-fast subdomain enumeration and reconnaissance tool written in Go. It combines multiple passive sources with active DNS brute-forcing and comprehensive security checks to provide a complete picture of a target's attack surface.
Related contents:
Kubernetes-native AI serving platform for scalable model serving.
Related contents:
Intercept LLM API traffic and visualize token usage in a real-time terminal dashboard. Track costs, debug prompts, and monitor context window usage across your AI development sessions.
tokentap tracks token usage for LLM CLI tools with a live terminal dashboard. See exactly how many tokens you're using in real-time.
Query your data in plain English.
Turn natural language questions into SQL queries with a small, local model that matches cloud LLM accuracy.
We fine-tuned a small language model to convert plain English questions into executable SQL queries. Because it's small, you can run it locally on your own machine, no API keys, no cloud dependencies, full privacy. Load your CSV files, ask questions, get answers.
Human-like Document AI
PageIndex is a vectorless, reasoning-based RAG engine that mirrors how humans read, delivering traceable, explainable, and context-aware retrieval, without vector databases or chunking.
AI Code Security Anti-Patterns distilled from 150+ sources to help LLMs generate safer code.
A comprehensive security reference distilled from 150+ sources to help LLMs generate safer code.
Related contents:
A native command-line interface for working with Apple Core ML models on macOS. Inspect, run inference, benchmark, and manage Core ML models without Xcode or Python.
Related contents:
An all-in-one enhancement suite for Google Gemini - timeline navigation, folder management, prompt library, and chat export in one powerful extension.
A deterministic, autonomous loop runner for the Codex CLI. It processes exactly one task per iteration from a JSON backlog, with fresh context each run, and keeps a JSONL audit log for traceability.
Related contents:
Autonomous task execution plugin for LLM CLI - refactored modular architecture.
llm loop is a powerful plugin for the LLM CLI tool that enables autonomous, goal-oriented task execution. Unlike traditional single-turn LLM interactions, llm loop allows the AI to work persistently towards a goal by making multiple tool calls, analyzing results, and iterating until the task is complete.
Related contents:
AI-Powered Reverse Engineering with Ghidra.
OGhidra bridges Large Language Models (LLMs) via Ollama with the Ghidra reverse engineering platform, enabling AI-driven binary analysis through natural language. Interact with Ghidra using conversational queries and automate complex reverse engineering workflows.
Related contents:
Agent harness framework for building, running, and verifying LLM workflows.
Gambit helps you build reliable LLM workflows by composing small, typed “decks” with clear inputs/outputs and guardrails. Run decks locally, stream traces, and debug with a built-in UI.
The Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.
Open Responses is an open-source specification and ecosystem for building multi-provider, interoperable LLM interfaces based on the OpenAI Responses API. It defines a shared schema, and tooling layer that enable a unified experience for calling language models, streaming results, and composing agentic workflows—independent of provider.
A simple, open format for giving agents new capabilities and expertise. Agent Skills are folders of instructions, scripts, and resources that agents can discover and use to do things more accurately and efficiently.
Related contents:
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.
It adopts a modular architecture that combines multimodal preprocessing, semantic vector indexing, intelligent retrieval, and large language model inference. At its core, WeKnora follows the RAG (Retrieval-Augmented Generation) paradigm, enabling high-quality, context-aware answers by combining relevant document chunks with model reasoning.
The Context Optimization Layer for LLM Applications.
Cut your LLM costs by 50-90% without losing accuracy.
Semantic search for agents.
A calm, CLI-native way to semantically grep everything, like code, images, pdfs and more.
Related contents:
Understand Privacy & Legal Terms. Plain-language analysis of Terms & Conditions. Know what you're agreeing to before you sign up.
Related contents:
A dead-simple unix tool for lightweight open-source local agents.
Orla is a unix tool for running lightweight open-source agents. It is easy to add to a script, use with pipes, or build things on top of.
Related contents:
A curated catalogue of agentic AI patterns — real‑world tricks, workflows, and mini‑architectures that help autonomous or semi‑autonomous AI agents get useful work done in production.
Related contents:
Multi-agent orchestrator for Claude Code. Track work with convoys; sling to agents.
Related contents:
- Welcome to Gas Town @ Steve Yegge's Medium.
- Gas Town Decoded @ Andrew Lilley Brinker.
- How to think about Gas Town @ Steve Klabnik.
- Agent Psychosis: Are We Going Insane? @ Armin Ronacher's Thoughts and Writings.
- Gas Town’s Agent Patterns, Design Bottlenecks, and Vibecoding at Scale @ Maggie Appleton.
- Move Over Gas Town, Claude Has First-Party Agent Orchestration @ Andrew Lilley Brinker.
shared memory for agents. Get smarter alongside your AI.
Your intelligence shouldn't reset every conversation. Ensue is a persistent knowledge tree that grows with you - what you learn today enriches tomorrow's reasoning.
The World's First LLM-Native Language. the first intermediate language designed for machine authorship.
Reduce logs to their semantic anomalies.
Cordon uses transformer embeddings and density scoring to identify semantically unusual patterns in large log files, reducing massive logs to the most anomalous sections for analysis. Repetitive patterns (even errors) are considered "normal background." Cordon surfaces unusual, rare, or clustered events that stand out semantically from the bulk of the logs.
Related contents:
Layrr is the visual editor for real code. Design visually, edit any stack, own everything. A browser coding agent interface for selecting elements and sending instructions directly to Claude Code.
Open Source Locally Hosted Lovable with Full Stack Support. The Open-Source AI Development Platform Built for Self-Hosting.
AI-powered development environment with advanced agent orchestration - designed for complete data sovereignty and infrastructure control.
Unterminated Block Parsing.
Remend is a lightweight, standalone preprocessor that completes incomplete Markdown syntax.
Related contents:
Production-ready RAG in your infrastructure.
Skald gives you a production-ready RAG in minutes through a plug-and-play API, and then let's you configure your RAG engine exactly to your needs.
Our solid defaults will work for most use cases, but you can tune every part of your RAG to better suit your needs. That means configurable vector search params, reranking, models, query rewriting, chunking (soon), and more.
Related contents:
Metis is an open-source, AI-driven tool for deep security code review.
Metis is an open-source, AI-driven tool for deep security code review, created by Arm's Product Security Team. It helps engineers detect subtle vulnerabilities, improve secure coding practices, and reduce review fatigue. This is especially valuable in large, complex, or legacy codebases where traditional tooling often falls short.
Executable GenAI Prompt Templates.
Dotprompt is an executable prompt template file format for Generative AI. It is designed to be agnostic to programming language and model provider to allow for maximum flexibility in usage. Dotprompt extends the popular Handlebars templating language with GenAI-specific features.
Enforce the habit of self-documenting code through better commit messages.
smartcommit is an intelligent, AI-powered CLI tool that helps you write semantic, Conventional Commits messages effortlessly. It analyzes your staged changes, asks clarifying questions to understand the "why" behind your code, and generates a structured commit message for you.
The toolkit for AI devtools context engineering. Build with codebase mapping, symbol extraction, and many kinds of code search.
kit is a production-ready toolkit for codebase mapping, symbol extraction, code search, and building LLM-powered developer tools, agents, and workflows.
kit shines for getting precise, accurate, and relevant context to LLMs. Use kit to build code reviewers, code generators and graphs, even full-fledged coding assistants: all enriched with the right code context.
An open-source SQL-Native memory engine for AI. Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems.
One line of code to give any LLM persistent, queryable memory using standard SQL databases.
Memori enables any LLM to remember conversations, learn from interactions, and maintain context across sessions with a single line: memori.enable(). Memory is stored in standard SQL databases (SQLite, PostgreSQL, MySQL) that you fully own and control.
AI Browser Automation. Automate browser based workflows with AI.
Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows on a large number of websites, replacing brittle or unreliable automation solutions.
Emdash is an orchestration layer for running multiple coding agents in parallel in isolated Git worktrees.
An orchestration layer for running multiple coding agents in parallel, each isolated in its own Git worktree. Run several agent instances concurrently to tackle independent subtasks or experiments.
Emdash lets you develop and test multiple features with multiple agents in parallel. It’s provider-agnostic (we support 10+ CLIs, such as Claude Code and Codex) and runs each agent in its own Git worktree to keep changes clean; when the environment matters, you can run a PR in its own Docker container. Hand off Linear, GitHub, or Jira tickets to an agent, review diffs side-by-side, and keep everything local.
Tokenflood is a load testing framework for simulating arbitary loads on instruction-tuned LLMs.
Tokenflood is a load testing tool for instruction-tuned LLMs that allows you to run arbitrary load profiles without needing specific prompt and response data. Define desired prompt lengths, prefix lengths, output lengths, and request rates, and tokenflood simulates this workload for you.
🔂 Run Claude Code in a continuous loop, autonomously creating PRs, waiting for checks, and merging
Automated workflow that orchestrates Claude Code in a continuous loop, autonomously creating PRs, waiting for checks, and merging - so multi-step projects complete while you sleep.
Fully automatic censorship removal for language models.
Heretic is a tool that removes censorship (aka "safety alignment") from transformer-based language models without expensive post-training. It combines an advanced implementation of directional ablation, also known as "abliteration" (Arditi et al. 2024), with a TPE-based parameter optimizer powered by Optuna.
Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.
Enchanted is open source, Ollama compatible, elegant macOS/iOS/visionOS app for working with privately hosted models such as Llama 2, Mistral, Vicuna, Starling and more. It's essentially ChatGPT app UI that connects to your private models. The goal of Enchanted is to deliver a product allowing unfiltered, secure, private and multimodal experience across all of your devices in iOS ecosystem (macOS, iOS, Watch, Vision Pro).
Related contents:
Chat for Ollama, Private AI Chat.
Empowering LLM researchers and hobbyists with seamless control over self-hosted models. Connect remotely, customize prompts, manage chats, and fine-tune configurations. All in one intuitive app.
Related contents:
A desktop app to easily run Large Language Models locally.
CLI tool – estimates LLM tokens/costs and runs provider-aware load tests for OpenAI, Anthropic, OpenRouter, or custom endpoints.
A fast, CLI-based tool to estimate token usage and API cost for prompts targeting various LLM providers (OpenAI, Claude, Mistral, etc.). Built in Rust for performance, portability, and safety.
AI-powered git commit message rewriter using GPT.
Automatically rewrite your entire git commit history with better, conventional commit messages using AI. Perfect for cleaning up messy commit histories before open-sourcing projects or improving repository maintainability.
Ollama Manager App.
Unofficial Ollama manager app for macOS, iOS, iPadOS, and visionOS, featuring server management, model management, and simple chat feature.
Related contents:
Fast, stateless LLM-powered assistant for your shell: qq answers; qa runs commands.
qq means quick question. qa means quick agent. Both are easy to type rapidly on QWERTY keyboards with minimal finger movement. That makes interacting with LLMs faster and more natural during real work.
Web Codegen Scorer is a tool for evaluating the quality of web code generated by LLMs.
JSON for LLM prompts at half the tokens. Spec, benchmarks & TypeScript implementation.
Token-Oriented Object Notation is a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage. It's intended for LLM input, not output.
TOON's sweet spot is uniform arrays of objects – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts. For deeply nested or non-uniform data, JSON may be more efficient.
Related contents:
Large language model made in Europe built to support all official 24 EU languages.
The EuroLLM project includes Instituto Superior Técnico, the University of Edinburgh, Instituto de Telecomunicações, Université Paris-Saclay, Unbabel, Sorbonne University, Naver Labs, and the University of Amsterdam. Together they created EuroLLM-9B, a multilingual AI model supporting all 24 official EU languages. Developed with support from Horizon Europe, the European Research Council, and EuroHPC, this open-source LLM aims to enhance Europe’s digital sovereignty and foster AI innovation. Trained on the MareNostrum 5 supercomputer, EuroLLM outperforms similar-sized models. It is fully open source and available via Hugging Face.
Related contents:
The Web Access Layer for AI Agents.
Connect Your Agent to the Web. Powering the Internet of Agents with fast, secure and reliable web access APIs.
Related contents:
The Intelligence Engine. Turn Natural Language Into Action. The Intelligence Layer for AI agents. Connect your models, tools, and data to create agentic apps that can think, act and talk to you.
An all-in-one toolkit to build agentic applications that turn natural language into real-world actions.
Want to build AI-native apps that respond to natural language? Dexto is the intelligence layer that makes it easy to build agentic apps like AI assistants and copilots. Describe your agents, plug in your tools, and watch them respond to plain English.
Butter is a cache that identifies patterns in LLM responses and saves you money by serving responses directly.
It's also deterministic, allowing your AI systems to consistently repeat past behaviors.
All-in-One RAG Framework
Modern documents increasingly contain diverse multimodal content—text, images, tables, equations, charts, and multimedia—that traditional text-focused RAG systems cannot effectively process. RAG-Anything addresses this challenge as a comprehensive All-in-One Multimodal Document Processing RAG system built on LightRAG.
As a unified solution, RAG-Anything eliminates the need for multiple specialized tools. It provides seamless processing and querying across all content modalities within a single integrated framework. Unlike conventional RAG approaches that struggle with non-textual elements, our all-in-one system delivers comprehensive multimodal retrieval capabilities.
Build Frontier RAG Apps. The open-source RAG platform: built-in citations, deep research, 22+ file formats, partitions, MCP server, and more.
Ground AI agents in your knowledge base, minimize hallucinations, and impress out of the box. Agentset is the open-source platform to build, evaluate, and ship production-ready RAG and agentic applications. It provides end-to-end tooling: ingestion, vector indexing, evaluation/benchmarks, chat playground, hosting, and a clean API with first-class developer experience.
Related contents:
Customizable AI Research & Knowledge Management Assistant. The AI Workspace Built for Teams. Connect any LLM to your internal knowledge sources and chat with it in real time alongside your team.
Open Source Alternative to NotebookLM / Perplexity, connected to external sources such as Search Engines, Slack, Linear, Jira, ClickUp, Confluence, Notion, YouTube, GitHub, Discord and more.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
Lance is a modern columnar data format optimized for machine learning and AI applications. It efficiently handles diverse multimodal data types while providing high-performance querying and versioning capabilities.
Related contents:
Your AI.Your Data. Zero Cloud. Offline ChatGPT alternative: open-source, on-device, and 100% private.
By connecting to Ollama local LLMs, NativeMind delivers the latest AI capabilities right inside your favourite browser — without sending a single byte to cloud servers.
Related contents:
The best ChatGPT that $100 can buy.
This repo is a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase. nanochat is designed to run on a single 8XH100 node via scripts like speedrun.sh, that run the entire pipeline start to end. This includes tokenization, pretraining, finetuning, evaluation, inference, and web serving over a simple UI so that you can talk to your own LLM just like ChatGPT. nanochat will become the capstone project of the course LLM101n being developed by Eureka Labs.
Related contents:
This course is intended to provide you with a comprehensive step-by-step understanding of how to engineer optimal prompts within Claude.