ai-agent
🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models
An ML intern that autonomously researches, writes, and ships good quality ML related code using the Hugging Face ecosystem — with deep access to docs, papers, datasets, and cloud compute.
The Agentic Development Environment.
Warp is an agentic development environment, born out of the terminal. Use Warp's built-in coding agent, or bring your own CLI agent (Claude Code, Codex, Gemini CLI, and others).
The Token-Efficient Coding Agent.
Coding Agent singularly focused efficiency and context curation. Reduces API costs by 50-80% vs other agent AND improves the code quality at the same time. Uses Hash Anchored edits, massively parallel operations, AST manipulation and many many other optimizations.
A CLI issue tracker for AI Agents.
A simple, lean issue tracker CLI designed for AI-assisted development. Track tasks across sessions with context preservation.
Related contents:
Build, benchmark, and deploy agents that use computers.
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
An LLM-as-a-judge HTTP proxy to secure agents in production .
Deploy agents. Safely. CrabTrap is an LLM-as-a-judge HTTP proxy to secure agents in production. It intercepts every request your AI agent makes, evaluates it against a policy, and allows or blocks it in real time.
Scan your website to see how ready it is for AI agents. We check multiple emerging standards — from robots.txt and Markdown negotiation to MCP, OAuth, Agent Skills and agentic commerce.
Close the loop with Argent Agentic AI Toolkit for React Native & Swift. An agentic toolkit to control, debug, and profile the iOS Simulator. Made by Software Mansion.
An agentic toolkit to control, debug, and profile the iOS Simulator. Your agent can navigate your app, explore the component tree, inspect network requests, and trace performance problems in your mobile apps.
Like BrowserUse, but for the terminal.
tui-use lets agents interact with programs that expect a human at the keyboard — REPLs, debuggers, TUI apps, and anything else bash can't reach.
AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.
Runtime governance for AI agents — the only toolkit covering all 10 OWASP Agentic risks with 9,500+ tests. Governs what agents do, not just what they say — deterministic policy enforcement, zero-trust identity, execution sandboxing, and SRE — Python · TypeScript · .NET · Rust · Go
Related contents:
Full autonomy. Controlled environment. OS-level containment for AI coding agents on macOS.
macOS containment for AI agents — user isolation, kernel sandbox, pf firewall, DNS blocklist, backup/rollback. TLA+ verified.
AI coding agents are most useful when you let them work autonomously. But full autonomy means the agent runs with your full privileges, your credentials, your files.
Hazmat makes that safe.
Related contents:
A 30-minute talk sharing practical lessons from building real production projects with AI coding agents — from brainstorming to shipped code.
Related contents:
Track AI Code all the way to production
An open-source Git extension for tracking AI code through the entire SDLC. Once installed, it automatically links every AI-written line to the agent, model, and transcripts that generated it — so you never lose the intent, requirements, and architecture decisions behind your code.
an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM.
goose is your on-machine AI agent, capable of automating complex development tasks from start to finish. More than just code suggestions, goose can build entire projects from scratch, write and execute code, debug failures, orchestrate workflows, and interact with external APIs - autonomously.
Related contents:
our Own AI Co-Worker. Its own computer.
Phantom deploys to a dedicated machine, learns from every conversation, and evolves its own capabilities. Open source. Self-hosted or managed.
tmux config with built-in terminal automation and agent-to-agent communication.
Write SQL. Ship confidence. SQL-first tooling for PostgreSQL Type-safe PostgreSQL client code generator.
pGenie validates SQL, manages indexes, and generates type-safe client SDKs — all derived from the migrations and queries in plain SQL.
Your AI agent configs, skills, and instructions on every device. One config repo. Every agent. Every machine.
You've spent hours perfecting your CLAUDE.md, building custom skills, tuning your settings. Then you open your laptop and none of it is there. Or you switch from Claude Code to Codex and start from scratch. agents-anywhere keeps your agent setup in a git repo and symlinks it to every agent on every machine.
Rewrite of Claude Code
The fastest repo in history to surpass 50K stars ⭐, reaching the milestone in just 2 hours after publication. Better Harness Tools that make real things done. Now writing in Rust using oh-my-codex.
Related contents:
Go hard on agents, not on your filesystem. easy containment for AI agents.
Use jai for effortless containment of AI agents on Linux. jai strives to be the easiest container in the world to configure--so easy that you never again need to run a code assistant without protection. It's not a substitute for docker or podman when you need better isolation. But if you regularly do risky things like run an AI CLI with your own privileges in your home directory on a computer that you care about, then jai could reduce the damage when things go wrong.
Let agents test your code in a real browser.
One command scans your unstaged changes or branch diff, then generates a test plan, and runs it against a live browser.
Expect reads your unstaged changes or branch diff, sends them to an AI agent (Claude Code or Codex CLI), and generates a step-by-step test plan describing how to validate the changes. You review and approve the plan in an interactive TUI, then the agent executes each step against a live browser - using your real login sessions so there's no manual auth setup. Every session is recorded so you can replay exactly what happened.
A powerful meta-prompting, context engineering and spec-driven development system that enables agents to work for long periods of time autonomously without losing track of the big picture
Run agentic coding workflows in a fully native desktop app for Git worktrees, terminals, and diffs. Fully Native App for Agentic Coding. Run issue-driven coding workflows in one native workspace.
Arbor is a Rust-powered native workspace with a shared daemon for the desktop app, web UI, CLI, and MCP server. Create worktrees from issues, run embedded terminals and managed processes, inspect PR context, and keep coding agents visible without juggling tools.
Workflow orchestration for AI coding agents, from task to merged PR.
Optio turns coding tasks into merged pull requests — without human babysitting. Submit a task (manually, from a GitHub Issue, or from Linear), and Optio handles the rest: provisions an isolated environment, runs an AI agent, opens a PR, monitors CI, triggers code review, auto-fixes failures, and merges when everything passes.
An open standard for shared agent learning. Agents persist, share, and query collective knowledge so they stop rediscovering the same failures independently.
cq is derived from colloquy (/ˈkɒl.ə.kwi/), a structured exchange of ideas where understanding emerges through dialogue rather than one-way output. It reflects a focus on reciprocal knowledge sharing; systems that improve through participation, not passive use. In radio, CQ is a general call ("any station, respond"), capturing the same model: open invitation, response, and collective signal built through interaction.
Related contents:
AI Coding Agent, Terminal, IDE.
Work with Claude directly in your codebase. Build, debug, and ship from your terminal, IDE, Slack, or the web. Describe what you need, and Claude handles the rest.
Related contents:
- Claude Code Cheat Sheet.
- Claude Code Essentials @ freeCodeCamp.org's YouTube.
- How I'm Productive with Claude Code @ Neil Kakkar.
- Auto mode for Claude Code @ Simon Willison's Weblog.
- Inside the Claude Code source @ Haseeb Qureshi's GitHub Gist.
- Entire Claude Code CLI source code leaks thanks to exposed map file @ Ars Technica.
- Reading leaked Claude Code source code @ Vita Nouva.
- The Claude Code Source Leak: fake tools, frustration regexes, undercover mode, and more @ Alex Kim's blog.
- What's cch? Reverse Engineering Claude Code's Request Signing.
- How Claude Code Builds a System Prompt @ dbreunig.com.
- Fuite Claude Code - 6 trucs à piquer pour vos hooks @ Korben :fr:.
- Leveling Up Secure Code Reviews with Claude Code @ SpecterOps.
agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.
Related contents:
Serena is a powerful coding agent toolkit capable of turning an LLM into a fully-featured agent that works directly on your codebase. Unlike most other tools, it is not tied to an LLM, framework or an interface, making it easy to use it in a variety of ways.
Serena provides essential semantic code retrieval and editing tools that are akin to an IDE’s capabilities, extracting code entities at the symbol level and exploiting relational structure. When combined with an existing coding agent, these tools greatly enhance (token) efficiency.
Serena is free & open-source, enhancing the capabilities of LLMs you already have access to free of charge.
Related contents:
Full computer-use for AI agents. Self-learning workflows. Native macOS. No screenshots required.
Your AI agent can write code, run tests, search files. But it can't click a button, send an email, or fill out a form. It lives inside a chat box.
Ghost OS changes that. One install, and any AI agent can see and operate every app on your Mac.
OpenShell is the safe, private runtime for autonomous AI agents.
NVIDIA OpenShell is the safe, private runtime for autonomous AI agents. It provides sandboxed execution environments that protect your data, credentials, and infrastructure. Agents run with exactly the permissions they need and nothing more, governed by declarative policies that prevent unauthorized file access, data exfiltration, and uncontrolled network activity.
An Open-Source Asynchronous Coding Agent. Open-source framework for building your org's internal coding agent.
Elite engineering orgs like Stripe, Ramp, and Coinbase are building their own internal coding agents — Slackbots, CLIs, and web apps that meet engineers where they already work. These agents are connected to internal systems with the right context, permissioning, and safety boundaries to operate with minimal human oversight.
Open SWE is the open-source version of this pattern. Built on LangGraph and Deep Agents, it gives you the same architecture those companies built internally: cloud sandboxes, Slack and Linear invocation, subagent orchestration, and automatic PR creation — ready to customize for your own codebase and workflows.
Related contents:
your repository becomes your agent. The Open Standard for Git-Native AI Agents. The open standard for defining, versioning, and running AI agents natively in git.
A git-native, framework-agnostic, open standard for defining AI agents. Version-controlled config that exports to Claude Code, OpenClaw, Lyzr Agent, Chimera, NanoBot, CrewAgent, and Agents SDK.
Run a team of coding agents on your Mac.
Create parallel Codex + Claude Code agents in isolated workspaces. See at a glance what they're working on, then review and merge their changes.
Related contents:
Open-source EDR for AI agents. Monitor processes, files, network, and behavior of autonomous AI agents.
Aegis is an open-source endpoint detection and response (EDR) tool that monitors AI agent processes, file access, network activity, and behavioral anomalies in real time. Built with Electron 33, Svelte 5, and TypeScript, it provides the same class of oversight for autonomous AI agents that CrowdStrike provides for traditional endpoints. No telemetry. No cloud. Everything stays local.
The Backend Built forAgentic Development
Give agents everything they need to ship fullstack apps.
InsForge is a backend development platform built for AI coding agents and AI code editors. It exposes backend primitives like databases, auth, storage, and functions through a semantic layer that agents can understand, reason about, and operate end to end.
Watch once. Then take the stage. A Teachable Desktop Agent.
Understudy is a teachable desktop agent. It operates your computer like a human colleague across GUI, browser, shell, files, and messaging. You demonstrate a task once, it learns the intent, remembers successful paths, and gradually upgrades to faster execution routes.
Give your agents access, not your secrets. The open-source secret vault for AI agents. Store once. Inject anywhere. Agents never see the keys.
Open-source credential vault. Your agents call services and never see a key.
OneCLI is an open-source gateway that sits between your AI agents and the services they call. Instead of baking API keys into every agent, you store credentials once in OneCLI and the gateway injects them transparently. Agents never see the secrets.
Symphony turns project work into isolated, autonomous implementation runs, allowing teams to manage work instead of supervising coding agents.
Related contents:
An agent that grows with you.
Install it on a machine, give it your messaging accounts, and it becomes a persistent personal agent that grows with you — learning your projects, building its own skills, and reaching you wherever you are.
The self-improving AI agent built by Nous Research. It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Related contents:
Contracts before code. Tests as law. Agents that can't cheat.
Pact is a multi-agent software engineering framework where the architecture is decided before a single line of implementation is written. Tasks are decomposed into components, each component gets a typed interface contract, and each contract gets executable tests. Only then do agents implement -- independently, in parallel, even competitively -- with no way to ship code that doesn't honor its contract. Generates Python, TypeScript, or JavaScript.
Touch-to-grab context tool for React Native UI changes.
Bridge the context gap: point at the exact native UI element, capture precise source context, and hand it to your coding agent without guesswork.
Let your AI go full send. Your home directory stays home.
Run Claude Code, Codex, or any AI coding agent in "yolo mode" without nuking your home directory.
Related contents:
A project to reject AI agents via AGENTS.md. This project provides the context and instructions to AI agents that their presence is unwelcome.
Related contents:
How to design, build, and operate AI agents for infrastructure teams — safely. 13 chapters covering architecture, sandboxing, credentials, change control, observability, and more.
AI agents can write IaC, fix compliance findings, detect drift, review PRs, and respond to incidents — all autonomously. But autonomy without guardrails is a liability. Agents that can terraform apply can also terraform destroy. Agents that read configs can leak secrets. Agents that loop can burn budgets.
This guide covers every architectural decision you need to make when building infrastructure agents — with real patterns, code snippets, multiple alternatives, and the risk framework to evaluate your choices.
Memory Management Kit for Agents - Remember Me, Refine Me.
🧠 ReMe is a memory management framework built for AI agents, offering both file-based and vector-based memory systems.
It addresses two core problems of agent memory: limited context windows (early information gets truncated or lost during long conversations) and stateless sessions (new conversations cannot inherit history and always start from scratch).
ReMe gives agents real memory — old conversations are automatically condensed, important information is persisted, and the next conversation can recall it automatically.
Security proxy for AI agents. Scans every message for prompt injection, PII, and secrets. Defense-in-depth: Go proxy + iptables firewall + eBPF kernel monitor. YAML policy engine, audit logging, 5 AI agents with RAG knowledge bases.
Security proxy for AI agents. Sits in front of OpenClaw and scans every message for prompt injection, PII leaks, and secrets — before they reach the model or leave the network.
The Terminal for Coding Agents.
IDE for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine. Superset is a turbocharged terminal that allows you to run any CLI coding agents along with the tools to 10x your development workflow.
Memory for Proactive 24/7 Agents.
MemU powers autonomous AI agents with persistent, evolving memory. Continuously predict user intentions, act proactively, and work for you — even while you sleep.
Minimal CLI coding agent by Mistral.
Mistral Vibe is a command-line coding assistant powered by Mistral's models. It provides a conversational interface to your codebase, allowing you to use natural language to explore, modify, and interact with your projects through a powerful set of tools.
CLI Code Agent Orchestrator. Autonomous AI agent orchestrator powered by Claude Code CLI.
OpenSwarm orchestrates multiple Claude Code instances as autonomous agents. It picks up Linear issues, runs Worker/Reviewer pair pipelines to produce code changes, reports progress to Discord, and retains long-term memory via LanceDB vector embeddings.
Open-source task management for the agentic era. The command center for solo entrepreneurs who delegate work to AI agents.
Open-source task management for the agentic era. The command center for solo entrepreneurs who delegate work to AI agents.
Mission Control gives your AI agents structure. Agents get roles, inboxes, and reporting protocols. You delegate work through a visual dashboard, they execute and report back. You stay in control without micromanaging.
A sandboxed bash interpreter for AI agents. Pure TypeScript with in-memory filesystem.
A simulated bash environment with an in-memory virtual filesystem, written in TypeScript. Designed for AI agents that need a secure, sandboxed bash environment. Supports optional network access via curl with secure-by-default URL filtering.
Game character voice lines + visual overlay notifications when your AI coding agent needs attention — or let the agent pick its own sound via MCP.
AI coding agents don't notify you when they finish or need permission. You tab away, lose focus, and waste 15 minutes getting back into flow. peon-ping fixes this with voice lines and bold on-screen banners from Warcraft, StarCraft, Portal, Zelda, and more — works with Claude Code, GitHub Copilot, Codex, Cursor, OpenCode, Kilo CLI, Kiro, Windsurf, Google Antigravity, and any MCP client.
Related contents:
Let coding agents diagnose and fix your React code.
One command scans your codebase for security, performance, correctness, and architecture issues, then outputs a 0–100 score with actionable diagnostics.
Continuous, non-invasive background code review for agents, to work better and faster. With TUI and CLI support.
Continuous code review for coding agents. Review commits immediately, catch issues early, and fix them while context is fresh.
Related contents:
The SQLite for AI memory. One file. Full RAG. Zero infrastructure.
🍯 Memory layer for on-device AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer.
kubectl for AI Agents. Enterprise AI agent orchestration. Manage, monitor, and scale your AI workforce.
Build your agent team in OpenClaw with one command.
You don't need to hire a dev team. You need to define one. Antfarm gives you a team of specialized AI agents — planner, developer, verifier, tester, reviewer — that work together in reliable, repeatable workflows. One install. Zero infrastructure.
Related contents:
Coding agents, visible to your team. Collaboration in the age of agentic engineering .
Open-source and self-hostable. Track sessions, share prompts, and link every conversation to the commit it produced.
AgentLogs captures and analyzes transcripts from AI coding agents (like Claude Code, Codex, OpenCode, and Pi) to give your team visibility into how AI tools are used in their codebases.
Related contents: