Transform Al Prototypes into Enterprise-Grade Products.
Langtrace is an Open Source Observability and Evaluations Platform for Al Agents.
Related contents:
Balance agent control with agency. Build resilient language agents as graphs.
Gain control with LangGraph to design agents that reliably handle complex tasks. Build and scale agentic applications with LangGraph Platform.
LangGraph — used by Replit, Uber, LinkedIn, GitLab and more — is a low-level orchestration framework for building controllable agents. While langchain provides integrations and composable components to streamline LLM application development, the LangGraph library enables agent orchestration — offering customizable architectures, long-term memory, and human-in-the-loop to reliably handle complex tasks.
Related contents:
JobSet: a k8s native API for distributed ML training and HPC workloads
JobSet is a Kubernetes-native API for managing a group of k8s Jobs as a unit. It aims to offer a unified API for deploying HPC (e.g., MPI) and AI/ML training workloads (PyTorch, Jax, Tensorflow etc.) on Kubernetes.
Related contents:
Bringing Agentic AI to cloud native.
An open-source framework for DevOps and platform engineers to run AI agents in Kubernetes, automating complex operations and troubleshooting tasks.
An in-depth book and reference on building agentic systems like Claude Code.
A deep-dive guide into architecture patterns for building responsive, reliable AI coding agents.
There's been a lot of asking about how Claude Code works under the hood. Usually, people see the prompts, but they don't see how it all comes together. This is that book. All of the systems, tools, and commands that go into building one of these.
A practical deep dive and code review into how to build a self-driving coding agent, execution engine, tools and commands. Rather than the prompts and AI engineering, this is the systems and design decisions that go into making agents that are real-time, self-corrective, and useful for productive work.
Go beyond nascent AI demos. The intelligent AI-native gateway for prompts and agentic apps.
Effortlessly build AI apps that can answer questions and help users get things done. Arch is the AI-native proxy that handles the pesky heavy-lifting so that you can move faster in building agentic apps, prevent harmful outcomes, and rapidly incorporate latest models.
AI-native (edge and LLM) proxy for agents. Move faster by letting Arch handle the pesky heavy lifting in building agentic apps -- query understanding and routing, seamless integration of prompts with tools, and unified access and observability of LLMs. Built by the contributors of Envoy proxy.
curated list of resources for AI Engineering.
Related contents:
The Open-Source LLM Evaluation Framework.
DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., which uses LLMs and various other NLP models that runs locally on your machine for evaluation.
Simple, secure, and reproducible packaging for AI/ML projects.
KitOps is an open source DevOps tool that packages and versions your AI/ML model, datasets, code, and configuration into a reproducible artifact called a ModelKit. ModelKits are built on existing standards, ensuring compatibility with the tools your data scientists and developers already use.
Secure & reliable LLMs.
Test & secure your LLM apps.
Open-source LLM testing used by 51,000+ developers.
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
Related contents:
llamafile lets you distribute and run LLMs with a single file.
Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.
Enable AI to control your browser. Make websites accessible for AI agents.
We make websites accessible for AI agents by extracting all interactive elements, so agents can focus on what makes their beer taste better.
Related contents:
Open Source LLM Engineering Platform.
Traces, evals, prompt management and metrics to debug and improve your LLM application.
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. YC W23
Prompt Engineering, Evaluation, and Observability for LLM apps.
Your End-to-End Collaborative Open Source End-to-End LLM Engineering Platform.
Agenta provides integrated tools for prompt engineering, versioning, evaluation, and observability—all in one place.
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.
Ship AI features in minutes.
Pezzo enables you to build, test, monitor and instantly ship AI all in one platform, while constantly optimizing for cost and performance.
Open-source, developer-first LLMOps platform designed to streamline prompt design, version management, instant delivery, collaboration, troubleshooting, observability and more.