observability
A scalable, fault-tolerant, and low-latency storage service optimized for real-time append-only workloads.
Related contents:
Telemetry Harbor OSS is the open-source ingestion and visualization stack behind Telemetry Harbor. Self-host your own telemetry backend with full control over your data and infrastructure.
Related contents:
Open Source Continuous Profiling Platform. Debug performance issues down to a single line of code.
Grafana Pyroscope is a continuous profiling platform designed to surface performance insights from your applications, helping you optimize resource usage such as CPU, memory, and I/O operations. With Pyroscope, you can both proactively and reactively address performance bottlenecks across your system.
Related contents:
High Performance, Resource Efficient OpenTelemetry Collection.
Rotel provides an efficient, high-performance solution for collecting, processing, and exporting telemetry data. Rotel is ideal for resource-constrained environments and applications where minimizing overhead is critical.
All-in-One Observability Platform.
Coroot is an open-source APM & Observability tool, a DataDog and NewRelic alternative. Metrics, logs, traces, continuous profiling, and SLO-based alerting, supercharged with predefined dashboards and inspections.
Cloud native networking and network security.
Calico is a single platform for networking, network security, and observability for any Kubernetes distribution in the cloud, on-premises, or at the edge. Whether you're just starting with Kubernetes or operating at scale, Calico's open source, enterprise, and cloud editions provide the networking, security, and observability you need.
Related contents:
A prometheus exporter for PHP-FPM.
The exporter connects directly to PHP-FPM and exports the metrics via HTTP.
Related contents:
eBPF-based Security Observability and Runtime Enforcement.
Tetragon is a flexible Kubernetes-aware security observability and runtime enforcement tool that applies policy and filtering directly with eBPF, allowing for reduced observation overhead, tracking of any process, and real-time enforcement of policies.
Related contents:
The Single Database for Big Observability. Fast, Efficient, Single Database for Real-Time Observability. The real-time, cloud-native observability database for metrics, logs, and traces, providing sub-second insights from edge to cloud—at any scale.
Related contents:
Scalable, Open Source, Logs DB & Logging Solution.
Related contents:
Transform Al Prototypes into Enterprise-Grade Products.
Langtrace is an Open Source Observability and Evaluations Platform for Al Agents.
Related contents:
Get your app Performance score. Flashlight is a Lighthouse-like tool for mobile apps. No installation required.
📱⚡️ Lighthouse for Mobile - audits your app and gives a performance score to your Android apps (native, React Native, Flutter..). Measure performance on CLI, E2E tests, CI...
Related contents:
Dynamic Tracing for Linux.
bpftrace is a high-level tracing language for Linux and provides a quick and easy way for people to write observability-based eBPF programs, especially those unfamiliar with the complexities of eBPF.
Related contents:
Dynamically program the kernel for efficient networking, observability, tracing, and security.
eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.
Related contents:
VictoriaLogs is open source user-friendly database for logs from VictoriaMetrics.
Related contents:
Scraparr is a Prometheus Exporter for various components of the *arr Suite
Self-hosted Error Tracking.
Bugsink offers real-time error tracking for your applications with full control through self-hosting.
Parseable is a disk less, cloud native database for logs, observability, security, and compliance. Parseable is built with focus on simplicity & resource efficiency.
Dashboards for DevOps.
Visualize cloud configurations. Assess security posture against a massive library of benchmarks. Build custom dashboards with code.
Related contents:
OpenTelemetry-native GenAI and LLM Application Observability.
Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. 🚀💻 Integrates with 50+ LLM Providers, VectorDBs, Agent Frameworks and GPUs.
Open-source observability for your LLM application, based on OpenTelemetry.
OpenLLMetry is a set of extensions built on top of OpenTelemetry that gives you complete observability over your LLM application. Because it uses OpenTelemetry under the hood, it can be connected to your existing observability solutions - Datadog, Honeycomb, and others.
A distributed tracing system.
Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in service architectures. Features include both the collection and lookup of this data.
If you have a trace ID in a log file, you can jump directly to it. Otherwise, you can query based on attributes such as service, operation name, tags and duration. Some interesting data will be summarized for you, such as the percentage of time spent in a service, and whether or not operations failed.
Related contents:
🔥 Airbroke: Lightweight, Airbrake-compatible, PostgreSQL-based Open Source Error Catcher. Self-hosted, Cost-effective, Open Source Error Tracking for a Sustainable Startup Journey.
Open Source Metrics Engine. Distributed TSDB and Query Engine, Prometheus Sidecar, Metrics Aggregator, and more such as Graphite storage and query engine.
M3 is a Prometheus compatible, easy to adopt metrics engine that provides visibility for some of the world’s largest brands.
Empower your testing with AI & usage insights.
Gravity monitors real-world user behaviors and usage patterns in live production and test environments to generate quality analytics, identify test coverage gaps, and assist in prioritizing and generating test cases.
App Monitoring, Error Tracking & Real User Monitoring. Application insights your developers need without the noise. Data means nothing without context. Get the full picture with secure, scalable error tracking and performance monitoring.
This is an OpenTelemetry auto-instrumentation package for Symfony framework applications.
OpenTelemetry Tail Sampling Configuration UI.
OTail is a user-friendly web interface for creating and managing OpenTelemetry tail sampling processor configurations. It provides a visual way to configure complex sampling policies without having to write YAML directly.
Tools to measure and visualize energy use on desktop computers.
Kubernetes Monitoring, Application Debug Platform. Instant Kubernetes-Native Application Observability.
Pixie is an open-source observability tool for Kubernetes applications. Use Pixie to view the high-level state of your cluster (service maps, cluster resources, application traffic) and also drill down into more detailed views (pod state, flame graphs, individual full-body application requests).
Batteries included UI to monitor your Messenger workers, transports, schedules, and messages.
Prometheus exporter for AWS CloudWatch - Discovers services through AWS tags, gets CloudWatch metrics data and provides them as Prometheus metrics with AWS tags as labels.
PoWA is a PostgreSQL Workload Analyzer that gathers performance stats and provides real-time charts and graphs to help monitor and tune your PostgreSQL servers.
pg_activity
is a top like application for PostgreSQL server activity
monitoring.
PostgreSQL Remote Control.
temBoard is a powerful management tool for PostgreSQL. It allows to observe, optimize, or configure PostgreSQL instances.
eks-node-viewer is a tool for visualizing dynamic node usage within a cluster. It was originally developed as an internal tool at AWS for demonstrating consolidation with Karpenter. It displays the scheduled pod resource requests vs the allocatable capacity on the node. It does not look at the actual pod resource usage.
Data and AI reliability. Delivered.
Data breaks. Monte Carlo ensures your team is the first to know and solve with end-to-end data observability.
Status Page On Demand. ⛑ Automated developer-oriented status page. The automated status page that you deserve.
If your infrastructure went down right now, how long would it take for you to know?
Gatus is a developer-oriented health dashboard that gives you the ability to monitor your services using HTTP, ICMP, TCP, and even DNS queries as well as evaluate the result of said queries by using a list of conditions on values like the status code, the response time, the certificate expiration, the body and many others. The icing on top is that each of these health checks can be paired with alerting via Slack, Teams, PagerDuty, Discord, Twilio and many more.
Cloud-native orchestration of data pipelines. Ship data pipelines with extraordinary velocity. An orchestration platform for the development, production, and observation of data assets.
The cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability.
Dagster is a cloud-native data pipeline orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability.
It is designed for developing and maintaining data assets, such as tables, data sets, machine learning models, and reports.
An 'Observe and Report Buddy' for your SRE toolbox.
Green Orb is a lightweight monitoring tool that enhances your application's reliability by observing its console output for specific patterns and executing predefined actions in response. Designed to integrate seamlessly, it's deployed as a single executable binary that runs your application as a subprocess, where it can monitor all console output, making it particularly useful in containerized environments. Green Orb acts as a proactive assistant, handling essential monitoring tasks and enabling SREs to automate responses to critical system events effectively.
Manage your Observability Systems. Command Line utility for managing Grafana Resources.
Software engineers know how to version and deploy their resources. Tools like Git or CI enable reliable workflows that track changes, with meaningful review processes giving confidence in the expected outcomes. Now, with Grizzly, you can have all this with Grafana resources, dashboards, datasources and more.
Network Analysis & Packet Capture. It's amazing what you discover when you start looking.
Arkime is an open source, large scale, full packet capturing, indexing, and database system.
Open-Source ML Monitoring and LLM Observability.
Open-source evaluation and observability for ML and LLM systems Evaluate, test, and monitor AI-powered systems. From tabular data to LLMs. Built for data scientists, AI, and ML engineers.
An open source, real-time monitoring tool with custom-monitor and agentLess.
Apache HertzBeat is a real-time monitoring system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building capabilities.
Intelligent Prompt Gateway.
Arch is an intelligent prompt gateway. Engineered with (fast) LLMs for the secure handling, robust observability, and seamless integration of prompts with APIs - all outside business logic. Built by the core contributors of Envoy proxy, on Envoy.
Arch is an intelligent Layer 7 gateway designed to protect, observe, and personalize LLM applications (agents, assistants, co-pilots) with your APIs.
OpenClarity is an open source platform to enhance security and observability of cloud native applications and infrastructure.
OpenClarity is an open source tool for agentless detection and management of Virtual Machine Software Bill Of Materials (SBOM) and security threats such as vulnerabilities, exploits, malware, rootkits, misconfigurations and leaked secrets.
Related contents:
Takes alerts from Prometheus Alertmanager, and shows them on a webpage for heads up displays. No-Nonsense.
APM for Ruby, Elixir, Node.js & Python. No-brainer monitoring for smart developers. Application Monitoring for Ruby on Rails, Elixir, Node.js & Python.
🖧🔍 WIFI / LAN intruder detector. Scans for devices connected to your network and alerts you if new and unknown devices are found.
Get visibility of what's going on on your WIFI/LAN network. Schedule scans for devices, port changes and get alerts if unknown devices or changes are found. Write your own Plugins with auto-generated UI and in-build notification system. Build out and easily maintain your network source of truth (NSoT).
Related contents:
An asynchronous Prometheus exporter for wireguard.
wireguard_exporter runs wg show [..] and scrapes the output to build Prometheus metrics.
An asynchronous Prometheus exporter for iptables
iptables_exporter runs one of several backend "scrape targets" such as iptables-save --counter and scrapes the output to build Prometheus metrics.
OpenTelemetry command-line tool for sending events from shell scripts & similar environments.
otel-cli is a command-line tool for sending OpenTelemetry traces. It is written in Go and intended to be used in shell scripts and other places where the best option available for sending spans is executing another program.
Low Code log management solution
FlowG is a log management platform that lets you ingest, transform, and query logs using a visual pipeline builder. It handles structured logs without requiring predefined schemas and relies on BadgerDB as its storage backend.
Lightweight network IP scanner. Can be used to notify about new hosts and monitor host online/offline history
Simple, open source error tracking. Open source error, performance, and uptime monitoring.
Collect every error from your project in real time, organize them to make them useful, and receive alerts when and where you want...without breaking the budget.
Related contents:
Like Prometheus, but for logs.
Grafana Loki is a set of open source components that can be composed into a fully featured logging stack. A small index and highly compressed chunks simplifies the operation and significantly lowers the cost of Loki.
Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.
MQTT Web Interface is an open-source web application that provides a real-time visualization of MQTT (Message Queuing Telemetry Transport) message flows. It allows users to monitor MQTT topics, publish messages, and view message statistics through an intuitive web interface.
Affordable full-stack production debugging & monitoring. Resolve Production Issues, Fast. An Open Source Observability Platform: Unify Session Replays, Logs, Traces, Metrics and Errors – All Without the Datadog Price Tag.
Resolve production issues, fast. An open source observability platform unifying session replays, logs, metrics, traces and errors powered by Clickhouse and OpenTelemetry.
Related contents:
Cabourotte verifies if your infrastructure is healthy.
Cabourotte is a tool that can be configured to execute health checks on your infrastructure. You can use it as a standalone tool but you can also integrate it with the Appclacks server to manage health checks configuration globally and with good tooling (CLI, Terraform provider, Kubernetes operator…).
Observability tools for modern infrastructures.
Centralized Blackbox health checks configuration and a Prometheus push gateway alternative.