HypStack: An Open Observability Stack for AI
HypStack is the open, three-layer stack that Hyperparam is designed to plug into. It collects logs from your AI surfaces with OpenTelemetry, stores them as Apache Iceberg in your own object storage, and analyzes them in the browser with Hyperparam. Every layer is open source and replaceable, including Hyperparam itself.
If you only want to use Hyperparam against existing files or buckets, you can skip this page. If you are setting up AI observability for an organization and asking "where do these logs go and who owns them," this is the architecture we recommend.
Why a New Stack
Every company already has a data stack. Datadog and Splunk for ops. Snowflake, Databricks, BigQuery, Redshift for analytics. None of it was designed for AI workloads.
Two things have changed at once:
- A new class of applications. Codex, Claude Code, Cursor, Claude Desktop, Copilot, plus every product team adding an LLM to their own app. These run on the client (terminal, IDE, desktop, browser), not behind your load balancer.
- A new shape of data. A single agent turn is a structured document with prompt, retrieved context, tool calls, tool responses, completion, and token counts. A real session is hundreds of those. That is megabytes of structured text per user per hour, not 200-byte access log lines.
Observability vendors are billed per GB ingested, default to short retention, and are structured for metrics rather than text. Warehouses are great at SQL but bad at nested LLM payloads, and the compute is rented. AI traces are the worst of both worlds for both products, and the typical answer is to add yet another vendor (LLM observability, eval, agent tracing, RAG monitoring, prompt management), each a copy of your data behind their API.
HypStack is the alternative: one stack you own end to end.
The Three Layers
1. Collect with Collectivus (OpenTelemetry)
Collectivus is an OpenTelemetry collector designed for AI workloads, deployable across your fleet via your existing MDM (Jamf, Kandji, Intune). Underneath it is pure OTel, so anything that already speaks OTLP works.
What it captures:
- Developer laptops and IDE plugins
- Agent runs and MCP tool calls
- Chatbots and customer-facing LLM features
- Production services that wrap an LLM
- Prompts, retrieved context, tool input and output, evaluations, token usage
The data flows into your bucket. You install Collectivus, not a Hyperparam SDK.
If you only need Claude Code traces on a single developer machine, the bundled Hyperparam Desktop daemon is a lighter-weight option. See Claude Code OpenTelemetry Setup for the local single-machine flow.
2. Store as Iceberg in Your Bucket
Apache Iceberg on object storage is the consensus winner for open analytical storage. HypStack writes traces into Iceberg tables in your own S3, GCS, or Azure account.
What this gives you:
- Cheap, durable storage at object-storage prices, with infinite retention if you want it
- Schema evolution and time travel built into the table format
- Compaction and partitioning handled by the table layer
- A catalog of your choice: Polaris, Lakekeeper, Glue, Nessie
- Same-day interop: Snowflake, Databricks, DuckDB, Spark, and Trino all read Iceberg, so a customer who wants their AI traces in their existing warehouse can point at the same table with no export job
The raw logs, schema, retention policy, and access control all live in your account, under your IAM, in your region.
3. Analyze with Hyperparam
Hyperparam is a browser-native client that reads Iceberg and Parquet directly, with no cluster, SQL endpoint, or ingestion step. Built on hyparquet, icebird, squirreling, and parquetindex, it can open millions of traces in a single tab.
The differentiator is joins across sources. Agent traces become useful when you can join them with the code the agent was editing, the issue it was assigned, or the eval results from yesterday's prompt change. Hyperparam can join Iceberg with GitHub repos, other buckets, other Iceberg tables, and existing warehouse exports in one workspace.
Because it is a JS client rather than a query engine:
- HTTP range requests pull only the bytes needed
- Credentials stay in the browser
- An agent can query its own traces with no SQL endpoint or auth proxy in between
- The same client embeds in VS Code, Claude Desktop, or any agent harness
See Data Sources for connecting Hyperparam to S3, GCS, Azure, Iceberg, and Hugging Face.
Closing the Loop
The point of HypStack is not a dashboard. It is a feedback loop between your agents and your knowledge base.
knowledge base (CLAUDE.md, prompts, skills, docs)
|
v
agents / chatbots / coding harness
|
v (OTel)
Collectivus
|
v
Iceberg (your bucket)
|
v
Hyperparam (join, analyze, surface)
|
+---> updates back to the knowledge baseThe workflow inside Hyperparam is three steps:
- Explore. Open a multi-million-row trace dataset in a browser tab. Use AI-assisted search to find the conversation, retry, or tool call you care about. Join with GitHub or a warehouse to correlate with code or product context.
- Surface. Generate label columns at scale ("did this conversation succeed", "was this tool call necessary", "did the user retry"), then filter and cluster down to the 1% that is actually broken: token-burn hotspots, retry loops, tool failures, rabbit-holes.
- Improve. A prompt or tool change is a hypothesis you can validate against the traces you already have. The change goes back into the knowledge base your team already maintains as markdown (CLAUDE.md, prompts, skills, standards). The agents pick it up. The loop repeats.
See How to Debug Wasted Tool Calls in LLM Logs for a worked example of the explore, surface, improve cycle.
What HypStack Enables
- Full-fidelity AI observability: prompts, tools, retrieval, evals, all queryable in one place
- Infinite retention at object-storage prices, not per-GB ingest pricing
- Agents that read their own traces directly from object storage, with no SQL endpoint to provision
- Compliance that does not fight you: your IAM, your VPC, your region, no vendor copy of sensitive prompts
What You Bring, What We Bring
HypStack is a pattern, not a SaaS contract. Most of it is open source you can deploy yourself.
You bring:
- Your apps, agents, and services
- Your S3, GCS, or Azure account
- Your existing warehouse, if you want to query traces from it
- Your security and compliance posture
The open-source pieces:
- Collectivus for collection
- Iceberg for storage, plus a catalog of your choice
- Hyperparam and the hyparam JS libraries for analysis
For organizations rolling this out at scale, Hyperparam offers paid help with OTel and MDM instrumentation patterns, opinionated schemas for AI workloads, collector and Iceberg deployment, and analysis-client support. Reach out via hyperparam.app.
Cost Shape
For a sense of why this matters financially:
| Tier | Approximate price | Retention |
|---|---|---|
| Datadog logs | $0.10 to $2.50 per GB ingested | 15 to 30 days default |
| Splunk | $0.50 to $5 per GB ingested | premium for retention |
| Object storage standard | ~$0.023 per GB-month | indefinite |
| Object storage cold tier | ~$0.004 per GB-month | indefinite |
Two orders of magnitude on storage, with compute moved to the client.
Who Owns What
| Asset | Traditional SaaS | HypStack |
|---|---|---|
| Raw logs | vendor cloud | your bucket |
| Schema | vendor-defined | open Iceberg |
| Query engine | vendor | JS client or your warehouse |
| Retention policy | vendor pricing tier | bucket lifecycle |
| Access control | vendor IAM | your IAM |
Next Steps
- Claude Code OpenTelemetry Setup: single-machine OTel flow with the bundled daemon
- Data Sources: connect Hyperparam to S3, GCS, Azure, and Iceberg
- Quick Start: load your first log file
- Exporting Chat Logs: pull traces out of Claude Code, ChatGPT, Langfuse, LangSmith, Phoenix, Datadog, and more