HypStack: An Open Observability Stack for AI

HypStack is the open, three-layer stack that Hyperparam is designed to plug into. It collects logs from your AI surfaces with OpenTelemetry, stores them as Apache Iceberg in your own object storage, and analyzes them in the browser with Hyperparam. Every layer is open source and replaceable, including Hyperparam itself.

If you only want to use Hyperparam against existing files or buckets, you can skip this page. If you are setting up AI observability for an organization and asking "where do these logs go and who owns them," this is the architecture we recommend.

Why a New Stack

Every company already has a data stack. Datadog and Splunk for ops. Snowflake, Databricks, BigQuery, Redshift for analytics. None of it was designed for AI workloads.

Two things have changed at once:

A new class of applications. Codex, Claude Code, Cursor, Claude Desktop, Copilot, plus every product team adding an LLM to their own app. These run on the client (terminal, IDE, desktop, browser), not behind your load balancer.
A new shape of data. A single agent turn is a structured document with prompt, retrieved context, tool calls, tool responses, completion, and token counts. A real session is hundreds of those. That is megabytes of structured text per user per hour, not 200-byte access log lines.

Observability vendors are billed per GB ingested, default to short retention, and are structured for metrics rather than text. Warehouses are great at SQL but bad at nested LLM payloads, and the compute is rented. AI traces are the worst of both worlds for both products, and the typical answer is to add yet another vendor (LLM observability, eval, agent tracing, RAG monitoring, prompt management), each a copy of your data behind their API.

HypStack is the alternative: one stack you own end to end.

The Three Layers

1. Collect with Collectivus (OpenTelemetry)

Collectivus is an OpenTelemetry collector designed for AI workloads, deployable across your fleet via your existing MDM (Jamf, Kandji, Intune). Underneath it is pure OTel, so anything that already speaks OTLP works.

What it captures:

Developer laptops and IDE plugins
Agent runs and MCP tool calls
Chatbots and customer-facing LLM features
Production services that wrap an LLM
Prompts, retrieved context, tool input and output, evaluations, token usage

The data flows into your bucket. You install Collectivus, not a Hyperparam SDK.

If you only need Claude Code traces on a single developer machine, the bundled Hyperparam Desktop daemon is a lighter-weight option. See Claude Code OpenTelemetry Setup for the local single-machine flow.

2. Store as Iceberg in Your Bucket

Apache Iceberg on object storage is the consensus winner for open analytical storage. HypStack writes traces into Iceberg tables in your own S3, GCS, or Azure account.

What this gives you:

Cheap, durable storage at object-storage prices, with infinite retention if you want it
Schema evolution and time travel built into the table format
Compaction and partitioning handled by the table layer
A catalog of your choice: Polaris, Lakekeeper, Glue, Nessie
Same-day interop: Snowflake, Databricks, DuckDB, Spark, and Trino all read Iceberg, so a customer who wants their AI traces in their existing warehouse can point at the same table with no export job

The raw logs, schema, retention policy, and access control all live in your account, under your IAM, in your region.

3. Analyze with Hyperparam

Hyperparam is a browser-native client that reads Iceberg and Parquet directly, with no cluster, SQL endpoint, or ingestion step. Built on hyparquet, icebird, squirreling, and parquetindex, it can open millions of traces in a single tab.

The differentiator is joins across sources. Agent traces become useful when you can join them with the code the agent was editing, the issue it was assigned, or the eval results from yesterday's prompt change. Hyperparam can join Iceberg with GitHub repos, other buckets, other Iceberg tables, and existing warehouse exports in one workspace.

Because it is a JS client rather than a query engine:

HTTP range requests pull only the bytes needed
Credentials stay in the browser
An agent can query its own traces with no SQL endpoint or auth proxy in between
The same client embeds in VS Code, Claude Desktop, or any agent harness

See Data Sources for connecting Hyperparam to S3, GCS, Azure, Iceberg, and Hugging Face.

Closing the Loop

The point of HypStack is not a dashboard. It is a feedback loop between your agents and your knowledge base.

knowledge base (CLAUDE.md, prompts, skills, docs)
        |
        v
   agents / chatbots / coding harness
        |
        v (OTel)
     Collectivus
        |
        v
      Iceberg (your bucket)
        |
        v
    Hyperparam (join, analyze, surface)
        |
        +---> updates back to the knowledge base

The workflow inside Hyperparam is three steps:

Explore. Open a multi-million-row trace dataset in a browser tab. Use AI-assisted search to find the conversation, retry, or tool call you care about. Join with GitHub or a warehouse to correlate with code or product context.
Surface. Generate label columns at scale ("did this conversation succeed", "was this tool call necessary", "did the user retry"), then filter and cluster down to the 1% that is actually broken: token-burn hotspots, retry loops, tool failures, rabbit-holes.
Improve. A prompt or tool change is a hypothesis you can validate against the traces you already have. The change goes back into the knowledge base your team already maintains as markdown (CLAUDE.md, prompts, skills, standards). The agents pick it up. The loop repeats.

See How to Debug Wasted Tool Calls in LLM Logs for a worked example of the explore, surface, improve cycle.

What HypStack Enables

Full-fidelity AI observability: prompts, tools, retrieval, evals, all queryable in one place
Infinite retention at object-storage prices, not per-GB ingest pricing
Agents that read their own traces directly from object storage, with no SQL endpoint to provision
Compliance that does not fight you: your IAM, your VPC, your region, no vendor copy of sensitive prompts

What You Bring, What We Bring

HypStack is a pattern, not a SaaS contract. Most of it is open source you can deploy yourself.

You bring:

Your apps, agents, and services
Your S3, GCS, or Azure account
Your existing warehouse, if you want to query traces from it
Your security and compliance posture

The open-source pieces:

Collectivus for collection
Iceberg for storage, plus a catalog of your choice
Hyperparam and the hyparam JS libraries for analysis

For organizations rolling this out at scale, Hyperparam offers paid help with OTel and MDM instrumentation patterns, opinionated schemas for AI workloads, collector and Iceberg deployment, and analysis-client support. Reach out via hyperparam.app.

Cost Shape

For a sense of why this matters financially:

Tier	Approximate price	Retention
Datadog logs	$0.10 to $2.50 per GB ingested	15 to 30 days default
Splunk	$0.50 to $5 per GB ingested	premium for retention
Object storage standard	~$0.023 per GB-month	indefinite
Object storage cold tier	~$0.004 per GB-month	indefinite

Two orders of magnitude on storage, with compute moved to the client.

Who Owns What

Asset	Traditional SaaS	HypStack
Raw logs	vendor cloud	your bucket
Schema	vendor-defined	open Iceberg
Query engine	vendor	JS client or your warehouse
Retention policy	vendor pricing tier	bucket lifecycle
Access control	vendor IAM	your IAM

Next Steps

Claude Code OpenTelemetry Setup: single-machine OTel flow with the bundled daemon
Data Sources: connect Hyperparam to S3, GCS, Azure, and Iceberg
Quick Start: load your first log file
Exporting Chat Logs: pull traces out of Claude Code, ChatGPT, Langfuse, LangSmith, Phoenix, Datadog, and more