Glass Box, Not Black Box: Trust in AI

There is a moment in every AI product where the user stops and asks: why did you do that?

It happens when the system makes a decision the user did not expect. A different chart type than requested. A data source the user did not mention. An agent delegation that produced a different result than a previous run. The system did something, and the user cannot see why.

This is the trust problem. And it gets worse as AI systems become more capable. The more autonomous the system, the more decisions it makes without explicit instruction, and the more opportunities for “why did you do that?” moments. A chatbot that responds to direct questions rarely surprises. An agentic system that routes queries through 25 specialist agents, selects model tiers based on reinforcement learning, and coordinates multi-agent swarms — that system makes hundreds of decisions per interaction.

If those decisions are invisible, trust erodes. If trust erodes, adoption stalls. This is not a UX polish problem. It is an architecture problem.

The Glass Box DAG

Every decision in DeepHarness is traced in a directed acyclic graph. We call it the Glass Box because the intent is literal: you can see through it.

When a query enters the system, the first node in the DAG records the raw input. The Q-learning router’s state encoding — intent hash, complexity bucket, entity bitmask, recent agent hash — is logged as the second node. The router’s action selection is the third: which agent was chosen, what the Q-values were for the alternatives, whether this was an exploitation (highest value) or exploration (random) decision.

From there, the graph branches. If the selected agent delegates to a specialist via the delegate tool, a new edge is created. The delegation records the source agent, the target agent, the delegation depth (max 3), and the reason for delegation. If the specialist delegates further, another edge. The graph captures the complete decision tree.

Model selection is a separate branch. The 5-layer cost cascade — confidentiality check, budget degradation, user override, cost-aware routing, agent default tier — is recorded as a sequence of nodes. Each layer either passes through or overrides, and the DAG records which layer determined the final model. If the cost router downgraded from Sonnet to Haiku because complexity scored below 33, that decision and its inputs are visible.

The DAG is not a log. Logs are sequential and flat. The Glass Box is a graph with branches, merges, and parallel paths. When a swarm runs in parallel-discovery topology — multiple scouts searching data sources simultaneously — the DAG shows the parallel branches, their individual results, and the merge point where results were synthesized. You can trace any output back to every decision that produced it.

Blueprint-First Governance

Transparency after the fact is necessary but insufficient. Showing users what happened does not prevent things that should not happen. This is why DeepHarness implements blueprint-first governance: nothing runs without a human-approved plan.

When the system determines that a task requires a swarm — multiple agents coordinating over multiple steps — it generates a blueprint before executing anything. The blueprint is a structured document that specifies the agents involved, the data sources each agent will access, the operations each agent will perform, the expected outputs, and the resource budget (model tiers, token limits, time constraints).

The user sees this blueprint before execution begins. They can approve it, modify it, or reject it. If approved, the system executes within the blueprint’s constraints. If an agent attempts an operation not covered by the blueprint, the system blocks it. If the task evolves and requires capabilities beyond the original scope, a new blueprint is generated and presented for approval.

This is governance as architecture, not as policy. The constraints are enforced by the execution engine, not by agent behavior. An agent cannot decide to access a data source that the blueprint does not authorize, because the authorization check happens at the system level, outside the agent’s control.

The result is that users always know what will happen before it happens. And when it is happening, they can verify that the execution matches the plan, because the Glass Box DAG shows every step.

Agent Action Receipts

Between the blueprint (what will happen) and the DAG (what did happen), there is the narration layer: what is happening right now.

The StreamClassifier maps raw SSE events from the API into three cognitive phases that the UI renders in real time.

Observe. The system acknowledges the query, identifies the intent, and reports which agents and data sources are relevant. The user sees what the system understood.

Reason. The system reports its analysis: which agent was selected and why, what model tier was chosen, what delegation strategy will be used. The user sees how the system is thinking about the problem.

Act. The system executes and streams results: data retrieved, charts rendered, configurations generated. The user sees the output as it is produced.

These phases are not cosmetic. The StreamClassifier parses actual SSE event types — text-start, text-delta, tool_call_start, tool_call_end, delegate-start, delegate-complete — and maps them to cognitive phases based on the event sequence. The Observe phase corresponds to intent classification events. The Reason phase corresponds to routing and delegation events. The Act phase corresponds to output generation events.

The narration is derived from the system’s actual behavior, not generated as a separate explanation layer. This distinction matters. Systems that generate explanations after the fact can hallucinate. Systems that narrate their actual execution cannot — the narration is the execution, exposed.

Stuck-Agent Recovery

Transparency also means being honest when things go wrong. Agentic systems fail. Agents get stuck — they enter loops, produce outputs that do not match expected schemas, or exceed time budgets. The question is not whether failure happens, but whether the user can see it.

DeepHarness implements stuck-agent recovery as a visible process. When an agent exceeds its time budget or produces outputs that fail validation, the system does not silently retry or fall back. It reports the failure in the narration stream, terminates the stuck agent, and either escalates to a higher-capability model tier or presents the partial results with a clear explanation of what went wrong.

The Glass Box DAG records the failure node, the recovery decision, and the outcome. If the system upgraded the model tier and retried, the DAG shows both attempts. If the system returned partial results, the DAG shows which agents completed and which failed.

This is uncomfortable transparency. Most products hide their failures. But hiding failures in an agentic system is worse than showing them, because the user cannot distinguish between “the system did not try” and “the system tried and failed” — and those require very different responses.

PII Masking

One category of decision must be invisible: the system’s handling of personally identifiable information.

DeepHarness’s PII masking runs before data leaves the infrastructure. When a query or data payload contains PII — names, email addresses, phone numbers, identifiers — the masking layer strips or replaces it before the content reaches an external model provider. The Glass Box records that masking occurred and what categories were detected, but does not record the PII itself.

This is the one exception to radical transparency. The system is transparent about the fact that masking happened, and about the policy that triggered it, but not about the content that was masked. This is by design: the transparency is about the decision, not about the data.

Why This Enables Enterprise Adoption

Enterprise buyers have a specific concern with AI systems: accountability. When a system makes a decision that affects business operations, someone needs to be able to answer “why did it do that?” with specifics, not with “the AI decided.”

The Glass Box DAG provides those specifics. Every routing decision has a Q-value justification. Every model selection has a cost-cascade trace. Every delegation has a source, target, and reason. Every failure has a recovery trace. These are not summaries or explanations — they are the actual decision records.

Blueprint governance provides the other half of accountability: authorization. The blueprint is a record of what was approved. The DAG is a record of what happened. The two together answer both “who authorized this?” and “how was it executed?”

This is what trust looks like in an agentic system. Not a trust badge. Not a compliance certification. A structural property of the architecture that makes every decision inspectable, every authorization traceable, and every failure visible.

We call it a Glass Box because that is what it is. You can see everything inside.