The Lab — System Evidence
Not a demo.
The actual system.
Every number on this page comes from a real command run against real infrastructure. Every decision log entry is verbatim AI output — unedited reasoning from the pipeline as it worked. This is what AWACS actually produces.
Entries
Logged
Per Session
Enforced by Code
Live data from status.json — last updated 2026-04-06T13:32
Knowledge Base — Domain Coverage
What the system has learned
49 Class A entries across 7 domains. Each entry ran through the full write chain: command executed → output hashed → candidate prepared → gate evaluated → admitted or rejected. No shortcuts.
Gate Decision Log — Verbatim
The admission gate in action
Every entry that reaches Class A passes a 5-question gate. Every rejection is logged with a reason. This is the actual librarian-decisions.jsonl — not a curated excerpt.
az resource list --resource-group rg-apps-prod --output table — 2,065ms wall time, 317 bytes. Revealed: all web resources are Microsoft.Web/staticSites, not App Service. Without this KB entry every cold session pays 2s to re-discover this.AI Self-Dialogue — Unedited
The system reasoning about itself
This is the Analyzer and Librarian working through a real candidate. The Analyzer ran on 2026-03-29 against the official Claude Code hooks documentation — 22 event types, 30 operational gotchas. Watch it classify its own finding and argue for the right trust tier.
cmd-007 we knew our hooks don't return JSON stdout per spec — we identified the missing additionalContext fields. That was one finding from one observation.
What the fetch returned: Complete operational surface. 22 hook event types documented. We are using 3. That means 19 event types are completely unknown to our implementation. This is not a partial match — this far-exceeded expectations.
Key deltas from what we believed:
• Exit code behavior is event-specific — exit 2 on PostToolUse does NOT block (tool already ran). We had the wrong mental model.
•
CLAUDE_ENV_FILE env var unknown to us — hooks can persist env vars for subsequent Bash commands.
•
once: true field in skill frontmatter — hook runs once per session then removes itself. We didn't know this existed.
•
stop_hook_active field on Stop event — prevents infinite re-triggering loops. Critical for our Stop hook architecture.
This is the official vendor spec, fetched from primary source. It is authoritative on what the API says it does. It is not authoritative on what it actually does in our environment.
Class A promotion path: each finding that matters gets its own test execution.
CLAUDE_ENV_FILE behavior, PermissionRequest hook, stop_hook_active pattern, once: true field — each should become a separate command execution candidate. When we run the command and observe the output, that becomes Class A.
✓ ADMIT to Class B
✗ DO NOT promote to Class A yet
| Q1 | ✓ Validated? — YES. Fetched from official Anthropic docs. Primary source, canonical state 2026-03-29. |
| Q2 | ✓ Reusable? — YES. Complete hook reference applicable to all Claude Code projects. Critical for fixing our JSON stdout gap. |
| Q3 | ✓ Has metadata? — YES. Full frontmatter, all required fields present. |
| Q4 | ✓ Structured? — YES. Tables, code blocks, knowledge-extracted sections. Retrievable. |
| Q5 | ✓ Worth remembering? — ABSOLUTELY. We use 3/22 hook types. This is a capability map revealing 19 unknown capabilities. The exit code behavior matrix alone prevents common implementation errors. |
Librarian note: Concur with Analyzer classification. This entry documents what the vendor says is true. It goes into Class B. When we test
CLAUDE_ENV_FILE in our environment and observe actual behavior, that becomes Class A. The distinction matters — vendor documentation is one data point; observed behavior in our environment is the ground truth.
Command intercepted: git commit -m "feat: wire brake system — blocks az group delete with exit(2)..." Matched pattern:
az\s+group\s+delete found in commit message body.
⛔ BRAKE — DESTRUCTIVE COMMAND BLOCKED
Risk: CRITICAL
Blast: ALL resources in the resource group are permanently deleted.
Safe: az resource list -g <rg> — inventory first
Resolution: Commit message reworded. "group-delete pattern" instead of the literal match. The brake logged the event to captures/brake-events/ as tamper-evident JSONL proof.
What this demonstrates: The system is live enough that it cannot describe itself without triggering itself. That's not a bug — it's proof.
Real Command — Real Output
What gets stored, and why
Class A entry cmd-20260405-2242-001 — executed live against the Personal Portfolio Azure subscription. This is the actual stored entry, including the discovery that saved all future sessions 2 seconds.
az webapp list returns 0 results here — all web resources are Microsoft.Web/staticSites, not App Service. Use az staticwebapp command family. This discovery call costs 2,065ms and 317 bytes. Without this entry in KB, every cold session must run this call before knowing which CLI to use.
Infrastructure Brake — Live Event Log
The system stopping dangerous commands
When a command matches the danger catalog, the brake fires before the command runs. This is a real event from captures/brake-events/ — logged with SHA-256 tamper detection.
More Evidence
Dig deeper into the data
Every page below contains real infrastructure data, real command outputs, and real cost/attempt comparisons. No mocked results.