Every number on this page came from running actual az commands against a live Azure subscription. Attempt counts are command counts. Token estimates use measured output bytes ÷ 4 plus prompt overhead. No extrapolation.
Executed against subscription 49521d08 (Personal Portfolio) · resource-group rg-apps-prod · 2026-04-06
One attempt = one az CLI call. We ran each path in sequence and counted. Failed calls (empty result, wrong type) count as attempts — they still consume context.
Measured with wc -c on each command's stdout. Token estimate = bytes ÷ 4 (standard approximation for CLI output).
Each retry adds ~300 tokens: the original question + prior failed output + "try again" context. This is estimated, not measured — and labeled accordingly.
We do not have instrumented token counts from a live Claude API session. Response tokens are measured. Prompt overhead is estimated.
The most common starting question. Resource type is the trap — App Service and Static Web Apps use different CLI commands.
(no output) Empty result. No App Service resources in rg-apps-prod. Static Web Apps are a different resource type.
(no output) Still empty. App Service Plans also absent.
Name Location Branch RepositoryUrl ...
----------- -------- ------ -------------
front9back9 East US 2 master github.com/...
awacs-ai East US 2 master github.com/...
1,272 bytes. Missing --resource-group so lists
all subscriptions. Command now correct but
query not focused. Still need show for URLs.{
"name": "awacs-ai",
"customDomains": ["awacs.ai","www.awacs.ai"],
"repositoryUrl": "github.com/Dwink213/awacs-ai-website",
"branch": "master",
"sku": { "name": "Free" }
}
→ 184 bytesAIOS knows to check the resource type before guessing the CLI command. It runs a discovery call first — "what types of resources exist in this resource group?"
Name ResourceGroup Location Type ----------- ------------- -------- ------------------------- front9back9 rg-apps-prod eastus2 Microsoft.Web/staticSites awacs-ai rg-apps-prod eastus2 Microsoft.Web/staticSites → 317 bytes · type confirmed → use az staticwebapp
{
"customDomains": ["awacs.ai","www.awacs.ai"],
"repositoryUrl": "github.com/Dwink213/awacs-ai-website",
"sku": { "name": "Free" }
}
→ 184 bytesWhat AIOS cold still can't skip: the 317-byte discovery call. It must ask "what's in this resource group?" before it knows whether to use az webapp, az staticwebapp, az functionapp, or something else.
MATCH — resource type: Microsoft.Web/staticSites resource group: rg-apps-prod names: awacs-ai, front9back9 0 bytes from Azure · in-context · no API call needed
{
"customDomains": ["awacs.ai","www.awacs.ai"],
"repositoryUrl": "github.com/Dwink213/awacs-ai-website",
"branch": "master",
"sku": { "name": "Free" }
}
→ 184 bytesWhat the KB eliminates: the 317-byte discovery call. That call exists only to answer "what type of resource is this?" The KB already answered that question in a prior session. It doesn't need to be asked again.
Between "No KB" and "AIOS cold": 2 saved attempts, 5.7 seconds faster. AIOS doesn't guess at the resource type — it asks Azure directly via a discovery call. Cost: 317 bytes and 2,065ms. That's still one call the KB eliminates entirely.
Between "AIOS cold" and "AIOS+KB": 1 saved attempt, 1.8 seconds faster — the 317-byte discovery call. The KB already knows that rg-apps-prod contains two Microsoft.Web/staticSites resources named awacs-ai and front9back9. That knowledge came from the prior session and costs nothing to reuse. This is the compounding effect: every session after the first, that discovery call never happens.
The full gap — naked to KB: 3 attempts saved · 7.5 seconds saved · ~1,218 tokens saved per question. Across 50 similar questions in a session: 375 seconds and ~61,000 tokens. At standard API rates, that's real money.
| Metric | Naive | AIOS Cold | AIOS + KB |
|---|---|---|---|
| Attempts per question | 4 | 2 | 1 |
| Wall time (1 question) | 10,412ms | 4,668ms | ~2,932ms |
| Response bytes | 1,456 | 501 | 184 |
| Est. total tokens | ~1,564 | ~725 | ~346 |
| Saved vs naive | — | 5,744ms / 839t | 7,480ms / 1,218t |
Multi-source chain. Azure environment list + GitHub CI/CD status. The stuck preview environment was found by running the command — not by reading docs.
To verify your deployment is current, check the Azure portal → your Static Web App → Deployments tab. You can also run:
az staticwebapp show --name <app-name> --resource-group <rg>
For GitHub Actions status, check the Actions tab in your repo for the latest workflow run.
Issues found: None — no commands run against the actual environment, nothing observed.
awacs-ai rg-apps-prod Microsoft.Web/staticSites
→ 317 bytes · needed to learn the resource nameName Branch Status LastUpdatedOn ------- ------ ---------- ---------------------------- default master Ready 2026-04-06T01:44:15Z preview (none) Uploading 2026-02-22T20:08:23Z ↑ stuck 43 days · never completed → 363 bytes
Name Branch Status LastUpdatedOn ------- ------ ---------- ---------------------------- default master Ready 2026-04-06T01:44:15Z ✓ current preview (none) Uploading 2026-02-22T20:08:23Z ✗ stuck → 363 bytes · direct (resource name from KB)
Production is current. Default environment updated 2026-04-06T01:44:15Z — after most recent master push.
Stuck preview environment: Created 2026-02-22, stuck in Uploading for 43 days with no source branch. Not affecting production but consuming resources.
Cleanup command (from KB — no research needed):
az staticwebapp environment delete --name awacs-ai --resource-group rg-apps-prod --environment-name preview
The naked LLM cannot run commands — it describes what you could do. That means it can't see the preview environment stuck in Uploading. The issue only exists in the command output. No amount of documentation reading reveals it. This is the core argument for KB grounding: some facts only exist after you run the command.
Between AIOS cold and AIOS+KB: the difference is one discovery call (317 bytes). Not dramatic — but it compounds. If this question is asked 20 times in a session, AIOS cold burns 6,340 bytes and 20 discovery calls. AIOS+KB burns 0. The KB pays for itself in the second question.
The ~300 token overhead per retry is a standard estimate, not measured. Actual prompt overhead depends on conversation history length, which grows with each retry. Real overhead is likely higher.
We do not have a Claude API session with token-level instrumentation. Response bytes are measured (wc -c on stdout). Total tokens = response bytes ÷ 4 + estimated prompt overhead.
A well-prompted LLM might skip directly to az staticwebapp if system context includes "this is a Static Web Apps environment." The naive path shown assumes no system context — a cold session.
This environment is simple. Token savings scale with environment complexity. A subscription with 50 resource groups and 200 resources would show larger KB vs cold-start gaps.