Add telemetry documentation#8133
Conversation
Document the end-to-end telemetry system across azure-dev, azd-queries, and azure-dev-tools. Covers architecture, data reference, feature instrumentation guide, dashboards/reports, and product overview. Partially contributes to Azure/azure-dev-pr#1772. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add service target values (including containerapp-dotnet, ai.endpoint) - Add service language values - Add feature-to-telemetry mapping table for outside-in lookup Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep docs focused on describing the system, not tracking issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move env.name row out of Service Languages table (wrong column count) - Fix Feature Mapping: auth.login.method → auth.method (actual field) - Fix Feature Mapping: project.infra.type → infra.provider - Fix Feature Mapping: packaging.type → pack.builder.image/tag - Fix Feature Mapping: update.availableVersion → fromVersion/toVersion - Fix Feature Mapping: ExecutionEnvironment → execution.environment - Fix KQL examples: MachineId → Properties['machine.id'] - Fix PII claim: no PII → no direct PII, sensitive values are hashed Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a comprehensive set of telemetry documentation pages covering architecture, data schema, dashboards, instrumentation guidance, and a product-facing overview, and wires them into the docs index plus cspell allowlists.
Changes:
- New telemetry docs: architecture, data reference, dashboards, feature instrumentation guide, and product overview.
- Updated
docs/README.mdto link the new docs in Guides/Reference/Architecture sections. - Extended
.vscode/cspell.misc.yamlwith terminology used by the new docs.
Show a summary per file
| File | Description |
|---|---|
| docs/architecture/telemetry.md | New end-to-end architecture doc with diagrams and pipeline detail. |
| docs/reference/telemetry-data.md | New schema reference for events, fields, ResultCode taxonomy, and KQL patterns. |
| docs/reference/telemetry-dashboards.md | New reference for Kusto functions, Power BI reports, and analysis layout. |
| docs/guides/feature-telemetry.md | New step-by-step instrumentation guide for new features. |
| docs/guides/telemetry-overview.md | New product-facing overview of telemetry metrics and dashboards. |
| docs/README.md | Adds links to the new telemetry docs. |
| .vscode/cspell.misc.yaml | Adds terms used in the new docs to the spell-check allowlist. |
Copilot's findings
Comments suppressed due to low confidence (1)
docs/guides/feature-telemetry.md:1
- Trailing whitespace at the end of line 184 (after
fields/). Minor formatting cleanup.
# Feature Telemetry Guide — Adding Telemetry to New Features
- Files reviewed: 7/7 changed files
- Comments generated: 10
| These are set once at process startup and attached to **every** span. | ||
|
|
||
| | Field Key | Type | Description | Example Values | | ||
| |-----------|------|-------------|----------------| | ||
| | `service.name` | string | Always `"azd"` | `azd` | |
| | Field Key | Type | Description | | ||
| |-----------|------|-------------| | ||
| | `service.host` | string | Azure service host | | ||
| | `service.name` | string | Azure service name | |
| | `service.name` | string | Azure service name | | ||
| | `service.statusCode` | measurement | HTTP status code | | ||
| | `service.method` | string | HTTP method | | ||
| | `service.errorCode` | measurement | Service-specific error code | |
| Init = toscalar(initUsers | count), | ||
| Provision = toscalar(provisionUsers | count), | ||
| Deploy = toscalar(deployUsers | count) |
|
|
||
| ### Basic: Command Usage Over Time | ||
| ```kql | ||
| getAzdEvents(startDate=ago(30d), endDate=now(), true, true) |
| | `js` | JavaScript | | ||
| | `ts` | TypeScript | |
|
|
||
| ### PII Protection | ||
|
|
||
| - **Hashed fields:** `project.template.id`, `project.template.version`, `project.name`, `env.name` are SHA-256 hashed (case-insensitive) before emission |
|
|
||
| ## KQL Query Library (azd-queries repo) | ||
|
|
||
| The [`azd-queries`](https://github.com/devdiv-azure-service-dmitryr/azd-queries) repo contains standalone KQL queries used for dashboards and analysis. These are separate from the deployed Kusto functions. |
| // Extensions report errors back to the host | ||
| return &azdext.CommandResult{ | ||
| Error: &azdext.ServiceError{ | ||
| Service: "arm", | ||
| StatusCode: resp.StatusCode, | ||
| Code: resp.Error.Code, | ||
| }, |
|
|
||
| ## Extension Framework Telemetry | ||
|
|
||
| **File:** `cli/azd/cmd/middleware/telemetry.go` (host side), `cli/azd/docs/extensions/extension-framework.md` |
| | [`devdiv-azure-service-dmitryr/azd-queries`](https://github.com/devdiv-azure-service-dmitryr/azd-queries) | **Pipeline & Governance** | GDPR classification, Kusto table sync, KQL query library | | ||
| | [`coreai-microsoft/azure-dev-tools`](https://github.com/coreai-microsoft/azure-dev-tools) → `product-telemetry/azd/` | **Analysis** | Power BI reports, Kusto functions, funnel metrics, investigations | |
There was a problem hiding this comment.
Are these safe to include?
jongio
left a comment
There was a problem hiding this comment.
Summary
Solid docs addition that'll help contributors understand the telemetry stack end-to-end. Three code references don't match what's currently in the codebase, and one of them could mislead users trying to opt out of telemetry.
Findings
| # | Severity | File | Issue |
|---|---|---|---|
| 1 | 🔴 high | telemetry-overview.md |
Opt-out config path defaults.collectTelemetry doesn't exist in code |
| 2 | 🟡 medium | telemetry-data.md |
ext.upgrade and ext.promote events aren't defined in events.go |
| 3 | 🟡 medium | feature-telemetry.md |
azdext.CommandResult type doesn't exist; extensions use ReportError() |
| 4 | 🟣 question | multiple | Internal infrastructure details (Kusto cluster, LENS job IDs, internal repo URLs) in a public repo |
| - No Azure credentials, tokens, or connection strings | ||
| - No personal names, emails, or IP addresses | ||
| - Project names and template names are **hashed** (one-way) — we can count unique projects but can't see what they're called | ||
| - Users opt out via `azd config set defaults.collectTelemetry no` or `AZURE_DEV_COLLECT_TELEMETRY=no` |
There was a problem hiding this comment.
🔴 high: The config path defaults.collectTelemetry doesn't exist anywhere in the azd codebase. The only working opt-out is the environment variable AZURE_DEV_COLLECT_TELEMETRY=no (see cli/azd/internal/telemetry/telemetry.go:33,78).
If a user runs azd config set defaults.collectTelemetry no, the key gets stored but nothing reads it. Telemetry keeps flowing.
Suggestion: Remove the config-based opt-out reference and document only the env var, or implement the config path before shipping these docs.
| | `ext.run` | Extension command execution | | ||
| | `ext.install` | Extension installation | | ||
| | `ext.upgrade` | Extension upgrade attempt | | ||
| | `ext.promote` | Registry promotion (e.g., dev → main) | |
There was a problem hiding this comment.
🟡 medium: ext.upgrade and ext.promote aren't defined as event constants in cli/azd/internal/tracing/events/events.go. The only extension events that exist today are ext.run (ExtensionRunEvent) and ext.install (ExtensionInstallEvent).
If these are planned but not yet implemented, consider marking them as such. Otherwise, remove them so contributors don't query Kusto for events that return zero rows.
| StatusCode: resp.StatusCode, | ||
| Code: resp.Error.Code, | ||
| }, | ||
| } |
There was a problem hiding this comment.
🟡 medium: azdext.CommandResult doesn't exist in the azdext package. Extensions report errors back to the host via azdext.ReportError(ctx, err) using the azdext.ServiceError type (defined at cli/azd/pkg/azdext/extension_error.go:16).
This code example won't compile if someone copies it. Consider replacing it with the actual error reporting pattern:
// Extensions report errors back to the host
azdext.ReportError(ctx, &azdext.ServiceError{
Service: "arm",
StatusCode: resp.StatusCode,
Code: resp.Error.Code,
})|
|
||
| ## Kusto Functions | ||
|
|
||
| Deployed to **`DDAzureClients.DevCli`** under the `azd` folder via a [LENS job](https://lens.msftcloudes.com/#/job/24ce3f0fd3d6499ab8a0d85d0c0c05e2). |
There was a problem hiding this comment.
🟣 question: Several docs reference internal Microsoft infrastructure that external contributors can't access:
- Kusto cluster
DDAzureClients, databaseDevCli - LENS job ID
24ce3f0fd3d6499ab8a0d85d0c0c05e2 - Internal repos
devdiv-azure-service-dmitryr/azd-queriesandcoreai-microsoft/azure-dev-tools
Is this intentional? If these docs are for internal contributors only, it might be worth noting that somewhere. If they're meant for external contributors too, the internal links will 404.
Documents the telemetry system end-to-end across azure-dev, azd-queries, and azure-dev-tools.
TLDR
azd telemetry spans 3 repos — instrumentation in azure-dev, pipeline/governance in azd-queries, analysis in azure-dev-tools. These docs capture the full picture: how data flows, what gets collected, how to add telemetry for new features, and where dashboards live.
Architecture
flowchart TB subgraph Instrumentation ["azure-dev (Instrumentation)"] CLI["azd CLI<br/>(Go + OpenTelemetry)"] VSC["VS Code Extension<br/>(@microsoft/vscode-azext-utils)"] EXT["Extensions<br/>(structured error reporting)"] end subgraph Export ["CLI Export Pipeline"] MW["Command Middleware<br/>cli/azd/cmd/middleware/telemetry.go"] OTel["OTel TracerProvider"] AIExp["App Insights Exporter<br/>SpanToEnvelope()"] DiskQ["Disk Queue<br/>~/.azd/telemetry/*.trn"] Upload["azd telemetry upload<br/>(background / deferred)"] end subgraph Ingestion ["Azure Monitor / Kusto"] AppInsights["Azure Application Insights"] Kusto["Azure Data Explorer (Kusto)<br/>DDAzureClients.DevCli"] RawTable["RawEventsAppRequests"] end subgraph Pipeline ["azd-queries (Pipeline & Governance)"] GDPR["GDPR Classify Pipeline<br/>eng/pipelines/classify.yml"] GDPRTool["gdpr tool<br/>(export → publish → ingest)"] GDPRAPI["GDPR API"] TableSync["Kusto Table Sync<br/>.github/workflows/ci.yml"] IngestScripts["Ingest Scripts<br/>(templates, template versions)"] KQLLib["KQL Query Library<br/>(core-usage, insights, aspire, vscode)"] end subgraph Analysis ["azure-dev-tools (Analysis)"] KustoFn["Kusto Functions<br/>(getAzdEvents, addTemplateColumns, etc.)"] PBI["Power BI Reports<br/>(KPIs, funnels, user journeys)"] Investigations["Ad-hoc Investigations"] end CLI --> MW --> OTel --> AIExp --> DiskQ --> Upload --> AppInsights VSC -->|VS Code telemetry framework| AppInsights EXT -->|structured errors via host| MW AppInsights --> Kusto --> RawTable GDPR -->|reads azure-dev source| GDPRTool --> GDPRAPI TableSync --> Kusto IngestScripts --> Kusto KQLLib -->|queries| RawTable KustoFn -->|deployed to DDAzureClients.DevCli| RawTable PBI -->|reads via| KustoFn Investigations -->|ad-hoc KQL| RawTableWhat is added
docs/architecture/telemetry.md— architecture and data flowdocs/reference/telemetry-data.md— events, fields, schema, KQL patternsdocs/reference/telemetry-dashboards.md— Kusto functions, Power BI reports, analysis toolsdocs/guides/feature-telemetry.md— how to instrument telemetry for new featuresdocs/guides/telemetry-overview.md— product-facing overview of metrics and dashboardsdocs/README.mdwith links to the abovePartially contributes to Azure/azure-dev-pr#1772.
Note: Once #8041 merges, will incorporate its extension/error attribute additions into the data reference.