Skip to content

Add telemetry documentation#8133

Open
rajeshkamal5050 wants to merge 6 commits into
mainfrom
telemetry-docs
Open

Add telemetry documentation#8133
rajeshkamal5050 wants to merge 6 commits into
mainfrom
telemetry-docs

Conversation

@rajeshkamal5050
Copy link
Copy Markdown
Contributor

@rajeshkamal5050 rajeshkamal5050 commented May 11, 2026

Documents the telemetry system end-to-end across azure-dev, azd-queries, and azure-dev-tools.

TLDR

azd telemetry spans 3 repos — instrumentation in azure-dev, pipeline/governance in azd-queries, analysis in azure-dev-tools. These docs capture the full picture: how data flows, what gets collected, how to add telemetry for new features, and where dashboards live.

Architecture

flowchart TB
    subgraph Instrumentation ["azure-dev (Instrumentation)"]
        CLI["azd CLI<br/>(Go + OpenTelemetry)"]
        VSC["VS Code Extension<br/>(@microsoft/vscode-azext-utils)"]
        EXT["Extensions<br/>(structured error reporting)"]
    end

    subgraph Export ["CLI Export Pipeline"]
        MW["Command Middleware<br/>cli/azd/cmd/middleware/telemetry.go"]
        OTel["OTel TracerProvider"]
        AIExp["App Insights Exporter<br/>SpanToEnvelope()"]
        DiskQ["Disk Queue<br/>~/.azd/telemetry/*.trn"]
        Upload["azd telemetry upload<br/>(background / deferred)"]
    end

    subgraph Ingestion ["Azure Monitor / Kusto"]
        AppInsights["Azure Application Insights"]
        Kusto["Azure Data Explorer (Kusto)<br/>DDAzureClients.DevCli"]
        RawTable["RawEventsAppRequests"]
    end

    subgraph Pipeline ["azd-queries (Pipeline & Governance)"]
        GDPR["GDPR Classify Pipeline<br/>eng/pipelines/classify.yml"]
        GDPRTool["gdpr tool<br/>(export → publish → ingest)"]
        GDPRAPI["GDPR API"]
        TableSync["Kusto Table Sync<br/>.github/workflows/ci.yml"]
        IngestScripts["Ingest Scripts<br/>(templates, template versions)"]
        KQLLib["KQL Query Library<br/>(core-usage, insights, aspire, vscode)"]
    end

    subgraph Analysis ["azure-dev-tools (Analysis)"]
        KustoFn["Kusto Functions<br/>(getAzdEvents, addTemplateColumns, etc.)"]
        PBI["Power BI Reports<br/>(KPIs, funnels, user journeys)"]
        Investigations["Ad-hoc Investigations"]
    end

    CLI --> MW --> OTel --> AIExp --> DiskQ --> Upload --> AppInsights
    VSC -->|VS Code telemetry framework| AppInsights
    EXT -->|structured errors via host| MW
    AppInsights --> Kusto --> RawTable

    GDPR -->|reads azure-dev source| GDPRTool --> GDPRAPI
    TableSync --> Kusto
    IngestScripts --> Kusto
    KQLLib -->|queries| RawTable

    KustoFn -->|deployed to DDAzureClients.DevCli| RawTable
    PBI -->|reads via| KustoFn
    Investigations -->|ad-hoc KQL| RawTable
Loading

What is added

  • docs/architecture/telemetry.md — architecture and data flow
  • docs/reference/telemetry-data.md — events, fields, schema, KQL patterns
  • docs/reference/telemetry-dashboards.md — Kusto functions, Power BI reports, analysis tools
  • docs/guides/feature-telemetry.md — how to instrument telemetry for new features
  • docs/guides/telemetry-overview.md — product-facing overview of metrics and dashboards
  • Updated docs/README.md with links to the above

Partially contributes to Azure/azure-dev-pr#1772.

Note: Once #8041 merges, will incorporate its extension/error attribute additions into the data reference.

Document the end-to-end telemetry system across azure-dev, azd-queries,
and azure-dev-tools. Covers architecture, data reference, feature
instrumentation guide, dashboards/reports, and product overview.

Partially contributes to Azure/azure-dev-pr#1772.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rajeshkamal5050 and others added 5 commits May 11, 2026 09:41
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add service target values (including containerapp-dotnet, ai.endpoint)
- Add service language values
- Add feature-to-telemetry mapping table for outside-in lookup

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep docs focused on describing the system, not tracking issues.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move env.name row out of Service Languages table (wrong column count)
- Fix Feature Mapping: auth.login.method → auth.method (actual field)
- Fix Feature Mapping: project.infra.type → infra.provider
- Fix Feature Mapping: packaging.type → pack.builder.image/tag
- Fix Feature Mapping: update.availableVersion → fromVersion/toVersion
- Fix Feature Mapping: ExecutionEnvironment → execution.environment
- Fix KQL examples: MachineId → Properties['machine.id']
- Fix PII claim: no PII → no direct PII, sensitive values are hashed

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rajeshkamal5050 rajeshkamal5050 marked this pull request as ready for review May 15, 2026 19:16
@rajeshkamal5050 rajeshkamal5050 requested a review from wbreza as a code owner May 15, 2026 19:16
Copilot AI review requested due to automatic review settings May 15, 2026 19:16
@rajeshkamal5050 rajeshkamal5050 added the skip-governance Skip PR governance checks label May 15, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a comprehensive set of telemetry documentation pages covering architecture, data schema, dashboards, instrumentation guidance, and a product-facing overview, and wires them into the docs index plus cspell allowlists.

Changes:

  • New telemetry docs: architecture, data reference, dashboards, feature instrumentation guide, and product overview.
  • Updated docs/README.md to link the new docs in Guides/Reference/Architecture sections.
  • Extended .vscode/cspell.misc.yaml with terminology used by the new docs.
Show a summary per file
File Description
docs/architecture/telemetry.md New end-to-end architecture doc with diagrams and pipeline detail.
docs/reference/telemetry-data.md New schema reference for events, fields, ResultCode taxonomy, and KQL patterns.
docs/reference/telemetry-dashboards.md New reference for Kusto functions, Power BI reports, and analysis layout.
docs/guides/feature-telemetry.md New step-by-step instrumentation guide for new features.
docs/guides/telemetry-overview.md New product-facing overview of telemetry metrics and dashboards.
docs/README.md Adds links to the new telemetry docs.
.vscode/cspell.misc.yaml Adds terms used in the new docs to the spell-check allowlist.

Copilot's findings

Comments suppressed due to low confidence (1)

docs/guides/feature-telemetry.md:1

  • Trailing whitespace at the end of line 184 (after fields/). Minor formatting cleanup.
# Feature Telemetry Guide — Adding Telemetry to New Features
  • Files reviewed: 7/7 changed files
  • Comments generated: 10

Comment on lines +145 to +149
These are set once at process startup and attached to **every** span.

| Field Key | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `service.name` | string | Always `"azd"` | `azd` |
Comment on lines +266 to +269
| Field Key | Type | Description |
|-----------|------|-------------|
| `service.host` | string | Azure service host |
| `service.name` | string | Azure service name |
| `service.name` | string | Azure service name |
| `service.statusCode` | measurement | HTTP status code |
| `service.method` | string | HTTP method |
| `service.errorCode` | measurement | Service-specific error code |
Comment on lines +600 to +602
Init = toscalar(initUsers | count),
Provision = toscalar(provisionUsers | count),
Deploy = toscalar(deployUsers | count)

### Basic: Command Usage Over Time
```kql
getAzdEvents(startDate=ago(30d), endDate=now(), true, true)
Comment on lines +207 to +208
| `js` | JavaScript |
| `ts` | TypeScript |

### PII Protection

- **Hashed fields:** `project.template.id`, `project.template.version`, `project.name`, `env.name` are SHA-256 hashed (case-insensitive) before emission

## KQL Query Library (azd-queries repo)

The [`azd-queries`](https://github.com/devdiv-azure-service-dmitryr/azd-queries) repo contains standalone KQL queries used for dashboards and analysis. These are separate from the deployed Kusto functions.
Comment on lines +160 to +166
// Extensions report errors back to the host
return &azdext.CommandResult{
Error: &azdext.ServiceError{
Service: "arm",
StatusCode: resp.StatusCode,
Code: resp.Error.Code,
},

## Extension Framework Telemetry

**File:** `cli/azd/cmd/middleware/telemetry.go` (host side), `cli/azd/docs/extensions/extension-framework.md`
Comment on lines +14 to +15
| [`devdiv-azure-service-dmitryr/azd-queries`](https://github.com/devdiv-azure-service-dmitryr/azd-queries) | **Pipeline & Governance** | GDPR classification, Kusto table sync, KQL query library |
| [`coreai-microsoft/azure-dev-tools`](https://github.com/coreai-microsoft/azure-dev-tools) → `product-telemetry/azd/` | **Analysis** | Power BI reports, Kusto functions, funnel metrics, investigations |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these safe to include?

Copy link
Copy Markdown
Member

@jongio jongio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Solid docs addition that'll help contributors understand the telemetry stack end-to-end. Three code references don't match what's currently in the codebase, and one of them could mislead users trying to opt out of telemetry.

Findings

# Severity File Issue
1 🔴 high telemetry-overview.md Opt-out config path defaults.collectTelemetry doesn't exist in code
2 🟡 medium telemetry-data.md ext.upgrade and ext.promote events aren't defined in events.go
3 🟡 medium feature-telemetry.md azdext.CommandResult type doesn't exist; extensions use ReportError()
4 🟣 question multiple Internal infrastructure details (Kusto cluster, LENS job IDs, internal repo URLs) in a public repo

- No Azure credentials, tokens, or connection strings
- No personal names, emails, or IP addresses
- Project names and template names are **hashed** (one-way) — we can count unique projects but can't see what they're called
- Users opt out via `azd config set defaults.collectTelemetry no` or `AZURE_DEV_COLLECT_TELEMETRY=no`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 high: The config path defaults.collectTelemetry doesn't exist anywhere in the azd codebase. The only working opt-out is the environment variable AZURE_DEV_COLLECT_TELEMETRY=no (see cli/azd/internal/telemetry/telemetry.go:33,78).

If a user runs azd config set defaults.collectTelemetry no, the key gets stored but nothing reads it. Telemetry keeps flowing.

Suggestion: Remove the config-based opt-out reference and document only the env var, or implement the config path before shipping these docs.

| `ext.run` | Extension command execution |
| `ext.install` | Extension installation |
| `ext.upgrade` | Extension upgrade attempt |
| `ext.promote` | Registry promotion (e.g., dev → main) |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 medium: ext.upgrade and ext.promote aren't defined as event constants in cli/azd/internal/tracing/events/events.go. The only extension events that exist today are ext.run (ExtensionRunEvent) and ext.install (ExtensionInstallEvent).

If these are planned but not yet implemented, consider marking them as such. Otherwise, remove them so contributors don't query Kusto for events that return zero rows.

StatusCode: resp.StatusCode,
Code: resp.Error.Code,
},
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 medium: azdext.CommandResult doesn't exist in the azdext package. Extensions report errors back to the host via azdext.ReportError(ctx, err) using the azdext.ServiceError type (defined at cli/azd/pkg/azdext/extension_error.go:16).

This code example won't compile if someone copies it. Consider replacing it with the actual error reporting pattern:

// Extensions report errors back to the host
azdext.ReportError(ctx, &azdext.ServiceError{
    Service:    "arm",
    StatusCode: resp.StatusCode,
    Code:       resp.Error.Code,
})


## Kusto Functions

Deployed to **`DDAzureClients.DevCli`** under the `azd` folder via a [LENS job](https://lens.msftcloudes.com/#/job/24ce3f0fd3d6499ab8a0d85d0c0c05e2).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 question: Several docs reference internal Microsoft infrastructure that external contributors can't access:

  • Kusto cluster DDAzureClients, database DevCli
  • LENS job ID 24ce3f0fd3d6499ab8a0d85d0c0c05e2
  • Internal repos devdiv-azure-service-dmitryr/azd-queries and coreai-microsoft/azure-dev-tools

Is this intentional? If these docs are for internal contributors only, it might be worth noting that somewhere. If they're meant for external contributors too, the internal links will 404.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-governance Skip PR governance checks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants