Proposal
Add GET /v1/client/allocations/prometheus-sd to the client agent HTTP API, serving the node's running allocations as Prometheus HTTP SD target groups.
One target group per allocated port of every running allocation, labeled with __meta_nomad_* labels (namespace, job, task group, allocation, node, port) plus job/group meta as __meta_nomad_meta_<key>. A ?port=<label> query parameter filters to a single port label (e.g. ?port=metrics). The endpoint would serve local client state only and require node:read.
Implementation sketch: ~200 lines in command/agent/ plus a client method. Test plan: unit tests covering target-group rendering, port filtering, ACL enforcement, and IPv6 host-IP bracketing.
Use-cases
Scraping per-allocation metrics with a node-local collector (e.g. a Prometheus agent per node). Each node's collector discovers its own allocations directly from the local client, so scrape-target discovery fans out to the clients instead of funneling through the servers — no server round-trip, no single discovery bottleneck, and discovery keeps working on a node even when servers are briefly unreachable.
Proposal
Add
GET /v1/client/allocations/prometheus-sdto the client agent HTTP API, serving the node's running allocations as Prometheus HTTP SD target groups.One target group per allocated port of every running allocation, labeled with
__meta_nomad_*labels (namespace, job, task group, allocation, node, port) plus job/group meta as__meta_nomad_meta_<key>. A?port=<label>query parameter filters to a single port label (e.g.?port=metrics). The endpoint would serve local client state only and requirenode:read.Implementation sketch: ~200 lines in
command/agent/plus a client method. Test plan: unit tests covering target-group rendering, port filtering, ACL enforcement, and IPv6 host-IP bracketing.Use-cases
Scraping per-allocation metrics with a node-local collector (e.g. a Prometheus agent per node). Each node's collector discovers its own allocations directly from the local client, so scrape-target discovery fans out to the clients instead of funneling through the servers — no server round-trip, no single discovery bottleneck, and discovery keeps working on a node even when servers are briefly unreachable.