Skip to content

Add configurability to set tags and metadata on langfuse traces#53

Merged
eshulman2 merged 4 commits into
forge-sdlc:mainfrom
danchild:feature-metadata-tags-config
Jun 18, 2026
Merged

Add configurability to set tags and metadata on langfuse traces#53
eshulman2 merged 4 commits into
forge-sdlc:mainfrom
danchild:feature-metadata-tags-config

Conversation

@danchild

@danchild danchild commented May 21, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds configurability for Langfuse trace tags and metadata via LANGFUSE_TRACE_TAGS and LANGFUSE_TRACE_METADATA env vars
  • Ports Grafana dashboards (business, engineering, issue-detail) into devtools/grafana/ with full provisioning
  • Fixes trace field leakage into agent system prompts (workflow-internal fields were appearing in LLM context)

Changes

Langfuse trace configurability

  • LANGFUSE_TRACE_TAGS / LANGFUSE_TRACE_METADATA: comma-separated field lists that control what workflow context is attached to Langfuse traces
  • Available fields: ticket_key, ticket_type, project_id, workflow_step, repo, pr_number, ci_status, event_source, event_type, llm_model, retry_count, system_prompt_length
  • Fields are validated at startup; unknown values are silently ignored and logged

Grafana dashboard stack

  • Three dashboards: forge-business (ticket throughput, cycle time), forge-engineering (LLM latency, token usage), forge-issue-detail (per-ticket trace drill-down)
  • Datasources: ClickHouse (Langfuse traces), Prometheus (Forge metrics), Redis
  • devtools/grafana/compose.grafana.yml: standalone Grafana compose for dashboard dev
  • devtools/grafana/compose.langfuse-network.yml: optional overlay that joins Grafana to a self-hosted Langfuse Docker network (requires Langfuse to be running — omit this file if not using self-hosted Langfuse)
  • Wired into both docker-compose.yml and devtools/docker-compose.dev.yml

Trace context leakage fix (prompted by review feedback)

  • Added trace_context parameter to ForgeAgent.run_task() — fields passed there are forwarded to Langfuse only and never written to the system prompt
  • Reverted generate_prd, generate_spec, generate_epics, regenerate_with_feedback, answer_question to their original minimal prompt contexts; workflow state trace fields go via trace_context instead
  • Fixed sync_pr_description in code_review.py the same way
  • Previously, fields like current_node, event_type, retry_count were leaking verbatim into the agent system prompt for tasks where they're irrelevant

Docs

  • docs/reference/config.md: added Langfuse trace field and Grafana variable reference
  • docs/developer-guide.md: added trace tags/metadata section and Grafana setup instructions
  • .env.example: documented new vars with recommended values for dashboard compatibility

Test plan

  • uv run pytest tests/unit/ -q passes
  • Start stack with docker compose --env-file .env -f docker-compose.yml up -d prometheus grafana — Grafana reachable at http://localhost:3010
  • With self-hosted Langfuse: add -f devtools/grafana/compose.langfuse-network.yml — ClickHouse datasource connects and dashboard panels render
  • Without Langfuse: base command (no network overlay) starts successfully — Prometheus and Redis panels work

🤖 Generated with Claude Code

@danchild danchild force-pushed the feature-metadata-tags-config branch 3 times, most recently from 34b8135 to bc14b93 Compare May 26, 2026 19:58
@eshulman2

Copy link
Copy Markdown
Collaborator

Review Notes

Overall

Good architecture — the TracingField enum + resolver pattern is clean, the config parsing with validation is solid, and test coverage is comprehensive. A few things to address before merge:

1. Bug workflow nodes are not covered

The PR only enriches context in feature workflow nodes (prd_generation, spec_generation, epic_decomposition, task_generation, pr_creation, qa_handler, code_review).

The bug workflow nodes that invoke the agent were not updated:

  • triage.py — only passes context={"ticket_key": ticket_key}, missing all other trace fields
  • rca_analysis.py — invokes the agent via ContainerRunner, different code path entirely
  • plan_bug_fix.py — same, uses ContainerRunner

If someone configures LANGFUSE_TRACE_TAGS=ticket_type,workflow_step, bug workflow traces will be missing those tags.

2. The per-node enrichment approach is fragile

The current design requires every node that calls the agent to manually build a ~6-line context dict:

context = {
    "ticket_key": ticket_key,
    "ticket_type": state.get("ticket_type", ""),
    "current_node": state.get("current_node", ""),
    "event_type": state.get("event_type", ""),
    "event_source": state.get("context", {}).get("source", ""),
    "retry_count": state.get("retry_count", 0),
}

This is copy-pasted into 7 files and will need to be added to every future node that calls the agent. If someone forgets (as happened with the bug workflow nodes), traces from that node get no tags/metadata.

Suggested alternative: Resolve the trace fields once in the orchestrator worker — it already has the full workflow state at invocation time — and store the resolved (tags, metadata) in the state dict or pass them via the LangGraph config. The agent's run_task() would then pick them up automatically without any node needing to know about tracing. This would:

  • Eliminate the per-node boilerplate
  • Cover all nodes automatically (including future ones)
  • Remove the risk of forgetting to add trace context to new nodes
  • Work with ContainerRunner-based nodes without changes

3. Field naming inconsistency

TracingField.WORKFLOW_STEP resolves by reading state["current_node"] but produces a metadata key called "workflow_step". Same for current_reporepo, current_pr_numberpr_number. The resolvers work correctly, but the mismatch between config names, Langfuse keys, and actual state keys is confusing. Consider naming TracingField members to match their state key names (e.g., CURRENT_NODE instead of WORKFLOW_STEP).

@danchild danchild force-pushed the feature-metadata-tags-config branch from bc14b93 to 1bcdf07 Compare June 8, 2026 14:40
danchild added a commit to danchild/forge that referenced this pull request Jun 8, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
danchild added a commit to danchild/forge that referenced this pull request Jun 8, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from 9da99f6 to 6eb63de Compare June 8, 2026 17:19
danchild added a commit to danchild/forge that referenced this pull request Jun 8, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from 6eb63de to 2b787f3 Compare June 8, 2026 17:32
@danchild

danchild commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Hi @eshulman2 thank you for the feedback. I've push a number of changes that address all of your concerns. Please let me know if you have questions.

danchild added a commit to danchild/forge that referenced this pull request Jun 8, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from 2b787f3 to 8e82dcb Compare June 8, 2026 18:07
danchild added a commit to danchild/forge that referenced this pull request Jun 8, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from 8e82dcb to 3c11e12 Compare June 8, 2026 18:12
@eshulman2

eshulman2 commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Claude Code review

All three concerns from the original review are addressed: bug workflow nodes are now covered via the orchestrator worker, per-node context boilerplate is eliminated, and TracingField members are renamed to match state key names (CURRENT_NODE, REPO, PR_NUMBER).

Found 2 issues in the refactored code:

  1. if value: should be if value is not None: — empty-string resolved values (e.g., ticket_key="") are silently dropped with no warning, making misconfigured state invisible in traces

for field in settings.trace_tag_fields:
value = resolve_field(field, state)
if value:
tags.append(value)
metadata: dict[str, Any] = {}
for field in settings.trace_metadata_fields:
value = resolve_field(field, state)
if value:
metadata[field.value] = value
return tags, metadata

  1. "ticket_key": ticket_key or "" in regenerate_with_feedback puts an empty string into task_context when ticket_key=None. Since run_task merges explicit context over trace context (merged = {**get_trace_context(), **(context or {})}), this overwrites the valid ticket_key from the orchestrator's trace context with "", silently breaking Langfuse session tracking for that invocation. Fix: only include ticket_key in task_context when it is not None.

logger.info(f"Regenerating {content_type} with feedback using Deep Agents")
task_context = {
"is_revision": True,
"ticket_key": ticket_key or "",
}
result = await self.run_task(
task=skill_name,
prompt=prompt,
context=task_context,
)

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

danchild added a commit to danchild/forge that referenced this pull request Jun 11, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from 3c11e12 to ede1ba3 Compare June 11, 2026 18:42
danchild added a commit to danchild/forge that referenced this pull request Jun 11, 2026
Addresses review feedback from forge-sdlc#53:
- Resolve tags/metadata once in the worker and pass via state/config,
  eliminating copy-pasted per-node context dicts across 7 feature nodes
- Extend coverage to ContainerRunner-based bug workflow nodes (triage,
  rca_analysis, plan_bug_fix) so all nodes get trace enrichment
- Fix TracingField naming to match state keys (WORKFLOW_STEP → CURRENT_NODE,
  etc.) for consistency between config names, Langfuse keys, and state keys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dan Childers <dchilder@redhat.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from b47dbef to 27e6c43 Compare June 11, 2026 19:14
@danchild

Copy link
Copy Markdown
Contributor Author

Hi @eshulman2 - thank you for the feedback. I made the changes and also made a few edits to .env.example to include better documentation and make sure the new env vars are empty by default

@eshulman2 eshulman2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last thing that is missing IMHO is docs, please update the config reference and the developer guide with information regarding this change

@danchild

Copy link
Copy Markdown
Contributor Author

Good idea, @eshulman2 - I just added documentation as described and pushed the changes. Please let me know what you think

@eshulman2 eshulman2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you removed some context from some nodes, why is that?

Comment thread src/forge/workflow/nodes/code_review.py Outdated
task="sync-pr-description",
prompt=prompt,
context={"owner": owner, "repo": repo, "pr_number": pr_number},
context={

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for this change?

Comment thread src/forge/workflow/nodes/epic_decomposition.py
@eshulman2

eshulman2 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

I believe this might give some context on the issues I mentioned in the review.

The trace context should not be merged into the agent system prompt. Agent prompts and skills should be isolated from the tags/metadata we collect for observability.

The problem is in run_task() — the trace context (set via set_trace_context) is merged into merged which then feeds both resolve_trace_fields() and the system prompt context block. This causes observability fields like is_blocked, ci_fix_attempts, ai_review_status, revision_requested, etc. to appear in every agent's system prompt regardless of whether they're relevant to the task.

The merge was introduced because ticket_key was removed from explicit node context dicts (reclassified as a trace field), and merging the trace context back into the prompt was the way to restore it. But ticket_key is domain context the agent needs — it shouldn't have been removed from the explicit context in the first place.

Suggested fix: keep the two concerns separate in run_task():

  • System prompt gets only the explicit context dict passed by the caller (domain context)
  • Langfuse fields are resolved from get_trace_context() alone, independently
# Domain context → system prompt only
if context:
    system_prompt += "\n\nContext:\n"
    for key, value in context.items():
        if value is not None:
            system_prompt += f"- {key}: {value}\n"

# Trace context → Langfuse only, never touches the prompt
trace_state = {**get_trace_context(), "system_prompt_length": len(system_prompt), "llm_model": ...}
trace_tags, trace_metadata = resolve_trace_fields(trace_state)

Nodes that need ticket_key in the prompt should pass it explicitly in their domain context dict — as they did before this PR.

@eshulman2

Copy link
Copy Markdown
Collaborator

A few other things worth addressing:

event_type is never written to workflow state. BaseState gets a new event_type: str field and there's a resolver for it, but event_type lives on QueueMessage — it's never written into the LangGraph state dict. So _resolve_event_type will always return None. Either write it to state at workflow start, or remove the field from the resolver until it has a real source.

The system_prompt_length missing warning is presumptuous. trace_metadata_fields warns at startup when Langfuse is enabled but system_prompt_length isn't in LANGFUSE_TRACE_METADATA. The whole point of this PR is opt-in configuration — treating one specific field as implicitly required contradicts that. If the operator intentionally left it out, they still get a warning. Just remove it.

parse_trace_fields naming is inverted. Called with allow_tags=True for the tags config and allow_tags=False for metadata — but allow_tags=False means "don't enforce tag eligibility" (i.e. allow everything). The boolean reads backwards. tags_only=True/False or enforce_tag_eligibility would be clearer.

ContextVar is never reset after workflow completion. set_trace_context is called before ainvoke but there's no cleanup after it returns. In the worker's long-running loop the same asyncio task context could carry a previous run's trace fields into the next invocation. Reset to {} after each ainvoke call.

Duplicate fields are silently accepted. LANGFUSE_TRACE_TAGS=ticket_key,ticket_key produces duplicate tags. A quick dedup pass in parse_trace_fields before returning would prevent it.

danchild and others added 2 commits June 18, 2026 11:28
- Remove inline "system_prompt_length"
- Create the mechanism to pass node state to context to allow writing
  configured metadata and traces to langfuse traces

Signed-off-by: Dan Childers <dchilder@redhat.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@eshulman2 eshulman2 force-pushed the feature-metadata-tags-config branch from 3434028 to 1a9a736 Compare June 18, 2026 09:19
…prompts

Grafana dashboards:
- Port forge-business, forge-engineering, forge-issue-detail dashboards
  into devtools/grafana/dashboards/
- Add datasource provisioning for ClickHouse (Langfuse), Prometheus, Redis
- Add compose.grafana.yml (standalone) and compose.langfuse-network.yml
  (optional overlay for self-hosted Langfuse) compose files
- Wire Grafana into docker-compose.yml and devtools/docker-compose.dev.yml
- Document required LANGFUSE_TRACE_TAGS/METADATA values for dashboard
  queries in .env.example, config reference, and developer guide
- Clarify compose.langfuse-network.yml as opt-in: running it without
  Langfuse up fails the whole stack; base commands now work without it

Trace context fix:
- Add trace_context parameter to ForgeAgent.run_task() — fields passed
  there go to Langfuse only and are never written to the system prompt
- Revert generate_prd/spec/epics/regenerate/answer_question to original
  minimal prompt contexts; workflow state trace fields forwarded via
  trace_context instead
- Fix sync_pr_description in code_review.py the same way

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@eshulman2 eshulman2 merged commit 6b81238 into forge-sdlc:main Jun 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants