Skip to content

feat(pepper): route Bedrock calls through tagged inference profiles [DEV-245]#42

Merged
brodkin merged 1 commit into
mainfrom
ryan/dev-245-bedrock-cost-attribution-profiles
May 10, 2026
Merged

feat(pepper): route Bedrock calls through tagged inference profiles [DEV-245]#42
brodkin merged 1 commit into
mainfrom
ryan/dev-245-bedrock-cost-attribution-profiles

Conversation

@brodkin

@brodkin brodkin commented May 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes DEV-245. Pepper currently runs both review and on-demand flows through the same Bedrock model ID, so Cost Explorer collapses them into one undifferentiated line item — we can't tell which flow drives spend.

This wires Pepper through two AWS Application Inference Profiles, both wrapping the existing us.anthropic.claude-sonnet-4-5-20250929-v1:0 system inference profile, each carrying its own cost-allocation tags:

Profile ARN Tags
pepper-pr-review arn:aws:bedrock:us-west-2:618640261060:application-inference-profile/cz21awrop223 Product=pepper, Mode=review
pepper-on-demand arn:aws:bedrock:us-west-2:618640261060:application-inference-profile/68jw718dw1jv Product=pepper, Mode=on-demand

Both profiles already exist in account 618640261060. The GitHubActions-ClaudeCode-Bedrock role's existing BedrockModelAccess policy already grants InvokeModel* on application-inference-profile/* — no IAM change required.

Workflow changes

  • New review_model and on_demand_model inputs default to the two profile ARNs.
  • Existing model input becomes an empty-default override that wins for both modes (back-compat for any caller currently setting model: — there are none in this repo's example).
  • New Resolve model for this run step picks the right ARN per resolved mode.
  • claude_args: --model reads from the resolved value.

Follow-up (not in this PR)

  • Cost allocation tag activation. AWS only registers tag keys after a billable resource emits them, so aws ce update-cost-allocation-tags-status errored with Tag keys not found. After this lands and the first review runs, activate Product and Mode in Billing → Cost Allocation Tags. Cost Explorer will populate ~24h later.

Test plan

  • PR triggers Pepper auto-review against itself; review-mode profile ARN appears in Resolve model for this run step output
  • @pepper comment triggers on-demand mode; on-demand profile ARN appears in resolution step output
  • After ~24h, Product and Mode tag keys are selectable in Billing → Cost Allocation Tags
  • After ~24h post-activation, Cost Explorer grouped by Mode shows two non-zero series for Bedrock spend

…DEV-245]

Adds per-mode AWS Application Inference Profiles so we can attribute
Bedrock spend to review vs on-demand flows in Cost Explorer:

- review_model defaults to pepper-pr-review profile (Mode=review)
- on_demand_model defaults to pepper-on-demand profile (Mode=on-demand)
- model input becomes an opt-in override that wins for both modes

Both profiles wrap the same Sonnet 4.5 system inference profile that
was previously used directly, so behavior is unchanged — only billing
attribution changes. Cost allocation tag activation in AWS Billing is
follow-up: the tags only register after the first invocation through
each profile.
@pepper-pr-review pepper-pr-review Bot added the pepper-cooking Pepper is currently reviewing this PR label May 10, 2026

@pepper-pr-review pepper-pr-review Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified against DEV-245 — aligned. The workflow now routes review-mode and on-demand-mode Bedrock calls through separate Application Inference Profiles tagged Mode=review and Mode=on-demand, so Cost Explorer can finally tell you which flow is eating the budget. The three-input hierarchy (override wins, else per-mode default) is clearly documented and the resolution step at :348–367 branches exactly right. Test plan is observational (check the resolution logs, wait 24h for billing to populate) — correct shape for a config-wiring change where the proof is in the spend attribution downstream.

Yours,
Pepper

When you're ready for another look, drop a comment with @pepper review.

@pepper-pr-review pepper-pr-review Bot added pepper-approved Pepper approved this PR and removed pepper-cooking Pepper is currently reviewing this PR labels May 10, 2026
@brodkin brodkin merged commit d0fdf8d into main May 10, 2026
1 check passed
@brodkin brodkin deleted the ryan/dev-245-bedrock-cost-attribution-profiles branch May 10, 2026 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pepper-approved Pepper approved this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant