Skip to content

fix(ci): record and use the PR number to post the comment#286

Open
upils wants to merge 5 commits into
canonical:mainfrom
upils:fix/comment-posting
Open

fix(ci): record and use the PR number to post the comment#286
upils wants to merge 5 commits into
canonical:mainfrom
upils:fix/comment-posting

Conversation

@upils
Copy link
Copy Markdown
Collaborator

@upils upils commented Apr 8, 2026

  • Have you signed the CLA?

Properly get the PR number associated to the benchmark run to post the comment
in the right place.
The approach tries to limit injection risk, applying the strategy recommended in
https://securitylab.github.com/resources/github-actions-untrusted-input

Tested with upils#12 after merging this change in my fork.

@upils upils added Simple Nice for a quick look on a minute or two Bug An undesired feature ;-) labels Apr 8, 2026
@upils upils requested a review from cjdcordeiro April 8, 2026 12:52
Copy link
Copy Markdown

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a glance I think these changes would introduce an exploit to enable an attacker access to a trusted workflow environment with access to secrets?

I've linked an alternative approach that demonstrates how to get a PR number more safely.

Comment thread .github/workflows/comment-perf.yaml Outdated
Comment thread .github/workflows/performance.yaml Outdated
Comment thread .github/workflows/performance.yaml Outdated
@upils upils requested a review from letFunny April 21, 2026 07:17
Copy link
Copy Markdown

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short version:

  • ⚠️ Careful with expression use for action inputs like run. The contents is interpolated by GHA before processed by the action (in this case shell script execution). An attacker can manipulate their branch name to compromise this workflow and gain access to secrets.
  • I've provided two suggested revisions with notes, but the gh pr list (GraphQL) suggestion is more strict on searching for the intended PR and should be less at risk to trigger secondary rate limits (but for this repo I don't think you have the activity for that to be a real concern).

uses: mshick/add-pr-comment@dd126dd8c253650d181ad9538d8b4fa218fc31e8 # v2
with:
message-path: benchmark-report.txt
issue: ${{ github.event.workflow_run.pull_requests[0].number }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for context of this removal for anyone interested, github.event.workflow_run.pull_requests is only present when the PR is not a branch from a fork of the repo IIRC, so it's an unreliable source.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing something the context for this PR... how come we have the PR at hand and not the number?

This whole workflow is conditioning on github.event.workflow_run.event == 'pull_request'.

I realize that this is all being done in the name of security, but the amout of fiddling with stuff that supposedly should be simple really makes it feel like we'll see more serious issues soon.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing something the context for this PR... how come we have the PR at hand and not the number?

This workflow is triggered by the other one and is receiving a github.event.workflow_run payload. However github.event.workflow_run.pull_requests is empty when triggered from a fork. This issue was raised to GitHub multiple times, but AFAIK they never really answered nor addressed it. So we have to work around that and resort to a more convoluted method to get the PR number.

Comment thread .github/workflows/comment-perf.yaml
Copy link
Copy Markdown
Collaborator

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions only. Sad state of affairs for Github, honestly. Thanks for the work here!

Comment thread .github/workflows/comment-perf.yaml Outdated
Comment thread .github/workflows/comment-perf.yaml
Copy link
Copy Markdown
Collaborator

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Paul, we discussed offline and this looks like one of the safest and simplest approaches. It is a bit sad that:

  1. There is not a good way in Github to do something as trivial as posting a comment securely.
  2. There is no standardized action in Canonical that has been vetted by security.

For now, let's move forward with this.

Comment thread .github/workflows/comment-perf.yaml Outdated
Comment thread .github/workflows/comment-perf.yaml Outdated
uses: mshick/add-pr-comment@dd126dd8c253650d181ad9538d8b4fa218fc31e8 # v2
with:
message-path: benchmark-report.txt
issue: ${{ github.event.workflow_run.pull_requests[0].number }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing something the context for this PR... how come we have the PR at hand and not the number?

This whole workflow is conditioning on github.event.workflow_run.event == 'pull_request'.

I realize that this is all being done in the name of security, but the amout of fiddling with stuff that supposedly should be simple really makes it feel like we'll see more serious issues soon.

@polarathene
Copy link
Copy Markdown

@niemeyer If you are aware of a better way to approach this, let me know. Review already addressed two concerns where the contributor could have gained access to secrets 😓

If you would like to simplify it in this case you could just upload the PR number from pull_request to the workflow_run add that to the GITHUB_OUTPUT, but be mindful if anyone were to maintain the workflow or another like it and introduce another output, the PR author would be able to manipulate that.

Below is just extra context on the security decisions if it's helpful.


+ uses: mshick/add-pr-comment@dd126dd8c253650d181ad9538d8b4fa218fc31e8 # v2
  with:
-   message-path: benchmark-report.txt
-   issue: ${{ github.event.workflow_run.pull_requests[0].number }}

I'm missing something the context for this PR... how come we have the PR at hand and not the number?

As mentioned in earlier review comment (which yours appears associated to but Github shows it disjoint on my end at least), the information is not available from PRs coming from other forks.

I realize that this is all being done in the name of security, but the amout of fiddling with stuff that supposedly should be simple really makes it feel like we'll see more serious issues soon.

It's an unfortunate issue with public CI running with third-party contributions that can add untrusted code (a contributor can modify the pull_request event workflow trigger in their own PR and Github Actions will run that modification).

Thus if you need access to secrets, you must do this fiddling to volley over into a trusted workflow.

For a common scenario like needing a PR number as you switch between the untrusted pull_request context to a trusted context in a separate workflow, this is indeed frustrating and has security risks as discussed in this PR that can be easy to make. Until Github resolves it, you either do it this way or adopt an alternative CI.

If you'd like me to direct you to advice from the Github Security Labs team, let me know.


It's possible to do an alternative with the pull_request_target event, which will not run any modified version on the PR branch, but will run with access to secrets.

pull_request_target is strongly discouraged however, since you are running the workflow on PR content with access to secrets if you are likewise not careful, that can be exploited (recently this happened to the trivy project (a popular security scanner) IIRC that used pull_request_target, and as a result their github action was compromised that all projects that ran that newer version would likewise be compromised by running malware in the downstream projects trusted workflows using that action 😓).

The approach taken in this PR here is much safer as only the PR number is needed to add a comment (paired with a markdown document that the PR author could manipulate the file content of). pull_request_target can have other risks like cache poisoning IIRC.

The approach taken here is more important if you did have execution of untrusted code occurring (which you'd handle that in the pull_request workflow), and the workflow_run workflow would strictly not run any untrusted code.

# contributor's control), so store it in a variable to avoid shell
# injection.
# In the unlikely case where multiple PRs, on the same branch, from the
# same author exists, the most recent one is selected.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean we are silently injecting wrong data to the PR?

Can we come up with a better plan here? As mentioned before, this is not looking great.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean we are silently injecting wrong data to the PR?

Sorry it's not clear which part of the highlighted comments you're referring to?:

  • Shell injection? No the comments are informing the maintainer how that is being avoided (so that no future edit unknowingly introduces a vulnerability by inlining third-party controlled data).
  • Selecting the most recent PR? That is best effort and realistically should not happen, but if someone had malicious intent they may attempt to automate a separate PR to be selected instead (which could technically be done to produce a single result in the same way, so long as the timing is right AFAIK).

So it's preventative to the extent that is possible. The exploit is avoided, but technically getting the workflow to target a different PR is viable AFAIK. In this case it's just the posting of a comment of benchmark report to a PR... not great if it's the wrong PR but not necessarily bad either?

The extra care and verbosity in commentary however is preventative towards future maintainers introducing other workflows that may introduce more risk if manipulated.

At least I assume that referencing and trusting a vetted workflow in the repo is far more likely (by said future maintainer/reviewer) than them investing time to understand and be confident in the details, so it's best to account for those risks and surface extra context for awareness (as digging into history of discussion/decisions like in this PR via git blame is far more time intensive).

Copy link
Copy Markdown

@polarathene polarathene Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we come up with a better plan here?

The solution is quite terse (commentary / context aside).

I'm not sure how you want to reduce that further without introducing the security risks?

You can certainly add more complexity, such as adding labels for filter include/exclude criteria, or using manual workflow triggers and reviewers providing the input (again manually).

As the task itself is just to post a comment with text (which the contributor has full control of due to pull_request to a PR), you can choose to keep it simple.

  • The proposal here is only to prevent a bad actor from controlling the PR number (and avoiding earlier accidents that would have compromised the repo).
  • If in future a workflow requires a similar untrusted to trusted context, but the consequences are worse, you would go through all this again to enable it securely. The same mistakes can happen and not be caught by review.

Overview - Workflow flow and challenges faced

Let's just quickly go over the whole scenario:

  1. pull_request workflow (untrusted context) produces output (a benchmark report) that should be posted as a comment on that PR.
  2. Automating a comment requires a trusted context (and thus a separate workflow) to grant permission via secrets / token.
  3. The trusted workflow (in this case workflow_run, which is triggered upon the pull_request workflow completing) lacks the context to identify the correct PR with certainity (only applies when PR source is from a repo fork).
  • This is a fault of the GitHub team, which has done nothing to resolve the concern over the years they've been aware of it. If you'd like me to direct you to advice from the Github Security Labs team on this particular workflow workaround, let me know.
  • Therefore you must reconstruct this information as best you can with the information you do have available. However you must be very careful when that information could be exploited by a third-party.
  1. Query for PRs at the repo with constraints on context available to the workflow. Ideally this is a single PR, but in the odd event it is not and is genuine, the first result will represent the most recently updated PR (which should be correct, assuming no malicious activity to game that).

Beyond that you can add untrusted input from the pull_request workflow to provide the PR number as additional validation. Just be mindful that the attacker can manipulate pull_request as they please, so while it can help avoid extremely rare false-positives, it adds nothing to the preventative security aspect.

Overview - Query for the PR number via gh pr list

A GraphQL query using the available context for the workflow to be as precise as possible at narrowing down the PR number:

  • head:${{ github.event.workflow_run.head_branch }}
    • PR branch name defined by the contributor.
    • Default via GitHub web UI is patch-1.
    • A PR author can raise multiple PRs with this same branch name, and may have already done so in the past, and could do so in future (notable while a workflow run is in progress).
  • sort:updated-desc
    • In that unlikely event the author has existing PRs to the project with the same branch and one of those were updated instead, this constraint prevents mistakenly selecting the newer PR by creation date.
  • author:${{ github.event.workflow_run.head_repository.owner.login }}
    • As head only filters by branch name (eg: the common patch-1), constraining by author is important.
  • ${{ github.event.workflow_run.head_sha }}
    • IIRC, this only ensures that commit SHA that triggered the workflow from the PR exists in any of the PRs. Thus it could technically match more than one PR.

gh pr list is then invoked with those search parameters and a JQ query to extract the PR number from the results or trigger an error to bail early if that search failed.

It is effectively this:

gh pr list --repo "${{ github.repository }}" --state all --search "${QUERY_PR}" \  
  --json number --jq "${JQ_FILTER}" >> "${GITHUB_OUTPUT}"

Inlined QUERY_PR + JQ_FILTER for clarity:

gh pr list \
  --repo "${{ github.repository }}" \
  --state all \
  --search '
    head:${{ github.event.workflow_run.head_branch }}
    sort:updated-desc
    author:${{ github.event.workflow_run.head_repository.owner.login }}
    ${{ github.event.workflow_run.head_sha }}
  ' \
  --json number --jq '
    .[0]
    | if (.number == null) then error("Cannot find PR number") end
    | "number=\(.number)"
  ' \
  >> "${GITHUB_OUTPUT}"

NOTE: As detailed in comments, ENV are used not only to make that expanded version more compact to be easier to digest, but to prevent an attacker from performing shell-injection in the run (_since the ${{ ... }} are interpolated prior to parsing the YAML and executing run.

Copy link
Copy Markdown

@polarathene polarathene Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: Please clarify the direction you want to take:

  • Only allow benchmarks to be posted to be posted by members that create branches on the repo.
  • Take preventative measures to minimize PR number manipulation.
  • Do the bare minimum, permit PR number manipulation.

If you want to pursue the preventative approach:

  • What exactly doesn't sit well with the step in question?
  • What would a more ideal approach look like to you given the constraints (requiring a workaround due to GH not providing what is needed)?
    • The noise can be shifted away from the workflow to live somewhere else
    • You could minimize the logic with pull_request_target, which may feel good to look at as a diff but more prone to compromising security (which this team unfortunately lacks timely access to the needed security expertise for this to even be worth entertaining?).
- name: Get PR number
  env:
    GH_TOKEN: # ...
    QUERY_PR: # ...
    JQ_FILTER: # ...
  run: gh pr list --repo "${{ github.repository }}" --state all --search "${QUERY_PR}" --json number --jq "${JQ_FILTER}" >> "${GITHUB_OUTPUT}"

That isn't that complicated to grok, especially with the detailed breakdown provided.

It should not require further modification, but if that were necessary, you can always choose to communicate very clearly inline for reviewers to be very careful here (with git blame leading to this PR which is abundant in context).


Original comment

As mentioned before, this is not looking great.

You could:

  • Revert to the original proposal if that sits better with you?
  • Decide that posting a benchmark report comment in PRs should be exclusive to branches within the source repo only.
  • Accept the workaround provided it's documented clear and sufficiently (_advisable, given risks caught here may be missed in future_).
  • Hope someone else eventually chimes in with something "better" (I've invested far more time than I'd like on the topic, I'm doubtful unless you've got access to a specialist or convince GH to resolve it).

Or continue to express concerns (1, 2), whilst ignoring my attempt to engage 🤷‍♂️ (I realize I'm incredibly verbose this 2nd time around, but hopefully it's structured in a way that's easier to digest?)

I understand your time is limited, so that you may only be able to glance over the PR diff itself (and maybe skim the discussion for some context).

I hope my replies this time make that skimming easier, such that you're able to ask more meaningful questions regarding how the concern could be approached better, or what tradeoffs you're willing to make instead.

Original Proposed change

The original PR commit effectively added these changes:

# ‎.github/workflows/performance.yaml
- name: Save PR number
  run: echo "${{ github.event.pull_request.number }}" > pr-number

# ‎.github/workflows/comment-perf.yaml
- name: Get PR number
  id: get-pr-number
  run: echo "number=$(cat ./pr-number)" >> "${GITHUB_OUTPUT}"

Quite simple approach, it works good enough?

Broader context view of the integration (NOTE: content below from the referenced commit has been slightly adjusted to fit the current PR review feedback):

    • Added "Save PR number" step.
    • Additionally upload the pr-number file.

‎.github/workflows/performance.yaml: (pull_request, untrusted context)

- name: Run benchmark
  # ...

- name: Save PR number
  run: echo "${{ github.event.pull_request.number }}" > pr-number

- name: Upload result
  uses: actions/upload-artifact@v4
  with:
    name: benchmark-report
    path: |
      benchmark-report.md
      pr-number
    retention-days: 1
    • Added "Read PR number" step.
    • Removed path in "Download comment" (extract to CWD).
    • Swapped ${{ github.event.workflow_run.pull_requests[0].number }} for ${{ steps.get-pr-number.outputs.number }}.

‎.github/workflows/comment-perf.yaml: (workflow_run, untrusted context)

- name: Download comment
  uses: actions/download-artifact@v4
  with:
    name: benchmark-report
    github-token: ${{ secrets.ROCKSBOT_CHISEL_PR_COMMENTER }}
    run-id: ${{ github.event.workflow_run.id }}

- name: Get PR number
  id: get-pr-number
  run: echo "number=$(cat ./pr-number)" >> "${GITHUB_OUTPUT}"

- name: Post message to PR
  uses: mshick/add-pr-comment@dd126dd8c253650d181ad9538d8b4fa218fc31e8 # v2
  with:
    message-path: benchmark-report.md
    issue: ${{ steps.get-pr-number.outputs.number }}

My original feedback was mistakenly aimed at >> "${GITHUB_ENV}", which detailed and exploit, but since the environment wasn't being updated there was no issue here (beyond contributor manipulating PR number that the comment gets posted to).

The PR then tried to better defend against the PR number manipulation, but introduced a vulnerability that would have compromised the workflow (shell injection, enabling RCE). Which has since been resolved 👍


Diff - Single step to find PR number

For context, the only difference from the above snippets (in this comment) was to replace the "Save PR number" + "Get PR number" steps with:

‎.github/workflows/comment-perf.yaml:

- name: Get PR number
  id: get-pr-number
  env:
    GH_TOKEN: ${{ secrets.ROCKSBOT_CHISEL_PR_COMMENTER }}
    # Search constraints - Isolate the PR as best as possible, select number of first PR result.
    # SECURITY: Contributor controlled context expressions must evaluate here, not in `run`
    #           to avoid shell injection vulnerability.
    QUERY_PR: "head:${{ github.event.workflow_run.head_branch }} sort:updated-desc author:${{ github.event.workflow_run.head_repository.owner.login }} ${{ github.event.workflow_run.head_sha }}"
    # Filters and formats the JSON into a `key=value` string with basic error handling:
    JQ_FILTER: >-
      .[0]
      | if (.number == null) then error("Cannot find PR number") end
      | "number=\(.number)"
  run: |
    gh pr list --repo "${{ github.repository }}" --state all --search "${QUERY_PR}" \ 
    --json number --jq "${JQ_FILTER}" >> "${GITHUB_OUTPUT}"

No need to upload a potentially manipulated PR number from the contributor pull_request workflow 👍

NOTE: As noted in this feedback (end of comment), GH_TOKEN could be GH_TOKEN: ${{ github.token }} instead, you'd just need to scope the workflow token permissions like this:

permissions:
  contents: read
  pull-requests: read

That verbosity just better mitigates manipulation of the PR number as opposed to blindly trusting it. I'd love to know a better way but unless GH bothers to resolve it properly and include that in the workflow_run context when PRs are from forks, there's not much you can do. You can of course hide this verbosity away in a separate file to invoke as an action (or composite action if folding multiple steps).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug An undesired feature ;-) Simple Nice for a quick look on a minute or two

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants