Skip to content

🐛 webhook: clarify CertDir default and wrap serving-cert load error#3500

Open
alliasgher wants to merge 2 commits into
kubernetes-sigs:mainfrom
alliasgher:fix-webhook-certdir-error
Open

🐛 webhook: clarify CertDir default and wrap serving-cert load error#3500
alliasgher wants to merge 2 commits into
kubernetes-sigs:mainfrom
alliasgher:fix-webhook-certdir-error

Conversation

@alliasgher

Copy link
Copy Markdown

Summary

Addresses the two small, agreed-upon asks from #900:

  1. Better docs. The `CertDir` default — `/k8s-webhook-server/serving-certs` — depends on `os.TempDir()`, which is `/tmp` on most Linux installs but is `TMPDIR`-dependent elsewhere. Maintainer asked to keep the default stable but update the comment to explain this and to nudge operators toward setting CertDir explicitly (usually to a mounted Secret path). Done.
  2. Helpful error. When `certwatcher.New` fails because the certs weren't where `CertDir` was pointed, users got a bare `open ...: no such file or directory`. Wrap the error so it names the paths that were attempted, prints the effective `CertDir`/`CertName`/`KeyName`, and points at `webhook.Server.Options.CertDir` as the thing to fix. The underlying error is preserved via `%w`.

Before

```
open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory
```

After

```
failed to load serving cert from "/tmp/k8s-webhook-server/serving-certs/tls.crt" / "/tmp/k8s-webhook-server/serving-certs/tls.key" — did you mount the certificate files at webhook.Server.Options.CertDir? (CertDir="/tmp/k8s-webhook-server/serving-certs", CertName="tls.crt", KeyName="tls.key"): open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory
```

Fixes #900

Tests

`go build ./pkg/webhook/...` and `go vet ./pkg/webhook/...` pass. No behavior change except the added context on the error path.

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

Welcome @alliasgher!

It looks like this is your first PR to kubernetes-sigs/controller-runtime 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/controller-runtime has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 13, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alliasgher
Once this PR has been reviewed and has the lgtm label, please assign sbueringer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

Hi @alliasgher. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 13, 2026
@alliasgher alliasgher changed the title webhook: clarify CertDir default and wrap serving-cert load error 🐛 webhook: clarify CertDir default and wrap serving-cert load error Apr 14, 2026
@troy0820

Copy link
Copy Markdown
Member

/ok-to-test
/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 14, 2026
@alliasgher

Copy link
Copy Markdown
Author

/retest

1 similar comment
@sbueringer

Copy link
Copy Markdown
Member

/retest

The CertDir default — <temp-dir>/k8s-webhook-server/serving-certs — is
driven by os.TempDir(), which on non-Linux systems (and Linux systems
with TMPDIR set) is not /tmp. When a user misplaces the serving cert,
certwatcher.New returns a bare "no such file or directory" error with
no hint that the path is webhook-specific and no pointer to the option
they need to set.

- Expand the CertDir doc comment to describe the defaulting behavior
  and recommend operators set it explicitly (typically to the path of
  a mounted Secret) rather than rely on the temp-dir default.
- Wrap the certwatcher.New error so the resulting message names the
  paths that were attempted, includes the CertDir / CertName / KeyName
  values, and points at webhook.Server.Options.CertDir as the place to
  fix things.

Fixes kubernetes-sigs#900

Signed-off-by: alliasgher <alliasgher123@gmail.com>
@alliasgher alliasgher force-pushed the fix-webhook-certdir-error branch from e31aa60 to 0f33ec9 Compare June 14, 2026 11:16
@sbueringer

Copy link
Copy Markdown
Member

Looks like this needs some test fixes, PTAL

The builder webhook tests tolerate a missing serving cert when starting
the server via os.IsNotExist(err). Now that Start wraps the cert-load
failure with fmt.Errorf(...: %w), os.IsNotExist no longer matches (it
does not unwrap). Switch the checks to errors.Is(err, fs.ErrNotExist),
which unwraps the chain and still matches the underlying os.PathError.

Signed-off-by: alliasgher <alliasgher123@gmail.com>
@alliasgher

Copy link
Copy Markdown
Author

Thanks @sbueringer. Fixed in 0e057fb — the builder webhook tests gate on os.IsNotExist(err) to tolerate the missing serving cert, but os.IsNotExist doesn't unwrap, so the new fmt.Errorf(...: %w) wrapper no longer matched. Switched those checks to errors.Is(err, fs.ErrNotExist), which unwraps and still matches the underlying os.PathError. TestBuilder passes locally with envtest.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 15, 2026
@sbueringer

sbueringer commented Jun 18, 2026

Copy link
Copy Markdown
Member

Thanks @sbueringer. Fixed in 0e057fb — the builder webhook tests gate on os.IsNotExist(err) to tolerate the missing serving cert, but os.IsNotExist doesn't unwrap, so the new fmt.Errorf(...: %w) wrapper no longer matched. Switched those checks to errors.Is(err, fs.ErrNotExist), which unwraps and still matches the underlying os.PathError. TestBuilder passes locally with envtest.

So this PR would break folks that are using os.IsNotExist.

I'm not really sure if that change in error message is worth it. I don't think a lot of folks are using os.IsNotExist, but I also don't think that the current error reporting is a real problem that folks have.

@alvaroaleman WDYT?

@alvaroaleman

Copy link
Copy Markdown
Member

@alvaroaleman WDYT?

Yeah I wouldn't change it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ManagerOptions#CertDir default is confusing

5 participants