Skip to content

authz: add onPolicyUpdate callback to authz file watcher#9142

Open
hnefatl wants to merge 1 commit into
grpc:masterfrom
hnefatl:onpolicyupdate
Open

authz: add onPolicyUpdate callback to authz file watcher#9142
hnefatl wants to merge 1 commit into
grpc:masterfrom
hnefatl:onpolicyupdate

Conversation

@hnefatl

@hnefatl hnefatl commented May 27, 2026

Copy link
Copy Markdown

Small additional capability, so user code can tell when a new policy has been loaded.

Our current usecase is updating an opentelemetry metric to reflect the policy version (~= file mtime). We could just run a goroutine polling the file separately, but then there's no guarantee that the authz policy has actually been loaded - it could have failed parsing, or the process could be CPU-starved, etc.

RELEASE NOTES:

  • TBD

@linux-foundation-easycla

linux-foundation-easycla Bot commented May 27, 2026

Copy link
Copy Markdown

CLA Signed
The committers listed above are authorized under a signed CLA.

  • ✅ login: hnefatl / name: Keith Collister (bc68034)

@hnefatl hnefatl force-pushed the onpolicyupdate branch 3 times, most recently from bc68034 to 183e653 Compare May 27, 2026 14:23
@codecov

codecov Bot commented May 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 78.57143% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.23%. Comparing base (5013974) to head (1bb5b67).

Files with missing lines Patch % Lines
authz/grpc_authz_server_interceptors.go 78.57% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9142      +/-   ##
==========================================
- Coverage   83.27%   83.23%   -0.04%     
==========================================
  Files         420      420              
  Lines       34022    34026       +4     
==========================================
- Hits        28331    28323       -8     
- Misses       4262     4271       +9     
- Partials     1429     1432       +3     
Files with missing lines Coverage Δ
authz/grpc_authz_server_interceptors.go 83.33% <78.57%> (+0.98%) ⬆️

... and 30 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hnefatl hnefatl marked this pull request as ready for review May 27, 2026 14:29
Comment thread authz/grpc_authz_server_interceptors.go Outdated
@easwars easwars added Type: Feature New features or improvements in behavior Area: Auth Includes regular credentials API and implementation. Also includes advancedtls, authz, rbac etc. labels May 27, 2026
@easwars easwars added this to the 1.82 Release milestone May 27, 2026
@hnefatl hnefatl force-pushed the onpolicyupdate branch 4 times, most recently from 4381232 to 056e67a Compare May 28, 2026 09:27
@easwars easwars assigned easwars and unassigned hnefatl Jun 3, 2026
@mbissa mbissa modified the milestones: 1.82 Release, 1.83 Release Jun 5, 2026
Comment thread authz/grpc_authz_server_interceptors.go Outdated
Comment thread authz/grpc_authz_server_interceptors.go Outdated
Comment thread authz/grpc_authz_server_interceptors.go Outdated
Comment thread authz/grpc_authz_server_interceptors.go Outdated
Comment thread authz/grpc_authz_server_interceptors.go Outdated
Comment thread authz/grpc_authz_server_interceptors_test.go Outdated
Comment thread authz/grpc_authz_server_interceptors_test.go Outdated
Comment thread authz/grpc_authz_server_interceptors_test.go Outdated
Comment thread authz/grpc_authz_server_interceptors_test.go Outdated
Comment on lines +148 to +151
close(updates)
if len(updates) != 0 {
t.Fatalf("expected exactly 2 updates in channel")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what this block of code is for.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent was to catch if the code calls the callback >1x per file modification.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace this block of code with the following

If we make updates to be a channel of size 1 as I recommended above, you could instead do the following here:

sCtx, sCancel := context.WithTimeout(ctx, 100 * time.Millisecond)
defer sCancel()
select {
  case <-updates:
    t.Fatal("OnPolicyUpdate callback invoked more times than expected")
  case <-sCtx.Done():
}

The above approach has the added benefit of waiting for a short amount of time to ensure that no additional callback invocations happen.

@easwars easwars assigned hnefatl and unassigned easwars Jun 5, 2026
@hnefatl hnefatl force-pushed the onpolicyupdate branch 3 times, most recently from dce75de to a7534e0 Compare June 10, 2026 14:25
@hnefatl hnefatl requested a review from easwars June 10, 2026 14:26
@easwars

easwars commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@hnefatl : Please don't mark review comment threads as resolved. It is the responsibility of the reviewer to do that. If the author marks them as "resolved", the reviewer would have to "unresolve" each of those to verify if the comments were addressed satisfactorily. Thanks for understanding.

@easwars easwars left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo minor nits.

Comment thread authz/grpc_authz_server_interceptors.go Outdated
// options.
func NewFileWatcherWithOptions(options FileWatcherOptions) (*FileWatcherInterceptor, error) {
if options.PolicyFile == "" {
return nil, fmt.Errorf("authorization policy file path is empty")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: While you are here, could you please add an "authz: " prefix to all error returned by this package from its exported functions? I can only see three of them now.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done+updated tests in a separate commit (iiuc they'll be merged into a single commit on main? but helps keep things distinct during review).

I kept the existing "exact error matching" rather than switching to e.g. regex matching errors because it was less disruptive, but lmk if you'd prefer switching to a fuzzier match.


select {
case <-ctx.Done():
t.Fatalf("timeout waiting for policy update")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/timeout/Timeout

The non-capitalization of error messages does not apply to test error strings and logs. See: https://google.github.io/styleguide/go/decisions#error-strings

Here and elsewhere in this test. Thanks.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ctx, cancel := context.WithTimeout(t.Context(), defaultTestTimeout)
defer cancel()

updates := make(chan string, 10)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a channel of size 1.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +148 to +151
close(updates)
if len(updates) != 0 {
t.Fatalf("expected exactly 2 updates in channel")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace this block of code with the following

If we make updates to be a channel of size 1 as I recommended above, you could instead do the following here:

sCtx, sCancel := context.WithTimeout(ctx, 100 * time.Millisecond)
defer sCancel()
select {
  case <-updates:
    t.Fatal("OnPolicyUpdate callback invoked more times than expected")
  case <-sCtx.Done():
}

The above approach has the added benefit of waiting for a short amount of time to ensure that no additional callback invocations happen.

@easwars easwars left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in the review.

LGTM, modulo minor nits

}

// FileWatcherOptions contains configuration options for the
// FileWatcherInterceptor.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please make this API experimental. We do this by adding the following block of docstring here:

// # Experimental
//
// Notice: This API is EXPERIMENTAL and may be changed or removed in a
// later release.

Comment on lines +120 to +122
// NewFileWatcherWithOptions returns a new FileWatcherInterceptor from a set of
// options.
func NewFileWatcherWithOptions(options FileWatcherOptions) (*FileWatcherInterceptor, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this one. Please make it experimental.

file := createTmpPolicyFile(t, "onpolicyupdate", []byte(content))
i, err := authz.NewFileWatcherWithOptions(authz.FileWatcherOptions{PolicyFile: file, RefreshDuration: time.Millisecond, OnPolicyUpdate: onPolicyUpdate})
if err != nil {
t.Fatalf("NewFileWatcherWithCallback() returned err: %v", err)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/NewFileWatcherWithCallback/NewFileWatcherWithOptions


content := `{"name": "foo1", "allow_rules":[{"name":"bar"}]}`
file := createTmpPolicyFile(t, "onpolicyupdate", []byte(content))
i, err := authz.NewFileWatcherWithOptions(authz.FileWatcherOptions{PolicyFile: file, RefreshDuration: time.Millisecond, OnPolicyUpdate: onPolicyUpdate})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please make a local var opts to hold the options and pass it to authz.NewFileWatcherWithOptions. That way, the options can be printed in the following t.Fatalf. See: https://google.github.io/styleguide/go/decisions#identify-the-input


select {
case <-ctx.Done():
t.Fatalf("timeout waiting for policy update")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/timeout/Timeout

@easwars

easwars commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

/gemini review

@easwars

easwars commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Moving to @mbissa for second set of eyes. I had some minor nits in my last pass, but looks mostly good to me.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request prefixes authorization error messages with "authz: " and introduces FileWatcherOptions to support an OnPolicyUpdate callback when the authorization policy is updated. Feedback on these changes includes replacing t.Context() with context.Background() in tests to maintain compatibility with older Go versions, documenting that the OnPolicyUpdate callback runs synchronously to prevent blocking the background refresh goroutine, and increasing the test's refresh duration from 1ms to 20ms to avoid high CPU usage and test flakiness.

}

func (s) TestOnPolicyUpdate(t *testing.T) {
ctx, cancel := context.WithTimeout(t.Context(), defaultTestTimeout)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using t.Context() requires Go 1.24 or later. Since grpc-go maintains compatibility with older Go versions (such as Go 1.22 and 1.23), using t.Context() will break compilation on those versions.

Please use context.Background() instead to ensure backward compatibility.

Suggested change
ctx, cancel := context.WithTimeout(t.Context(), defaultTestTimeout)
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)

Comment on lines +108 to +110
// OnPolicyUpdate is a callback to be invoked when a policy is
// loaded/updated. The loaded policy string is passed as an argument.
OnPolicyUpdate func(string)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since this callback is executed synchronously within the file watcher's background goroutine, any blocking or slow operations inside OnPolicyUpdate will delay subsequent policy refreshes.

It is highly recommended to document this behavior so users know they should keep the callback fast and non-blocking (or offload heavy work to a separate goroutine).

	// OnPolicyUpdate is a callback to be invoked when a policy is
	// loaded/updated. The loaded policy string is passed as an argument.
	//
	// Note: This callback is executed synchronously on the file watcher's
	// background goroutine. To avoid blocking or delaying subsequent policy
	// refreshes, the callback should be non-blocking and fast. If any
	// slow or blocking operations are needed, they should be run in a
	// separate goroutine.
	OnPolicyUpdate func(string)


content := `{"name": "foo1", "allow_rules":[{"name":"bar"}]}`
file := createTmpPolicyFile(t, "onpolicyupdate", []byte(content))
i, err := authz.NewFileWatcherWithOptions(authz.FileWatcherOptions{PolicyFile: file, RefreshDuration: time.Millisecond, OnPolicyUpdate: onPolicyUpdate})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a RefreshDuration of time.Millisecond (1ms) is extremely aggressive and causes the background goroutine to poll the file 1000 times per second. This can lead to high CPU usage and test flakiness, especially on resource-constrained CI environments.

Consider increasing this to a more reasonable value like 20 * time.Millisecond.

Suggested change
i, err := authz.NewFileWatcherWithOptions(authz.FileWatcherOptions{PolicyFile: file, RefreshDuration: time.Millisecond, OnPolicyUpdate: onPolicyUpdate})
i, err := authz.NewFileWatcherWithOptions(authz.FileWatcherOptions{PolicyFile: file, RefreshDuration: 20 * time.Millisecond, OnPolicyUpdate: onPolicyUpdate})

Signed-off-by: Keith Collister <kcollister@google.com>
@hnefatl

hnefatl commented Jul 1, 2026

Copy link
Copy Markdown
Author

Addressed comments from Easwar+Gemini.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Auth Includes regular credentials API and implementation. Also includes advancedtls, authz, rbac etc. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants