Skip to content

[fix][broker] Fix tenant admin authorization bug.#20068

Merged
nodece merged 6 commits into
apache:masterfrom
dragonls:master-fix-tenant-admin-auth
Apr 19, 2023
Merged

[fix][broker] Fix tenant admin authorization bug.#20068
nodece merged 6 commits into
apache:masterfrom
dragonls:master-fix-tenant-admin-auth

Conversation

@dragonls

@dragonls dragonls commented Apr 11, 2023

Copy link
Copy Markdown
Contributor

Fixes #20066

Motivation

Fixes the bug that producers/consumers will all disconnect while using tenant admin to produce/consume.

The root cause is that the permission check logic is not the same bewteen org.apache.pulsar.broker.service.ServerCnx and org.apache.pulsar.broker.service.persistent.PersistentTopic#onPoliciesUpdate.

In org.apache.pulsar.broker.service.ServerCnx, while handleProducer and handleSubscribe, the permission check will go to isTopicOperationAllowed and finally be processed by org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync, which will validateTenantAdminAccess. Everything is fine.

In org.apache.pulsar.broker.service.persistent.PersistentTopic#onPoliciesUpdate, it checks the permission by producer.checkPermissionsAsync and consumer.checkPermissionsAsync. Let's take the processing logic of the producer as an example. It will be processed by org.apache.pulsar.broker.authorization.AuthorizationService#canProduceAsync and finally org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#canProduceAsync. The tenant admin can not pass the check.

Modifications

Update org.apache.pulsar.broker.authorization.AuthorizationProvider#canProduceAsync to org.apache.pulsar.broker.authorization.AuthorizationProvider#allowTopicOperationAsync in org.apache.pulsar.broker.authorization.AuthorizationService#canProduceAsync. So do org.apache.pulsar.broker.authorization.AuthorizationService#canConsumeAsync and org.apache.pulsar.broker.authorization.AuthorizationService#canLookupAsync

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • Added and fixed the test in org.apache.pulsar.broker.auth.AuthorizationTest#simple
  • Added and fixed the test in org.apache.pulsar.websocket.proxy.ProxyAuthorizationTest

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: dragonls#7

@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Apr 11, 2023
@dragonls dragonls force-pushed the master-fix-tenant-admin-auth branch from 093e668 to 78d265f Compare April 11, 2023 12:48
@dragonls

dragonls commented Apr 12, 2023

Copy link
Copy Markdown
Contributor Author

In fact, there are many places in the current code that check the permission of super user or tenant admin:

  1. org.apache.pulsar.broker.authorization.AuthorizationProvider#isSuperUser in org.apache.pulsar.broker.authorization.AuthorizationService, including functions canProduceAsync, canConsumeAsync, canLookupAsync. No tenant admin check here.
  2. org.apache.pulsar.broker.authorization.AuthorizationProvider#isSuperUserOrAdmin in org.apache.pulsar.broker.authorization.AuthorizationService, including function allowSinkOpsAsync, allowSourceOpsAsync, allowFunctionOpsAsync.
  3. After this PR, org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#validateTenantAdminAccess in org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider, including functions canProduceAsync, canConsumeAsync. No tenant admin check in allowTheSpecifiedActionOpsAsync.

Maybe we need to uniformly add/update permission check wherever we need. And also remove some reduplicated check, for example:

  1. validateTenantAdminAccess in org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync
  2. isSuperUser in org.apache.pulsar.broker.authorization.AuthorizationService, including functions canProduceAsync, canConsumeAsync, canLookupAsync.

I am just afaid that there are still some authorization bug not found.

What do you think? @Technoboy- @nodece

@nodece

nodece commented Apr 12, 2023

Copy link
Copy Markdown
Member

I know these cases, I want to deprecate the canProduceAsync, canConsumeAsync, and canLookupAsync, use the allowTopicOperationAsync instead.

@dragonls

dragonls commented Apr 12, 2023

Copy link
Copy Markdown
Contributor Author

I know these cases, I want to deprecate the canProduceAsync, canConsumeAsync, and canLookupAsync, use the allowTopicOperationAsync instead.

Sounds good. I can do this after this PR merged.

@Technoboy-

Copy link
Copy Markdown
Contributor

I know these cases, I want to deprecate the canProduceAsync, canConsumeAsync, and canLookupAsync, use the allowTopicOperationAsync instead.

Sounds good. I can do this after this PR merged.

Yes, this will keep the logic the same. But we'd better discuss it in the dev mail list first.

@dragonls

Copy link
Copy Markdown
Contributor Author

Make sense. I also find some function in org.apache.pulsar.broker.authorization.AuthorizationService marked deprecated.
image

Maybe we can do the same thing.

@nodece

nodece commented Apr 12, 2023

Copy link
Copy Markdown
Member

Thank @dragonls for your contribution, I made #19461 to that before.

@dragonls

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter

codecov-commenter commented Apr 12, 2023

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.96%. Comparing base (dc1cdac) to head (44bf9fe).
⚠️ Report is 2644 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #20068       +/-   ##
=============================================
+ Coverage     24.26%   72.96%   +48.69%     
- Complexity      294    31823    +31529     
=============================================
  Files          1609     1868      +259     
  Lines        125669   138332    +12663     
  Branches      13707    15220     +1513     
=============================================
+ Hits          30490   100930    +70440     
+ Misses        90689    29390    -61299     
- Partials       4490     8012     +3522     
Flag Coverage Δ
inttests 24.13% <0.00%> (-0.13%) ⬇️
systests 24.78% <0.00%> (?)
unittests 72.25% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...sar/broker/authorization/AuthorizationService.java 59.75% <100.00%> (+54.55%) ⬆️

... and 1590 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@michaeljmarshall michaeljmarshall left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to change the canProduceAsync and canConsumeAsync methods. Instead, we can change the calling code. Let me know what you think.

@nodece nodece left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the test to fail. You also need to modify the org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync to avoid a dead loop.

@dragonls, Sorry for providing an incorrect review.

@dragonls

Copy link
Copy Markdown
Contributor Author

I expect the test to fail. You also need to modify the org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync to avoid a dead loop.

I didn't find dead loop. Can you be more specific?

@nodece

nodece commented Apr 17, 2023

Copy link
Copy Markdown
Member

I expect the test to fail. You also need to modify the org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync to avoid a dead loop.

I didn't find dead loop. Can you be more specific?

Sorry that I provide an incorrect review.

@nodece nodece left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nodece nodece requested a review from Technoboy- April 18, 2023 09:36
@nodece

nodece commented Apr 18, 2023

Copy link
Copy Markdown
Member

/pulsarbot rerun-failure-checks

@nodece

nodece commented Apr 19, 2023

Copy link
Copy Markdown
Member

Thank @dragonls for your contribution!

@nodece nodece merged commit fc17c1d into apache:master Apr 19, 2023
@nodece nodece added this to the 3.1.0 milestone Apr 19, 2023
michaeljmarshall added a commit to michaeljmarshall/pulsar that referenced this pull request Apr 19, 2023
@michaeljmarshall

Copy link
Copy Markdown
Member

@nodece @dragonls - I don't think we should have taken this direction with the PR. I propose an alternative here: #20142.

michaeljmarshall added a commit to michaeljmarshall/pulsar that referenced this pull request Apr 19, 2023
michaeljmarshall added a commit to michaeljmarshall/pulsar that referenced this pull request Apr 19, 2023
michaeljmarshall added a commit that referenced this pull request Apr 20, 2023
…20142)

Fixes: #20066

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, this change helps us move in the right direction.

### Modifications

* Update `Producer` and `Consumer` in broker to call the `AuthorizationService#allowTopicOperationAsync` method.

### Verifying this change

This change is trivial.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.
michaeljmarshall added a commit that referenced this pull request Apr 20, 2023
…#20143)

This reverts commit fc17c1d.

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think this approach could have unintended consequences. Instead, I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method. I propose an update to the `Consumer` and `Producer` logic here #20142.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, I think we should not change their implementations.

### Modifications

* Revert #20068

### Verifying this change

This change is trivial. It removes certain test changes that were only made to make the previous PR work.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.
@dragonls

dragonls commented Apr 20, 2023

Copy link
Copy Markdown
Contributor Author

@nodece @dragonls - I don't think we should have taken this direction with the PR. I propose an alternative here: #20142.

OK. I looked into all relative discussion, the right direction seems to be:

  1. All superUser/admin check should be moved into AuthorizationService, which keep AuthorizationProvider clean and more focused on the inner check.
  2. Outer callers, such as Producer/Consumer, shoud call AuthorizationService.allowTopicOperationAsync instead of canProduceAsync/canConsumerAsync/canLookupAsync. The canProduceAsync/canConsumerAsync/canLookupAsync in AuthorizationProvider might be deprecated in the future.

@dragonls dragonls deleted the master-fix-tenant-admin-auth branch April 20, 2023 08:56
@poorbarcode

poorbarcode commented May 7, 2023

Copy link
Copy Markdown
Contributor

poorbarcode pushed a commit that referenced this pull request May 7, 2023
Co-authored-by: druidliu <druidliu@tencent.com>
(cherry picked from commit fc17c1d)
poorbarcode pushed a commit that referenced this pull request May 7, 2023
…20142)

Fixes: #20066

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, this change helps us move in the right direction.

### Modifications

* Update `Producer` and `Consumer` in broker to call the `AuthorizationService#allowTopicOperationAsync` method.

### Verifying this change

This change is trivial.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.

(cherry picked from commit dc5e497)
poorbarcode pushed a commit that referenced this pull request May 7, 2023
…#20143)

This reverts commit fc17c1d.

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think this approach could have unintended consequences. Instead, I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method. I propose an update to the `Consumer` and `Producer` logic here #20142.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, I think we should not change their implementations.

### Modifications

* Revert #20068

### Verifying this change

This change is trivial. It removes certain test changes that were only made to make the previous PR work.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.

(cherry picked from commit 00dc7a0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [broker] Producer and consumer will all be disconnected while using tenant admin to produce/consume

7 participants