Azure: Require catalog-provided credentials for Key Vault client#16541
Azure: Require catalog-provided credentials for Key Vault client#16541wombatu-kun wants to merge 1 commit into
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Can using DefaultAzureCredential cause similar problem with ADSL client? It seems to me, that when |
|
@nandorKollar good catch — yes, the same That ADLS exposure is actually called out in #16465 itself: the report notes the Key Vault case "is less exposed than the ADLS location issue because it is config-driven rather than metadata-driven, but it is still an endpoint-trust break." So the ADLS case is known as a separate concern and considered the more exposed of the two — there the target storage host comes from table-metadata locations rather than a single catalog config property. I kept this PR scoped to Key Vault on purpose, because the ADLS mitigation can't be the same shape. For Key Vault, requiring a catalog-vended credential is reasonable since ambient Key Vault auth is a niche setup. For ADLS, authenticating to your own storage via a managed identity ( Since #16465 came in through the private Apache security list, I'd suggest the ADLS location case be triaged the same way rather than expanded here. But let's land this Key Vault PR first, and I'm happy to help drive the ADLS follow-up afterwards. |
Great, thanks! I didn't notice that the #16465 mentioned ADLS too! It sounds more important, I don't think the Key Vault path is often used in Iceberg, unlike storage access path. |
| } | ||
|
|
||
| if (allProperties.containsKey(ADLS_TOKEN_CREDENTIAL_PROVIDER)) { | ||
| return Optional.of(AdlsTokenCredentialProviders.from(allProperties).credential()); |
There was a problem hiding this comment.
Nit: probably AdlsTokenCredentialProviders is not the best name here, as this is connecting to Key Vault now.
Would it make sense to handle the two separately? Is it a possible use case that someone would like to use for example managed identities to connect to the key vault, and at the same time use SAS tokens for the storage accounts?
Closes #16465.
Summary
AzureKeyManagementClientbuilt its Key VaultKeyClientfrom an arbitraryazure.keyvault.urland authenticated using ambient Azure credentials — when noadls.token-credential-providerwas configured, the credential chain fell back toDefaultAzureCredentialBuilder().build()(managed identity / environment / Azure CLI). A malicious or mis-scoped catalog configuration could therefore redirect Key Vault authentication traffic to an attacker-controlled endpoint and exfiltrate the client's ambient bearer token, as reported in #16465.Following the direction suggested on the issue, the Key Vault client now authenticates only with a credential supplied through configuration (a catalog-vended
adls.token, or an explicitly configuredadls.token-credential-provider) and no longer silently falls back to ambientDefaultAzureCredential. When no such credential is provided it fails fast with aValidationException. With this change a malicious catalog can at most recover a credential it already issued, rather than the client's ambient identity.What changed
AzurePropertiesgainskeyVaultTokenCredential(), which returns the catalog-/operator-supplied credential and is empty when only ambient credentials would be available; the inlineTokenCredentialbuilder previously used for ADLS is extracted into a shared helper and reused.AzureKeyManagementClientnow resolves its credential through this method and throws aValidationExceptionwhen none is configured. ADLS storage credential behavior is unchanged — this is a Key Vault–only change.Note: the supplied token must be scoped for Key Vault (for the public cloud,
https://vault.azure.net).adls.tokenis reused as the carrier rather than introducing a new property; a dedicatedazure.keyvault.tokencould be added as a follow-up if preferred.Tests
TestAzurePropertiesadds unit coverage forkeyVaultTokenCredential(): resolution fromadls.token, resolution from a custom token-credential provider, and — critically — that it is empty (no ambient fallback) when nothing is configured. The integrationTestAzureKeyManagementClientnow asserts both sides of the new contract against a live vault: the old URL-only configuration fails withValidationException, and wrap/unwrap works when a Key Vault-scoped token is provided viaadls.token.