Skip to content

Commit 3ea7c43

Browse files
zhoward-1claude
andauthored
docs: add authentication and identity provider setup guide (#1039)
## Summary Adds `docs/operator-guides/authentication.md`. The compliance guide mentions RBAC and OIDC but has no setup instructions — this fills that gap. Covers: - Enabling RBAC in the API server ConfigMap - OIDC configuration with step-by-step instructions for Okta, Google Workspace, Azure AD, and Keycloak - Session token expiry configuration - MFA (enforced at IdP level — explains the delegation pattern) - `RoleBinding` and `ClusterRoleBinding` examples for user and group access - Multi-tenant namespace isolation with `NetworkPolicy` - Service-to-service auth: worker → API server (TLS + service account), controller → compute cluster (ray-manager token) - Guidance on preventing direct S3/etcd access Part of the operator/contributor guide improvements proposed in #1033. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent a291f54 commit 3ea7c43

1 file changed

Lines changed: 195 additions & 0 deletions

File tree

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Authentication & Identity
2+
3+
Authentication in Michelangelo operates at two levels:
4+
5+
- **User authentication** — end users authenticate to the Michelangelo API and UI via an identity provider (IdP)
6+
- **Service authentication** — internal services (worker, controller manager) authenticate to each other using Kubernetes service account tokens
7+
8+
This guide covers configuring both, plus RBAC authorization and multi-tenant isolation.
9+
10+
## Enabling RBAC
11+
12+
RBAC is disabled by default. Enable it in the API server ConfigMap overlay before connecting an identity provider:
13+
14+
```yaml
15+
apiserver:
16+
auth:
17+
rbacEnabled: true
18+
```
19+
20+
Apply the overlay and restart the API server:
21+
22+
```bash
23+
kubectl rollout restart deployment/michelangelo-apiserver -n ma-system
24+
```
25+
26+
Once RBAC is enabled, users without a RoleBinding will be denied access to all resources.
27+
28+
## Connecting an Identity Provider (OIDC)
29+
30+
Michelangelo supports any OIDC-compliant identity provider. Configure it in the API server ConfigMap:
31+
32+
```yaml
33+
apiserver:
34+
auth:
35+
rbacEnabled: true
36+
oidc:
37+
issuerUrl: https://accounts.your-idp.com
38+
clientId: michelangelo
39+
usernameClaim: email # JWT claim used as the Michelangelo username
40+
groupsClaim: groups # JWT claim used for group-based RBAC
41+
```
42+
43+
### Okta
44+
45+
1. In the Okta admin console, create an application of type **Web**
46+
2. Set the **Sign-in redirect URI** to `https://michelangelo-envoy.your-domain/callback`
47+
3. Copy the **Client ID** and **Okta domain** into the config:
48+
```yaml
49+
oidc:
50+
issuerUrl: https://your-org.okta.com
51+
clientId: <client-id-from-okta>
52+
```
53+
54+
### Google Workspace
55+
56+
1. In Google Cloud Console, create an **OAuth 2.0 Client ID** of type Web application
57+
2. Add your Michelangelo Envoy URL as an authorized redirect URI
58+
3. Set the issuer URL:
59+
```yaml
60+
oidc:
61+
issuerUrl: https://accounts.google.com
62+
clientId: <client-id>.apps.googleusercontent.com
63+
usernameClaim: email
64+
groupsClaim: hd # Google Workspace hosted domain
65+
```
66+
67+
### Azure Active Directory
68+
69+
1. Register a new application in the Azure portal
70+
2. Set the redirect URI to your Michelangelo Envoy callback URL
71+
3. Note the **Application (client) ID** and **Directory (tenant) ID**:
72+
```yaml
73+
oidc:
74+
issuerUrl: https://login.microsoftonline.com/<tenant-id>/v2.0
75+
clientId: <application-client-id>
76+
usernameClaim: upn # User Principal Name (email format)
77+
groupsClaim: groups
78+
```
79+
80+
### Keycloak
81+
82+
1. Create a realm and a Client with Client Protocol `openid-connect`
83+
2. Set the redirect URI and note the client ID:
84+
```yaml
85+
oidc:
86+
issuerUrl: https://keycloak.your-domain.com/realms/<realm-name>
87+
clientId: michelangelo
88+
```
89+
90+
## Session Token Configuration
91+
92+
Control how long a user's session remains valid:
93+
94+
```yaml
95+
apiserver:
96+
auth:
97+
sessionTokenExpiry: 8h # Valid time units: h, m, s
98+
```
99+
100+
8 hours is a reasonable default for a standard workday. Shorter expiry increases security but requires more frequent re-authentication.
101+
102+
## Multi-Factor Authentication
103+
104+
MFA is enforced at the IdP level, not within Michelangelo. Configure MFA policies in your identity provider's admin console. Michelangelo requires users to complete the full IdP authentication flow — including MFA — before issuing a session token.
105+
106+
## Granting Access with RBAC
107+
108+
After RBAC is enabled, users need a `RoleBinding` or `ClusterRoleBinding` to access Michelangelo resources.
109+
110+
### Grant a user read access to a project namespace
111+
112+
```yaml
113+
apiVersion: rbac.authorization.k8s.io/v1
114+
kind: RoleBinding
115+
metadata:
116+
name: alice-reader
117+
namespace: ml-team-project
118+
subjects:
119+
- kind: User
120+
name: alice@your-company.com # Must match the value of usernameClaim in the JWT
121+
apiGroup: rbac.authorization.k8s.io
122+
roleRef:
123+
kind: ClusterRole
124+
name: viewer
125+
apiGroup: rbac.authorization.k8s.io
126+
```
127+
128+
### Grant a team admin access via group membership
129+
130+
```yaml
131+
apiVersion: rbac.authorization.k8s.io/v1
132+
kind: RoleBinding
133+
metadata:
134+
name: ml-team-admins
135+
namespace: ml-team-project
136+
subjects:
137+
- kind: Group
138+
name: ml-team # Must match the value of groupsClaim in the JWT
139+
apiGroup: rbac.authorization.k8s.io
140+
roleRef:
141+
kind: ClusterRole
142+
name: editor
143+
apiGroup: rbac.authorization.k8s.io
144+
```
145+
146+
Use `RoleBinding` to scope access to a specific namespace. Use `ClusterRoleBinding` only for platform administrators who need cross-namespace access.
147+
148+
## Multi-Tenant Namespace Isolation
149+
150+
Each team or project should have its own Kubernetes namespace. Use `NetworkPolicy` resources to prevent cross-namespace access to ML workloads:
151+
152+
```yaml
153+
apiVersion: networking.k8s.io/v1
154+
kind: NetworkPolicy
155+
metadata:
156+
name: deny-cross-namespace
157+
namespace: ml-team-a
158+
spec:
159+
podSelector: {}
160+
policyTypes:
161+
- Ingress
162+
ingress:
163+
- from:
164+
- namespaceSelector:
165+
matchLabels:
166+
kubernetes.io/metadata.name: ml-team-a
167+
- namespaceSelector:
168+
matchLabels:
169+
kubernetes.io/metadata.name: ma-system # Control plane needs access
170+
```
171+
172+
This allows traffic within the team's namespace and from the Michelangelo control plane, but blocks all other namespaces.
173+
174+
## Service Authentication (Internal)
175+
176+
Michelangelo services authenticate to each other using Kubernetes service account tokens.
177+
178+
**Worker → API server**: Configured via `worker.useTLS: true` in the worker ConfigMap. The worker uses its Kubernetes pod service account token. Do not set `useTLS: false` in production.
179+
180+
```yaml
181+
worker:
182+
address: michelangelo-apiserver.ma-system.svc.cluster.local:15566
183+
maApiServiceName: ma-apiserver
184+
useTLS: true
185+
```
186+
187+
**Controller manager → compute cluster**: Uses the `ray-manager` service account token stored as a Secret in the control plane namespace. See [Register a Compute Cluster](jobs/register-a-compute-cluster-to-michelangelo-control-plane.md) for the full setup including token rotation guidance.
188+
189+
## Disabling Direct Storage Access
190+
191+
Do not allow users or services to directly access etcd or object storage (S3/MinIO) in ways that bypass the Michelangelo API. For S3 access:
192+
193+
- Set `useIam: true` in the controller manager ConfigMap — this uses IAM roles attached to pods via ServiceAccount annotations, not hardcoded credentials
194+
- Do not grant `s3:*` to individual users; use IAM policies scoped to specific buckets and prefixes
195+
- Audit S3 bucket policies regularly to ensure no public or cross-account access is inadvertently granted

0 commit comments

Comments
 (0)