Skip to content

Capture Authentik SCIM Provider wire-protocol traffic (research spike for SCIM Provider spec) #21

Description

@herbie-bot

Background

We are about to design a SCIM 2.0 Provider for spring-services (an HTTP server that
Authentik can push user / group changes into via the SCIM 2.0
protocol — RFC 7643 + RFC 7644). The foundation work (data-model preparation in
UserEntity, PrincipalDirectory port) is done in spec
012-scim-foundation-user-model. The actual SCIM
endpoints, group model, and PATCH/filter machinery come next as the SCIM Provider spec.

Before we write that spec, we need to know what Authentik actually sends on the wire.
RFC 7644 allows a broad range of behaviours (PATCH operation shapes, filter syntax variants,
group-membership update strategies, bulk vs sequential, …) and Authentik picks one concrete
point in that space. Without observing Authentik's real traffic, the spec will be a
collection of half-grounded guesses, and 30–50 % of the design decisions will need to be
revised during implementation.

The task in this issue is to capture that wire-protocol traffic empirically, so the
spec can be written against ground truth.

What you'll deliver

A documented capture of Authentik's SCIM-outbound HTTP traffic across nine concrete
scenarios, posted as a comment on this issue. The traffic is captured by a tiny Spring Boot
HTTP recorder that you build in a throwaway sub-project (the code is provided below — you
do not have to design it yourself).

Concretely, the deliverable is a markdown document with:

  1. The discovery-call sequence Authentik makes on startup.
  2. One full POST /Users example (method, path, headers, JSON body).
  3. One full user-update example (PUT or PATCH — whichever Authentik uses, with the full
    operation list).
  4. One full POST /Groups example.
  5. One full membership-add PATCH on a group.
  6. One full membership-remove PATCH on a group.
  7. The deactivation behaviour (PATCH active=false vs DELETE — whichever Authentik
    chooses, with the exact request).
  8. The user-deletion behaviour (Authentik may issue DELETE, may issue a soft-revoke
    PATCH, or may issue nothing — observe).
  9. The initial-sync behaviour with ≥ 10 users (rate, parallelism, bulk endpoint usage).

For each scenario, include: HTTP method, full URI (with query string), every header,
verbatim JSON body. No editorialising; this is raw observation.

Why this is needed

The downstream SCIM Provider spec has open design questions whose answers are dictated by
Authentik's behaviour. Examples:

  • PATCH scope — RFC 7644 §3.5.2 defines add/replace/remove with optional path
    expressions like members[value eq "..."]. We need to know which subset Authentik
    actually uses so we implement only what's needed.
  • Filter grammareq is mandatory by the RFC, but sw/co/ne/AND/OR/pr
    are optional. We need to know which ones Authentik sends.
  • Group-membership update strategy — does Authentik PATCH each membership change
    individually, or send a full member list in one PUT? Affects performance design and the
    shape of the controller method.
  • Vendor extensions — does Authentik send urn:authentik:* schema extensions on User
    or Group resources? If so, we need a strategy (accept-and-ignore vs reject).
  • Discovery dependency — does Authentik probe /ServiceProviderConfig /
    /Schemas / /ResourceTypes before provisioning? If yes, those endpoints are blocking
    prerequisites; if no, they are nice-to-have.

Every one of these decisions feeds directly into the spec we'll write after this issue
is resolved.

Step-by-step procedure

Step 1 — Build the recorder

The recorder is a single-file Spring Boot app that pretends to be a SCIM endpoint. It
returns minimal RFC-compliant JSON so Authentik treats it as a real, functional provider —
but its real job is to write every request to a log file.

Create a new directory outside the spring-services repo (e.g. ~/scim-recorder/),
and add these two files.

pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.5.14</version>
        <relativePath/>
    </parent>
    <groupId>local</groupId>
    <artifactId>scim-recorder</artifactId>
    <version>0.0.1</version>
    <properties>
        <java.version>21</java.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>

src/main/java/local/ScimRecorderApp.java:

package local;

import jakarta.servlet.http.HttpServletRequest;
import java.nio.file.*;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
import org.slf4j.*;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

@SpringBootApplication
@RestController
@RequestMapping("/scim/v2")
public class ScimRecorderApp {

    private static final Logger log = LoggerFactory.getLogger(ScimRecorderApp.class);
    private static final Path LOG_FILE = Paths.get("scim-traffic.log");
    private final AtomicInteger ids = new AtomicInteger();

    public static void main(String[] args) {
        SpringApplication.run(ScimRecorderApp.class, args);
    }

    @GetMapping(value = "/ServiceProviderConfig", produces = "application/scim+json")
    public Map<String, Object> serviceProviderConfig(HttpServletRequest req) {
        record(req, null);
        return Map.of(
            "schemas", List.of("urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig"),
            "patch", Map.of("supported", true),
            "bulk", Map.of("supported", false, "maxOperations", 1000, "maxPayloadSize", 1048576),
            "filter", Map.of("supported", true, "maxResults", 200),
            "changePassword", Map.of("supported", false),
            "sort", Map.of("supported", false),
            "etag", Map.of("supported", false),
            "authenticationSchemes", List.of(
                Map.of("type", "oauthbearertoken",
                       "name", "OAuth Bearer Token",
                       "description", "Bearer token")));
    }

    @GetMapping(value = "/Schemas", produces = "application/scim+json")
    public Map<String, Object> schemas(HttpServletRequest req) {
        record(req, null);
        return Map.of("schemas",
                List.of("urn:ietf:params:scim:api:messages:2.0:ListResponse"),
                "totalResults", 0, "Resources", List.of());
    }

    @GetMapping(value = "/ResourceTypes", produces = "application/scim+json")
    public Map<String, Object> resourceTypes(HttpServletRequest req) {
        record(req, null);
        return Map.of("schemas",
                List.of("urn:ietf:params:scim:api:messages:2.0:ListResponse"),
                "totalResults", 0, "Resources", List.of());
    }

    @RequestMapping(value = "/**", produces = "application/scim+json")
    public ResponseEntity<?> catchAll(HttpServletRequest req,
                                      @RequestBody(required = false) String body) {
        record(req, body);
        String method = req.getMethod();
        String uri = req.getRequestURI();
        String id = "captured-" + ids.incrementAndGet();

        if ("POST".equals(method)) {
            return ResponseEntity.status(201).body(Map.of(
                "schemas", List.of("urn:ietf:params:scim:schemas:core:2.0:User"),
                "id", id,
                "meta", Map.of(
                    "resourceType", uri.contains("/Groups") ? "Group" : "User",
                    "created", LocalDateTime.now().toString(),
                    "lastModified", LocalDateTime.now().toString())));
        }
        if ("GET".equals(method) && (uri.endsWith("/Users") || uri.endsWith("/Groups"))) {
            return ResponseEntity.ok(Map.of(
                "schemas", List.of("urn:ietf:params:scim:api:messages:2.0:ListResponse"),
                "totalResults", 0, "Resources", List.of(),
                "startIndex", 1, "itemsPerPage", 0));
        }
        if ("GET".equals(method)) {
            return ResponseEntity.ok(Map.of("id", id,
                "schemas", List.of("urn:ietf:params:scim:schemas:core:2.0:User"),
                "meta", Map.of("resourceType", "User")));
        }
        if ("PUT".equals(method) || "PATCH".equals(method)) {
            return ResponseEntity.ok(Map.of("id", lastSegment(uri),
                "schemas", List.of("urn:ietf:params:scim:schemas:core:2.0:User"),
                "meta", Map.of("resourceType", "User",
                    "lastModified", LocalDateTime.now().toString())));
        }
        if ("DELETE".equals(method)) {
            return ResponseEntity.noContent().build();
        }
        return ResponseEntity.status(405).build();
    }

    private static String lastSegment(String uri) {
        int i = uri.lastIndexOf('/');
        return i < 0 ? uri : uri.substring(i + 1);
    }

    private static void record(HttpServletRequest req, String body) {
        StringBuilder sb = new StringBuilder(2048);
        sb.append("\n=========================================================\n");
        sb.append("TIME:   ").append(LocalDateTime.now()
                .format(DateTimeFormatter.ISO_LOCAL_DATE_TIME)).append("\n");
        sb.append("METHOD: ").append(req.getMethod()).append("\n");
        sb.append("URI:    ").append(req.getRequestURI());
        if (req.getQueryString() != null) sb.append("?").append(req.getQueryString());
        sb.append("\nHEADERS:\n");
        Collections.list(req.getHeaderNames()).forEach(h ->
            sb.append("  ").append(h).append(": ")
              .append(req.getHeader(h)).append("\n"));
        sb.append("BODY:\n").append(body == null ? "(none)" : body).append("\n");
        String entry = sb.toString();
        log.info(entry);
        try {
            Files.writeString(LOG_FILE, entry,
                StandardOpenOption.CREATE, StandardOpenOption.APPEND);
        } catch (Exception e) {
            log.warn("Failed to append to log file: {}", e.getMessage());
        }
    }
}

Start the recorder with:

cd ~/scim-recorder
./mvnw spring-boot:run

It will listen on http://localhost:8080. Verify it works:

curl -s http://localhost:8080/scim/v2/ServiceProviderConfig | python3 -m json.tool

You should see a JSON response and see a request appear in the recorder's terminal log
and in ~/scim-recorder/scim-traffic.log.

Step 2 — Make the recorder reachable from Authentik

Where does your Authentik instance run?

  • Docker on the same Mac/Windows machine: Authentik can reach your laptop at
    http://host.docker.internal:8080.
  • Docker on Linux: add --network=host to your Authentik container, OR use
    http://172.17.0.1:8080 (Docker default bridge gateway).
  • On a remote server / Kubernetes cluster: the easiest path is ngrok http 8080,
    which gives you a public HTTPS URL (https://xxx.ngrok-free.app) you can paste into
    Authentik. The free plan is sufficient for an evening's experiment.

Pick whichever fits your setup.

Step 3 — Configure the SCIM provider in Authentik

In the Authentik admin UI:

  1. Go to Applications → Providers → Create, select SCIM Provider.
  2. Fill in:
    • Name: scim-recorder-experiment
    • URL: the URL from Step 2 followed by /scim/v2. Examples:
      • http://host.docker.internal:8080/scim/v2
      • https://xxx.ngrok-free.app/scim/v2
    • Token: any string. The recorder does not validate. Suggested:
      dummy-token-for-experiment.
    • Property Mappings — User / Group: leave the Authentik defaults.
  3. Save.
  4. Create a new Application (or pick an existing one) and assign
    scim-recorder-experiment as a Backchannel Provider.
  5. Trigger an initial sync from the provider detail view ("Run sync again" button or
    similar). This kicks off the first wave of HTTP traffic.

Step 4 — Run the nine scenarios

For each scenario:

  1. Clear the log file so the scenario's traffic is isolated:
    > ~/scim-recorder/scim-traffic.log
  2. Perform the action in Authentik.
  3. Wait ~10 seconds for any async retries.
  4. Copy the contents of scim-traffic.log into your deliverable document under the
    scenario's heading.

Scenario 1 — Discovery probing. Just configure the provider and trigger an initial
sync against an empty backchannel. Observe what Authentik probes before any user data is
sent.

Scenario 2 — Create one user. Create a user scim-alice in Authentik with name,
email, surname, given name. Assign her (directly or via a group) to the application.

Scenario 3 — Modify the user. Change scim-alice's display name. Save. Then change
her email. Save. (Two separate actions so we see whether Authentik batches or sends two
distinct PATCHes.)

Scenario 4 — Create a group with one member. Create a group scim-test-group, add
scim-alice to it, assign the group to the application.

Scenario 5 — Add a second member. Create scim-bob, add him to scim-test-group.

Scenario 6 — Remove a member. Remove scim-alice from scim-test-group.

Scenario 7 — Deactivate a user. Deactivate scim-bob in Authentik (try both "remove
from the application" and "deactivate the user account globally" — these may produce
different requests).

Scenario 8 — Delete a user. Delete scim-alice entirely in Authentik.

Scenario 9 — Bulk-ish sync. Create 10–20 additional users at once in Authentik (CSV
import works well if available; otherwise script the creation via Authentik's API or
create them manually). Trigger a fresh sync. Observe rate, parallelism, and whether
/Bulk is used.

Step 5 — Write up and post

Create a markdown document with one section per scenario. Format suggestion:

## Scenario 1 — Discovery probing

**Action:** Configured the provider with empty backchannel, clicked "Run sync".

**Observed traffic (verbatim from `scim-traffic.log`):**

\`\`\`
=========================================================
TIME:   2026-06-21T...
METHOD: GET
URI:    /scim/v2/ServiceProviderConfig
HEADERS:
  ...
BODY:
  (none)

=========================================================
TIME:   ...
METHOD: ...
...
\`\`\`

**Notes:** [anything you noticed, e.g. "Authentik probed in this order: ServiceProviderConfig
first, then Schemas, then started provisioning"]

Post the document as a single comment on this issue. Plain copy-paste, no need to
sanitise — there is no production data in the experiment.

Definition of Done

  • Recorder runs locally and captures requests into scim-traffic.log.
  • Authentik is configured with a SCIM provider pointing at the recorder.
  • All nine scenarios are executed and the traffic is captured.
  • A markdown writeup with one section per scenario is posted as a comment on this
    issue, including verbatim HTTP method, URI, headers, and JSON body for at least
    the seven "Pflicht" deliverables listed under "What you'll deliver".
  • Any unexpected behaviour (5xx retries, requests with vendor-specific schema URNs,
    requests on paths the recorder did not anticipate) is flagged in the writeup.

The recorder code does not need to be committed to this repo — it is a throwaway
artefact. Keep it locally; we may want to re-run the experiment after Authentik
upgrades.

Notes / hints

  • The recorder will log to both stdout and scim-traffic.log. The file is the
    source of truth — long JSON bodies get truncated in some terminals.
  • If you see Authentik retry the same request three or four times in a row with backoff,
    one of your responses is probably the wrong status code. The catch-all handler returns
    201 for POSTs and 200 for everything else; if that breaks something, capture the
    failure but don't try to "fix" the recorder — we want to know what Authentik does on
    perceived errors.
  • If a scenario produces zero traffic when you expected some, double-check that the
    group/user is actually assigned to the application that has the SCIM backchannel.
    Authentik silently skips entities outside the assigned scope.
  • If ngrok keeps disconnecting between scenarios, restart it and update the URL in
    the Authentik provider — but try to do all nine scenarios in one ngrok session if you
    can, because the URL changes per session.

Time budget

Realistic estimate: 2–3 hours end-to-end for someone who has the Authentik admin
account in hand. Most of the time is in Authentik UI navigation, not in the experiment
itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions