ci: run the Java integration test-suite on GitHub against AIO (jenkins offboarding D)#14225
ci: run the Java integration test-suite on GitHub against AIO (jenkins offboarding D)#14225mo-auto wants to merge 35 commits into
Conversation
…one image (offboarding D1) Add CN_PERSISTENCE_LOAD_TEST_DATA (default false) so the Janssen integration-test dataset can be loaded into the SQL backend, enabling the test-suite to run against the all-in-one image without jenkins.jans.io. - test_data_setup.py mirrors jans-linux-setup test_data_loader for SQL: adds the custom test columns (parsed from the vendored test schema LDIFs), imports the auth/scim/fido2 test data LDIFs, adds the password grant to the SCIM client, applies the jans-auth dynamic-config delta, enables the test scripts and promotes the test default scopes. Idempotent and lock-guarded. - bootstrap.py runs it as a gated stage after the custom-ldif stage. - vendor templates/test into the persistence-loader image. - all-in-one: default CN_PERSISTENCE_LOAD_TEST_DATA=false. - config-api: honor CN_CONFIG_API_TEST_CLIENT_ID/SECRET/TRUSTED so the config-api test client is deterministic for the integration test-suite. The config-api test client already receives every scope on upgrade (update_test_client_scopes), so no config-api test data is loaded here. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…fboarding D2) Add .github/workflows/test-integration.yml + scripts/render_test_profiles.py to run the HTTP integration test-suites against an all-in-one server with test data loaded, on a MySQL/PGSQL matrix - replacing the jenkins.jans.io / docker-jans-monolith runner. - builds persistence-loader + config-api from the checkout and assembles the AIO so the test-data changes are exercised; runs the DB + AIO + a TLS-terminating proxy on 443 - extracts the live salt / SCIM / config-api secrets from the AIO through its own pycloudlib manager (adapter-agnostic) and renders profiles/<fqdn>/ for each module from the canonical templates/test sources; copies the committed client keystores - imports the AIO cert into the JDK truststore, builds the modules, then runs 'mvn -Dcfg=<fqdn> test' for jans-auth/client, jans-scim/client, jans-config-api and jans-fido2/client (continue-on-error, gated at the end) - publishes results three ways: GITHUB_STEP_SUMMARY table, uploaded artifact, and a SHA-pinned dorny/test-reporter PR check; dumps AIO logs on failure - all actions SHA-pinned + harden-runner; matrix is MYSQL-only on PRs, both nightly Server-side persistence-coupled suites (jans-auth-server/server, jans-orm) and agama are a follow-up. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
…l+vault+traefik) The AIO requires consul + vault (pycloudlib defaults CN_CONFIG_ADAPTER=consul / CN_SECRET_ADAPTER=vault); the previous hand-rolled 'docker run' started neither, so the configurator hung and the AIO never became healthy (first run #27060489146). Reuse the canonical automation/start_janssen_aio_demo.sh which brings up the full compose stack (consul + vault + traefik TLS + db + AIO). Bake the integration-test env (CN_PERSISTENCE_LOAD_TEST_DATA + CN_CONFIG_API_TEST_CLIENT_*) into a thin image layer tagged as the image the demo compose expects, so the script is used unmodified. Import the demo's generated CA cert into the JDK truststore; extract live secrets from the 'jans' container. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…er configurator start_janssen_aio_demo.sh generates the TLS certs as mode-600 files owned by the host user; the AIO configurator runs as uid 1000 and got PermissionError reading the mounted ca.key, so configurator-load failed and the auth_jks_base64 secret was never created (AIO stayed unhealthy). Relax the cert perms and restart the AIO after the stack is up. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…data loader pycloudlib's create_from_ldif only inserts plain entries, but the test data LDIFs also carry 'changetype: modify' records (flip jansDefScope on extra scopes, enable extra scripts, activate an attribute, add group members, add scope claims). Those records have no objectClass, so _data_from_ldif crashed with 'NoneType object is not subscriptable' (oc[-1] on None). Split each LDIF into plain entries (-> create_from_ldif) and modify records, and apply the modifies via client.update with a DN-suffix->table map, inferring scalar/multivalued + boolean columns from the existing value. Comment-only/no-DN blocks are skipped. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…a reflect + resilience) The modify applier crashed (AttributeError: NoneType has no attribute 'c') when client.get hit a table missing from the cached SQLAlchemy metadata. Force a fresh metadata reflection before applying modifies, skip a modify whose table still isn't reflected, wrap each modify so one failure can't abort the whole load, and initialise multivalued 'add' targets with the correct JSON shape when the column is empty. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…oad best-effort Some test entries carry attributes (e.g. creationDate on jansFido2RegistrationEntry / jansCustomPerson) that are not columns of their SQL table, so pycloudlib's insert raised 'CompileError: Unconsumed column names'. Pass a transform_column_mapping callback to create_from_ldif that drops attributes without a matching reflected column. Also wrap the post-import config tweaks so a single failure can't crash-loop the loader and keep the AIO unhealthy. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
… failure The AIO comes up (loader completes, fido2 deploys) but jans-auth isn't serving openid-configuration (404) and the tail-400 dump truncates the service logs. Capture supervisorctl program states + a targeted jans-auth/error grep so the jans-auth startup failure is diagnosable. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
* fix(casa): correct otp-java import package to com.bastiaanjansen.otp
otp-java's Maven groupId is com.github.bastiaanjansen, but its Java package
is com.bastiaanjansen.otp. The OTP migration used the groupId as the import
package, so casa failed to compile ("package com.github.bastiaanjansen.otp
does not exist"). Fix the imports in the casa HOTP/TOTP services and the
two OTP person-auth scripts. The pom dependency groupId
(com.github.bastiaanjansen:otp-java) is correct and unchanged.
Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
* fix(casa): pass counter to HOTP getURI (otp-java API)
otp-java's HOTPGenerator.getURI takes the moving-factor counter first:
getURI(int counter, String issuer, String account). Pass 0 (initial
counter, matching the previous lochbridge default) in the casa HOTP
service and the HOTP URI calls in both OTP scripts. TOTPGenerator's
getURI(issuer, account) is correct and unchanged. Verified all other
otp-java calls (Builder, generate, verify, HMACAlgorithm) against the
v2.1.0 source.
Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
---------
Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
Co-authored-by: moauto <54212639+mo-auto@users.noreply.github.com>
docker-jans-auth-server pulls casa-config + jans-fido2-client + jans-fido2-model + agama-inbound as custom libs from the release, but build-test only collected the first three, so agama-inbound-0.0.0-nightly.jar was 404 and the 'docker (auth-server)' image build failed (run 27098512184) - leaving a stale ghcr auth-server image (and breaking the AIO jans-auth). agama-inbound is a jans-auth-server submodule (jans-auth-server/agama/inboundID, artifactId agama-inbound), installed to ~/.m2 during the auth-server deploy; add it to the release-asset collection alongside the other auth-server custom libs. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…ogress Cut the health wait 1200s->600s so a stuck AIO fails fast, dump jans-auth/configurator log lines periodically during the wait, and enrich the failure dump (supervisorctl status + grepped service/error logs) so the reason jans-auth stays at 404 is visible without waiting/cancelling. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…ebKeys date On AIO-health failure, dump the configurator auth-keys.json, the decoded auth_openid_key_base64 secret, and the jansConfWebKeys DB value, to pinpoint which stage turns the JWKS into a date string. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…tore corruption The configurator's first run failed to read the demo's mode-600 TLS certs, and the restart workaround re-ran key generation against a half-initialised keystore -- writing a 'Keystore was tampered with' error string into jansConfWebKeys and breaking jans-auth and config-api WebKeysConfiguration parsing. Run the demo in the background and relax the cert perms as they appear so the configurator succeeds on its first run, no restart needed. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
… as root The setup-java JDK's cacerts is not writable by the runner user (keytool failed with FileNotFoundException: .../cacerts Permission denied), so run the import via sudo using the explicit keytool path. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…e live FQDN The build runs clean install over the full jans-auth-server/scim/config-api/fido2 reactors; -Dcfg=<fqdn> required a rendered profile for every filtering module (agama-engine has none -> 'Error loading property file .../agama/engine/profiles/<fqdn>/config-agama-test.properties'). Compilation does not need the live FQDN, and every module ships a default profile, so build with -Dcfg=default and keep -Dcfg=<fqdn> for the test step (whose 4 legs are fully rendered). Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…run unit suites Triage support: the failure log dump never ran (job only fails at the gate, so failure() was false), so dump jetty service logs (scim/config-api/auth/fido2) on always() to expose the 503/401 cause. Collect surefire reports workspace-wide (the jans-auth-client reports were not captured). Add a unit-suite step (jans-orm, jans-core). Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…ead of 503 jans-scim returns 503 (AuthorizationProcessingFilter.disabledApiResponse) on every CRUD call because jansScimEnabled is false: the persistence-loader defaults CN_SCIM_ENABLED=false and the workflow never set it (linux-setup always enables SCIM). Bake CN_SCIM_ENABLED=true into the AIO. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…er calls work The AIO demo pinned the FQDN to the host IP via the jans container's extra_hosts; with a loopback IP (CI) the in-container public URL had no :443 listener, so config-api's jans-auth introspection callback got 'Connection refused' and every config-api request returned 401. Alias the FQDN to the traefik service on the compose network (and drop the extra_hosts override) so https://<fqdn> resolves to traefik's TLS entrypoint and routes back to the AIO. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
Mirror the Jenkins full-build-with-tests model: instead of building with -DskipTests and running only 4 client legs, run 'mvn clean install' with tests ENABLED and -Dcfg=<fqdn> over every reactor (jans-orm, jans-core, jans-auth-server, jans-scim, jans-config-api, jans-fido2), so each module's unit AND integration suites execute against the live AIO in one pass. Render the agama-engine profile so that reactor compiles under -Dcfg, and gate the job on the collected testng failure count (failure.ignore lets all suites run first). Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…r diagnostic The bulk metadata reflect omits jansCustomScript (while jansAttr/jansScope/jansAppConf resolve), so the test auth scripts (031C-5621/5622) were never enabled -- a likely driver of the auth-client HtmlUnit login-flow hang. Reflect the single table on demand before skipping. Also dump the scimCustom* attr DB state + jans-scim /Schemas so we can see why the SCIM extension isn't recognised despite the attrs being present + jans-scim restarted. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…mScr The SQL table for custom scripts is jansCustomScr (objectClass jansCustomScr), not jansCustomScript -- so DN_TABLE_SUFFIX and enable_test_scripts hit NoSuchTableError and the test auth scripts (031C-5621/5622) were never enabled. Use the correct name everywhere. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
The 2-core GitHub runner is ~13x too slow for the HtmlUnit-heavy auth-server/client suite (Jenkins ran all ~2185 in 14.5 min on a dedicated VM). Provision an ephemeral s-8vcpu-16gb DO droplet via the DO API, rsync the checkout, and run the whole jans-side flow there via the new automation/ci/run_aio_integration.sh (build AIO + demo stack + secrets + render + build + bounded suites + units). Results scp'd back; the runner publishes them. Always tear the droplet down, with a tag-scoped reap job as a cancellation safety net. Requires a new DO_TOKEN repo secret. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
The first VM run failed in the bootstrap step: 'Could not get lock /var/lib/apt/lists/lock' -- DO's cloud-init/unattended-upgrades still holds apt at boot. Wait for cloud-init, and give apt -o DPkg::Lock::Timeout=600 so it waits for the lock instead of failing. (Provisioning + SSH + env passing all worked.) Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
apt now succeeds, but the on-VM script's build-AIO step runs 'docker build' before the demo script installs docker -> 'docker: command not found'. Install docker (get.docker.com, which also brings the compose + buildx plugins) in the VM bootstrap; the demo's install_docker is guarded and skips when docker is present. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
… healthy On the VM the mysql container Started then died (gone from compose ps), so the AIO services fail with UnknownHostException: mysql. Dump the mysql/postgresql container state + logs in the health-failure path so the next run shows why the DB exited. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…plugin The on-VM build reached jans-scim-server then failed: buildnumber-maven-plugin runs 'git log' and the rsync had excluded .git. Ship .git (the runner checkout is shallow, so it's small) and set persist-credentials:false on checkout so the GITHUB_TOKEN isn't carried onto the VM. (The prior mysql death was transient -- this run the AIO was healthy and reached the build step.) Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
… on the VM With .git shipped, the buildnumber plugin's 'git log' then hit 'detected dubious ownership' because rsync -a preserved the runner uid while the build runs as root. Add safe.directory '*' in the VM bootstrap. (Also DB-container logs in the on-VM health-fail diagnostic for the transient mysql death.) Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
…h-client timeout The GitHub run log is too large to fetch reliably, so each maven suite's output is redirected to aio-logs/ and uploaded unconditionally for offline diagnosis. Raise the auth-client suite timeout to 2400s: the Jenkins reference (build jans-auth-server/15291) runs it serially in ~711s (1935 tests, 28 expected failures), and the AIO is slower per round-trip, so 900s cut it off mid-suite. The HtmlUnit EvaluatorException noise is normal -- it floods the Jenkins log too -- not a failure. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
mysql was OOM-killed (exit 137) at the demo's 768M limit under the test-data load, so the AIO never became healthy and the run died in the secret step before any logs were captured. Make the demo's mysql memory limit env-overridable (default 768M unchanged) and have CI request 3G (the VM has 16GB). Add an EXIT trap that always dumps docker ps / per-container logs / OOM state to aio-logs/, so an early failure still produces a downloadable artifact. Signed-off-by: moauto <54212639+mo-auto@users.noreply.github.com>
Summary
Part of the jenkins.jans.io → GitHub offboarding. Moves the Java integration test-suite onto GitHub Actions, running against an all-in-one (AIO) server with the integration test-data loaded — replacing the jenkins.jans.io /
docker-jans-monolithtest runner.This is Phase D (A–C already merged: artifact publish, consumer repoint, pipeline chaining). Opened as draft: it is untestable locally and needs a
workflow_dispatchrun to validate + iterate.D1 — load integration test-data into the AIO (gated)
New
CN_PERSISTENCE_LOAD_TEST_DATA(defaultfalse) loads the Janssen test dataset into the SQL backend.docker-jans-persistence-loader/scripts/test_data_setup.py(new) mirrors jans-linux-setup'stest_data_loaderfor SQL: adds the custom test columns (parsed from the vendored test schema LDIFs), imports the auth/scim/fido2 test data, adds thepasswordgrant to the SCIM client, applies the jans-auth dynamic-config delta, enables the test scripts and promotes the test default scopes. Idempotent + lock-guarded.bootstrap.pyruns it as a gated stage after the custom-ldif stage.templates/testinto the persistence-loader image.docker-jans-all-in-one:CN_PERSISTENCE_LOAD_TEST_DATA=falsedefault.docker-jans-config-api: honorsCN_CONFIG_API_TEST_CLIENT_ID/SECRET/TRUSTEDso the config-api test client is deterministic (it already receives every scope on upgrade, so no config-api test data is loaded by the loader).Test-client secrets are derived as
<inum>-<host-label>(matching upstream), so the suite recomputes them from the fixed inums + AIO FQDN — no shared secret file.D2 — integration-test workflow
.github/workflows/test-integration.yml+.github/workflows/scripts/render_test_profiles.py:workflow_dispatch(persistence choice + optional prebuilt image) + nightly cron + path-filtered PR; MySQL/PGSQL matrix (MySQL-only on PRs).get_manager()(adapter-agnostic) and rendersprofiles/<fqdn>/for each module from the canonicaltemplates/testsources.mvn -Dcfg=<fqdn> testfor jans-auth/client, jans-scim/client, jans-config-api and jans-fido2/client (continue-on-error, gated at the end).$GITHUB_STEP_SUMMARYtable, uploaded artifact, and a SHA-pinneddorny/test-reporterPR check; dumps AIO logs on failure.Scope / follow-ups
jans-auth-server/server,jans-orm) and agama are a follow-up — they connect directly to persistence and were designed to run inside the monolith container. The auth/server profile is already rendered, so they are straightforward to enable once the HTTP suites are green.docker-jans-monolithand the final stale-jenkins cleanup are the remaining offboarding phases (E/F), handled separately.Validation plan
workflow_dispatchMySQL — AIO + DB up, test-data loaded, profiles rendered, the 4 suites build + test, results posted.workflow_dispatchboth (MySQL + PGSQL).Closes #14226,