fix(vm_workload_showroom): remap .ssh ownership for ansible-runner container (rootless podman uid mapping)#65
Merged
Conversation
…ntainer The ansible-runner-api container runs as uid=1001 inside rootless podman. In the showroom user's user namespace, uid=1001 maps to a higher host uid (showroom's first subuid + 1000, typically ~232072), not the showroom user uid (1888). The .ssh directory is created owned by showroom:showroom (700), which means the container process gets 'Permission denied' on /app/.ssh/config and cannot SSH to lab nodes. Solver and validator both fail immediately with: Failed to connect to the host via ssh: Can't open user config file /app/.ssh/config: Permission denied Fix: after writing the SSH config, run podman unshare chown -R 1001:1001 on the .ssh directory. podman unshare executes in the showroom user's user namespace so uid=1001 inside the namespace translates to the correct host uid that the container process owns, giving it read access. Only runs when showroom_ansible_runner_api is enabled.
ad5fa25 to
717914e
Compare
…onment Two fixes to the environment: block: 1. Merge f_user_data into runner env vars so lab credentials (satellite_password, bastion_ssh_password, guid, etc.) are passed to the zt-runner and injected as Ansible extravars into solve/validate playbooks. showroom_runtime_automation_environment_variables can still override individual keys. 2. Remove | upper filter from key names — playbooks reference vars in lowercase (satellite_password, not SATELLITE_PASSWORD). Uppercasing broke ansible extravar injection silently. 3. Handle list values by joining with comma so env vars remain strings.
717914e to
38c227a
Compare
1c61e70 to
ec4823d
Compare
…tty v2.7.4 - Switch podman unshare chown from shell to command (lint: command-instead-of-shell) - Add changed_when: true (lint: no-changed-when) - Set wetty image tag to v2.7.4 instead of v3.0 Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What breaks
When
showroom_ansible_runner_api: trueis set, every lab that uses the ZT runner to execute solve/validate playbooks fails on every module with:Solver and validator cannot connect to any lab node. Every click of Check or Solve in the showroom UI fails immediately.
Commits in this PR
1. Fix
.sshownership for ansible-runner container (rootless podman uid mapping)The
.sshdirectory is created owned byshowroom:showroom(uid=1888, mode700) and volume-mounted into theansible-runner-apicontainer at/app/.ssh.The container runs as uid=1001 (
defaultuser in the UBI9 image). In rootless podman, the showroom user's UID namespace maps:The container process (uid=232072 on host) cannot read files owned by uid=1888 with
700permissions. Every SSH attempt by Ansible fails before a connection is even opened.Fix: after writing the SSH config, run
podman unshare chown -R 1001:1001on the.sshdirectory.podman unshareexecutes inside the showroom user's user namespace, so uid=1001 inside translates to the correct host uid (232072) that the container process actually runs as.Gated on
showroom_ansible_runner_api— no-op for deployments not using the runner.2. Fix
volumes:indent crash andenvironment:Python dict rendering inansible_runner_api_service.j2Two bugs in the template introduced by the "add userdata to runner api" commit that crash
podman-composeat startup with:Bug 1 —
volumes:wrong indentation:The
{# dns_search #}Jinja2 comment left trailing whitespace that causedvolumes:to be indented at the same level as theports:list items (6 spaces instead of 4). YAML saw it as a non-sequence item inside a sequence and failed to parse.Fix: remove the comment line entirely.
Bug 2 —
environment:rendered as Python dict repr:{{ f_user_data | combine({...}) }}is rendered by Jinja2 as a Python dict string"{'key': 'value', 'list': ['item']}"which is not valid YAML environment variable format — podman-compose cannot parse it.Fix: iterate the dict and emit proper YAML
key: "value"pairs, converting any list values to comma-separated strings.Why this regressed in v1.6.8
This was not a problem in v1.6.6 because with
showroom_ssh_method: passwordthe sshkey block in22-showroom-users-security.ymlnever ran, so no.sshdirectory was explicitly created by the showroom role. The container either had an empty mount or no mount at all, and labs used password auth — the permission issue never triggered.v1.6.7 (PR #63) replaced
ansible_runner_api_service.j2withservice_runtime_automation.j2and dropped the.sshmount entirely — fully breaking any lab that needs SSH key auth in the runner.v1.6.8 (PR #64) restored the
.sshmount and added a new task in20-showroom-user-setup.ymlto explicitly create the.sshdirectory whenshowroom_ansible_runner_api: true— but created it owned byshowroom:showroomwhich the container cannot read.Tested on
ocpvdev01.rhdp.netZeroTouch single-pod deployments, GUIDsfg8ggand7xl28cc @andrjone @miteshget