playbook: add pickupFirst mode for shared-certificate distribution (TPP)#650
Open
jmeldrum76 wants to merge 1 commit into
Open
playbook: add pickupFirst mode for shared-certificate distribution (TPP)#650jmeldrum76 wants to merge 1 commit into
jmeldrum76 wants to merge 1 commit into
Conversation
Adds an opt-in `pickupFirst: true` field on the playbook `request:` block.
When enabled (TPP only for v1), `vcert run` queries the cert object's
current metadata first and installs whatever the platform holds rather
than enrolling a new cert on every follower host. This matches the
common "one cert, many endpoints" pattern (wildcards, load-balancer
pools) where one team renews centrally and many followers need to
converge to the same cert+key on their own maintenance windows.
Decision flow on each run:
- locate (TPP RetrieveCertificateMetaData) - cheap O(1) metadata GET
- thumbprint matches installed -> defer to renewBefore check
- platform cert newer than installed -> full pickup + install, no enroll
- platform cert older than installed -> refuse downgrade (safety guard)
- platform cert not found -> fall through to existing enroll
The change is purely additive: 5 files, +289 lines, 0 deletions, 0
modifications to existing logic. Existing playbooks without
`pickupFirst` are byte-identical to current behavior. On VCP/Firefly/
NGTS the feature silently no-ops (ErrLocateNotSupported); VCP-native
support is a planned follow-up that needs a different locator strategy
(cert-object DN model differs).
Files:
- pkg/playbook/app/domain/playbookRequest.go: PickupFirst, PickupID fields
- pkg/playbook/app/vcertutil/vcertutil.go: LocateLatestCN, locateTPP,
PickupCertificateByLocator
- pkg/playbook/app/installer/crypto.go: LoadInstalledPEM (export)
- pkg/playbook/app/service/pickup_first.go: orchestrator (new file)
- pkg/playbook/app/service/service.go: Execute() hook
Verified end-to-end against a live TPP lab across seven scenarios:
backwards-compat / hot-path match / install-newer-pickup / refuse-
downgrade / in-renew-window-defer-to-enroll / initial-enroll / non-TPP
silent-noop.
Signed-off-by: Jeremy Meldrum <21229220+jmeldrum76@users.noreply.github.com>
Author
|
For context: I'm submitting this as a Venafi (CyberArk) colleague — happy to follow whatever internal review process the playbook engine maintainers want before this gets merged. The commit is from my personal GitHub identity but the work is internal to the company. Let me know if there's an internal Jira/design-doc step I should do, a reviewer to ping, or anything else (README-PLAYBOOK.md update, tests, CHANGELOG) you'd like added to this PR before review. Happy to push follow-up commits on the same branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add
pickupFirstmode to vcert playbook for shared-certificate distribution (TPP)BUSINESS PROBLEM
Many customers operate the "one cert, many endpoints" pattern: a single TLS certificate (often a wildcard) is installed on dozens to hundreds of heterogeneous endpoints — Apache servers, NGINX, F5/NetScaler load balancers, Imperva, etc. — all serving the same FQDN(s). When the cert is renewed in TPP (manually via Aperture, automatically via a renewal policy, or via
vcerton a designated leader host), every follower needs to install that exact same cert + key during its own maintenance window, which may be days or weeks after the renewal happens.vcert's current playbook (vcert run -f apache.yaml) is built around the assumption that the host running the playbook owns the enrollment — it always tries to enroll / renew through therequestblock. That means:vcert pickupto bridge this gap. We did exactly this for our customer — ~250 lines of bash that drivesvcert pickup, compares thumbprints, decides whether to install or defer to the existing renewal path.Business impact: every customer with shared / wildcard certs across multiple endpoints either accepts staggered-renewal pain, builds bespoke distribution scripts, or pushes the cert manually. The pattern is common enough that
vcertshould support it natively.PROPOSED SOLUTION
Add an opt-in
pickupFirstmode to the playbookrequestblock. With one new boolean field (and one optional override), a follower host's playbook becomes a self-healing converger to whatever the platform currently holds at a given cert object.When
pickupFirst: true:RetrieveCertificateMetaData(dn)— one cheap GET, returns thumbprint +ValidTowith no PEM / key payload.NotAfter.renewBeforewindow check (normal playbook flow takes over).RetrieveCertificatefor cert + chain + key, install at the playbook's paths via the existing installer chain, runafterInstallAction. No enrollment.Backwards compatibility: absent
pickupFirst(orpickupFirst: false), the playbook behaves byte-identically to today. Existing customer playbooks are unaffected.Architectural notes from a working prototype
pkg/playbook/app/service/pickup_first.go(~150 lines) plus three small public helpers invcertutilandinstaller. The patch is purely additive: zero deletions, zero modifications to existing logic. The new field defaults make every untouched code path identical to current behavior.RetrieveCertificateMetaDatais O(1) by DN.runInstaller,CreateX509Cert(handles PKCS#8 encrypted-key decryption),afterInstallAction, backup / rollback. No new installer code.Diffstat against
v5.13.2Scope for v1: TPP only
VCP support would require a different locator strategy. Its cert-object model is fundamentally different:
versionType(CURRENT/OLD) andcertificateStatus(ACTIVE/RETIRED) are independent state machines.managedCertificateId(the lineage identifier) is not currently a server-side searchable field.The proposed implementation silently no-ops on non-TPP backends so VCP / Firefly / NGTS playbooks see zero behavior change and zero error noise. VCP-native support is a clean follow-up issue once the locator abstraction lands.
CURRENT ALTERNATIVES
In production for a customer today, we have evaluated or are doing all of the following:
vcert pickupandvcert run. Reads install paths andrenewBeforefrom the playbook YAML, drives the four-branch decision tree (newer pickup → install / match in window → renew / match outside window → no-op / nothing in TPP → initial enroll), handles PKCS#8 key decryption before write (because Apache withoutSSLPassPhraseDialogcan't load encrypted keys), filters stderr noise, writes timestamped backups. Roughly 250 lines of bash that every customer in this situation ends up writing variants of.vcert pickupdriven by cron with custom diffing. Same pattern, different language.vcertentirely; the cert object in TPP becomes informational rather than authoritative.vcert run --force-renewon every host on a coordinated maintenance window, even though only one of them actually needed to enroll.All four approaches reinvent the same logic and put the burden on the operator. Native support in
vcertwould replace all of them with one YAML flag.VENAFI EXPERIENCE
vcert v5(currentlyv5.12.3in the customer environment; verified the proposed implementation also compiles and tests cleanly againstv5.13.2/ master). Daily use of the playbook engine,vcert pickup,vcert run, and the standalonevcert enroll/vcert renewcommands.vcert. Mix of enrollment patterns: user-provided CSR, service-generated, mixed key-retrieval policies across folders.pickupFirstfield)A working prototype patch (
pickupFirst.patch) is attached. Five files, +290 lines, zero deletions, zero modifications to existing code paths. Apply withgit apply pickupFirst.patchfrom thevcertrepo root.