Skip to content

feat(sync): TAM-6863: facility sync-host setup workflow#10124

Draft
dannash100 wants to merge 31 commits into
feat/TAM-6862/tamanu-manages-reporting-rolesfrom
feat/TAM-6863/sync-host-setup
Draft

feat(sync): TAM-6863: facility sync-host setup workflow#10124
dannash100 wants to merge 31 commits into
feat/TAM-6862/tamanu-manages-reporting-rolesfrom
feat/TAM-6863/sync-host-setup

Conversation

@dannash100

@dannash100 dannash100 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Changes

Adds a deliberate first-run setup workflow for pointing a facility server at its sync host, held in local system facts rather than originating in config (TAM-6863).

Boot resolution. A new serverConfig resolver (mirroring initDeviceId) resolves the sync host/credentials and facility ids once at boot, precedence SYNC_URL/SYNC_FACILITY_IDS env → local system fact → legacy config. The runtime reads getSyncConfig() / getServerFacilityIds() / isServerConfigured() instead of config.sync.* / selectFacilityIds(config), and the boot paths skip timesync/central/sync setup when unconfigured so a fresh server still boots to serve the wizard.

First-run wizard. When the facility server is unconfigured the web app shows a setup wizard instead of login (GET /api/public/setup/status gate): sync server URL + central administrator credentials + facility id(s). On submit the facility authenticates the admin against central, requires manage:all, then has central provision a dedicated sync user and stores the returned credentials.

Endpoints. Public GET /api/public/setup/status + POST /api/public/setup/sync (facility); POST /api/admin/syncCredentials (central, manage:all) find-or-rotates a System: <facilities> sync user.

Secrets. The sync password is stored encrypted in local_system_secrets; host/username/facility ids are plain facts.

Also: resetSyncConfig facility subcommand to clear config and re-run setup; the default sync.host config is emptied (kept as a deprecated one-version fallback). The SYNC_URL env var keeps the single connection-string form for headless/k8; the wizard uses separate fields.

⚙️ Config note for deployment: sync.host / sync.email / sync.password and serverFacilityId(s) are now a deprecated fallback — sync target is set via the wizard or SYNC_URL / SYNC_FACILITY_IDS.

Verified end-to-end locally (wizard → admin auth → manage:all gate → central-provisioned sync user → encrypted secret). Endpoint integration tests are a follow-up.

Auto-Deploy

  • Deploy
Options
  • Artillery load test
  • Seed from closest snapshot
  • Generate fake data
  • More data (20Gi)
  • No facility servers (central-only)
  • No sync (facility tasks scaled to zero)
  • AMD64 architecture (default is arm64)
  • Skip mobile build
  • Always build mobile
  • Stay up for 8 hours
  • Stay up for 24 hours
  • Stay up (no TTL)
  • Build images only (don't deploy)
  • Pause this deploy

Tests

  • Run E2E tests
  • Run DAST scan

Review Hero

  • Run Review Hero
  • Auto-fix review suggestions Wait for Review Hero to finish, resolve any comments you disagree with or want to fix manually, then check this to auto-fix the rest.
  • Auto-fix CI failures Check this to auto-fix lint errors, test failures, and other CI issues.
  • Auto-merge upstream Check this to merge the base branch into this PR, with AI conflict resolution if needed.
  • Save suppressions Check this to capture 👎 reactions on Review Hero comments as suppression rules in .github/review-hero/suppressions.yml. Also runs automatically at the end of any auto-fix run.

Remember to...

  • ...write or update tests
  • ...add UI screenshots and testing notes to the Linear issue
  • ...add any manual upgrade steps to the Linear issue
  • ...update the config reference, settings reference, or any relevant runbook(s)
  • ...call out additions or changes to config files for the deployment team to take note of

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit c3e6f9d. Configure here.

const cancelTasks = startScheduledTasks(context);
const cancelSyncTask = startScheduledTasks(context, syncTaskClass);
// SyncTask needs a sync manager, which only exists when the server is configured.
const cancelSyncTask = isConfigured ? startScheduledTasks(context, [SyncTask]) : () => {};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync stays off after wizard

Medium Severity

When startAll boots unconfigured, it skips creating syncManager and SyncTask based on a one-time isConfigured flag. The setup wizard only refreshes in-memory serverConfig; it does not create those components. After wizard + page reload, the process still has no outbound sync until a full restart.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c3e6f9d. Configure here.

const configuredFacilities = getDeclaredFacilityIds();
// A wizard-configured server has no external (env/config) declaration to
// drift-check against — the recorded fact is the source of truth.
if (!configuredFacilities?.length) return;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integrity check without sync host

Medium Severity

If SYNC_FACILITY_IDS or legacy config declares facility IDs but no sync host/credentials are available, ensureFacilityMatches still calls CentralServerConnection and connect(). With an empty default sync.host, startup throws before the server can serve the setup wizard.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c3e6f9d. Configure here.

Comment thread packages/facility-server/app/serverConfig.js Outdated
Comment thread packages/facility-server/app/routes/apiv1/setup.js Outdated
Comment thread packages/central-server/app/admin/syncCredentials.js
Comment thread packages/facility-server/app/serverConfig.js Outdated
Comment thread packages/facility-server/__tests__/serverConfig.test.js Outdated
Comment thread packages/central-server/app/admin/syncCredentials.js
Comment thread packages/facility-server/app/routes/apiv1/setup.js
Comment thread packages/facility-server/app/routes/apiv1/setup.js
Comment thread packages/central-server/app/admin/syncCredentials.js Outdated
Comment thread packages/facility-server/app/routes/apiv1/setup.js
@review-hero

review-hero Bot commented Jun 23, 2026

Copy link
Copy Markdown

🦸 Review Hero Summary
15 agents reviewed this PR | 6 critical | 8 suggestions | 0 nitpicks | Filtering: consensus 3 voters, 6 below threshold

Below consensus threshold (6 unique issues not confirmed by majority)
Location Agent Severity Comment
packages/central-server/app/admin/syncCredentials.js:29 BES Requirements suggestion The findOne-then-create/update pattern is a TOCTOU race. Two concurrent setup requests for the same facility set could both find no existing user and both attempt to create, causing a unique constr...
packages/central-server/app/admin/syncCredentials.js:35 BES Requirements suggestion When creating a new User with User.create({ ..., password }), the password is passed as a plain string. If User.create doesn't automatically hash the password (unlike the setPassword instance...
packages/facility-server/app/routes/apiv1/setup.js:27 Security suggestion isProduction is evaluated once at module load time using process.env.NODE_ENV. If NODE_ENV is unset (common in some deployment scenarios), the check defaults to non-production, silently allowin...
packages/facility-server/app/routes/apiv1/setup.js:47 Design & Architecture suggestion The setup handler mixes three distinct responsibilities in one 80-line function: input validation + host probing, credential provisioning via central, and local fact persistence. Splitting these in...
packages/facility-server/app/routes/apiv1/setup.js:86 Security suggestion Credential-validity oracle: a failed login returns 422 while valid credentials without admin permission return 403. An attacker can distinguish valid-but-unprivileged credentials from invalid ones ...
packages/facility-server/app/subCommands/startApp.js:48 Design & Architecture nitpick Line 49 destructures getSyncConfig().host unconditionally even in the unconfigured case (where host is null/empty). It's only used inside the if (isConfigured) block on line 57. Move the destru...
Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.


packages/facility-server/app/serverConfig.js:152: Uses || instead of ?? to coalesce the config value. Per project preference (important-project-rules.md), prefer ?? over |||| would incorrectly coalesce a legitimate falsy value (though unlikely here, the convention exists to prevent that class of bug). Change to return config.sync?.[key] ?? null;


packages/facility-server/app/routes/apiv1/setup.js:107: Four sequential fact/secret writes (lines 107–112) are not wrapped in a transaction. If a middle write fails the server is left in a half-configured state: isServerConfigured() will return false (it requires all four values), but stale facts remain. Wrap in a managed transaction so the writes are atomic. This also closes the narrow TOCTOU window between the isServerConfigured() guard at line 52 and the writes here.


packages/central-server/app/admin/syncCredentials.js:31: setPassword is async (it awaits User.hashPassword — see packages/database/src/models/User.ts:74), but it's called without await. The subsequent await existing.save() will persist the user before the password hash has been computed, leaving the password field unchanged (or set to a pending Promise's string coercion). Fix: await existing.setPassword(password);


packages/facility-server/app/serverConfig.js:148: configHost() calls new URL(config.sync.host.trim()).origin without a try/catch. If the config value is non-empty but not a valid URL (e.g. a bare hostname like central.example.com), this throws an unhandled error and crashes the server at startup. The readFacts function is defensive about table-not-existing errors, but this config fallback path isn't. Consider wrapping in try/catch and returning null on failure, or logging a warning.


packages/facility-server/__tests__/serverConfig.test.js:27: initWith passes facts.LocalSystemFact (a { get, set } object) as context.models, but readFacts destructures { LocalSystemFact, LocalSystemSecret } from models — so both are undefined. Every LocalSystemFact.get(...) call throws, is caught by the try/catch, and falls through silently to env/config fallbacks. The tests don't actually exercise the fact-reading path at all. Additionally, makeFacts never provides a LocalSystemSecret mock, so the password fact is never readable even with correct wiring. Fix initWith to { context: { models: { LocalSystemFact: facts.LocalSystemFact, LocalSystemSecret: { getSecret: async (key) => facts.store.get(key) ?? null } } } } and store the test password under FACT_SYNC_PASSWORD in the LocalSystemSecret mock rather than in LocalSystemFact.


packages/facility-server/app/serverConfig.js:18: Module-level mutable singleton (let resolved) makes this module's state invisible to the rest of the application and hard to test (tests must carefully sequence initServerConfig calls and can't run in parallel). Since the resolved config is already attached to context.serverConfig (line 45), consider making that the canonical owner and having the getters accept context (or at least a fallback) rather than reading a hidden global. The current() fallback (line 118-129) that silently re-derives from env/config when resolved is null masks bugs where initServerConfig was never called.


packages/facility-server/app/subCommands/startAll.js:42: The 'if configured → init sync infra, timesync, timezone checks; else log warning' block is copy-pasted almost identically across startAll.js, startApp.js, and startTasks.js (three copies). Extract a shared helper like initSyncInfraIfConfigured(context) that conditionally wires up centralServer, syncManager, timesync, and timezone checks, and returns isConfigured. This would eliminate the duplication and ensure future changes (e.g. adding a new sync-related init step) are applied once.


packages/central-server/app/admin/syncCredentials.js:35: The provisioned sync user gets role: 'admin', granting full system privileges. A sync account only needs to push/pull data — giving it admin role violates least privilege and means a leaked sync password is a full admin compromise. Consider a dedicated sync role with only the permissions the sync protocol requires.


packages/facility-server/app/routes/apiv1/setup.js:22: GET /public/setup/status has no integration test. While simpler than the sync endpoint, it's a public unauthenticated endpoint that reveals server configuration state. A basic test confirming it returns { configured: false } on a fresh server and { configured: true } after setup would verify the contract the frontend relies on. Can be added alongside the POST /public/setup/sync tests in a shared Setup.test.js file.


packages/central-server/app/admin/syncCredentials.js:15: POST /admin/syncCredentials creates/rotates a sync user with admin role and returns plaintext credentials. It has a permission gate (manage:all) but no integration test verifying that: (1) unauthenticated requests are rejected, (2) non-admin roles are forbidden, (3) admin roles succeed and a user is created/rotated, (4) duplicate calls rotate the same account. Since this endpoint mints credentials, missing auth coverage is high-risk. Add a test file at __tests__/apiv1/SyncCredentials.test.js following the createTestContext / baseApp.asRole() pattern used in nearby admin route tests.


packages/facility-server/app/routes/apiv1/setup.js:47: POST /public/setup/sync is a new unauthenticated endpoint that writes sync credentials and facility IDs to the database. It has a critical security invariant: it must reject requests once the server is already configured (line 53, 409 response), otherwise any unauthenticated caller could repoint a live server's sync target. There is no integration test covering this invariant, nor covering validation failures (missing/invalid body fields), the 409 already-configured guard, or the 403 non-admin path. The outbound central-server probe can be stubbed. This is high-risk: an untested unauthenticated endpoint that writes security-sensitive state. Add an integration test at packages/facility-server/__tests__/apiv1/Setup.test.js — see packages/facility-server/__tests__/apiv1/Appointments.test.js for the pattern.


packages/facility-server/app/routes/apiv1/setup.js:53: TOCTOU race condition on the isServerConfigured() gate. This reads a module-level variable; two concurrent requests can both see false, both pass the check, and both proceed to write facts and call initServerConfig. The last writer wins, so an attacker racing a legitimate setup could silently redirect the server to a malicious sync host. Use a database-level lock (e.g. advisory lock or a serialised transaction that checks-and-sets the facts atomically) to ensure only one setup request succeeds.


packages/central-server/app/admin/syncCredentials.js:25: The deterministic email sync.${facilityIds.join('-')}@sync.tamanu is built from unsanitised user input. While Sequelize parameterises the query, excessively long or adversarial facility ID strings (e.g. thousands of IDs) could produce an unbounded email/displayName. Consider capping the number of facility IDs (the zod schema only enforces .min(1), not a max) and validating that each ID matches an expected format (e.g. alphanumeric + hyphens, bounded length) to prevent abuse.


packages/facility-server/app/routes/apiv1/setup.js:67: SSRF: The unauthenticated /public/setup/sync endpoint makes an outbound HTTP request to a user-supplied host URL (via probe.login()). The only validation is that the protocol is https: (or http: in non-prod). There is no restriction on private/internal IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, 169.254.169.254 cloud metadata, IPv6 loopback). An attacker can use this unconfigured server as a proxy to probe internal network services. Consider adding an allowlist/blocklist for resolved IP addresses before making the outbound connection — e.g. resolve the hostname first and reject RFC 1918/link-local/loopback addresses.

import { useQuery } from '@tanstack/react-query';
import { useApi } from '../useApi';

// Unauthenticated check used pre-login to decide whether to show the first-run

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a dedicated endpoint and a new request, we could have the existing status endpoint return a special value if setup is required.

Comment thread packages/facility-server/app/routes/apiv1/setup.js
@dannash100 dannash100 marked this pull request as draft June 24, 2026 22:00
… resolver

Foundation for moving the facility sync target out of static config into local
system facts:

- FACT_SYNC_EMAIL / FACT_SYNC_PASSWORD constants (syncHost + facilityIds facts
  already existed)
- parseSyncUrl: parse a SYNC_URL connection string (creds embedded, à la
  DATABASE_URL) into host/email/password
- initSyncConfig: resolve sync host, creds and facility ids once at boot with
  precedence env (SYNC_URL/SYNC_FACILITY_IDS) -> facts -> legacy config, seeding
  facts from legacy config for back-compat

Not yet wired into the boot sequence; consumers still read config.
…t boot

Add serverConfig.js: a boot-time resolver (mirroring initDeviceId) that resolves
the sync connection (host/email/password) and facility identity (facilityIds)
once, with precedence env (SYNC_URL / SYNC_FACILITY_IDS) > local system fact >
legacy config. It writes nothing — facts are written out of band by the setup
wizard (and FACT_FACILITY_IDS by the existing facility-match integrity check).

Runtime now reads getSyncConfig() / getServerFacilityIds() / isServerConfigured()
instead of config.sync.* / selectFacilityIds(config): CentralServerConnection,
ApplicationContext settings, auth login, integrity drift-check, mSupply tasks,
shell, and the startApi/startSync/startAll/startTasks boot paths. The boot paths
now skip timesync/central/sync setup when unconfigured so a fresh server still
boots to serve the setup wizard.

The default sync.host config is emptied (kept as a deprecated one-version
fallback) so a fresh install is unconfigured and shows the wizard.
Public (unauthenticated, pre-sync there are no users) facility endpoints:

- GET /api/public/setup/status -> { configured } — gates the web setup wizard
- POST /api/public/setup/sync — validates the submitted SYNC_URL + facilityIds by
  doing an outbound login probe against the host, then records them to local
  system facts and refreshes the in-memory holder

Security: the sync endpoint only works while UNCONFIGURED (409 once configured,
so a live server can't be repointed by an unauthenticated call); https-only
(http allowed in non-production); rate-limited; generic validation errors and no
syncUrl/password in logs.

TODO: endpoint integration tests (need a central login mock + controlled
resolver state).
When the facility web app loads against an unconfigured server, show a setup
wizard instead of the login screen:

- useSetupStatusQuery: unauthenticated GET /public/setup/status; treats 404
  (central) / errors as configured so the wizard never blocks login
- SetupWizardView: single SYNC_URL field + a single/multiple (omniserver)
  facility-id selector (multiple uses add/remove inputs), client-side trim +
  dedupe; POSTs to /public/setup/sync and reloads into login on success
- App.jsx renders the wizard after the server-alive check, before login
…uperadmin gate

Two design corrections from review:

- The wizard (and POST /public/setup/sync) now take separate host + username +
  password fields rather than a single combined SYNC_URL (that connection-string
  form stays only for the SYNC_URL env var, for k8 automation).
- The setup endpoint is now gated on central super-admin credentials: it logs in
  with the supplied creds and requires ability.can('manage', 'all') before
  writing facts. This both authorises setup (a fresh server has no local user to
  authenticate, so we prove super-admin on central — which is the same identity
  as the facility super admin) and resolves the bootstrap problem.

Next pass: have central provision a dedicated sync user and return its
credentials once the admin has authenticated, instead of storing the admin's.
…al labelling

- equal-width facility-mode radios with a gap (fullWidth)
- remove button is now a minus icon button; facility-id inputs stay equal width
  via a fixed remove-slot
- smaller heading so it clears the Tamanu logo
- relabel the credential fields as central Administrator username/password
  (they authorise setup as a central super-admin; see manage:all gate)
- note: sync password should move to the encrypted local_system_secrets table
  once that lands, and is superseded by central-provisioned sync creds later
Add POST /api/admin/syncCredentials (gated manage:all): find-or-rotates a
dedicated per-server sync user ("System: <facilities> sync", role admin,
generated password) and returns its credentials. Mirrors the sync users the
provision subcommand creates.

The facility setup endpoint now, after authenticating the admin and checking
manage:all, calls this to provision sync credentials and stores those — rather
than persisting the admin's own credentials.
Clears the sync host/credentials/facilities facts so the server returns to the
unconfigured state and can be set up again. Host-level access only; restart to
take effect.
- ensureHostMatches now compares the externally declared host (env/config) via
  getDeclaredHost and skips when there's none, instead of constructing a
  CentralServerConnection (which throws when unconfigured) — so a fresh server
  boots cleanly to serve the setup wizard.
- resetSyncConfig hard-deletes the facts (force: true): local_system_facts is
  paranoid, and a soft-deleted row keeps occupying the unique key constraint,
  so a later set() of the same key would collide.
Now that local_system_secrets exists, hold the sync password as an encrypted
secret there (LocalSystemSecret.setSecret/getSecret) rather than plaintext in
local_system_facts. Host, username and facility ids stay as plain facts. The
resolver reads the secret separately and tolerantly, so a missing/unconfigured
crypto key file doesn't blank out the non-secret facts. resetSyncConfig clears
the secret too.
…wport

AuthFlowView vertically centres its content, so a tall wizard (omniserver / many
facilities) clipped off the top and bottom. Bound the form to the viewport
height and scroll internally instead.
Pad the scroll area's top so a full-height wizard doesn't render its heading
under the fixed top-left Tamanu logo.
Use margin-top (not padding) so the scroll viewport starts below the fixed
logo; content then scrolls within that frame instead of sliding up behind it.
Stop reusing AuthFlowView for the wizard: its vertically-centred body and fixed
top-left logo don't suit a tall scrolling form (content scrolled behind the
logo, uneven top/bottom gaps). The wizard now has its own layout — logo in
normal flow atop a scrollable form column, splash image alongside — so the form
scrolls cleanly beneath the logo regardless of height.
…ty list

A server with one facility id filled in is a single-facility server; no need for
a mode toggle. Always render the facility-id list with add/remove (the first row
can't be removed).
…L field

yup's .url() rejects hosts without a TLD (e.g. localhost); validate with the URL
parser instead, matching the server-side check.
- syncCredentials: await async setPassword (was persisting unhashed); derive the
  sync user email from a hash of the facility ids (fixed length, collision-free)
  and cap facility id count
- integrity: skip the facility-match central verify when the server has no host
  (was constructing CentralServerConnection and crashing unconfigured boot)
- serverConfig: guard configHost against malformed URLs; use ?? not ||
- setup: gate POST /public/setup/sync to trusted source networks (loopback /
  RFC1918 / link-local / Tailscale, v4+v6 via ipaddr.js) per review — restrict
  the request source rather than the target host; wrap fact/secret writes in a
  transaction so a mid-write failure can't half-configure the server
- fold setup status into the public ping response (setupRequired) instead of a
  dedicated endpoint + extra request; drop useSetupStatusQuery
- fix serverConfig test wiring (mock LocalSystemSecret; password under the secret)
…d facts

Emptying the default sync.host broke two consumers still reading config directly,
failing every facility test at createTestContext:
- websocketClientService did new URL(config.sync.host) — now reads the resolved
  host and no-ops when the server is unconfigured
- FacilitySyncManager read sync email/password from config — now from getSyncConfig
…rdening

(continuation — these files were missed by an earlier mis-staged commit)
- syncCredentials: await setPassword, hash-derived email, cap facility ids
- integrity: skip facility-match central verify when unconfigured
- serverConfig: guard configHost, ?? over ||
- setup: trusted-source-IP gate, atomic writes
- ping carries setupRequired; serverConfig test wiring fixed
- central POST /admin/syncCredentials: unauthenticated rejected, non-admin
  forbidden, empty ids rejected, admin provisions a sync user (hashed password),
  repeat calls rotate the same account
- facility isTrustedSetupSource: trusts loopback/RFC1918/link-local/Tailscale
  (v4+v6), rejects public + malformed
- facility setup: GET /public/ping reports setupRequired; POST /public/setup/sync
  rejects missing host / empty facilities / non-https
Emptying the default sync.host left the test env with no host, so every test
constructing CentralServerConnection threw. The test config supplies sync
host/email/password + facilities (a configured env, as before), so CSC builds.
Setup/serverConfig tests adjusted for the now-configured test server.
The base's lock was stale under the node 26 bump's npm (dev->devOptional flips,
nested duplicate dedup), failing the CI lockfile check on any PR off this base.
Regenerated with npm 11 so `npm install` is a no-op.
@dannash100 dannash100 force-pushed the feat/TAM-6863/sync-host-setup branch from 134a7a0 to fae9f64 Compare June 24, 2026 23:07
The mSupply task tests mocked selectFacilityIds to control the server's facility
ids per-test, but the tasks now read getServerFacilityIds. Mock that instead
(partial serverConfig mock, keeping the real initServerConfig so createTestContext
still works).
- LocalSystemSecret.getSecret/setSecret were renamed to get/set by the
  TAM-6862 crypto merge; serverConfig and the setup endpoint still called the
  old names, so the wizard 500'd and the boot password read silently failed.
- Extract the triplicated configured/unconfigured sync boot block (timesync +
  central connection + sync manager + timezone checks, plus the warn) into
  setupSyncRuntime(); use it from startAll/startTasks and the sync path of
  startApp.
- mSupply task tests mocked ../../dist/serverConfig, but the tasks load
  ../serverConfig from source under jest — point the mock at ../../app/serverConfig.
…echeck

The determinism harness builds its baseline DB at the pre-migration commit,
which can pre-date the build-less switch and so records `.js` migration names.
At HEAD, areMigrationsAvailable() consulted migrations directly, tripping
createMigrationInterface's `.js`-vs-`.ts` guard before migrateAndHash() could
run the upgrade that normalises them. Normalise the storage in the precheck
too (exporting normaliseMigrationStorageExtensions from @tamanu/upgrade),
matching what upgrade() does before any migration state is read.
import { log } from '@tamanu/shared/services/logging';

// { sync: { host, email, password }, facilityIds }, resolved once at boot.
let resolved = null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BES Requirements] suggestion

Module-level mutable singleton let resolved = null means getSyncConfig() / getServerFacilityIds() return stale data if initServerConfig is called again (e.g. after the setup wizard writes new facts). The setup handler does call initServerConfig after saving, but other callers that captured the old return value (e.g. CentralServerConnection constructor stored host in its constructor) won't pick up the change until the process restarts. The code comment in setup.js says "the sync process picks it up on its next (re)start" which is correct, but isServerConfigured() in the same process will flip mid-flight — worth documenting that the sync runtime is NOT hot-reloaded.

if (isServerConfigured()) {
context.timesync = await initTimesync({
models: context.models,
url: `${getSyncConfig().host.replace(/\/*$/, '')}/api/timesync`,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BES Requirements] critical

performTimeZoneChecks is called with remote: context.centralServer, but context.centralServer is never set in the API branch (the comment on line 48 even says "doesn't open a central connection"). While performTimeZoneChecks handles undefined remote gracefully today, this is fragile — the intent is clearly wrong and will break if the timezone check is ever tightened. Either skip the remote check entirely when there's no central connection, or instantiate one.

const password = crypto.randomBytes(24).toString('base64url');

const existing = await User.findOne({ where: { email } });
if (existing) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BES Requirements] suggestion

The displayName is built as System: ${uniqueFacilityIds.join(', ')} sync. With up to 100 facility IDs (the schema max), this could produce a very long string. Consider truncating or summarising when the list exceeds a reasonable length, or lowering the .max(100) limit to match realistic use.

email: z.string().trim().min(1),
password: z.string().min(1),
facilityIds: z
.array(z.string().trim().min(1))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BES Requirements] suggestion

isProduction is evaluated once at module load time (const isProduction = process.env.NODE_ENV === 'production'). This is fine for the running server but will bind to whatever NODE_ENV is set at import time, which can cause surprises in tests where NODE_ENV may change. Consider inlining the check inside the refine callback: process.env.NODE_ENV === 'production'.

.max(100),
});

export const setupSyncHandler = asyncHandler(async (req, res) => {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Integration tests] suggestion

POST /public/setup/sync is a security-sensitive unauthenticated endpoint but the integration test in __tests__/apiv1/Setup.test.js only covers the 409 (already-configured) path. The trusted-source IP gate (line 78) is only unit-tested via isTrustedSetupSource, and there is no integration test for request validation (missing/invalid fields returning 4xx). Consider adding at least a validation-error integration test (e.g. missing host or facilityIds) to exercise the Zod schema through the HTTP layer — the unit test for isTrustedSetupSource doesn't prove the middleware wires req.ip correctly. The comment in Setup.test.js acknowledges the gap ('happy path is verified manually'), which is reasonable given the test harness always boots a configured server, but the validation path can be tested even against a configured server (it runs before the isServerConfigured check) — actually no, the trusted-source check runs first. Still, a test confirming the 403 from a non-trusted IP (if the test runner's loopback counts as trusted, test that it's accepted) would strengthen confidence in the wiring.

}

// Reconfiguring a live server goes through env / direct DB, not this endpoint.
if (isServerConfigured()) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Security] critical

TOCTOU race condition: isServerConfigured() is checked outside the transaction (line 85), but the config is written inside a transaction (line 135). Two concurrent requests from trusted sources can both pass the check and race to configure the server. The second overwrites the first's credentials silently — an attacker on the local network could re-point a server that was just legitimately configured. Move the configured check inside the transaction (or use a DB-level advisory lock / unique constraint) so only one request can win.

export const setupSyncHandler = asyncHandler(async (req, res) => {
req.flagPermissionChecked();

if (!isTrustedSetupSource(req.ip)) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Security] critical

IP-based trust is bypassable when a reverse proxy is configured. req.ip honours Express's trust proxy setting (set from config.proxy.trusted in addFacilityMiddleware.js:22). If the proxy list is misconfigured or overly broad, an external attacker can spoof X-Forwarded-For to inject a private-range IP and bypass isTrustedSetupSource. Since this is the primary gate on an unauthenticated endpoint that makes outbound requests (SSRF) and can claim a fresh server, consider a defence-in-depth measure: e.g. a one-time setup token generated at install time that must accompany the request, so network position alone is not sufficient.


const email = syncUserEmail(uniqueFacilityIds);
const displayName = `System: ${uniqueFacilityIds.join(', ')} sync`;
const password = crypto.randomBytes(24).toString('base64url');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Security] suggestion

The plaintext password is returned in the HTTP response body (line 46). If central is behind a TLS-terminating proxy and the downstream leg is plain HTTP, or if response logging/caching is enabled at any layer, the password is exposed. Consider whether the credential can be exchanged via a short-lived token or at least ensure the response carries Cache-Control: no-store to prevent intermediary caching.

@review-hero

review-hero Bot commented Jun 26, 2026

Copy link
Copy Markdown

🦸 Review Hero Summary
12 agents reviewed this PR | 3 critical | 5 suggestions | 1 nitpick | Filtering: consensus 3 voters, 4 below threshold

Below consensus threshold (4 unique issues not confirmed by majority)
Location Agent Severity Comment
packages/central-server/app/admin/syncCredentials.js:43 Integration tests suggestion POST /api/admin/syncCredentials — the new-user path passes password as a plain string to User.create(), while the existing-user path explicitly calls setPassword() to hash it. The test verifi...
packages/facility-server/app/routes/apiv1/setup.js:91 Security suggestion SSRF via attacker-controlled host: any machine on a trusted network (all RFC1918, CGNAT, Tailscale ranges) can supply an arbitrary URL as host, and the server will make an outbound HTTP(S) requ...
packages/facility-server/app/routes/apiv1/setup.js:121 BES Requirements critical The setup handler checks loginResult.ability?.can('manage', 'all') to verify admin access, but TamanuApi.login() returns the raw login response from the server. Verify that the login response a...
packages/web/app/views/SetupWizardView.jsx:117 BES Requirements suggestion The yup validation message 'Sync server URL must be a valid URL' is a hardcoded English string shown to users. Per Tamanu convention all user-facing strings must use TranslatedText or `getTrans...

Nitpicks

File Line Agent Comment
packages/central-server/app/admin/syncCredentials.js 24 Integration tests POST /api/admin/syncCredentials — the integration test doesn't verify that the returned password actually authenticates (i.e. comparePassword succeeds). The test checks user.password !== result.body.password (proving hashing happened) but doesn't verify the hash is valid. A single `expect(await...
Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.


packages/facility-server/app/serverConfig.js:14: Module-level mutable singleton let resolved = null means getSyncConfig() / getServerFacilityIds() return stale data if initServerConfig is called again (e.g. after the setup wizard writes new facts). The setup handler does call initServerConfig after saving, but other callers that captured the old return value (e.g. CentralServerConnection constructor stored host in its constructor) won't pick up the change until the process restarts. The code comment in setup.js says "the sync process picks it up on its next (re)start" which is correct, but isServerConfigured() in the same process will flip mid-flight — worth documenting that the sync runtime is NOT hot-reloaded.


packages/facility-server/app/subCommands/startApp.js:55: performTimeZoneChecks is called with remote: context.centralServer, but context.centralServer is never set in the API branch (the comment on line 48 even says "doesn't open a central connection"). While performTimeZoneChecks handles undefined remote gracefully today, this is fragile — the intent is clearly wrong and will break if the timezone check is ever tightened. Either skip the remote check entirely when there's no central connection, or instantiate one.


packages/central-server/app/admin/syncCredentials.js:37: The displayName is built as System: ${uniqueFacilityIds.join(', ')} sync. With up to 100 facility IDs (the schema max), this could produce a very long string. Consider truncating or summarising when the list exceeds a reasonable length, or lowering the .max(100) limit to match realistic use.


packages/facility-server/app/routes/apiv1/setup.js:70: isProduction is evaluated once at module load time (const isProduction = process.env.NODE_ENV === 'production'). This is fine for the running server but will bind to whatever NODE_ENV is set at import time, which can cause surprises in tests where NODE_ENV may change. Consider inlining the check inside the refine callback: process.env.NODE_ENV === 'production'.


packages/central-server/app/admin/syncCredentials.js:24: POST /api/admin/syncCredentials — the integration test doesn't verify that the returned password actually authenticates (i.e. comparePassword succeeds). The test checks user.password !== result.body.password (proving hashing happened) but doesn't verify the hash is valid. A single expect(await user.comparePassword(result.body.password)).toBe(true) assertion would close that loop, catching a bug where setPassword is skipped or the wrong value is hashed.


packages/facility-server/app/routes/apiv1/setup.js:75: POST /public/setup/sync is a security-sensitive unauthenticated endpoint but the integration test in __tests__/apiv1/Setup.test.js only covers the 409 (already-configured) path. The trusted-source IP gate (line 78) is only unit-tested via isTrustedSetupSource, and there is no integration test for request validation (missing/invalid fields returning 4xx). Consider adding at least a validation-error integration test (e.g. missing host or facilityIds) to exercise the Zod schema through the HTTP layer — the unit test for isTrustedSetupSource doesn't prove the middleware wires req.ip correctly. The comment in Setup.test.js acknowledges the gap ('happy path is verified manually'), which is reasonable given the test harness always boots a configured server, but the validation path can be tested even against a configured server (it runs before the isServerConfigured check) — actually no, the trusted-source check runs first. Still, a test confirming the 403 from a non-trusted IP (if the test runner's loopback counts as trusted, test that it's accepted) would strengthen confidence in the wiring.


packages/facility-server/app/routes/apiv1/setup.js:85: TOCTOU race condition: isServerConfigured() is checked outside the transaction (line 85), but the config is written inside a transaction (line 135). Two concurrent requests from trusted sources can both pass the check and race to configure the server. The second overwrites the first's credentials silently — an attacker on the local network could re-point a server that was just legitimately configured. Move the configured check inside the transaction (or use a DB-level advisory lock / unique constraint) so only one request can win.


packages/facility-server/app/routes/apiv1/setup.js:78: IP-based trust is bypassable when a reverse proxy is configured. req.ip honours Express's trust proxy setting (set from config.proxy.trusted in addFacilityMiddleware.js:22). If the proxy list is misconfigured or overly broad, an external attacker can spoof X-Forwarded-For to inject a private-range IP and bypass isTrustedSetupSource. Since this is the primary gate on an unauthenticated endpoint that makes outbound requests (SSRF) and can claim a fresh server, consider a defence-in-depth measure: e.g. a one-time setup token generated at install time that must accompany the request, so network position alone is not sufficient.


packages/central-server/app/admin/syncCredentials.js:34: The plaintext password is returned in the HTTP response body (line 46). If central is behind a TLS-terminating proxy and the downstream leg is plain HTTP, or if response logging/caching is enabled at any layer, the password is exposed. Consider whether the credential can be exchanged via a short-lived token or at least ensure the response carries Cache-Control: no-store to prevent intermediary caching.

- setup.js: close the TOCTOU between the isServerConfigured() check and the write
  with an advisory lock + re-check inside the transaction, so concurrent trusted
  requests can't both configure the server. Inline the NODE_ENV check so tests
  that set it are respected.
- syncCredentials.js: summarise the display name for servers with many facilities;
  send the credential response with Cache-Control: no-store.
- startApp.js: don't pass a non-existent central connection as the timezone-check
  remote on the API server.
- serverConfig.js: document that the sync runtime is not hot-reloaded.
- test: assert the provisioned sync password actually authenticates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants