Skip to content

refactor(database): linear-time mergeMultiJoin (plus type hints)#6631

Open
jaskfla wants to merge 3 commits into
devfrom
mmj
Open

refactor(database): linear-time mergeMultiJoin (plus type hints)#6631
jaskfla wants to merge 3 commits into
devfrom
mmj

Conversation

@jaskfla

@jaskfla jaskfla commented Feb 5, 2026

Copy link
Copy Markdown
Contributor

🦸 Review Hero

  • Run Review Hero
  • Auto-fix review suggestions
  • Auto-fix CI failures

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @jaskfla, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on optimizing the mergeMultiJoin utility for better performance and enhancing the overall type safety and consistency across the codebase. It introduces linear-time complexity for a critical database operation and improves developer experience through better type hinting and standardized constant management.

Highlights

  • Performance Optimization: The mergeMultiJoin utility function has been refactored to improve its time complexity from quadratic (O(N*M)) to linear (O(N+M)) by utilizing a Set for efficient lookup of existing join tables. This significantly enhances performance when merging large arrays of join configurations.
  • Type Safety and Documentation: Comprehensive JSDoc type hints have been added to the mergeMultiJoin function, including @typedef for MultiJoinItem and @template, @param, @returns for the function itself. This improves code readability, maintainability, and enables better static analysis.
  • TypeScript Consistency: Type definitions for SingleSurveyResponseRequest and SurveyResponsesRequest in web server routes have been updated from type aliases to interface declarations, aligning with common TypeScript practices for object type extensibility.
  • Constant Centralization: Constants like BES_ADMIN_PERMISSION_GROUP and SURVEY_RESPONSE_DEFAULT_FIELDS have been moved from local definitions to imports from @tupaia/constants, promoting a single source of truth and reducing duplication.
  • Query Hook Refinement: The useSurveyResponses query hook has been refined to use a consistent params object for its query key and API call, and as const assertion was added to DEFAULT_FIELDS for stronger type inference.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • packages/database/src/core/permissions/assertSurveyResponsePermission.js
    • Refactored SQL query string formatting for improved readability without functional changes.
  • packages/database/src/core/utilities/mergeMultiJoin.js
    • Implemented a performance optimization for mergeMultiJoin by replacing Array.prototype.find with a Set for checking unique joins, reducing time complexity from quadratic to linear.
    • Added extensive JSDoc type definitions for MultiJoinItem and the mergeMultiJoin function, including @typedef, @template, @param, and @returns.
  • packages/datatrak-web-server/src/routes/SingleSurveyResponseRoute.ts
    • Converted SingleSurveyResponseRequest from a type alias to an interface for better TypeScript extensibility.
    • Centralized constants BES_ADMIN_PERMISSION_GROUP and SURVEY_RESPONSE_DEFAULT_FIELDS by importing them from @tupaia/constants.
  • packages/datatrak-web-server/src/routes/SurveyResponsesRoute.ts
    • Converted SurveyResponsesRequest from a type alias to an interface for better TypeScript extensibility.
    • Added as const assertion to the DEFAULT_FIELDS array for stronger type inference and immutability.
  • packages/datatrak-web/src/api/queries/useSurveyResponses.ts
    • Refactored useQuery key and parameter passing to use a consistent params object.
    • Updated the useQuery callback to use async/await syntax and added explicit type assertion for the return type.
    • Changed boolean conversion for the enabled option from !! to Boolean() for stylistic consistency.
Activity
  • No specific review comments or activity have been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread packages/datatrak-web/src/api/queries/useSurveyResponses.ts Outdated

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant performance improvement to the mergeMultiJoin utility by refactoring it to a linear-time algorithm. This is an excellent optimization that will improve performance where this utility is used. The pull request also includes several other valuable refactorings, such as adding JSDoc type hints for better developer experience, centralizing constants for improved consistency, and strengthening TypeScript types with interface and as const for better type safety. The changes are well-implemented and improve the overall quality of the codebase. I have no further comments.

return baseMultiJoin;
}
const checkJoinIsUnique = join => !baseMultiJoin.find(j => j.joinWith === join.joinWith);
return [...baseMultiJoin].concat(...multiJoinToMerge.filter(checkJoinIsUnique));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.prototype.concat already returns a new array without modifying either of the originals

Both these spreads are redundant

const existingJoinTables = new Set(base.map(j => j.joinWith));
const uniques = source.filter(j => !existingJoinTables.has(j.joinWith));

return [...base, ...uniques];

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively:

Suggested change
return [...base, ...uniques];
return base.concat(uniques);

@jaskfla jaskfla force-pushed the mmj branch 2 times, most recently from 0d91e57 to 0c611da Compare February 24, 2026 01:36
Base automatically changed from rn-1717-test-four to rn-1545-epic-datatrak-offline March 2, 2026 04:44
Base automatically changed from rn-1545-epic-datatrak-offline to dev March 2, 2026 05:46
@jaskfla jaskfla force-pushed the mmj branch 2 times, most recently from f9f07bf to 7bea986 Compare March 15, 2026 21:02
@review-hero

review-hero Bot commented Mar 15, 2026

Copy link
Copy Markdown

🦸 Review Hero Summary
2 agents reviewed this PR | 0 critical | 0 suggestions | 1 nitpick

Nitpicks

File Line Agent Comment
packages/database/src/core/utilities/mergeMultiJoin.js 24 Performance When all source items are duplicates, uniques will be empty but [...base, ...uniques] still allocates a new array. Adding if (uniques.length === 0) return base; before the spread avoids this unnecessary allocation in the common hot-path where source joins already exist.
Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.


packages/database/src/core/utilities/mergeMultiJoin.js:24: When all source items are duplicates, uniques will be empty but [...base, ...uniques] still allocates a new array. Adding if (uniques.length === 0) return base; before the spread avoids this unnecessary allocation in the common hot-path where source joins already exist.

@jaskfla jaskfla force-pushed the mmj branch 2 times, most recently from 8012274 to 28d59e7 Compare March 30, 2026 00:34
@jaskfla

jaskfla commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

When all source items are duplicates, uniques will be empty but [...base, ...uniques] still allocates a new array. Adding if (uniques.length === 0) return base; before the spread avoids this unnecessary allocation in the common hot-path where source joins already exist.

Ignoring this performance nitpick because it feels wrong for it to only sometimes return a new array.

@jaskfla

jaskfla commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

@claude review

Comment thread packages/database/src/core/utilities/mergeMultiJoin.js
Comment thread packages/database/src/core/utilities/mergeMultiJoin.js

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — small, well-scoped change; the remaining typedef nit is minor and non-blocking.

Extended reasoning...

Overview

Single-file change to packages/database/src/core/utilities/mergeMultiJoin.js: switches duplicate detection from Array.prototype.find (O(N·M)) to a Set lookup (O(N+M)), and adds JSDoc type hints (a MultiJoinItem typedef and a generic-templated signature on the function).

Security risks

None. Pure utility function operating on plain JS objects with no I/O, no string interpolation into SQL, no authentication or authorization surface. The set of inputs and outputs is unchanged from the previous implementation.

Level of scrutiny

Low. The runtime behaviour change is mechanical (find → Set.has) and preserves order: base is emitted first, then unique items from source in their original order — matching the previous [...base].concat(...filtered) semantics. The renames (baseMultiJoinbase, multiJoinToMergesource) are local to the function body. The empty-source short-circuit was already present in spirit (filter of empty array yields empty array). All remaining differences are JSDoc, which doesn't affect runtime.

Other factors

The author has already responded to and incorporated prior typedef feedback (d17bff7 fixed the joinCondition optionality and added joinConditions plural). The one new nit being filed inline (joinType should be JoinType | null since JOIN_TYPES.DEFAULT === null and getQueryOptionsForColumns produces joinType: null items that flow into mergeMultiJoin) is the same class of typedef-accuracy issue and is non-blocking — no checkJs, no current TS importers of MultiJoinItem. Approving on the merits of the runtime change; the author can decide whether to fold the nit into a follow-up.

* @typedef {{
* joinWith: string;
* joinAs?: string;
* joinType?: JoinType;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 🟡 The new MultiJoinItem typedef declares joinType?: JoinType, but JoinType (BaseDatabase.js:37) is a literal union of strings only — it does not include null. However, null is the canonical default joinType in this codebase: JOIN_TYPES.DEFAULT === null (BaseDatabase.js:48), addJoin defaults joinType to that value (BaseDatabase.js:873), and getQueryOptionsForColumns in packages/central-server/src/apiV2/GETHandler/helpers.js:117 produces multiJoin items with joinType: null (asserted across many cases in helpers.test.js) that flow straight into mergeMultiJoin. The same package already uses the correct JoinType | null form at FeedItem.js:99. Severity is low (no checkJs, no current TS importers of MultiJoinItem), but this typedef ships into the emitted .d.ts (declaration: true, emitDeclarationOnly: true) and the PR explicitly advertises adding type hints — same class of accuracy nit as the two already-corrected items in this block. Fix: joinType?: JoinType | null.

Extended reasoning...

What's wrong

The new MultiJoinItem typedef at packages/database/src/core/utilities/mergeMultiJoin.js:6 declares:

joinType?: JoinType;

But JoinType is imported from packages/database/src/core/BaseDatabase.js:37:

/** @typedef {'cross' | 'fullOuter' | 'inner' | 'left' | 'leftOuter' | 'outer' | 'right' | 'rightOuter'} JoinType */

This is a string-only literal union. null is not assignable. So the typedef effectively says: joinType may be one of those eight strings or undefined — but never null.

Why that's wrong

null is the canonical default joinType throughout this codebase:

  1. BaseDatabase.js:48: JOIN_TYPES.DEFAULT: null. The whole JOIN_TYPES object is annotated @satisfies {Record<string, JoinType | null>} (line 38) — the surrounding code already treats JoinType | null as the real domain.
  2. BaseDatabase.js:870-873: addJoin destructures joinType = JOIN_TYPES.DEFAULT, i.e. defaults to null.
  3. packages/central-server/src/apiV2/GETHandler/helpers.js:117: getQueryOptionsForColumns defaults joinType = null and threads it into constructJoinCondition (line 90: { joinWith, joinCondition, joinType }), producing multiJoin items literally shaped { ..., joinType: null }. These items then flow through dbOptions.multiJoin into mergeMultiJoin as the source argument via permission helpers like assertSurveyResponsePermission.js, assertAnswerPermissions.js, assertSyncGroupPermissions.js, etc.
  4. packages/central-server/src/tests/apiV2/helpers.test.js asserts joinType: null on the produced items at lines 18, 39, 44, 70, 75, 113, 118, 123 — confirming this is the real production shape.

The convention to write JoinType | null already exists in the same package: packages/database/src/core/modelClasses/FeedItem.js:99 documents @param {JoinType | null} [dbOptions.joinType]. The new typedef just doesn't follow it.

Step-by-step proof

  1. Apply this PR; MultiJoinItem is now declared.
  2. In a TS file, write const item: MultiJoinItem = { joinWith: 'foo', joinType: JOIN_TYPES.DEFAULT, joinCondition: ['a','b'] }; — i.e. the canonical addJoin input shape.
  3. tsc reports Type 'null' is not assignable to type '"cross" | "fullOuter" | "inner" | "left" | "leftOuter" | "outer" | "right" | "rightOuter" | undefined'.
  4. Equivalently, take any item produced by getQueryOptionsForColumns (which always sets joinType: null) and try to assign it to MultiJoinItem — same error.

Why existing code doesn't break today

packages/database has allowJs: true but no checkJs, and grep confirms no .ts file currently imports MultiJoinItem from mergeMultiJoin.js. So nothing fails to build today. But packages/database/tsconfig.json extends tsconfig-js.json which sets declaration: true and emitDeclarationOnly: true, so the inaccurate typedef is emitted into the .d.ts shipped to consumers. The PR explicitly advertises adding type hints, so this is a fresh accuracy regression rather than a pre-existing issue, and is the same class as the two analogous typedef items already corrected in this very PR (joinCondition optionality and joinConditions plural — fixed in d17bff7).

How to fix

- *   joinType?: JoinType;
+ *   joinType?: JoinType | null;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant