Skip to content

Architecture refactor scout: github-28087513500 #7089

Description

@github-actions

cc @algolia/frontend-experiences-web

Architecture Refactor Scout

Run: github-28087513500

Summary

I inspected the whole monorepo, concentrating on the InstantSearch.js core (packages/instantsearch.js/src/lib and src/connectors), the algoliasearch-helper, the React/Vue flavors, and the shared instantsearch-ui-components. No CONTEXT.md or docs/adr/ exists, so candidates are grounded only in files and caller behavior. The recurring friction is shallow utility modules and stamped-out orchestration: a single concept (highlight transformation, "show more" toggling, route↔state mapping, coordinate parsing) is spread across several tiny exports or copy-pasted into many connectors, so callers must learn an interface nearly as large as the implementation and bugs hide in the orchestration glue rather than in any one tested pure function. The strongest candidates pull such a concept behind one deep module without changing public widget APIs.

Candidate Shortlist

candidate-1: Deepen highlight-parts transformation into one module
  • Recommendation strength: Strong
  • Files:
    • packages/instantsearch.js/src/lib/utils/getHighlightedParts.ts
    • packages/instantsearch.js/src/lib/utils/concatHighlightedParts.ts
    • packages/instantsearch.js/src/lib/utils/reverseHighlightedParts.ts
    • packages/instantsearch.js/src/lib/utils/getHighlightFromSiblings.ts
    • packages/instantsearch.js/src/lib/utils/escape-highlight.ts (owns TAG_REPLACEMENT)
    • barrel: packages/instantsearch.js/src/lib/utils/index.ts:6,25,26,50
    • callers across flavors: packages/react-instantsearch/src/widgets/{Highlight,ReverseHighlight,Snippet}.tsx, packages/react-instantsearch/src/ui/RefinementList.tsx, packages/vue-instantsearch/src/components/Highlighter.js, packages/instantsearch.js/src/helpers/*
  • Problem: One concept — "parse an Algolia highlighted string into parts, and re-serialize/reverse those parts" — is split across four tiny exports that all couple on TAG_REPLACEMENT from a fifth file. getHighlightedParts (parse) and concatHighlightedParts (serialize) are exact inverses yet live apart; reverseHighlightedParts delegates to getHighlightFromSiblings, whose sibling-indexing logic carries a latent bug: parts[i + 1]?.isHighlighted || true and parts[i - 1]?.isHighlighted || true (getHighlightFromSiblings.ts:9-10) collapse to true whenever the neighbor is false. A caller wanting "highlighted parts" must currently learn four entry points and which constant they share — the interface is as wide as the implementation, and the real edge-case logic is only reachable through the full pipeline.
  • Proposed change: Group these inverse operations behind one deep highlight-parts module so callers import a single concept ("get parts / join parts / reverse parts") instead of four loose functions, with the tag constant an internal detail rather than a shared import. Keep the existing named exports as thin re-exports for backward compatibility. Fix the || true sibling default as part of the consolidation, since the corrected behavior becomes locally testable.
  • Benefits: Locality — parse/serialize/reverse and the tag contract concentrate in one place reused by all three flavors; today a tag-format change touches five files. Leverage — callers learn one small surface. Testability — the interface becomes the natural test seam, and the sibling-default bug is verifiable at that surface rather than only through reverseHighlightedParts.
  • Risks: Public re-exports must stay stable (these symbols are consumed by react/vue and the helpers/ components). Fixing the || true default changes observable reverse-highlight output in degenerate cases; the PR should snapshot current vs. corrected behavior. Low migration cost — internal callers only.
  • Verification: Run packages/instantsearch.js/src/lib/utils/__tests__/{getHighlightedParts,concatHighlightedParts,reverseHighlightedParts,getHighlightFromSiblings}-test.ts; add a round-trip test (concat(get(x)) === x); type-check that the barrel still exports every original name; run react/vue Highlight/Snippet unit tests.

Before:

flowchart LR
  Caller[Highlight / Snippet widgets] --> getParts[getHighlightedParts]
  Caller --> concat[concatHighlightedParts]
  Caller --> reverse[reverseHighlightedParts]
  reverse --> siblings[getHighlightFromSiblings]
  getParts --> Tag[TAG_REPLACEMENT in escape-highlight]
  concat --> Tag
Loading

After:

flowchart LR
  Caller[Highlight / Snippet widgets] --> Deep[highlight-parts module]
  Deep --> siblings[sibling logic hidden]
  Deep --> Tag[TAG_REPLACEMENT hidden]
Loading
candidate-2: Extract the facet "show more" state machine from connectors
  • Recommendation strength: Strong
  • Files:
    • packages/instantsearch.js/src/connectors/refinement-list/connectRefinementList.ts:240-259,422-456
    • packages/instantsearch.js/src/connectors/menu/connectMenu.ts:185-201,292-328
    • packages/instantsearch.js/src/connectors/hierarchical-menu/connectHierarchicalMenu.ts:218-238,307-413
  • Problem: Three facet connectors hand-roll the identical "show more" state machine inside their closures: a mutable isShowingMore flag, a toggleShowMore placeholder reassigned via createToggleShowMore(renderOptions, this) that flips the flag and calls widget.render, a cachedToggleShowMore indirection so the exposed callback stays stable across renders, a getLimit() returning isShowingMore ? showMoreLimit : limit, and a canToggleShowMore derivation. This is orchestration, not pure logic — it is never unit-tested in isolation, only through each connector's full init/render cycle, which is exactly where the rubric says real bugs hide. Every new facet connector must re-learn and re-implement this ordering (placeholder → reassign in render → cache indirection) correctly.
  • Proposed change: Lift the show-more behavior into one deep module that a connector creates once and reads from, owning the flag, the toggle, the stable cached callback, the effective limit, and the canToggleShowMore rule. Connectors would request the current values instead of re-declaring the mechanism. No widget option or render-state field changes — the exposed toggleShowMore, isShowingMore, and canToggleShowMore keep their current shapes.
  • Benefits: Locality — the toggle/limit/cache ordering lives in one place instead of three near-identical copies. Leverage — a connector author learns one small interface rather than a five-variable dance. Testability — the state machine becomes directly testable at its own seam (toggle flips limit, cached callback identity is stable), instead of only via full connector renders.
  • Risks: The cached-callback indirection exists to preserve referential stability for react-instantsearch/vue-instantsearch consumers; the extraction must preserve that identity contract. connectNumericMenu and connectHits have related-but-different lazy-init shapes — scope this PR to the three true show-more connectors and leave the others. Medium review surface across three files.
  • Verification: Run the connector test suites for refinement-list, menu, and hierarchical-menu; assert toggleShowMore reference identity is stable across renders; add focused unit tests for the extracted module (toggle flips getLimit, canToggleShowMore boundaries); run E2E_FLAVOR=react E2E_BROWSER=chromium yarn test:e2e for refinement-list show-more.

Before:

flowchart LR
  RL[connectRefinementList] --> SM1[isShowingMore + toggle + cache + getLimit]
  Menu[connectMenu] --> SM2[isShowingMore + toggle + cache + getLimit]
  HM[connectHierarchicalMenu] --> SM3[isShowingMore + toggle + cache + getLimit]
Loading

After:

flowchart LR
  RL[connectRefinementList] --> Deep[showMore state module]
  Menu[connectMenu] --> Deep
  HM[connectHierarchicalMenu] --> Deep
  Deep --> Detail[flag/cache/limit hidden]
Loading
candidate-3: Own the route↔state lifecycle behind the router middleware seam
  • Recommendation strength: Worth exploring
  • Files:
    • packages/instantsearch.js/src/middlewares/createRouterMiddleware.ts:46-124
    • packages/instantsearch.js/src/lib/stateMappings/simple.ts
    • packages/instantsearch.js/src/lib/stateMappings/singleIndex.ts
    • packages/instantsearch.js/src/lib/routers/history.ts
  • Problem: The router middleware leaks knowledge across the router/stateMapping seam. It manually pairs stateMapping.stateToRoute / routeToState with router.write / read / onUpdate, holds its own lastRouteState and dedupes writes via isEqual (:90-96), and reaches into InstantSearch internals to reconstruct previous UI state — using mainIndex.getWidgets().length === 0 as a proxy for "we don't yet know the index widgets, so fall back to _initialUiState" in both topLevelCreateURL (:55-59) and onUpdate (:111). The caller must understand widget-initialization ordering and route-equality semantics to read this middleware at all; the round-trip mapping concept is smeared across the middleware rather than owned by a module.
  • Proposed change: Move the route↔state round-trip (write-deduplication, the "widgets-not-yet-known → initialUiState" fallback, and the read/write/onUpdate pairing) behind a single deep coordinator the middleware drives, so the middleware reads as lifecycle hooks delegating to one object instead of inlined ordering rules. Router and StateMapping stay as the public seams users override; only the internal gluing concentrates.
  • Benefits: Locality — the load-bearing ordering rule and write-dedup live in one tested place instead of being duplicated between topLevelCreateURL and onUpdate. Leverage — a reader/extender of routing learns one coordinator, not the interaction of three objects plus InstantSearch internals. Testability — round-trip stability and the initial-state fallback become testable without standing up a full InstantSearch instance.
  • Risks: Routing is the highest-blast-radius subsystem here; behavior must be byte-identical for URL output. The $$internal default-middleware path and the initialUiState-with-routing warning must be preserved. Higher review scrutiny; keep the diff to extraction-without-behavior-change and lean on existing routing tests.
  • Verification: Run middleware tests (src/middlewares/__tests__/createRouterMiddleware*) and routing integration tests; assert identical createURL output and identical router.write call counts before/after; run e2e routing specs in tests/e2e/playwright.

Before:

flowchart LR
  MW[createRouterMiddleware] --> Router[router read/write/onUpdate]
  MW --> Mapping[stateMapping stateToRoute/routeToState]
  MW --> Internals[mainIndex widgets + _initialUiState + lastRouteState]
Loading

After:

flowchart LR
  MW[createRouterMiddleware] --> Coord[route-state coordinator]
  Coord --> Router[router]
  Coord --> Mapping[stateMapping]
  Coord --> Internals[ordering + dedup hidden]
Loading
candidate-4: Deepen geo coordinate parsing behind one parser
  • Recommendation strength: Worth exploring
  • Files:
    • packages/instantsearch.js/src/lib/utils/geo-search.ts
    • sole caller: packages/instantsearch.js/src/connectors/geo-search/connectGeoSearch.ts
  • Problem: geo-search.ts exposes aroundLatLngToPosition plus insideBoundingBoxToBoundingBox, where the latter is a string | LatLng union dispatcher (:68-74) over two near-identical private functions (:20-43, :45-66) that each repeat the same "destructure four numbers → validate truthiness → throw a format-specific error → build {northEast, southWest}" sequence. The truthiness check (!neLat || ...) also silently rejects a legitimate 0 coordinate. Callers must know which input format they hold and that two parallel error messages exist for one concept. It is a shallow module: the interface (a union plus two near-duplicate behaviors) is about as complex as the implementation.
  • Proposed change: Collapse the array/string variants into one parser that normalizes either input to a bounding box with a single validation-and-error path, so the connector hands raw input to one entry point rather than choosing a format-specific branch. Fix the 0-coordinate rejection as part of unifying validation. Keep aroundLatLngToPosition and the public insideBoundingBoxToBoundingBox names for compatibility.
  • Benefits: Locality — one validation rule and one error-construction path instead of two copies. Leverage — the caller stops reasoning about the input union. Testability — the normalization/validation seam is small and exercised directly (it already has __tests__/geo-search-test.ts).
  • Risks: Smallest candidate; payoff is modest. Changing the 0-coordinate behavior alters which inputs throw — call it out explicitly and add a regression test. Error-message wording may be asserted in tests; preserve or update deliberately.
  • Verification: Run src/lib/utils/__tests__/geo-search-test.ts; add cases for 0 bounds and for array-vs-string equivalence; run connectGeoSearch connector tests.

Before:

flowchart LR
  Geo[connectGeoSearch] --> Dispatch[insideBoundingBoxToBoundingBox union]
  Dispatch --> ArrFn[array variant: validate/throw/build]
  Dispatch --> StrFn[string variant: validate/throw/build]
Loading

After:

flowchart LR
  Geo[connectGeoSearch] --> Parser[bounding-box parser]
  Parser --> One[single validate/normalize hidden]
Loading
candidate-5: Consolidate init/render argument builders
  • Recommendation strength: Speculative
  • Files:
    • packages/instantsearch.js/src/lib/utils/render-args.ts:9-57
    • callers: packages/instantsearch.js/src/widgets/index/index.ts, packages/instantsearch.js/src/middlewares/createMetadataMiddleware.ts
  • Problem: createInitArgs and createRenderArgs build overlapping ~10-field argument objects, and the caller must know the differences by memory: parent/uiState are init-only, results/scopedResults are render-only, and state is silently results._state when present "to make the UI optimistic" (:47) but helper.state otherwise. That optimistic-state rule and the init-vs-render field split are knowledge callers (the index widget, the metadata middleware) must hold rather than ask for. It is a mild shallow-module smell: two builders sharing most of their surface with a subtle, undocumented divergence.
  • Proposed change: Express the shared argument shape and the init/render divergence (including the optimistic state rule) behind one builder concept so callers ask for "init args" or "render args" without re-deriving which fields apply or why state switches sources. No change to the args objects widgets receive.
  • Benefits: Locality — the optimistic-state rule and field split sit in one place. Leverage — a connector/middleware author stops memorizing init-only vs render-only fields. Testability — the divergence becomes assertable at one seam.
  • Risks: These objects are the InitOptions/RenderOptions passed to every connector, so the field shapes are effectively public; this must be pure refactor with zero shape change. Thinner payoff than candidates 1–2 — flagged Speculative because the two builders may legitimately stay separate; verify the divergence is worth unifying before committing.
  • Verification: Type-check the full package (the args types flow into every connector); run index-widget and metadata-middleware tests; snapshot a representative connector's received init/render args before and after.

Before:

flowchart LR
  Caller[index widget / metadata middleware] --> Init[createInitArgs]
  Caller --> Render[createRenderArgs]
  Caller --> Rule[knows init-only vs render-only + optimistic _state]
Loading

After:

flowchart LR
  Caller[index widget / metadata middleware] --> Builder[render-args builder]
  Builder --> Rule[field split + optimistic _state hidden]
Loading

Top Recommendation

Implement candidate-1 (deepen highlight-parts transformation) first. It is the cleanest fit for the rubric: a single concept currently fragmented across four shallow exports plus a shared constant, consumed identically by all three flavors, with an interface as wide as its implementation and a genuine latent bug (|| true) that is only reachable through the full pipeline today. The locality is tight (a handful of lib/utils files plus the barrel), the public surface can be preserved via re-exports so migration risk is low, and the deepened module is exactly the natural test seam — making it a small, high-leverage, single PR.

Next Step

To implement a selected candidate, trigger:

/implement candidate-1

Replace candidate-1 with the id of the candidate you want to implement (candidate-1 through candidate-5).

Non-Candidates

  • Decompose InstantSearch.ts lifecycle (lib/InstantSearch.ts) — real friction, but the exploration estimated 3–4 PRs; too broad for one reviewable change and not a single small-interface improvement.
  • A generic init/render orchestration factory across all ~35 connectors — high theoretical leverage but touches the entire connector surface in one PR; rejected as a broad rewrite with outsized review/migration risk. Candidate-2 carves out the well-bounded show-more slice instead.
  • Extract/restructure the insights middleware (createInsightsMiddleware.ts) — genuinely tangled (token resolution + client loading + param patching), but large (~400 lines, multiple state machines) and better treated as its own multi-PR effort, not a small deepening.
  • Unify SearchParameters' three refinement trees / hierarchical-facet path logic (packages/algoliasearch-helper) — strong "scattered concept" smell, but it lives in a separately versioned, backward-compatibility-sensitive package and spans SearchParameters, requestBuilder, SearchResults, and tree generation; locality and compat risk make it a poor first PR.
  • Reorganize the connectors/index.ts barrel — discoverability churn, not module depth; explicitly out of scope per the cleanup/formatting exclusion.
  • Collapse the React/Vue useX connector wrappers — they look like pass-throughs, but they are the per-flavor seam adapting instantsearch.js connectors to React/Vue lifecycles; deleting them spreads complexity into callers (the deletion test says they earn their keep), so this is not a deepening opportunity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions