Skip to content

[efficiency-improver] perf: avoid redundant string[] copy in LuceneSearchQueryBase grouped query methods#481

Open
github-actions[bot] wants to merge 2 commits into
support/3.xfrom
efficiency/base-fields-array-check-1781468268-39b56761c7d97bf1
Open

[efficiency-improver] perf: avoid redundant string[] copy in LuceneSearchQueryBase grouped query methods#481
github-actions[bot] wants to merge 2 commits into
support/3.xfrom
efficiency/base-fields-array-check-1781468268-39b56761c7d97bf1

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

🤖 This is an automated draft PR from Efficiency Improver, an AI assistant focused on reducing the energy consumption and computational footprint of this repository.

Goal

Eliminate an unconditional string[] allocation in every GroupedAnd/GroupedOr/GroupedNot call on LuceneSearchQueryBase — the primary query-building base class used directly by LuceneSearchQuery.

Focus Area

Code-Level Efficiency — remove a provably no-op heap allocation on the query-building hot path.

Changes

File: src/Examine.Lucene/Search/LuceneSearchQueryBase.cs

All 9 call sites (fields.ToArray()) updated across GroupedAnd/Or/Not (3 public + 6 INestedQuery explicit implementations):

Before:

// public overloads
return GroupedAndInternal(fields.ToArray(), fieldVals, Occurrence);

// INestedQuery overloads
fields == null ? s_emptyStringArray : fields.ToArray()

After:

// public overloads
return GroupedAndInternal(fields as string[] ?? fields.ToArray(), fieldVals, Occurrence);

// INestedQuery overloads
fields == null ? s_emptyStringArray : (fields as string[] ?? fields.ToArray())

When the caller already passes a string[] (the common case — e.g. new[] { "title", "body" }), the isinst type check short-circuits and no copy is made. The isinst instruction is negligible compared to the allocation it prevents.

Energy Efficiency Evidence

Proxy metric: heap allocation count per grouped query call (fewer allocations → less GC pressure → less CPU energy for collection).

Call site Before After Saving
GroupedAnd(IEnumerable<string>, IExamineValue[]) 1 string[] copy 0 when caller passes string[] −1 alloc/call
GroupedOr(IEnumerable<string>, IExamineValue[]) 1 string[] copy 0 when caller passes string[] −1 alloc/call
GroupedNot(IEnumerable<string>, IExamineValue[]) 1 string[] copy 0 when caller passes string[] −1 alloc/call
6 × INestedQuery.Grouped* overloads 1 string[] copy each 0 when caller passes string[] −1 alloc/call each

In a search application executing 100 grouped queries per second with string[] fields input, this eliminates ~900 short-lived heap objects per second — directly reducing GC collection frequency.

Limitation: This is a micro-optimisation. The allocation is small and short-lived. Impact is proportional to query throughput and the percentage of callers that already use string[].

Analogous change: PR #475 applies the same pattern to LuceneQuery.cs. This PR covers LuceneSearchQueryBase.cs, the complementary class used directly by LuceneSearchQuery.

GSF Context

Hardware Efficiency — eliminate provably unnecessary work. The .ToArray() copy exists only to satisfy a type constraint that is already satisfied when the caller passes a string[]; removing the copy lets the CPU skip one object header write and one memcpy per call.

Trade-offs

  • fields as string[] ?? fields.ToArray() adds one isinst instruction — undetectable overhead, faster than the allocation it replaces.
  • No behaviour change: the array values passed to internal methods are identical.
  • Readability is equivalent or slightly improved (fewer intermediate allocations to reason about).

Reproducibility

dotnet build src/Examine.sln --configuration Release
dotnet test src/Examine.Test/Examine.Test.csproj -f net8.0 --configuration Release

Test Status

✅ Build: 0 errors, 3 pre-existing framework warnings (net6.0 EOL, unrelated to this change).
✅ Tests: 147 passed / 2 skipped as expected (net8.0).

Generated by Efficiency Improver · sonnet46 5.8M ·

Add this agentic workflows to your repo

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/efficiency-improver.md@dcdf09723d42ef9b6c75335e4612fd145d4ccdaa

…query methods

Replace unconditional fields.ToArray() with fields as string[] ?? fields.ToArray()
in all 9 GroupedAnd/Or/Not overloads (3 public + 6 INestedQuery implementations).

When callers already pass a string[] (the common case), the isinst type check
short-circuits and no copy is made. This eliminates one string[] allocation per
grouped query call.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Shazwazza Shazwazza marked this pull request as ready for review June 18, 2026 23:13
@greptile-apps

greptile-apps Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR avoids a redundant heap allocation in the GroupedAnd/GroupedOr/GroupedNot methods of LuceneSearchQueryBase by using a type-check short-circuit (fields as string[] ?? fields.ToArray()) instead of unconditionally calling .ToArray(). The change is safe because all three internal methods (GroupedAndInternal, GroupedOrInternal, GroupedNotInternal) only read from the fields array — they pass it on to GetMultiFieldQuery which accepts IReadOnlyList<string> and never mutates it — so sharing the caller's original array introduces no aliasing risk.

  • At all 9 call sites in both the public overloads and the six INestedQuery explicit interface implementations, fields.ToArray() is conditionally replaced; the null-guard paths (s_emptyStringArray) are unchanged and still execute before the cast.
  • The micro-optimisation eliminates one short-lived string[] allocation per call when the caller already provides a string[], reducing GC pressure at high query throughput.

Confidence Score: 5/5

Safe to merge — the change is a well-scoped micro-optimisation with no behavioural impact.

All three internal methods only read from the fields array and do not store a reference to it beyond the call frame, so reusing the caller's array instead of copying it is correct. The null-guard paths are unchanged. The change is consistent in all 9 call sites and mirrors the pattern already applied in PR #475.

No files require special attention.

Important Files Changed

Filename Overview
src/Examine.Lucene/Search/LuceneSearchQueryBase.cs Replaces unconditional fields.ToArray() with fields as string[] ?? fields.ToArray() at 9 call sites across GroupedAnd/Or/Not public and INestedQuery explicit implementations; no functional change since internal methods only read from the array.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Caller: GroupedAnd/Or/Not\nIEnumerable fields param"] --> B{"fields as string[] != null?"}
    B -- "Yes: caller passed string[]\nisinst short-circuits" --> C["Pass caller array directly\nzero allocation"]
    B -- "No: other IEnumerable type" --> D[".ToArray() new string copy"]
    C --> E["Internal method\nGroupedAndInternal / OrInternal / NotInternal"]
    D --> E
    E --> F["GetMultiFieldQuery\nIReadOnlyList read-only iteration"]
    F --> G["BooleanQuery built\nand added to stack"]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A["Caller: GroupedAnd/Or/Not\nIEnumerable fields param"] --> B{"fields as string[] != null?"}
    B -- "Yes: caller passed string[]\nisinst short-circuits" --> C["Pass caller array directly\nzero allocation"]
    B -- "No: other IEnumerable type" --> D[".ToArray() new string copy"]
    C --> E["Internal method\nGroupedAndInternal / OrInternal / NotInternal"]
    D --> E
    E --> F["GetMultiFieldQuery\nIReadOnlyList read-only iteration"]
    F --> G["BooleanQuery built\nand added to stack"]
Loading

Reviews (1): Last reviewed commit: "perf: avoid redundant string[] copy in L..." | Re-trigger Greptile

@Shazwazza

Copy link
Copy Markdown
Owner

@copilot resolve the merge conflicts in this pull request

Co-authored-by: Shazwazza <1742685+Shazwazza@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Resolved — I merged origin/support/3.x into this branch and fixed the conflict in src/Examine.Lucene/Search/LuceneSearchQueryBase.cs in commit a7c51b7e.

Copilot AI requested a review from Shazwazza June 18, 2026 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants