[efficiency-improver] perf: avoid redundant string[] copy in LuceneSearchQueryBase grouped query methods#481
Conversation
…query methods Replace unconditional fields.ToArray() with fields as string[] ?? fields.ToArray() in all 9 GroupedAnd/Or/Not overloads (3 public + 6 INestedQuery implementations). When callers already pass a string[] (the common case), the isinst type check short-circuits and no copy is made. This eliminates one string[] allocation per grouped query call. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Greptile SummaryThis PR avoids a redundant heap allocation in the
Confidence Score: 5/5Safe to merge — the change is a well-scoped micro-optimisation with no behavioural impact. All three internal methods only read from the No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Caller: GroupedAnd/Or/Not\nIEnumerable fields param"] --> B{"fields as string[] != null?"}
B -- "Yes: caller passed string[]\nisinst short-circuits" --> C["Pass caller array directly\nzero allocation"]
B -- "No: other IEnumerable type" --> D[".ToArray() new string copy"]
C --> E["Internal method\nGroupedAndInternal / OrInternal / NotInternal"]
D --> E
E --> F["GetMultiFieldQuery\nIReadOnlyList read-only iteration"]
F --> G["BooleanQuery built\nand added to stack"]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A["Caller: GroupedAnd/Or/Not\nIEnumerable fields param"] --> B{"fields as string[] != null?"}
B -- "Yes: caller passed string[]\nisinst short-circuits" --> C["Pass caller array directly\nzero allocation"]
B -- "No: other IEnumerable type" --> D[".ToArray() new string copy"]
C --> E["Internal method\nGroupedAndInternal / OrInternal / NotInternal"]
D --> E
E --> F["GetMultiFieldQuery\nIReadOnlyList read-only iteration"]
F --> G["BooleanQuery built\nand added to stack"]
Reviews (1): Last reviewed commit: "perf: avoid redundant string[] copy in L..." | Re-trigger Greptile |
|
@copilot resolve the merge conflicts in this pull request |
Co-authored-by: Shazwazza <1742685+Shazwazza@users.noreply.github.com>
Resolved — I merged |
🤖 This is an automated draft PR from Efficiency Improver, an AI assistant focused on reducing the energy consumption and computational footprint of this repository.
Goal
Eliminate an unconditional
string[]allocation in everyGroupedAnd/GroupedOr/GroupedNotcall onLuceneSearchQueryBase— the primary query-building base class used directly byLuceneSearchQuery.Focus Area
Code-Level Efficiency — remove a provably no-op heap allocation on the query-building hot path.
Changes
File:
src/Examine.Lucene/Search/LuceneSearchQueryBase.csAll 9 call sites (
fields.ToArray()) updated acrossGroupedAnd/Or/Not(3 public + 6INestedQueryexplicit implementations):Before:
After:
When the caller already passes a
string[](the common case — e.g.new[] { "title", "body" }), theisinsttype check short-circuits and no copy is made. Theisinstinstruction is negligible compared to the allocation it prevents.Energy Efficiency Evidence
Proxy metric: heap allocation count per grouped query call (fewer allocations → less GC pressure → less CPU energy for collection).
GroupedAnd(IEnumerable<string>, IExamineValue[])string[]copystring[]GroupedOr(IEnumerable<string>, IExamineValue[])string[]copystring[]GroupedNot(IEnumerable<string>, IExamineValue[])string[]copystring[]INestedQuery.Grouped*overloadsstring[]copy eachstring[]In a search application executing 100 grouped queries per second with
string[]fields input, this eliminates ~900 short-lived heap objects per second — directly reducing GC collection frequency.Limitation: This is a micro-optimisation. The allocation is small and short-lived. Impact is proportional to query throughput and the percentage of callers that already use
string[].Analogous change: PR #475 applies the same pattern to
LuceneQuery.cs. This PR coversLuceneSearchQueryBase.cs, the complementary class used directly byLuceneSearchQuery.GSF Context
Hardware Efficiency — eliminate provably unnecessary work. The
.ToArray()copy exists only to satisfy a type constraint that is already satisfied when the caller passes astring[]; removing the copy lets the CPU skip one object header write and one memcpy per call.Trade-offs
fields as string[] ?? fields.ToArray()adds oneisinstinstruction — undetectable overhead, faster than the allocation it replaces.Reproducibility
dotnet build src/Examine.sln --configuration Release dotnet test src/Examine.Test/Examine.Test.csproj -f net8.0 --configuration ReleaseTest Status
✅ Build: 0 errors, 3 pre-existing framework warnings (net6.0 EOL, unrelated to this change).
✅ Tests: 147 passed / 2 skipped as expected (net8.0).
Add this agentic workflows to your repo
To install this agentic workflow, run