⚡ Bolt: [performance improvement] Optimize calculate_urgency and categories in priority_engine#828
⚡ Bolt: [performance improvement] Optimize calculate_urgency and categories in priority_engine#828RohanExploit wants to merge 1 commit into
Conversation
…atching - Rename _calculate_urgency to calculate_urgency to maintain public accessibility. - Remove arbitrary early break in _detect_categories to ensure proper sorting flow. - Replace slow manual `for k in keywords:` looping with a faster `any(k in text for k in keywords)` substring check before invoking pre-compiled regular expressions in `calculate_urgency`.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughThe PR exposes ChangesPriority Engine Urgency Method Exposure and Optimization
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR refactors and optimizes PriorityEngine urgency/category scoring to improve hot-path performance and fix category ranking accuracy, and updates local benchmark scripts to call the renamed urgency method.
Changes:
- Renames
_calculate_urgencytocalculate_urgencyand updates in-repo call sites. - Optimizes urgency scoring by pre-filtering with
any(k in text for k in keywords)before runningregex.search(...). - Removes the category keyword-match count cap so category sorting reflects full counts.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| backend/priority_engine.py | Renames urgency method, adds substring pre-filter before regex search, and removes category count cap for more accurate ranking. |
| backend/tests/benchmark_urgency.py | Updates benchmark script to call calculate_urgency instead of _calculate_urgency. |
| backend/tests/benchmark_urgency_unoptimized.py | Updates unoptimized benchmark script to call calculate_urgency instead of _calculate_urgency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def benchmark(iterations=10000): | ||
| start_time = time.perf_counter() | ||
| for _ in range(iterations): | ||
| # We only benchmark _calculate_urgency. We give it a base severity of 10. | ||
| priority_engine._calculate_urgency(sample_text, 10) | ||
| # We only benchmark calculate_urgency. We give it a base severity of 10. | ||
| priority_engine.calculate_urgency(sample_text, 10) | ||
| end_time = time.perf_counter() |
| @@ -43,7 +43,7 @@ def benchmark(iterations=10000): | |||
| pr = cProfile.Profile() | |||
| pr.enable() | |||
| for _ in range(20000): | |||
| priority_engine._calculate_urgency(sample_text, 10) | |||
| priority_engine.calculate_urgency(sample_text, 10) | |||
| pr.disable() | |||
| def benchmark(iterations=10000): | ||
| start_time = time.perf_counter() | ||
| for _ in range(iterations): | ||
| priority_engine._calculate_urgency(sample_text, 10) | ||
| priority_engine.calculate_urgency(sample_text, 10) | ||
| end_time = time.perf_counter() |
| if __name__ == "__main__": | ||
| # Force the engine to clear its cache and simulate the old unoptimized behavior | ||
| # where the keywords list is empty and regex.search is always called. | ||
| from backend.adaptive_weights import adaptive_weights | ||
| priority_engine._regex_cache = [] | ||
| for pattern, weight in adaptive_weights.get_urgency_patterns(): | ||
| priority_engine._regex_cache.append((re.compile(pattern), weight, pattern, [])) | ||
|
|
||
| # Warm up | ||
| priority_engine._calculate_urgency(sample_text, 10) | ||
| priority_engine.calculate_urgency(sample_text, 10) | ||
|
|
||
| print("--- Running Unoptimized Benchmark ---") | ||
| benchmark() | ||
|
|
||
| # Profile to show where time is spent | ||
| print("\n--- Running Profiler ---") | ||
| pr = cProfile.Profile() | ||
| pr.enable() | ||
| for _ in range(5000): | ||
| priority_engine._calculate_urgency(sample_text, 10) | ||
| priority_engine.calculate_urgency(sample_text, 10) | ||
| pr.disable() |
| if any(k in text for k in keywords): | ||
| if regex.search(text): | ||
| urgency += weight | ||
| reasons.append( | ||
| f"Urgency increased by context matching pattern: '{original_pattern}'" | ||
| ) |
| for k in keywords: | ||
| if k in text: | ||
| count += 1 | ||
| # Optimization: Cap count at 5 for sorting to avoid excessive string matching | ||
| if count >= 5: | ||
| break | ||
|
|
||
|
|
||
| if count > 0: |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/priority_engine.py (1)
181-206:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake the new public method self-initializing.
calculate_urgency()is now part of the public surface, but it still relies onanalyze()to lowercasetextand populate_regex_cache. In a fresh process, direct callers currently skip all urgency regex checks because the cache is empty, and mixed-case input can score differently than theanalyze()path. That makes the new API return inconsistent results and also skews the updated benchmarks.💡 Proposed fix
def calculate_urgency(self, text: str, severity_score: int): + self._ensure_weights_cache() + text = text.lower() # Base urgency follows severity urgency = severity_score🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/priority_engine.py` around lines 181 - 206, calculate_urgency currently assumes analyze() has been run (to lowercase input and populate self._regex_cache), so in fresh processes it skips regex checks and gives inconsistent scores; update calculate_urgency to self-initialize by lowercasing the input (text = text.lower()) and ensuring the regex cache is populated (e.g., call self.analyze(text) or a private initializer like self._build_regex_cache() when self._regex_cache is empty) before iterating over self._regex_cache, keeping the existing weighting, reason messages, and final min(100, urgency) behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/tests/benchmark_urgency_unoptimized.py`:
- Line 28: Replace the no-op f-string used in the print call to satisfy Ruff
F541: locate the print statement currently written as print(f"Benchmark:
calculate_urgency") (in the benchmark_urgency_unoptimized.py test) and change it
to a normal string print("Benchmark: calculate_urgency") so the f prefix is
removed.
In `@backend/tests/benchmark_urgency.py`:
- Line 28: Replace the no-op f-string in the test output by removing the
unnecessary f prefix: change the call to print in
backend/tests/benchmark_urgency.py that currently uses print(f"Benchmark:
calculate_urgency") to a normal string literal print("Benchmark:
calculate_urgency") so Ruff F541 is resolved; locate the print statement in the
benchmark_urgency module and update it accordingly.
---
Outside diff comments:
In `@backend/priority_engine.py`:
- Around line 181-206: calculate_urgency currently assumes analyze() has been
run (to lowercase input and populate self._regex_cache), so in fresh processes
it skips regex checks and gives inconsistent scores; update calculate_urgency to
self-initialize by lowercasing the input (text = text.lower()) and ensuring the
regex cache is populated (e.g., call self.analyze(text) or a private initializer
like self._build_regex_cache() when self._regex_cache is empty) before iterating
over self._regex_cache, keeping the existing weighting, reason messages, and
final min(100, urgency) behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 662ba6a3-3534-4481-8d3b-8421ba3fa1eb
📒 Files selected for processing (3)
backend/priority_engine.pybackend/tests/benchmark_urgency.pybackend/tests/benchmark_urgency_unoptimized.py
| avg_time_ms = (total_time / iterations) * 1000 | ||
|
|
||
| print(f"Benchmark: _calculate_urgency") | ||
| print(f"Benchmark: calculate_urgency") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show relevant lines from the target file
if [ -f backend/tests/benchmark_urgency_unoptimized.py ]; then
echo "== backend/tests/benchmark_urgency_unoptimized.py (lines 15-40) =="
nl -ba backend/tests/benchmark_urgency_unoptimized.py | sed -n '15,40p'
else
echo "File not found: backend/tests/benchmark_urgency_unoptimized.py"
fi
# Check Ruff config files
echo "== Ruff config candidates =="
ls -1 pyproject.toml ruff.toml .ruff.toml setup.cfg tox.ini 2>/dev/null || true
# Search for F541 mentions in repo
echo "== Searching for F541 in Ruff config =="
rg -n "F541" -S . || true
# If pyproject.toml exists, show the Ruff settings block area (small)
if [ -f pyproject.toml ]; then
echo "== pyproject.toml: searching for Ruff section =="
rg -n "\\[tool\\.ruff\\]" pyproject.toml || true
echo "== pyproject.toml: searching for rule selection =="
rg -n "select\\s*=|ignore\\s*=|lint\\.select|lint\\.ignore|F541" pyproject.toml || true
fiRepository: RohanExploit/VishwaGuru
Length of output: 195
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="backend/tests/benchmark_urgency_unoptimized.py"
echo "== Existence check =="
ls -l "$FILE" 2>/dev/null || echo "File not found: $FILE"
echo "== Showing lines 20-35 (or nearest available) =="
if [ -f "$FILE" ]; then
# Add line numbers without nl
awk 'NR>=20 && NR<=35 {printf "%5d\t%s\n", NR, $0}' "$FILE" || true
echo "== Exact line 28 =="
awk 'NR==28 {print "28\t"$0}' "$FILE" || true
fi
echo "== Ruff config candidates =="
ls -1 pyproject.toml ruff.toml .ruff.toml setup.cfg tox.ini 2>/dev/null || true
echo "== Search for F541 in repo =="
rg -n "F541" -S . || true
# If pyproject.toml exists, look for ruff config and select/ignore rules
if [ -f pyproject.toml ]; then
echo "== pyproject.toml: ruff sections (tool.ruff) =="
rg -n "\\[tool\\.ruff(\\.|\\])" pyproject.toml || true
echo "== pyproject.toml: select/ignore/extend-select/extend-ignore occurrences =="
rg -n "select\\s*=|ignore\\s*=|extend-select\\s*=|extend-ignore\\s*=" pyproject.toml || true
fi
# Also check ruff.toml if present
if [ -f ruff.toml ]; then
echo "== ruff.toml: select/ignore occurrences =="
rg -n "select\\s*=|ignore\\s*=|extend-select\\s*=|extend-ignore\\s*=" ruff.toml || true
fi
if [ -f .ruff.toml ]; then
echo "== .ruff.toml: select/ignore occurrences =="
rg -n "select\\s*=|ignore\\s*=|extend-select\\s*=|extend-ignore\\s*=" .ruff.toml || true
fiRepository: RohanExploit/VishwaGuru
Length of output: 1315
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== ruff.toml (relevant sections) =="
if [ -f ruff.toml ]; then
# Show first ~80 lines to include select list and ignore list
sed -n '1,120p' ruff.toml
echo "== Search ruff.toml for F541/F5 =="
rg -n "F541|F5" ruff.toml || true
else
echo "ruff.toml not found"
fiRepository: RohanExploit/VishwaGuru
Length of output: 1253
🌐 Web query:
Ruff rule F541 f-string without placeholders enabled by select ["F"] flake8-fstrings
💡 Result:
Ruff rule F541, titled f-string-missing-placeholders, is a diagnostic rule derived from Pyflakes that identifies f-strings (literal string interpolation) that do not contain any placeholder expressions (e.g., f"Hello, world!") [1]. Key details about rule F541: What it does: It flags f-strings that lack placeholders, as these are unnecessary and may confuse readers who expect dynamic content, or might indicate a programmer error where a placeholder was intended but forgotten [1]. Recommended fix: Replace the f-string with a standard string literal (e.g., change f"Hello" to "Hello") [1]. Context and Behavior: - Origin: The rule is a first-party implementation in Ruff that matches the behavior of the same-named rule found in Pyflakes [1]. - Autofix status: While Ruff provides an automatic fix for this rule, there has been significant internal discussion regarding whether this fix should be considered unsafe [2][3]. Some users and contributors have argued that an f-string without placeholders might be intentional (though stylistic) and that an automatic fix could be risky if the developer had specific, albeit misplaced, intent [2][3]. - Concatenation: To maintain compatibility with Pyflakes, Ruff does not flag f-strings without placeholders if they are part of an implicit string concatenation that includes other f-strings that do contain placeholders [1][4]. Configuration: As part of the "F" (Pyflakes) category, F541 is enabled by default in Ruff [5]. You do not need to specifically select it if you have "F" enabled [5]. If you wish to manage its behavior (e.g., regarding the safety of its autofix), you can configure this in your project's Ruff configuration file using the extend-unsafe-fixes setting [2].
Citations:
- 1: https://docs.astral.sh/ruff/rules/f-string-missing-placeholders/
- 2: Do not autofix F541
f-string without any placeholders/ mark as risky astral-sh/ruff#22620 - 3: [
pyflakes] mark fix as unsafe (F541) astral-sh/ruff#22650 - 4: F541 does not find violations in concatenations astral-sh/ruff#10885
- 5: https://docs.astral.sh/ruff/rules/
Remove no-op f prefix to fix Ruff F541
backend/tests/benchmark_urgency_unoptimized.py:28 uses print(f"Benchmark: calculate_urgency") with no placeholders; Ruff’s F selection enables F541, so lint can fail—change it to print("Benchmark: calculate_urgency").
🧰 Tools
🪛 Ruff (0.15.14)
[error] 28-28: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/tests/benchmark_urgency_unoptimized.py` at line 28, Replace the no-op
f-string used in the print call to satisfy Ruff F541: locate the print statement
currently written as print(f"Benchmark: calculate_urgency") (in the
benchmark_urgency_unoptimized.py test) and change it to a normal string
print("Benchmark: calculate_urgency") so the f prefix is removed.
| avg_time_ms = (total_time / iterations) * 1000 | ||
|
|
||
| print(f"Benchmark: _calculate_urgency") | ||
| print(f"Benchmark: calculate_urgency") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="backend/tests/benchmark_urgency.py"
echo "== File: $file =="
echo "-- Lines around 28 --"
nl -ba "$file" | sed -n '15,40p'
echo "-- Exact line 28 --"
nl -ba "$file" | sed -n '28,28p'
echo "-- Search for other F541-like occurrences (f\" with no {{) --"
rg -n 'print\(f"[^"{]*"\)' backend/tests/benchmark_urgency.py || trueRepository: RohanExploit/VishwaGuru
Length of output: 197
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="backend/tests/benchmark_urgency.py"
echo "== File: $file =="
echo "-- Lines 20-35 (with line numbers) --"
awk 'NR>=20 && NR<=35 {printf "%d:%s\n", NR, $0}' "$file"
echo "-- Exact line 28 --"
awk 'NR==28 {print NR ":" $0}' "$file"
echo "-- Does Ruff F541 mention this file/line? (search ruff config or cached reports) --"
rg -n "F541" -S . || true
echo "-- Search for print(f\"...\" pattern in this file --"
rg -n 'print\(f"[^"]*"\)' "$file" || trueRepository: RohanExploit/VishwaGuru
Length of output: 1159
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for Ruff config files =="
fd -t f -e toml -e cfg -e ini -e pyproject.toml -a 'ruff' . 2>/dev/null || true
fd -t f -a 'pyproject.toml' . || true
echo "== Locate likely Ruff config (pyproject.toml/ruff.toml) =="
fd -t f 'pyproject.toml' .
fd -t f 'ruff.toml' .
fd -t f 'setup.cfg' .
fd -t f 'tox.ini' .
echo "== Search for F541 in the repo =="
rg -n "F541" . || true
echo "== Search for 'ruff' tool config mentioning F541/unsafe rules =="
rg -n "tool\.ruff|ruff\.|select|ignore|extend-select|per-file-ignores" pyproject.toml ruff.toml setup.cfg tox.ini 2>/dev/null || true
echo "== Show relevant Ruff config sections (if pyproject.toml exists) =="
if [ -f "pyproject.toml" ]; then
rg -n "tool\.ruff" pyproject.toml || true
sed -n '1,200p' pyproject.toml | head -n 200
fiRepository: RohanExploit/VishwaGuru
Length of output: 652
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="ruff.toml"
echo "== ruff.toml head =="
sed -n '1,120p' "$file"
echo "== ruff.toml lint section (select/ignore/per-file-ignores) =="
rg -n "^(select|ignore|lint\.per-file-ignores|\\[lint\\.per-file-ignores\\])" "$file" || true
sed -n '1,80p' "$file"
sed -n '80,140p' "$file" || true
echo "== Does ruff.toml explicitly mention F541? =="
rg -n "F541" "$file" || trueRepository: RohanExploit/VishwaGuru
Length of output: 2679
🏁 Script executed:
#!/bin/bash
set -euo pipefail
if command -v ruff >/dev/null 2>&1; then
ruff --version
echo "== Ruff findings (exact file) =="
ruff check backend/tests/benchmark_urgency.py --config ruff.toml
echo "== Ruff findings (force F541 only) =="
ruff check backend/tests/benchmark_urgency.py --config ruff.toml --select F541
else
echo "ruff not installed in sandbox"
fiRepository: RohanExploit/VishwaGuru
Length of output: 753
Remove the no-op f prefix flagged by Ruff (F541)
Ruff reports F541 (f-string without placeholders) at backend/tests/benchmark_urgency.py:28; change print(f"Benchmark: calculate_urgency") to print("Benchmark: calculate_urgency").
🧰 Tools
🪛 Ruff (0.15.14)
[error] 28-28: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/tests/benchmark_urgency.py` at line 28, Replace the no-op f-string in
the test output by removing the unnecessary f prefix: change the call to print
in backend/tests/benchmark_urgency.py that currently uses print(f"Benchmark:
calculate_urgency") to a normal string literal print("Benchmark:
calculate_urgency") so Ruff F541 is resolved; locate the print statement in the
benchmark_urgency module and update it accordingly.
There was a problem hiding this comment.
1 issue found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/tests/benchmark_urgency.py">
<violation number="1" location="backend/tests/benchmark_urgency.py:22">
P2: This benchmark calls `calculate_urgency` without first populating `_regex_cache` (production loads adaptive weights before this path), so it effectively measures an empty-loop no-op rather than the optimized regex matching. Additionally, production normalizes text to lowercase in `analyze()` before calling this method — the benchmark should do the same to produce representative timings.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| # We only benchmark _calculate_urgency. We give it a base severity of 10. | ||
| priority_engine._calculate_urgency(sample_text, 10) | ||
| # We only benchmark calculate_urgency. We give it a base severity of 10. | ||
| priority_engine.calculate_urgency(sample_text, 10) |
There was a problem hiding this comment.
P2: This benchmark calls calculate_urgency without first populating _regex_cache (production loads adaptive weights before this path), so it effectively measures an empty-loop no-op rather than the optimized regex matching. Additionally, production normalizes text to lowercase in analyze() before calling this method — the benchmark should do the same to produce representative timings.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/tests/benchmark_urgency.py, line 22:
<comment>This benchmark calls `calculate_urgency` without first populating `_regex_cache` (production loads adaptive weights before this path), so it effectively measures an empty-loop no-op rather than the optimized regex matching. Additionally, production normalizes text to lowercase in `analyze()` before calling this method — the benchmark should do the same to produce representative timings.</comment>
<file context>
@@ -18,22 +18,22 @@
- # We only benchmark _calculate_urgency. We give it a base severity of 10.
- priority_engine._calculate_urgency(sample_text, 10)
+ # We only benchmark calculate_urgency. We give it a base severity of 10.
+ priority_engine.calculate_urgency(sample_text, 10)
end_time = time.perf_counter()
</file context>
Optimize data structures and logically prevent app flow breakage in PriorityEngine by fixing micro-optimizations that violated rules, and replacing a manual loop over keywords with an
any()pre-filter before executing the compiled regex withre.search(). Removes an artificial count truncation that impacted accurate priority sorting.PR created automatically by Jules for task 537533420952539318 started by @RohanExploit
Summary by cubic
Optimize the urgency calculation by pre-filtering keywords with
any()before running compiled regex, and remove the category count cap for accurate sorting. Also rename_calculate_urgencytocalculate_urgencyand update benchmarks to speed up the hot path and fix category ranking.Refactors
_calculate_urgencytocalculate_urgency; updated all call sites and benchmarks.any(k in text for k in keywords)beforeregex.search().Bug Fixes
_detect_categoriesto ensure correct category counts and sorting.Written for commit 62ab070. Summary will update on new commits.
Summary by CodeRabbit
Performance
Improvements