perf(scraper): speed up gamelist scrape progress and writes#868
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughThis PR adds scraped-count caching in the API, batch scrape-write contracts and a bulk implementation in MediaDB, refactors sentinel-based scraped-media queries, introduces SQL tracing and benchmarks, and rewrites the gamelist scraper to pre-parse files and perform index-driven companion matching with batched writes and updated tests. ChangesScrape Write Batching and Query Optimization
GameList Scraper Index-Driven Companion Optimization
Build Configuration
🎯 4 (Complex) | ⏱️ ~75 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
pkg/database/scraper/gamelistxml/scraper.go (1)
314-323: ⚡ Quick winMove the new type/const declarations into the top declaration block.
Lines 314-323 and Lines 517-520 add new types/constants after function bodies, which makes the file layout inconsistent with the repo rule and harder to scan.
As per coding guidelines, "Define Go types and consts near the top of the file, before functions and methods".
Also applies to: 517-520
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/database/scraper/gamelistxml/scraper.go` around lines 314 - 323, The new types parsedGamelistFile and parsedGamelistSystem (and the other new type/const group added later) are declared after function bodies; move those type/const declarations up into the file's top declaration block where other types and consts are defined (i.e., before any functions/methods) so they follow the repo guideline of placing Go types/consts near the top and keep the file layout consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/database/mediadb/sql_trace_bench_test.go`:
- Around line 38-41: Replace the use of sync.Mutex for the mu field with
syncutil.Mutex (i.e., change "mu sync.Mutex" to "mu syncutil.Mutex") and update
imports to reference the syncutil package instead of the stdlib sync; keep the
rest of the struct (byStmt, stmtByHandle) unchanged so existing code using
mu.Lock/Unlock still compiles against syncutil.Mutex.
In `@pkg/database/scraper/gamelistxml/scraper_test.go`:
- Around line 2562-2572: The helper drainBufferedUpdates currently reads from ch
without checking the receive-ok flag, which causes an infinite append of
zero-value scraper.ScrapeUpdate when the channel is closed; update the loop in
drainBufferedUpdates to use the two-value receive (e.g., update, ok := <-ch) and
return the collected updates immediately if ok is false, otherwise append the
received update, preserving the existing select/default behavior to still return
when no buffered items exist.
In `@pkg/database/scraper/gamelistxml/scraper.go`:
- Around line 1858-1864: When handling .slug files in the slug-detection block,
normalize the derived slug (currently computed as slug :=
strings.TrimSuffix(filename, filepath.Ext(filename))) to the same case as the
map keys before lookup; replace lookups against indexes.AllTitlesBySlug and
indexes.TitlesBySlug with a lowercased (or otherwise normalized) slug variable
(e.g., slugKey := strings.ToLower(slug)) so filenames like "MySlug.slug"
correctly match and avoid spuriously adding entries to MissingTitleSlugs.
---
Nitpick comments:
In `@pkg/database/scraper/gamelistxml/scraper.go`:
- Around line 314-323: The new types parsedGamelistFile and parsedGamelistSystem
(and the other new type/const group added later) are declared after function
bodies; move those type/const declarations up into the file's top declaration
block where other types and consts are defined (i.e., before any
functions/methods) so they follow the repo guideline of placing Go types/consts
near the top and keep the file layout consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e7bd55f6-0c85-4dba-8442-c455ce66b552
📒 Files selected for processing (14)
pkg/api/methods/media_scrape.gopkg/api/methods/media_scrape_test.gopkg/database/database.gopkg/database/mediadb/mediadb.gopkg/database/mediadb/sql_plan_test.gopkg/database/mediadb/sql_scraper.gopkg/database/mediadb/sql_scraper_bench_test.gopkg/database/mediadb/sql_scraper_test.gopkg/database/mediadb/sql_trace_bench_test.gopkg/database/mediadb/sql_trace_runtime.gopkg/database/mediadb/sql_trace_runtime_stub.gopkg/database/scraper/gamelistxml/scraper.gopkg/database/scraper/gamelistxml/scraper_test.goscripts/tasks/mister.yml
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Summary
Validation
Summary by CodeRabbit
New Features
Performance
Refactor
Tests
Chores