Skip to content

Implement getAnalyzedEntityIds() for efficient batch assembly#29

Open
jjroelofs wants to merge 2 commits into
feature/centralized-batch-processingfrom
jur/feature/centralized-batch-processing/#28-implement-getAnalyzedEntityIds
Open

Implement getAnalyzedEntityIds() for efficient batch assembly#29
jjroelofs wants to merge 2 commits into
feature/centralized-batch-processingfrom
jur/feature/centralized-batch-processing/#28-implement-getAnalyzedEntityIds

Conversation

@jjroelofs

Copy link
Copy Markdown
Contributor

Summary

Fixes #28. Implements the new getAnalyzedEntityIds() method from BatchableAnalyzerInterface (dxpr/analyze#18).

The centralized batch system previously called hasResults() on every entity during batch assembly, which triggered full entity rendering via generateContentHash(). This caused 504 timeouts on large content sets.

The new method returns entity IDs with existing results via a direct DB query on analyze_ai_sentiments_results, avoiding entity rendering entirely during assembly.

Changes

  • Add getAnalyzedEntityIds() to AISentimentsAnalyzer plugin, delegating to storage
  • Add getAnalyzedEntityIds() to SentimentsStorageService with a DISTINCT entity_id query

Test plan

  • Run batch on a large content set, verify no timeout during form submission
  • Verify already-analyzed entities are correctly excluded from non-force batch runs

Jurriaan Roelofs added 2 commits May 11, 2026 13:05
Add getAnalyzedEntityIds() to AISentimentsAnalyzer and storage service,
returning entity IDs with existing results via a direct DB query on
analyze_ai_sentiments_results. This avoids rendering entities during
batch assembly, preventing 504 timeouts on large content sets.

Closes #28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant