feat: add DB indexes and feed caching for performance#55
Conversation
- DB indexes on (status, visibility) and (-published_at) — migration 0018 - LocMemCache configured with 30s TTL for unfiltered public feed page 1 - select_related + prefetch_related on public feed queryset Closes #31
sadykovIsmail
left a comment
There was a problem hiding this comment.
Code review:
- Composite index on
(status, visibility)is exactly what the feed query filters on — this should cut query time significantly with a large posts table - Index on
(-published_at)supports the default ordering — good, this is the most-hit sort - Cache TTL of 30s is conservative — short enough to feel near-real-time, long enough to absorb traffic spikes
- Cache only applies to
not request.query_params— so search/ordering/pagination bypass it. Correct, those can't be cached with a single key prefetch_related('reactions')for reaction counts — prevents N+1 per post in the list- LocMemCache is per-process — fine for single server, would need Redis for multi-instance. Acceptable for now
The cache key "public_feed_page_1" is hardcoded but only ever used in one place, so that's fine.
✅ Good performance investment.
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 29314140 | Triggered | Generic Password | f3d2c3e | app/author/tests/test_feed_performance.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Pull request overview
This PR introduces performance optimizations for the public feed by adding database indexes to support common feed queries and adding a short-TTL in-memory cache for the first page of the feed response.
Changes:
- Add
LocMemCacheconfiguration and aPUBLIC_FEED_CACHE_TTLsetting. - Cache the public feed’s first page response when no query params are present.
- Add DB indexes for
(status, visibility)and(-published_at); prefetchreactionsfor reaction counts; add performance-oriented tests.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| app/core/settings.py | Adds local-memory cache backend config and a feed cache TTL setting. |
| app/author/views.py | Adds feed caching in PublicPostListView.list() and prefetches reactions in the feed queryset. |
| app/author/tests/test_feed_performance.py | Adds new tests intended to cover feed indexing/caching/pagination behavior. |
| app/author/models.py | Declares new Meta.indexes on BlogPostModel for feed query performance. |
| app/author/migrations/0018_blogpost_indexes.py | Applies DB indexes via migrations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| CACHES = { | ||
| "default": { | ||
| "BACKEND": "django.core.cache.backends.locmem.LocMemCache", | ||
| "LOCATION": "blog-app-cache", |
There was a problem hiding this comment.
LocMemCache is process-local and won’t be shared across multiple Gunicorn workers or multiple pods/instances, so the public feed cache may provide little benefit (and inconsistent hit rates) in typical production deployments. Consider using a shared cache backend (e.g., Redis/Memcached) or making the cache backend configurable via environment so production can opt into a shared cache.
| CACHES = { | |
| "default": { | |
| "BACKEND": "django.core.cache.backends.locmem.LocMemCache", | |
| "LOCATION": "blog-app-cache", | |
| CACHE_BACKEND = os.getenv( | |
| "DJANGO_CACHE_BACKEND", | |
| "django.core.cache.backends.locmem.LocMemCache", | |
| ) | |
| CACHE_LOCATION = os.getenv( | |
| "DJANGO_CACHE_LOCATION", | |
| "blog-app-cache", | |
| ) | |
| CACHES = { | |
| "default": { | |
| "BACKEND": CACHE_BACKEND, | |
| "LOCATION": CACHE_LOCATION, |
| cache_key = "public_feed_page_1" | ||
| cached = cache.get(cache_key) | ||
| if cached is not None: | ||
| return Response(cached) | ||
| response = super().list(request, *args, **kwargs) | ||
| ttl = getattr(dj_settings, 'PUBLIC_FEED_CACHE_TTL', 30) | ||
| cache.set(cache_key, response.data, ttl) | ||
| return response |
There was a problem hiding this comment.
Caching response.data from a paginated DRF response will also cache the already-rendered pagination links (e.g., the absolute next URL built from the first request’s host/scheme). Subsequent requests with a different Host header or HTTPS scheme can receive incorrect pagination URLs. Consider including host/scheme in the cache key, or caching only the serialized results and rebuilding pagination links per request.
| return BlogPostModel.objects.filter( | ||
| status=BlogPostModel.Status.PUBLISHED, | ||
| visibility=BlogPostModel.Visibility.PUBLIC, | ||
| ).select_related('author', 'user') | ||
| ).select_related('author', 'user').prefetch_related('reactions') |
There was a problem hiding this comment.
prefetch_related('reactions') fetches all Reaction rows for the page, which can be expensive for highly-reacted posts. Since the API only needs reaction_count, consider annotating a Count('reactions') on the queryset and serializing that count, avoiding materializing the full reaction objects.
| def test_feed_response_correct_after_cache(self): | ||
| create_post(self.user, self.author, title="Cached Post") | ||
| res = self.client.get(PUBLIC_FEED_URL) | ||
| self.assertEqual(res.status_code, status.HTTP_200_OK) | ||
| self.assertEqual(res.data["count"], 1) | ||
|
|
||
| def test_cursor_pagination_returns_results(self): | ||
| for i in range(5): | ||
| create_post(self.user, self.author, title=f"Post {i}") | ||
| res = self.client.get(PUBLIC_FEED_URL) | ||
| self.assertEqual(res.status_code, status.HTTP_200_OK) | ||
| self.assertGreaterEqual(len(res.data["results"]), 1) |
There was a problem hiding this comment.
PR description mentions asserting query counts with assertNumQueries, but this test module doesn’t currently assert query counts or verify that the second request is served from cache. If query-count regression protection is a goal, add a test that makes two requests and asserts the cached path avoids DB queries (or at least reduces them).
| models.Index(fields=['status', 'visibility']), | ||
| models.Index(fields=['-published_at']), |
There was a problem hiding this comment.
The model Meta.indexes don’t specify name=..., but the migration adds indexes with explicit names. This mismatch can cause makemigrations to detect changes later (index renames/drops/re-adds) and generate extra churn migrations. Align by specifying the same name on the models.Index(...) declarations in BlogPostModel.Meta, or regenerate the migration using makemigrations so the model and migration agree on index names.
| models.Index(fields=['status', 'visibility']), | |
| models.Index(fields=['-published_at']), | |
| models.Index(fields=['status', 'visibility'], name='blogpost_status_visibility_idx'), | |
| models.Index(fields=['-published_at'], name='blogpost_published_at_idx'), |
| models.Index(fields=['status', 'visibility']), | ||
| models.Index(fields=['-published_at']), | ||
| ] | ||
|
|
There was a problem hiding this comment.
For the feed query (status + visibility filter with ORDER BY -published_at), the current pair of indexes may still require an extra sort. A composite index that includes published_at after (status, visibility) (optionally as a partial index for published/public) is more likely to support an index-only/order-preserving scan for this endpoint.
| models.Index(fields=['status', 'visibility']), | |
| models.Index(fields=['-published_at']), | |
| ] | |
| models.Index(fields=['status', 'visibility', '-published_at']), | |
| ] |
| if not request.query_params: | ||
| cache_key = "public_feed_page_1" | ||
| cached = cache.get(cache_key) | ||
| if cached is not None: | ||
| return Response(cached) | ||
| response = super().list(request, *args, **kwargs) | ||
| ttl = getattr(dj_settings, 'PUBLIC_FEED_CACHE_TTL', 30) | ||
| cache.set(cache_key, response.data, ttl) | ||
| return response | ||
| return super().list(request, *args, **kwargs) | ||
|
|
There was a problem hiding this comment.
The cache is only used when there are no query params. This bypasses caching for requests like ?page=1 or ?page_size=..., even though these are still the first page, and it also diverges from the PR description (“bypass when search/ordering are present”). If the intent is “cache first page unless search/ordering change the result”, consider checking specifically for those params (and optionally treating page=1 as cacheable, with the cache key varying by page_size).
| @@ -0,0 +1,56 @@ | |||
| from django.test import TestCase, override_settings | |||
There was a problem hiding this comment.
override_settings is imported but never used, which will fail linting (flake8 F401) and break CI. Remove the unused import or use it (e.g., to isolate cache configuration for these tests).
| from django.test import TestCase, override_settings | |
| from django.test import TestCase |
| class FeedPerformanceTests(TestCase): | ||
| def setUp(self): | ||
| self.client = APIClient() | ||
| self.user = create_user("perfuser") | ||
| self.author = create_author(self.user) | ||
|
|
There was a problem hiding this comment.
These tests can become order-dependent because the locmem cache persists across test methods: one test may populate public_feed_page_1, causing later tests to receive cached data for a different DB state. Clear the cache in setUp/tearDown (or use override_settings with a unique cache LOCATION) so each test runs in isolation.
| def test_cursor_pagination_returns_results(self): | ||
| for i in range(5): | ||
| create_post(self.user, self.author, title=f"Post {i}") | ||
| res = self.client.get(PUBLIC_FEED_URL) | ||
| self.assertEqual(res.status_code, status.HTTP_200_OK) | ||
| self.assertGreaterEqual(len(res.data["results"]), 1) |
There was a problem hiding this comment.
test_cursor_pagination_returns_results is misleading: PublicPostListView uses PageNumberPagination (StandardPagination), not cursor pagination. Rename the test to match the actual pagination style, or switch the view to cursor pagination if that’s the intended change.
What this does
Two performance improvements for the public feed: database indexes on the most-queried columns, and in-memory caching of the first page.
Changes
(status, visibility)and(-published_at)— directly serves the feed queryLocMemCacheconfigured in settings withPUBLIC_FEED_CACHE_TTL = 30sPublicPostListView.list()caches the first page (no query params) under keypublic_feed_page_1?search=or?ordering=params are presentprefetch_related('reactions')added for reaction countsdjango.test.utils.assertNumQueriesCloses #31