Skip to content
172 changes: 172 additions & 0 deletions adr/003-attendees-endpoint-n-plus-1-elimination.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
# ADR-003: Eliminate N+1 Queries in `/attendees` Endpoint

- **Status:** Accepted
- **Date:** 2026-05-25
- **Endpoint:** `GET /api/v1/summits/{id}/attendees`
- **Branch:** `perf/attendees-n-plus-1` (stacked on `hotfix/cache-optimizations`)
- **Related:** ADR-002 (events endpoint, same methodology)

## Context

Profiling `/attendees` with the admin UI's typical expand set
(`expand=tags,notes,manager&relations=member,manager,tags,tickets,notes`)
showed:

- **83 queries / request** (10-row page)
- **DB time 1061ms** dominating a 1182ms total — DB is the bottleneck here,
not the serializer (only 120ms)

Applied the same methodology as ADR-002:

1. Enabled `Server-Timing` instrumentation + `QueryTimingMiddleware` on the
attendees route — both already shipped from the events PR.
2. Re-enabled the SQL pattern logger temporarily, identified the top N+1s,
fixed one pattern per commit.

## Decision

### Reusable infra added in this branch

#### `ParametrizedGetAll` trait — optional `afterQuery` hook

The trait that wraps every `_getAll` endpoint now accepts an optional
`callable $afterQuery = null` parameter. When present, the hook fires
between the data-load step and `$response->toArray()`, receiving the
`PagingResponse` so callers can pre-populate caches or batch-load related
entities before serialization. Backward-compatible; existing callers pass
nothing.

### Targeted fixes

#### Fix 1 — Memoize + batch-preload `Summit::getSpeakerByMember`

**Symptom:** Three `SELECT DISTINCT PresentationSpeaker` patterns firing
~8 times each (≈24 queries).

**Root cause:** `SummitAttendeeSerializer:133` calls
`$summit->getSpeakerByMember($member)` per attendee. Inside,
`getSpeakerByMemberId()` runs THREE separate DQLs (moderator check,
speaker check, assistance check) per call.

**Fix:**

- `Summit` gains `private array $speakerByMemberIdCache` (unannotated;
Doctrine ignores it). `getSpeakerByMemberId()` reads and writes the
cache at every return point.
- New `Summit::preloadSpeakersByMemberIds(array $ids, bool $filter)` runs
the same 3 lookup steps but with `WHERE mb.id IN (:ids)` and populates
the cache for every id (with `null` for members not found). Three batch
queries instead of `N × 3` per-attendee queries.
- Each batch query also `addSelect('mb')` so the speaker's Member is
fetch-joined (avoids a follow-on N+1 once the speaker is loaded).
- `OAuth2SummitAttendeesApiController::getAttendeesBySummit` passes an
`afterQuery` closure that collects the page's attendee member ids and
invokes the preload.

**Files:**
- `app/Models/Foundation/Summit/Summit.php`
- `app/Http/Controllers/Apis/Protected/Summit/OAuth2SummitAttendeesApiController.php`
- `app/Http/Controllers/Apis/Protected/Summit/Traits/ParametrizedGetAll.php`

**Impact:** ~24 queries → 3.

---

#### Fix 2 — Batch-preload Notes, Tickets+Badges, Tags, Member

**Symptom:**
- `SummitAttendeeNote WHERE OwnerID = ?` × 10
- `SummitAttendeeTicket WHERE ... = ?` × 10
- `Tag JOIN SummitAttendee_Tags WHERE SummitAttendeeID = ?` × 10
- `SummitAttendeeBadge WHERE TicketID = ?` × 12 (exposed after tickets loaded)
- `Member WHERE id = ?` × 8

**Root cause:** Each is an EXTRA_LAZY collection / association on the
attendee or its ticket. Iteration during serialization fires one DB load
per attendee or per ticket.

**Fix:** Five batch fetch-join queries in the `afterQuery` closure:

```dql
SELECT a, n FROM SummitAttendee a LEFT JOIN a.notes n WHERE a.id IN (:ids)
SELECT a, t, b FROM SummitAttendee a LEFT JOIN a.tickets t LEFT JOIN t.badge b WHERE a.id IN (:ids)
SELECT a, tg FROM SummitAttendee a LEFT JOIN a.tags tg WHERE a.id IN (:ids)
SELECT a, m FROM SummitAttendee a LEFT JOIN a.member m WHERE a.id IN (:ids)
```

Doctrine's fetch-join (`SELECT a, X`) populates the inverse-side
collection / association so subsequent serializer iterations read from
memory.

**Files:** `app/Http/Controllers/Apis/Protected/Summit/OAuth2SummitAttendeesApiController.php`

**Impact:** ~50 queries → 4.

## Consequences

### Performance — measured on `api2.dev.fnopen.com`

| Metric | Baseline | After all fixes | Δ |
| --------------- | -------- | --------------- | -------- |
| Queries | 83 | **36** | **-57%** |
| DB time | 1061 ms | ~930 ms | -13% |
| Serializer time | 120 ms | ~22 ms | **-82%** |
| Total | 1182 ms | ~1040 ms | -12% |

The query-count reduction is dramatic; the wall-clock reduction is more
modest because per-query latency on the model database is ~25-30 ms.
Each remaining query "costs" almost the same as before, and we still
fire ~36 of them. **DB latency, not N+1 count, is now the dominant
component** — this would be the natural next investigation
(connection-pooling, query batching at the DB level, read replicas).

### What we kept

- Server-Timing header + `QueryTimingMiddleware` from ADR-002.
- The new `afterQuery` hook in `ParametrizedGetAll::_getAll` (reusable
for future endpoints — same pattern would apply to e.g.
`/orders`, `/tickets`).
- Entity-level caches (`Summit::$speakerByMemberIdCache`).

### What we did *not* fix (and why)

- **`PresentationSpeaker` SELECT × 7** — comes from a deeper path
inside the serializer chain after the speakers are loaded. Each
saves ~7 queries / ~6 ms; diminishing returns.
- **`COUNT(MemberID)` × 6** — `belongsToGroup()` checks against
Members other than the current user. Out of scope for this branch.
- **PromoCode-like × 4** — per-ticket discount code lookup. 4-query
pattern; would need a 5th fetch-join in the tickets preload.
- **`SET TRANSACTION` × 3** — connection-level isolation, not
application code.

The wall-clock gain from each of these would be ~10-50 ms. The
remaining DB-latency cost is the limiting factor and is infrastructure,
not application code.

### Risks

- Same risk class as ADR-002: the transient cache properties on entities
(`Summit::$speakerByMemberIdCache`) are correct only as long as
instances stay request-scoped. None of the current code reuses an EM
across requests without `clear()`, so it's safe.
- The `afterQuery` hook is opt-in per caller; the trait remains
backward-compatible for the dozens of other endpoints that use it
without passing the parameter.

## Methodology summary

Same as ADR-002 — see that file for the full write-up. Highlights specific
to this branch:

1. **Different bottleneck profile.** Events was serializer-bound;
attendees is DB-bound. Same instrumentation revealed both — the
methodology is endpoint-agnostic.
2. **Cascading lazy loads.** Removing one N+1 (tickets) exposed another
one downstream (badges per ticket). The pattern logger caught both;
we extended the same preload to fetch-join the badge alongside the
ticket in one query.
3. **The `afterQuery` hook is the right abstraction.** Each endpoint
knows what its serializer will touch; the hook lets it warm those
exact caches without touching the shared trait body or the repository
layer.
Original file line number Diff line number Diff line change
Expand Up @@ -701,7 +701,81 @@ function () {
null,
null,
null,
$params
$params,
// afterQuery hook: batch-preload everything the serializer is about
// to touch per-attendee. Collapses ~38 lazy-load queries per request
// (Notes/Tickets/Tags/Member/Speaker lookups) into 4 batch queries.
function ($data) use ($summit) {
$items = $data->getItems();
$attendeeIds = [];
$memberIds = [];
foreach ($items as $attendee) {
if (method_exists($attendee, 'getId') && $attendee->getId()) {
$attendeeIds[] = $attendee->getId();
}
if (method_exists($attendee, 'getMember')) {
$m = $attendee->getMember();
if ($m !== null && method_exists($m, 'getId') && $m->getId()) {
$memberIds[] = $m->getId();
Comment on lines +716 to +719
}
}
}

// Speaker lookup cache (moderator/speaker/assistance × N members).
if (!empty($memberIds)) {
$summit->preloadSpeakersByMemberIds($memberIds);
}

if (!empty($attendeeIds)) {
$em = \LaravelDoctrine\ORM\Facades\Registry::getManager(
\models\utils\SilverstripeBaseModel::EntityManager
);

// Notes (SummitAttendeeNote) — one batch query covers all attendees.
try {
$em->createQuery(
'SELECT a, n FROM ' . \models\summit\SummitAttendee::class . ' a ' .
'LEFT JOIN a.notes n WHERE a.id IN (:ids)'
)->setParameter('ids', $attendeeIds)->getResult();
} catch (\Exception $ex) {
\Illuminate\Support\Facades\Log::warning('attendees notes preload failed', ['error' => $ex->getMessage()]);
}

// Tickets — fetch-join collection AND the badge on each ticket
// (badges fire one query per ticket otherwise — was 12 per request).
try {
$em->createQuery(
'SELECT a, t, b FROM ' . \models\summit\SummitAttendee::class . ' a ' .
'LEFT JOIN a.tickets t ' .
'LEFT JOIN t.badge b ' .
'WHERE a.id IN (:ids)'
)->setParameter('ids', $attendeeIds)->getResult();
} catch (\Exception $ex) {
\Illuminate\Support\Facades\Log::warning('attendees tickets preload failed', ['error' => $ex->getMessage()]);
}

// Tags ManyToMany — fetch-join populates the inverse collection.
try {
$em->createQuery(
'SELECT a, tg FROM ' . \models\summit\SummitAttendee::class . ' a ' .
'LEFT JOIN a.tags tg WHERE a.id IN (:ids)'
)->setParameter('ids', $attendeeIds)->getResult();
} catch (\Exception $ex) {
\Illuminate\Support\Facades\Log::warning('attendees tags preload failed', ['error' => $ex->getMessage()]);
}

// Member fetch-join so $attendee->getMember() returns an
// initialized entity instead of triggering a per-attendee lazy load.
try {
$em->createQuery(
'SELECT a, m FROM ' . \models\summit\SummitAttendee::class . ' a ' .
'LEFT JOIN a.member m WHERE a.id IN (:ids)'
)->setParameter('ids', $attendeeIds)->getResult();
} catch (\Exception $ex) {
\Illuminate\Support\Facades\Log::warning('attendees member preload failed', ['error' => $ex->getMessage()]);
}
}
}
);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ public function _getAll
callable $afterQuery = null
)
{
Session::put('timing.controller_start', microtime(true));
return $this->processRequest(function () use (
$getFilterRules,
$getFilterValidatorRules,
Expand Down Expand Up @@ -150,6 +151,7 @@ public function _getAll
}

$transformStart = microtime(true);
Session::put('timing.serializer_start', microtime(true));
$serializerParams['filter'] = $filter;
$res = $data->toArray
(
Expand All @@ -159,13 +161,15 @@ public function _getAll
$serializerParams,
$serializerType && is_callable($serializerType) ? call_user_func($serializerType) : SerializerRegistry::SerializerType_Public
);
Session::put('timing.serializer_end', microtime(true));
$transformEnd = (microtime(true)-$transformStart)*1000;
$encodeStart = microtime(true);
$json_response = $this->ok($res);
$encodeEnd = (microtime(true)-$encodeStart)*1000;
Session::put("db_time", $dbEnd );
Session::put("transform_time", $transformEnd );
Session::put("encode_time", $encodeEnd );
Session::put('timing.controller_end', microtime(true));
Session::save();
return $json_response;
});
Expand Down
52 changes: 50 additions & 2 deletions app/Http/Middleware/Doctrine/QueryTimingCollector.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,63 @@ class QueryTimingCollector
public static float $totalMs = 0.0;
public static int $count = 0;

public static function record(float $startedAt): void
/**
* Per-pattern bucket for finding N+1s during profiling.
*
* @var array<string, array{count:int, totalMs:float, sample:string}>
*/
public static array $patterns = [];

public static function record(float $startedAt, ?string $sql = null): void
{
self::$totalMs += (microtime(true) - $startedAt) * 1000.0;
$ms = (microtime(true) - $startedAt) * 1000.0;
self::$totalMs += $ms;
self::$count++;

if ($sql !== null) {
$pattern = self::normalize($sql);
if (!isset(self::$patterns[$pattern])) {
self::$patterns[$pattern] = ['count' => 0, 'totalMs' => 0.0, 'sample' => $sql];
}
self::$patterns[$pattern]['count']++;
self::$patterns[$pattern]['totalMs'] += $ms;
}
Comment on lines +18 to +38
}

public static function reset(): void
{
self::$totalMs = 0.0;
self::$count = 0;
self::$patterns = [];
}

/**
* Top N most-repeated patterns (count >= 2), sorted by count desc.
*
* @return array<int, array{pattern:string, count:int, totalMs:float, sample:string}>
*/
public static function topPatterns(int $limit = 10): array
{
$rows = [];
foreach (self::$patterns as $pattern => $stats) {
if ($stats['count'] < 2) continue;
$rows[] = [
'pattern' => $pattern,
'count' => $stats['count'],
'totalMs' => round($stats['totalMs'], 1),
'sample' => $stats['sample'],
];
}
usort($rows, fn($a, $b) => $b['count'] <=> $a['count']);
return array_slice($rows, 0, $limit);
}

private static function normalize(string $sql): string
{
$s = preg_replace('/\?|:[a-zA-Z_][a-zA-Z0-9_]*/', '?', $sql);
$s = preg_replace("/'[^']*'/", '?', $s);
$s = preg_replace('/\b\d+\b/', '?', $s);
$s = preg_replace('/\s+/', ' ', $s);
return trim($s);
}
}
16 changes: 12 additions & 4 deletions app/Http/Middleware/Doctrine/QueryTimingMiddleware.php
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ public function query(string $sql): DBALResult
try {
return parent::query($sql);
} finally {
QueryTimingCollector::record($start);
QueryTimingCollector::record($start, $sql);
}
}

Expand All @@ -53,25 +53,33 @@ public function exec(string $sql): int
try {
return parent::exec($sql);
} finally {
QueryTimingCollector::record($start);
QueryTimingCollector::record($start, $sql);
}
}

public function prepare(string $sql): DBALStatement
{
return new QueryTimingStatement(parent::prepare($sql));
return new QueryTimingStatement(parent::prepare($sql), $sql);
}
}

class QueryTimingStatement extends AbstractStatementMiddleware
{
private string $sql;

public function __construct($wrapped, string $sql)
{
parent::__construct($wrapped);
$this->sql = $sql;
}

public function execute($params = null): DBALResult
{
$start = microtime(true);
try {
return parent::execute($params);
} finally {
QueryTimingCollector::record($start);
QueryTimingCollector::record($start, $this->sql);
}
}
}
Loading
Loading