Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions docs/plans/trt-1989-partitioning-prep.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,8 @@ When `prow_job_runs` is partitioned:
`(id, prow_job_release, timestamp)`).
2. All tables with FKs into `prow_job_runs` must reference the full
composite key — meaning they need the partition key columns too.
3. Tables with FKs **from** `prow_job_runs` to non-partitioned tables
(annotations, pull request join table) must either be co-partitioned
or have their FKs dropped.
3. Tables with FKs **to** `prow_job_runs` (annotations, pull request
join table) must either be co-partitioned or have their FKs dropped.

This means `prow_job_runs`, `prow_job_run_annotations`, and
`prow_job_run_prow_pull_requests` must all migrate to partitioned in a
Expand Down
128 changes: 128 additions & 0 deletions docs/plans/trt-1989-phase2-indexes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# TRT-1989 Phase 2: Composite Indexes on Denormalized Columns

**Date:** 2026-05-19
**JIRA:** [TRT-1989](https://redhat.atlassian.net/browse/TRT-1989)
**Depends on:** Phase 1 — column prep (`trt-1989-partitioning-prep.md`)

## Purpose

Phase 1 added denormalized `release` and `timestamp` columns to every table
that will be partitioned or holds a FK into a partitioned table. Phase 2
adds composite indexes on those columns so the query planner can use them
immediately — before partitioning is applied.

These indexes mirror the future partition key `(release, timestamp)`. Once
the tables are partitioned, each partition inherits a local copy of the
index, and the planner uses partition pruning instead. The indexes added
here serve two purposes:

1. **Immediate benefit** — queries migrated in Phase 3 to filter on the
denormalized columns will use these indexes on the current
non-partitioned tables.
2. **Validation** — exercising the indexes under production workload
confirms the column data is correct before committing to partitioning.

## Changes

All changes are GORM index tags on model structs in
`pkg/db/models/prow.go`. GORM `AutoMigrate` creates the indexes
automatically on the next migration run.

### prow_job_runs

Added composite index `idx_prow_job_runs_release_timestamp` across
`ProwJobRelease` and `Timestamp`.

Also added a standalone index on `ProwJobRunTest.ProwJobID` to support
variant queries that previously required joining through `prow_job_runs`.
Comment on lines +36 to +37
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct the model/table reference for the standalone index.

The text points to ProwJobRunTest.ProwJobID, but this index is on prow_job_run_tests (ProwJobRunTests model field), not prow_job_runs.

Proposed doc fix
-Also added a standalone index on `ProwJobRunTest.ProwJobID` to support
+Also added a standalone index on `ProwJobRunTests.ProwJobID` to support
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Also added a standalone index on `ProwJobRunTest.ProwJobID` to support
variant queries that previously required joining through `prow_job_runs`.
Also added a standalone index on `ProwJobRunTests.ProwJobID` to support
variant queries that previously required joining through `prow_job_runs`.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/trt-1989-phase2-indexes.md` around lines 36 - 37, Update the doc
line that incorrectly references ProwJobRunTest.ProwJobID: it should point to
the prow_job_run_tests table / ProwJobRunTests model field instead of
prow_job_runs; replace the reference with either "prow_job_run_tests" or
"ProwJobRunTests.ProwJobID" so the standalone index is correctly attributed to
the prow_job_run_tests model.

Comment on lines +31 to +37
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Move the standalone index documentation to the correct section.

Lines 36-37 describe an index on prow_job_run_tests.prow_job_id, but they appear under the ### prow_job_runs section header. This creates confusion about which table is being indexed.

📝 Proposed reorganization

Move lines 36-37 to appear under the ### prow_job_run_tests section (after line 38), or create a separate subsection if this index deserves its own explanation distinct from the composite release-timestamp index.

The SQL at lines 79-80 correctly shows this index on prow_job_run_tests, so the documentation sections should match the SQL organization.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/trt-1989-phase2-indexes.md` around lines 31 - 37, The
documentation incorrectly places the standalone index note for
prow_job_run_tests.prow_job_id under the prow_job_runs section; move the two
lines describing the index on prow_job_run_tests.prow_job_id (and the index name
if mentioned) out from under the "### prow_job_runs" header and place them under
the "### prow_job_run_tests" header (or create a short subsection there) so the
descriptive text matches the SQL shown later for the idx on
prow_job_run_tests.prow_job_id.


### prow_job_run_tests

Added composite index `idx_prow_job_run_tests_release_timestamp` across
`ProwJobRunTimestamp` and `ProwJobRunRelease`.

### prow_job_run_test_outputs

Added composite index `idx_prow_job_run_test_outputs_release_timestamp`
across `ProwJobRunTestTimestamp` and `ProwJobRunTestRelease`.

### prow_job_run_prow_pull_requests

Added composite index
`idx_prow_job_run_prow_pull_requests_release_timestamp` across
`ProwJobRunRelease` and `ProwJobRunTimestamp`.

### prow_job_run_annotations

Added composite index `idx_prow_job_run_annotations_release_timestamp`
across `ProwJobRunRelease` and `ProwJobRunTimestamp`.

## Explicit SQL

GORM `AutoMigrate` will create these indexes on the next migration run.
If you prefer to create them manually — for example, using `CONCURRENTLY`
to avoid locking production tables — run these statements directly.

`CREATE INDEX CONCURRENTLY` cannot run inside a transaction, so each
statement must be executed individually (not wrapped in `BEGIN`/`COMMIT`).

### prow_job_runs

```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_prow_job_runs_release_timestamp
ON prow_job_runs (prow_job_release, "timestamp");
```

### prow_job_run_tests

```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_prow_job_run_tests_prow_job_id
ON prow_job_run_tests (prow_job_id);

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_prow_job_run_tests_release_timestamp
ON prow_job_run_tests (prow_job_run_timestamp, prow_job_run_release);
```

### prow_job_run_test_outputs

```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_prow_job_run_test_outputs_release_timestamp
ON prow_job_run_test_outputs (prow_job_run_test_timestamp, prow_job_run_test_release);
```

### prow_job_run_prow_pull_requests

```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_prow_job_run_prow_pull_requests_release_timestamp
ON prow_job_run_prow_pull_requests (prow_job_run_release, prow_job_run_timestamp);
```

### prow_job_run_annotations

```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_prow_job_run_annotations_release_timestamp
ON prow_job_run_annotations (prow_job_run_release, prow_job_run_timestamp);
```

## Notes

- **Safe to create before deploying model updates.** GORM `AutoMigrate`
only adds — it never drops indexes, columns, or tables it doesn't
recognize. Indexes created manually will persist through any number of
`AutoMigrate` runs on the old model. Once the updated model with index
tags is deployed, `AutoMigrate` sees the indexes already exist and
skips them. There is no rollback risk.
- `CONCURRENTLY` avoids taking an exclusive lock on the table, allowing
reads and writes to continue during index creation. It is slower but
safe for production use.
- If the index already exists (e.g., GORM created it during a prior
migration), `IF NOT EXISTS` makes the statement a no-op.
- GORM `AutoMigrate` does **not** use `CONCURRENTLY` — it takes a brief
lock. On large tables this can block writes for the duration of the
index build. For production deployments, prefer creating the indexes
manually with the SQL above ahead of the code deploy, so that
`AutoMigrate` finds them already in place.
- Column order in the index matches the expected query pattern: most
queries filter on release first (equality), then timestamp (range).
The `prow_job_run_tests` index leads with timestamp because the
materialized view queries filter primarily on timestamp ranges.
172 changes: 172 additions & 0 deletions docs/plans/trt-1989-phase3-query-optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
# TRT-1989 Phase 3: Query Optimization Using Denormalized Columns

**Date:** 2026-05-19
**JIRA:** [TRT-1989](https://redhat.atlassian.net/browse/TRT-1989)
**Depends on:** Phase 1 (column prep), Phase 2 (indexes)

## Purpose

Phase 1 added denormalized `release` and `timestamp` columns to child
tables (`prow_job_run_tests`, `prow_job_run_test_outputs`,
`prow_job_run_prow_pull_requests`, `prow_job_run_annotations`). Phase 2
added composite indexes on those columns.

Nearly every significant query in sippy filters on
`prow_job_runs.timestamp` and/or `prow_jobs.release` via joins. Once these
tables are partitioned, those join-based filters **won't help the planner
prune child table partitions** — the planner needs WHERE clauses on each
partitioned table's own partition key columns.

This phase adds filters on the denormalized columns and drops joins where
all referenced columns have local replacements. This is safe to ship
before partitioning — the extra WHERE clauses let the planner use the
composite indexes from Phase 2. After partitioning, they become the
primary mechanism for partition pruning.

## Guiding Principles

1. **Add filters first, then replace when validated** — keep existing
join-based filters alongside new local filters during rollout.
After local denormalized columns are validated, replace old filters
and drop no-longer-needed joins where safe.

2. **Drop joins only when safe** — a join can be dropped only if *every*
column it provides (in SELECT, WHERE, GROUP BY, ORDER BY, FILTER) has
a local replacement.

3. **Materialized views use `|||TIMENOW|||` templates** — filters added
to mat views must use the same template tokens, not `$1`-style params.

4. **SQL functions use `$N` params** — new WHERE clauses reuse existing
params from the function signature.

5. **GORM queries use `?` placeholders** — pass the same Go variables
already available in the function scope.

## Changes by Query

### Group A: Queries starting from `prow_job_run_tests`

#### A1. `prowJobFailedTestsMatView` — `pkg/db/views.go`

Rewritten to start from `prow_job_run_tests`. Replaced
`prow_job_runs."timestamp"` with `pjrt.prow_job_run_timestamp` and
`prow_job_runs.prow_job_id` with `pjrt.prow_job_id`. **Dropped JOIN
`prow_job_runs`**.

#### A2. `testAnalysisByJobMatView` — `pkg/db/views.go`

Replaced all `prow_job_runs."timestamp"` references with
`prow_job_run_tests.prow_job_run_timestamp`. Replaced `prow_jobs.release`
with `prow_job_run_tests.prow_job_run_release`. **Dropped JOIN
`prow_job_runs`**. Rewired JOIN `prow_jobs` via
`prow_job_run_tests.prow_job_id`.

#### A3. `testReportMatView` — `pkg/db/views.go`

Replaced all `prow_job_runs."timestamp"` in WHERE and FILTER clauses with
`prow_job_run_tests.prow_job_run_timestamp`. Replaced `prow_jobs.release`
with `prow_job_run_tests.prow_job_run_release`. **Dropped JOIN
`prow_job_runs`**. Rewired JOIN `prow_jobs` via
`prow_job_run_tests.prow_job_id`. JOIN `prow_jobs` kept for
`prow_jobs.variants`.

#### A4. `test_results()` function — `pkg/db/functions.go`

Added `WHERE prow_job_run_tests.prow_job_run_timestamp BETWEEN $1 AND $3`
to limit the scan. Replaced `timestamp` in all CASE expressions with
`prow_job_run_tests.prow_job_run_timestamp`. Replaced `prow_jobs.release`
with `prow_job_run_tests.prow_job_run_release`. **Dropped JOINs
`prow_job_runs` and `prow_jobs`**.

#### A5. `ProwJobHistoricalTestCounts` — `pkg/db/query/job_queries.go`

Replaced `prow_job_runs.prow_job_id` with
`prow_job_run_tests.prow_job_id` and `prow_job_runs.timestamp` with
`prow_job_run_tests.prow_job_run_timestamp`. **Dropped JOIN
`prow_job_runs`**.

#### A6. `GetRecentTestFailures` — `pkg/api/recent_test_failures.go`

Added redundant local filters (`prow_job_run_tests.prow_job_run_timestamp`
and `prow_job_run_tests.prow_job_run_release`) to all four queries:
main query, NOT EXISTS subquery, last-pass lookback, and failure outputs.
Joins kept — `prow_job_runs` still needed for `timestamp` in SELECT and
`url`; `prow_jobs` still needed for `name`.

#### A7. `testStatusQuery` (CR) — `pkg/api/componentreadiness/.../provider.go`

Added `pjrt.prow_job_run_release = ?` and `pjrt.prow_job_run_timestamp`
range filters to the CTE WHERE clause. Joins kept — `prow_job_runs`
needed for `labels`, `prow_jobs` needed for variant lookup.

#### A8. `testDetailQuery` (CR) — `pkg/api/componentreadiness/.../provider.go`

Same pattern as A7 — added local release and timestamp filters. Joins
kept — `pjr.url`, `pjr.timestamp`, `pjr.labels`, `pj.name`, `pj.id`
still needed in SELECT.

#### A9. `payloadTestFailuresMatView` — `pkg/db/views.go`

Added `pjrt.prow_job_run_timestamp > (|||TIMENOW||| - '14 days'::interval)`
to WHERE. Joins kept — `release_tags`, `release_job_runs`, `prow_jobs`,
`prow_job_runs` still needed for other columns.

### Group B: Queries starting from `prow_job_run_test_outputs`

#### B1. `TestOutputs` — `pkg/db/query/test_queries.go`

Added `prow_job_run_test_outputs.prow_job_run_test_timestamp`,
`prow_job_run_test_outputs.prow_job_run_test_release`,
`prow_job_run_tests.prow_job_run_timestamp`, and
`prow_job_run_tests.prow_job_run_release` filters. Joins kept —
`prow_job_runs` for URL, `prow_jobs` for variants.

#### B2. `TestDurations` — `pkg/db/query/test_queries.go`

Replaced `prow_job_runs.timestamp` filter with
`prow_job_run_tests.prow_job_run_timestamp`. Replaced ambiguous
`"timestamp"` in SELECT/GROUP BY/ORDER BY with explicit
`prow_job_run_tests.prow_job_run_timestamp`. Replaced `prow_jobs.release`
with `prow_job_run_tests.prow_job_run_release`. **Dropped JOIN
`prow_job_runs`**. JOIN `prow_jobs` rewired via
`prow_job_run_tests.prow_job_id` (needed for variants).

### Group C: Queries on `prow_job_runs` directly

#### C1. `BuildClusterHealth` — `pkg/db/query/build_clusters.go`

Added `WHERE prow_job_runs.timestamp BETWEEN @start AND @end` so the
planner can use the timestamp index to limit the scan. The `@start` and
`@end` params already exist in the function signature.

#### C2-C4. No changes

- `BuildClusterAnalysis` — already has timestamp in WHERE, cross-release
- `HasBuildClusterData` — existence check, timestamp bound would be wrong
- `ProwJobRunIDs` — simple lookup, already indexed

### Group D: SQL functions with PR join tables

#### D1. `job_results()` — `pkg/db/functions.go`

- **`repo_org_jobs` CTE**: Added `WHERE prow_job_runs.prow_job_release = $1`
- **`merged_prs` CTE**: Added `AND prow_job_runs.timestamp BETWEEN $2 AND $4`
- **`results` CTE**: Added `WHERE prow_job_runs.prow_job_release = $1`
- **`last_pass` CTE**: No change — intentionally cross-release

#### D2. `jobRunsReportMatView` — No change

The CTEs materialize all data. Adding filters would require
parameterizing the mat view. Deferred to a future change.

## Joins Dropped Summary

| Query | Join dropped | Columns replaced |
|-------|-------------|-----------------|
| `prowJobFailedTestsMatView` | `prow_job_runs` | `timestamp` → `pjrt.prow_job_run_timestamp`, `prow_job_id` → `pjrt.prow_job_id` |
| `testAnalysisByJobMatView` | `prow_job_runs` | `timestamp` → local, `prow_job_id` → local; `prow_jobs` rewired via `prow_job_run_tests.prow_job_id` |
| `testReportMatView` | `prow_job_runs` | `timestamp` → local; `prow_jobs` rewired via `prow_job_run_tests.prow_job_id` |
| `test_results()` | `prow_job_runs` + `prow_jobs` | `timestamp` → local, `release` → local |
| `ProwJobHistoricalTestCounts` | `prow_job_runs` | `prow_job_id` → local, `timestamp` → local |
| `TestDurations` | `prow_job_runs` | `timestamp` → local; `prow_jobs` rewired via `prow_job_run_tests.prow_job_id` |
13 changes: 10 additions & 3 deletions pkg/api/componentreadiness/dataprovider/postgres/provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,9 @@ WITH deduped AS (
JOIN prow_jobs pj ON pj.id = pjr.prow_job_id
WHERE pj.release = ?
AND pjr.timestamp >= ? AND pjr.timestamp < ?
AND pjr.prow_job_release = ?
AND pjrt.prow_job_run_release = ?
AND pjrt.prow_job_run_timestamp >= ? AND pjrt.prow_job_run_timestamp < ?
AND pjrt.deleted_at IS NULL AND pjr.deleted_at IS NULL AND pj.deleted_at IS NULL
AND (pjr.labels IS NULL OR NOT pjr.labels @> ARRAY['InfraFailure'])
ORDER BY pjrt.prow_job_run_id, pjrt.test_id, pjrt.suite_id,
Expand Down Expand Up @@ -340,7 +343,7 @@ func (p *PostgresProvider) queryTestStatus(ctx context.Context, release string,
dbGroupBy map[string]bool) (map[string]crstatus.TestStatus, []error) {

var rows []testStatusRow
if err := p.dbc.DB.WithContext(ctx).Raw(testStatusQuery, release, start, end).Scan(&rows).Error; err != nil {
if err := p.dbc.DB.WithContext(ctx).Raw(testStatusQuery, release, start, end, release, release, start, end).Scan(&rows).Error; err != nil {
return nil, []error{fmt.Errorf("querying test status: %w", err)}
}

Expand Down Expand Up @@ -517,6 +520,9 @@ JOIN test_ownerships tow ON tow.test_id = pjrt.test_id
AND (tow.suite_id = pjrt.suite_id OR (tow.suite_id IS NULL AND pjrt.suite_id IS NULL))
WHERE pj.release = ?
AND pjr.timestamp >= ? AND pjr.timestamp < ?
AND pjr.prow_job_release = ?
AND pjrt.prow_job_run_release = ?
AND pjrt.prow_job_run_timestamp >= ? AND pjrt.prow_job_run_timestamp < ?
AND pjrt.deleted_at IS NULL AND pjr.deleted_at IS NULL AND pj.deleted_at IS NULL
AND tow.staff_approved_obsolete = false
AND (pjr.labels IS NULL OR NOT pjr.labels @> ARRAY['InfraFailure'])
Expand All @@ -528,7 +534,7 @@ func (p *PostgresProvider) queryTestDetails(ctx context.Context, release string,
includeVariants map[string][]string) (map[string][]crstatus.TestJobRunRows, []error) {

var rows []testDetailRow
if err := p.dbc.DB.WithContext(ctx).Raw(testDetailQuery, release, start, end).Scan(&rows).Error; err != nil {
if err := p.dbc.DB.WithContext(ctx).Raw(testDetailQuery, release, start, end, release, release, start, end).Scan(&rows).Error; err != nil {
return nil, []error{fmt.Errorf("querying test details: %w", err)}
}

Expand Down Expand Up @@ -680,11 +686,12 @@ func (p *PostgresProvider) QueryJobRuns(ctx context.Context, reqOptions reqopts.
JOIN prow_job_runs pjr ON pjr.prow_job_id = pj.id
WHERE pj.release = ?
AND pjr.timestamp >= ? AND pjr.timestamp < ?
AND pjr.prow_job_release = ?
AND pj.deleted_at IS NULL AND pjr.deleted_at IS NULL
AND (pj.name LIKE 'periodic-%%' OR pj.name LIKE 'release-%%' OR pj.name LIKE 'aggregator-%%')
GROUP BY pj.name
ORDER BY pj.name
`, release, start, end).Scan(&rows).Error
`, release, start, end, release).Scan(&rows).Error
if err != nil {
return nil, fmt.Errorf("querying job runs: %w", err)
}
Expand Down
2 changes: 2 additions & 0 deletions pkg/api/job_analysis.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ func PrintJobAnalysisJSONFromDB(
sum(case when overall_result = 'A' then 1 else 0 end) AS "A"`, period).
Joins("INNER JOIN prow_jobs ON prow_job_runs.prow_job_id = prow_jobs.id").
Where("prow_jobs.id IN ?", jobs).
Where("prow_job_runs.prow_job_release = ?", release).
Where("prow_job_runs.timestamp BETWEEN ? AND ?", start, end).
Group("period")

sumResults.Scan(&sums)
Expand Down
6 changes: 5 additions & 1 deletion pkg/api/job_runs.go
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,11 @@ func JobsRunsReportFromDB(dbc *db.DB, filterOpts *filter.FilterOptions, release
ids[i] = jr.ID
}
var annotations []models.ProwJobRunAnnotation
if err := dbc.DB.Where("prow_job_run_id IN ?", ids).Find(&annotations).Error; err != nil {
annotationQuery := dbc.DB.Where("prow_job_run_id IN ?", ids)
if len(release) > 0 {
annotationQuery = annotationQuery.Where("prow_job_run_release = ?", release)
}
if err := annotationQuery.Find(&annotations).Error; err != nil {
return nil, err
}
annotationsByRun := make(map[string]apitype.AnnotationMap)
Expand Down
Loading