Skip to content

CI: Share Gradle cache and select PR matrices#16566

Open
ajantha-bhat wants to merge 2 commits into
apache:mainfrom
ajantha-bhat:codex/incremental-pr-ci
Open

CI: Share Gradle cache and select PR matrices#16566
ajantha-bhat wants to merge 2 commits into
apache:mainfrom
ajantha-bhat:codex/incremental-pr-ci

Conversation

@ajantha-bhat
Copy link
Copy Markdown
Member

@ajantha-bhat ajantha-bhat commented May 26, 2026

Summary

  • add reusable composite actions to prepare and save incremental Gradle build-cache state across CI jobs
  • add a shared PR CI planner that uses merge-base changed files, full-ci labels, and global build/workflow changes to select PR matrices
  • keep full Java/Spark/Flink/Hive/Kafka/Delta/CVE coverage for main, release branches, tags, full-ci, and global Gradle/workflow changes
  • run a primary JVM on ordinary PRs and keep the full Java 17/21 matrix for full CI paths

Selective PR behavior

  • Spark-only changes run Spark jobs, not Flink/Hive/Kafka jobs
  • spark/v4.1/** selects Spark 4.1 only; spark/v4.0/** and spark/v3.5/** behave similarly
  • flink/v2.0/** selects Flink 2.0 only; other versioned Flink paths behave similarly
  • API/Core/Data/file-format changes run Java checks plus latest Spark and latest Flink canaries
  • runtime/bundle CVE scans are limited to affected runtime artifacts, while dependency/global changes run the full CVE matrix

Rationale

  • follows the Project Nessie/Polaris-style Gradle cache approach to reduce repeated Gradle work
  • keeps the selective planner small and driven by changed paths plus versions already declared in gradle.properties
  • avoids changing Gradle project definitions or adding a new third-party dependency

Validation

  • YAML parse for the new composite actions and edited workflows
  • git diff --check
  • bash -n .github/scripts/plan-pr-ci.sh
  • synthetic planner assertions for Spark 4.1, Flink 2.0, core canaries, and full-ci
  • ./gradlew -h

@github-actions github-actions Bot added the INFRA label May 26, 2026
@ajantha-bhat ajantha-bhat changed the title CI: Add incremental PR build planner [WIP] CI: Add incremental PR build planner May 26, 2026
@ajantha-bhat ajantha-bhat marked this pull request as draft May 26, 2026 07:11
@ajantha-bhat ajantha-bhat changed the title [WIP] CI: Add incremental PR build planner CI: Add incremental PR build planner May 26, 2026
@ajantha-bhat ajantha-bhat force-pushed the codex/incremental-pr-ci branch from 8871ed8 to 0f085f8 Compare May 26, 2026 10:14
@ajantha-bhat ajantha-bhat marked this pull request as ready for review May 26, 2026 12:15
@ajantha-bhat ajantha-bhat changed the title CI: Add incremental PR build planner [WIP] CI: Add incremental PR build planner May 26, 2026
@ajantha-bhat ajantha-bhat changed the title [WIP] CI: Add incremental PR build planner CI: Add incremental PR build planner (WIP) May 26, 2026
@ajantha-bhat ajantha-bhat changed the title CI: Add incremental PR build planner (WIP) CI: Share Gradle build cache across jobs May 26, 2026
@ajantha-bhat ajantha-bhat force-pushed the codex/incremental-pr-ci branch from dd3c1c0 to 43ffb7b Compare May 26, 2026 13:22
@ajantha-bhat ajantha-bhat changed the title CI: Share Gradle build cache across jobs CI: Share Gradle cache and select PR matrices May 26, 2026
@ajantha-bhat ajantha-bhat force-pushed the codex/incremental-pr-ci branch from 73789b5 to 5b1993c Compare May 26, 2026 13:44
@ajantha-bhat ajantha-bhat force-pushed the codex/incremental-pr-ci branch from 5b1993c to 4682b11 Compare May 26, 2026 13:48
@kevinjqliu
Copy link
Copy Markdown
Contributor

thanks for the PR @ajantha-bhat

i've done some work with gradle cache recently (#16356) and made it so that there's only 1 canonical writer. Before this change, I saw that we were constantly getting thrashed by multiple cache writers and cache utilization was really low.

Theres also a security component to this, we should only write to cache on push to main branch. We should not allow PRs to write to the shared cache since that's a cache poisoning vulnerability.

I'm curious how this change effects when to save to cache and how its reused

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants