Skip to content

Was v1alpha2 batch scheduler#4962

Open
marosset wants to merge 4 commits into
ray-project:masterfrom
marosset:was-v1alpha2-batch-scheduler
Open

Was v1alpha2 batch scheduler#4962
marosset wants to merge 4 commits into
ray-project:masterfrom
marosset:was-v1alpha2-batch-scheduler

Conversation

@marosset

Copy link
Copy Markdown
Contributor

Why are these changes needed?

These changes implement batch scheduling using Kubernetes native workload aware scheduling APIs using the KubeRay batch scheduler interface.
These changes are a re-implementation of #4723 because there were a few different suggestions to see what this would look like as a batch scheduler plugin.

I think there are a few pros and a few cons to this approach

Pros

  • Easy to target specific K8s alpha/beta API versions. In K8s v1.36 batch scheduling was supported with the v1alpha2 scheduling.k8s.io APIs. Implementing workload-aware-scheduling against this API as a specific plugin allows us to also add support for future alpha or beta APIs which could be implemented as individual plugins greatly improving maintainability
  • Changes are much less invasive in the main ray-controller

Cons

  • Could lead to some configuration confusion. Normally the batch scheduler name is the name of the external scheduler. In this case there is no external scheduler
  • Less observability. Batch scheduler plugins do not have allow for as detailed observability such as node conditions for each phase of scheduling. Support could be added to each batch scheduler plugin but the decoupled nature between the operator and the batch plugins makes it harder to ensure everything is correct

Related issue number

Part of #4344

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

marosset added 4 commits June 29, 2026 19:57
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
@marosset

Copy link
Copy Markdown
Contributor Author

@Future-Outlier - Here is the kubernetes workload aware scheduling work implemented as a batch scheduler plugin.
Please a look at this and some of the comments I made in the PR description!
Thanks!

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 90d2a29. Configure here.

if existing.DeletionTimestamp != nil {
return fmt.Errorf("PodGroup %s/%s is being deleted (finalizer pending), will retry", podGroup.Namespace, podGroup.Name)
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale PodGroup kept on exists

High Severity

After a stale Workload is recreated, syncSchedulingResources treats an existing PodGroup as success when Create returns AlreadyExists and the object is not terminating. It never compares or updates SchedulingPolicy (e.g. gang MinCount). A pre-delete PodGroup can keep an old policy while the new Workload templates match the cluster, so gang sizing stays wrong until manual intervention.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 90d2a29. Configure here.

@Future-Outlier Future-Outlier self-assigned this Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants