Adjust the minimum reschedule delay for failed allocations

Today when a task process exits, or the health check fails, we will attempt to restart the task up to the maximum number of restart attempts configured.  Once these attempts have been exhausted the reschedule block is triggered in order to reschedule the entire allocation to a new client node.  Today the lower limit for the delay in rescheduling is set at 5 seconds and will back off or increase for subsequent attempts based on the delay function configured.  https://developer.hashicorp.com/nomad/docs/job-specification/reschedule#delay

In cases where the rescheduling attempts are unlimited, and the task has another underlying issue that will prevent it from starting even on a different client, the settings in the reschedule block for delay/delay_function allow you to avoid rapidly attempting to reschedule tasks and thrashing repeated attempts.  However, in certain scenarios a minimum 5 second delay may introduce an unwanted delay in the initial rescheduling process.

With additional documentation calling out the risks of runaway rescheduling events, we should allow adjusting this minimum lower than 5 seconds to reduce the recovery window for tasks that would otherwise start successfully on new clients.

Currently this is controlled via an internal constant https://github.com/hashicorp/nomad/blob/v2.0.3/nomad/structs/structs.go#L6606
https://github.com/hashicorp/nomad/blob/v2.0.3/nomad/structs/structs.go#L6671
https://github.com/hashicorp/nomad/blob/v2.0.3/nomad/structs/structs.go#L4732

We should consider allowing this to be set as low as 1 second while including warnings around unbounded rescheduling attempts.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adjust the minimum reschedule delay for failed allocations #28185

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Adjust the minimum reschedule delay for failed allocations #28185

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions