Our runners often deal with very long running jobs that may even take 24h to complete.
For that reason, when deploying a new versions it is not reasonable to manually pause the runner and wait for all the jobs to complete before updating and unpausing. Doing so may require keeping the runner paused while a job takes many hours to complete, preventing other jobs from being processed in the meantime.
Kubernetes actually provides a way to handle this situation automatically via graceful shutdowns. In short, we can set a very long terminationGracePeriodSeconds value and, when getting SIGTERM from Kubernetes, stop polling for new jobs and exit once the last running job completes.
Once all the right APIs and examples are available in gitlab-runner-rs (see collabora/gitlab-runner-rs#129) we need to set a large terminationGracePeriodSeconds by default (letting people customize it) and wire up all the right logic to drain jobs on termination, providing seamless, no downtime upgrades.
See also collabora/obs-gitlab-runner#125
Our runners often deal with very long running jobs that may even take 24h to complete.
For that reason, when deploying a new versions it is not reasonable to manually pause the runner and wait for all the jobs to complete before updating and unpausing. Doing so may require keeping the runner paused while a job takes many hours to complete, preventing other jobs from being processed in the meantime.
Kubernetes actually provides a way to handle this situation automatically via graceful shutdowns. In short, we can set a very long
terminationGracePeriodSecondsvalue and, when gettingSIGTERMfrom Kubernetes, stop polling for new jobs and exit once the last running job completes.Once all the right APIs and examples are available in
gitlab-runner-rs(see collabora/gitlab-runner-rs#129) we need to set a largeterminationGracePeriodSecondsby default (letting people customize it) and wire up all the right logic to drain jobs on termination, providing seamless, no downtime upgrades.See also collabora/obs-gitlab-runner#125