When a node is evacuated, a user might use something like kubectl taint node <node> piraeus.io/remove-satellite:NoSchedule and then delete the LinstorSatellite resource. This means that after evacuation, the Satellite is not recreated.
However, there is the chance that critical pods (linstor-csi-node, linstor-satellite, etc..) get removed first. This then blocks evacuation, as the DaemonSets get recreated, but because of the above taint, the Pod cannot be scheduled. So we have a satellite that should be evacuated, but cannot be because the Pod was removed and it does not get recreated because it can't be scheduled.
We should consider updating the Pod tolerations so that in this situations, critical pods can get recreated
When a node is evacuated, a user might use something like
kubectl taint node <node> piraeus.io/remove-satellite:NoScheduleand then delete theLinstorSatelliteresource. This means that after evacuation, the Satellite is not recreated.However, there is the chance that critical pods (linstor-csi-node, linstor-satellite, etc..) get removed first. This then blocks evacuation, as the DaemonSets get recreated, but because of the above taint, the Pod cannot be scheduled. So we have a satellite that should be evacuated, but cannot be because the Pod was removed and it does not get recreated because it can't be scheduled.
We should consider updating the Pod tolerations so that in this situations, critical pods can get recreated