Skip to content

river service resilience to machine failure #1

Description

@smorovic

Improve robustness of river service when elastic search machine is down:
investigate and possibly adjust DNS default timeout in river java service (should be relevant to es-local)
implement heartbeat from river (java). Exit if heartbeat attempts exceed a configured number of tries. Good chance to develop using the new Java REST API
implement the same functionality in python and javascript services. Add river doc entry to enable new checks by the daemon
river-daemon: detect stale services and modify to restart. check and terminate existing service if a duplicate entry is found (which could happen when elastic search is down and comes back)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions