Skip to content

Increase Kavenegar API status SMS timeout #29

@SalehBorhani

Description

@SalehBorhani

Summary

We need to increase the timeout/retry window for status SMS requests to the Kavenegar API. Currently the status-checking logic times out or treats delayed delivery reports as failures. Increasing the timeout will reduce false negatives and improve message status accuracy.

Background

  • Our system uses Kavenegar to send SMS and relies on their delivery status responses.
  • Some carriers and network conditions cause delivery reports to arrive later than our current timeout.
  • This results in messages being marked as failed or triggering unnecessary retries/alerts.

Proposed change

  • Increase the timeout and/or extend the time window we consider for a final delivery status from Kavenegar.
  • Possible options:
    • Increase HTTP request timeout when polling the Kavenegar API (if currently low).
    • Increase the application-level wait period before marking a message as permanently failed (e.g., from X minutes → Y minutes).
    • Add configurable retry/backoff policy for status checks (with sensible defaults).
  • Make the new timeout configurable via environment variable or config file (e.g., KAVENEGAR_STATUS_TIMEOUT_MINUTES / KAVENEGAR_STATUS_MAX_AGE).

Acceptance criteria

  • Messages that receive late delivery reports within the new window are updated to the correct status instead of being marked permanently failed.
  • No significant increase in resource usage from extended polling (or if polling frequency changes, show plan to mitigate).
  • Timeout value is configurable and documented.
  • Unit and integration tests added to cover the extended timeout behavior and config override.

Impact & Risks

  • Increased memory/DB retention of pending messages for longer (minimal if limited to reasonable values).
  • If timeout is set too long, alerts and retries may be delayed; therefore default should be conservative and configurable.
  • May require coordination with monitoring/alerting to avoid false positives.

Implementation notes

  • Identify where we currently mark SMS as failed due to timeout or where we poll Kavenegar for status.
  • Add config entry and respect it in that code path.
  • Update documentation and any operational runbooks.
  • Add tests:
    • Unit test for code that decides when to mark a message failed vs. keep waiting.
    • Integration test (mocking Kavenegar) verifying late delivery report within new window updates status.

Suggested default values

  • HTTP request timeout: 10s → 30s (if currently lower)
  • Application-level finalization window: 5 minutes → 30 minutes (example — adjust per product needs)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions