Sync main: bulk kanban create tool + FD-leak zombie-shutdown fixes#1
Merged
Conversation
When asked to create several tasks at once, the model reaches for a single bulk call. Without such a tool it hallucinates `kanban_bulk_create_tasks`; the quality monitor rejects it, and after the 2-correction cap the model often gives up and falsely reports a "board/registry error server-side" — so the tasks never get created (observed with the 9B/9C/9D plan). Provide the real tool. It wraps kanban_create_task.execute per item (identical permission/validation), creates partial-on-error, and supports intra-batch dependencies via depends_on_index (1-based index of an earlier task in the same batch) so e.g. 9C/9D can depend on 9B before its id exists.
Three fixes for the recurring "evonic mati sendiri" (FD watchdog SIGTERMs at fd>400) where the process survived as a half-dead zombie: 1. app.py teardown_request: also close the thread-local SQLite connections for api_rate_limit and rate_limit. The before_request rate-limit check opened a per-thread connection (3 FDs each in WAL mode) on every /api/* request; with Flask's thread-per-request these accumulated until GC (~180 FDs on api_rate_limit.db alone) — the dominant FD-leak source now that the SFTP loop is fixed. Mirrors the existing db.close() pattern. Verified flat at 8 handles across 70+ requests (was growing unbounded). 2. runtime._signal_handler: arm a daemon hard-exit backstop before sys.exit. sys.exit() only raises SystemExit in the main thread; when SIGTERM lands while the threaded WSGI server blocks in its accept loop, the server swallows SystemExit — runtime drains but the process keeps serving. systemd still sees it active so Restart=always never fires. os._exit backstop guarantees the restart after the graceful attempt. 3. ssh_backend: resolve _REMOTE_EVONIC_DIR's ~ against the REMOTE $HOME instead of os.path.expanduser (local HOME) — the original SFTP permission-denied retry loop that leaked sockets (root-cause fix, previously uncommitted).
…ERM hard-exit backstop + ssh remote-HOME)
Brings the integrated work from local main into the fork: - feat: kanban_bulk_create_tasks tool (batch task creation) - fix: FD-leak self-shutdown zombie (rate-limit conn close, SIGTERM hard-exit backstop, ssh remote-HOME resolution)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings the integrated work from local
maininto the fork'smain. Combines two independent changes (also proposed upstream as anvie#80 and anvie#81):feat:
kanban_bulk_create_taskstoolCreate multiple Kanban tasks in one call. Removes the failure where the model hallucinated a non-existent bulk-create tool, got rejected by the quality monitor, and then falsely reported a "board/registry error". Supports partial-on-error and intra-batch dependencies via
depends_on_index.fix: FD-leak self-shutdown becoming a zombie
teardown_requestnow also closes the thread-localapi_rate_limit/rate_limitSQLite connections (3 FDs each in WAL mode). These leaked per request thread (~180 FDs onapi_rate_limit.db), tripping the FD watchdog. Verified flat at 8 handles across 70+ requests._signal_handlerarms anos._exit(0)daemon backstop beforesys.exit(0).sys.exitonly raisesSystemExitin the main thread, which the threaded WSGI accept loop swallows, leaving a half-dead process that systemd never restarts.$HOMEinstead ofos.path.expanduser(local HOME), fixing the original SFTP permission-denied retry loop that leaked sockets.Testing
tools.jsonis valid JSON/api/*requests, confirmed FD count stays bounded and queue workers/scheduler recover