What's not done / known gaps#5
Open
ArksherX wants to merge 2 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds per-identity byte-rate limiting to the tunnelled data proxied through the gateway. Each agent identity (extracted from the mTLS client certificate extension) gets its own token bucket rate limiter. If an identity exceeds its configured throughput limit, the copy loop is throttled automatically.
Components of the Gateway (task item 1)
The gateway is composed of the following parts:
main.rs— Entry point. Loads config, sets up TLS, starts the TCP listener, and dispatches connections toMakeProxyService.proxy.rs— Core request handling. Implements theServicetrait for hyper, extracts the destination from the CONNECT request, calls the policy engine, opens the upstream TCP connection, and spawns the bidirectional tunnel.policy.rs— Authorization logic. Extracts the agent identity from the custom X.509 certificate extension, queries PostgreSQL to check for a valid signed permission row, and returns Allow or Deny.config.rs— Typed configuration structs deserialized fromconfig.toml.tls.rs— Sets up the rustls server config for mTLS, requiring and verifying client certificates.rate_limit.rs(new) — Per-identity token bucket rate limiters backed bygovernorand stored in aDashMap.What the feature does
[rate_limit]fromconfig.toml—bytes_per_secondandburst_bytesDashMap<String, Arc<RateLimiter>>keyed by agent identitybytes_per_second > 0, the bidirectional copy is wrapped with rate limiter checks per identityDesign choices and tradeoffs
In-memory over persistent storage
State lives in a
DashMapon the heap. This means limits reset on gateway restart and are not shared across multiple gateway instances. This is the right tradeoff for a single-node SL5 weight enclave deployment — restarts are controlled events, and the SL5 threat model assumes a single-facility enclave. Adding distributed state (e.g. a Postgres counter per identity per time window) would be the correct next step for multi-node deployments, at the cost of a database roundtrip per data chunk.Global config, not per-identity config
All identities share the same
bytes_per_secondandburst_bytesvalues from config. Per-identity limits would require either a config map keyed by identity string or a new database column — straightforward to add but out of scope for this implementation given the simplicity priority.What it protects against and what it doesn't
The rate limit addresses sustained bulk exfiltration — an agent continuously streaming large volumes of data will be throttled. It does not address short bursts below the window duration, and it does not address an adversary who controls multiple distinct identities. It is a bandwidth control, not a session control.
Crate choice:
governorgovernorprovides a well-tested token bucket implementation withno_stdsupport and minimal dependencies.RateLimiter::directwith aQuota::per_secondis the simplest correct primitive for bytes-per-second limiting.Implementation process and tools used
spawn_tunnelinproxy.rs)tokio::spawn, resolving clippy pedantic lints (needless_pass_by_value,clone_on_copy), and derivingCopyonRateLimitConfigto satisfy both the borrow checker and clippy simultaneouslyWhat's not done / known gaps
limit code path is exercised and all 61 tests pass, but a timing-based
throughput assertion was not added to avoid flakiness in CI)