diff --git a/src/content/docs/explanations/platform/data-flow.mdx b/src/content/docs/explanations/platform/data-flow.mdx new file mode 100644 index 000000000..851fd3c49 --- /dev/null +++ b/src/content/docs/explanations/platform/data-flow.mdx @@ -0,0 +1,101 @@ +--- +title: Platform Data Flow +--- + +This page explains what data flows between a +[Tenzir Node](/explanations/node) and the +[Tenzir Platform](/explanations/platform), so you can reason about +privacy, compliance, and what an operator of the Tenzir Platform can see. + +When a Tenzir Node starts, it opens a single outbound, TLS-encrypted +WebSocket connection to the Tenzir Platform. All communication between +the two travels over that connection. + +For the bigger picture on TLS termination and trust boundaries, see the +[FAQ entry on data privacy](/explanations/faqs#can-tenzir-see-my-data). + +## The connection is a tunnel, not a feed + +The Tenzir Node does not push state to the Tenzir Platform. It opens the +WebSocket, authenticates, and then waits. The Tenzir Platform — and +through it, the app — uses this connection as a reverse tunnel to call +the [node API](/reference/node/api) on demand. + +In other words: whatever the app shows you about a Tenzir Node was +pulled from that node at the moment you asked for it. The Tenzir +Platform does not maintain a continuously updated mirror of the +Tenzir Node's state. + +The Tenzir Node initiates the connection — the Tenzir Platform never +connects to it. For nodes registered with the hosted Tenzir Platform, +the endpoint defaults to `wss://ws.tenzir.app:443/production`. See +node-setup/configure-tls for the TLS options. + +## What the app pulls on demand + +Anything you can retrieve via the [node API](/reference/node/api), the +Tenzir Platform can retrieve on behalf of an authenticated user in your +workspace. In practice this means: + +- The list of **pipelines** on the Tenzir Node, their TQL source, + labels, and lifecycle state. +- **Pipeline metrics** such as ingress/egress event and byte counters. +- **Diagnostics** (warnings and errors) emitted by pipelines. +- **Contexts** — their names, types, and, if you open them in the app, + their contents. +- **Packages** installed on the Tenzir Node and their configuration. +- **Schemas** the Tenzir Node has seen. + +These are fetched when a page or component in the app needs them. Close +the page and the requests stop. + +## Pipeline data does not flow through the Tenzir Platform + +Pipelines run entirely on the Tenzir Node. The events flowing through a +pipeline's operators are **not** routed through the Tenzir Platform, +regardless of where the pipeline was created. + +The single exception is a pipeline you run interactively from the app +(for example, in the Explorer). In that case the results stream +Tenzir Node → Tenzir Platform → browser for as long as the view is +open, so the app can render them. They are not persisted on the +Tenzir Platform. + +## Secrets + +When a pipeline accesses a secret, the Tenzir Node requests its value +from the Tenzir Platform over the same encrypted channel, and the +Tenzir Platform replies with the resolved value. See +secrets. + +## What is *not* sent + +The Tenzir Platform does **not** receive: + +- The **events stored in the Tenzir Node** (Parquet partitions, the + catalog, or query results) — unless you explicitly query them from + the app. +- The **raw bytes** ingested from your sources. +- **Pipeline payloads** for pipelines that aren't being viewed live in + the app. +- **Configuration files**, environment variables, or secret values + defined locally on the Tenzir Node. +- **Operating system logs** or anything outside the `tenzir-node` + process. + +## Encryption and trust + +The WebSocket is TLS-encrypted and terminates at the Tenzir Platform. +In the editions where Tenzir hosts the Tenzir Platform, Tenzir is +therefore technically able to observe data that traverses it during +interactive use. If this is unacceptable, host the Tenzir Platform +yourself with the [Sovereign Edition](https://tenzir.com/pricing) — see +platform-setup. + +## See Also + +- node-setup/configure-tls +- platform +- node +- secrets +- faqs diff --git a/src/sidebar.ts b/src/sidebar.ts index b23babb63..b6f49e7d3 100644 --- a/src/sidebar.ts +++ b/src/sidebar.ts @@ -260,6 +260,7 @@ export const explanations = [ "explanations/pipeline", "explanations/node", "explanations/platform", + "explanations/platform/data-flow", "explanations/language", ], "architecture",