Skip to content

Design idea: pq – a general-purpose Rust-native jq pipeline processor #3574

@phlax

Description

@phlax

Motivation

We frequently use jq (and sometimes yq) across workflows—from ad-hoc Bash filters, to Bazel rules, to GitHub Actions. jq itself is robust, but yq is a pain to support and jq module portability (currently tied to GH Actions) is sorely lacking. Existing tools like aspect's jq/yq rules are brittle and hard to maintain hermetically.

Proposal: pq

A new tool (pq – "pipeline queries") implemented as a Rust-native pipeline processor. The core lens is:

  • Use jq syntax and its core semantics (jaq and/or shelling to jq binary)
  • Hermetic: single Rust binary, integrates cleanly with Bazel, CI, or CLI
  • Modular: borrow jq's module system, but stateless and portable
  • Pipeline-first: allow staged processing where each step's output can feed subsequent steps (json/yaml/toml to json, process, exec, branch, post-process, etc)
  • Inputs: paths, env vars, URLs, stdin, literals
  • expose state efficiently to steps
  • Multi-format inputs: parse yaml, json, toml natively, minimize format glue
  • Option for strict jq compat by finding and shelling out to jq binary for obscure/exotic cases

Example Pipeline (YAML)

inputs:
  manifest: {yaml: path/to/deployment.yaml}
  policy:   {yaml: $POLICY_PATH}
pipeline:
  - jq:      '. as $m | $policy | .items[] | select(.enabled)'
    inputs:  [manifest, policy]
    outputs:  filtered, ...
  - exec:    ["python", "-c", "import sys; print(sys.stdin.read().upper())"]
    inputs:   filtered, ...
    outputs:  transformed, ...
  - jq:      '{result: ., timestamp: now}'
    input:   transformed, ...

Design Notes

  • Default backend: jaq (pure Rust, no FFI)
  • "Strict" mode: pass args directly to jq binary for rare cases needing 100% compatibility
  • Designed for use in:
    • Bazel (hermetic, no Python/Go nonsense)
    • CI jobs (no need for GH-only Actions magic)
    • One-off CLI/data-wrangling
    • Everything keeps working exactly the same on dev's laptop, CI machines, remote runners
  • jq modules: allow portable module import path (project-local, per-user, per-step)
  • Steers clear of replicating every jq CLI quirk—API is clean, jq-mode is always there as fallback
  • Boring, reliable, documented, fast

Open Questions

  • Should the pipeline spec allow for conditional/branching/validation features?
  • How to handle jq args/module search path environment like jq, vs a simpler system (env vars, CLI flags, project config)?
  • What's the best ergonomic structure for importing modules per-pipeline or per-step?
  • Should pipelines explicitly support parallel/concurrent steps or be strictly serial? (parallel/async!!!)
  • Should module system allow for sharing reusable jq snippets between unrelated pipelines?

Next Steps

  • High-level code structure sketch and draft initial CLI UX
  • Work out minimal viable pipeline runner
  • Integrate jaq with input normalization and output marshalling
  • (Optional) Implement native/jq/compat backend with strict pass-through mode

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrustPull requests that update rust code

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions