BioFlow is an Apache 2.0 open-core workflow OS for life-science software stacks: connectors + deterministic orchestration + reproducible provenance.
This repo currently contains a local runner that simulates workflows end-to-end (schemas, hashing, audit chain, artifacts) so you can iterate on contracts and UX before wiring real external services.
Positioning, boundary, and Apache 2.0 core license intent: see docs/OPEN_CORE.md.
npm install
npm run check
npm run check:ciRun the CI-enforced smoke and docs command checks locally:
npm run demo:killer:ci
npm run test -- test/docs-commands.test.tsCreate a sample workflow:
npm run bioflow -- init ./example.workflow.yamlValidate and run it with synthetic inputs:
npm run bioflow -- validate ./example.workflow.yaml
npm run bioflow -- run ./example.workflow.yaml --input ./README.mdOr use the committed example:
npm run bioflow -- run ./examples/example.workflow.yaml --input ./README.mdVerify a run (audit chain + CAS integrity + deterministic replay):
npm run bioflow -- verify <runId>Generate a deterministic GxP-style validation report (Markdown, content-addressed via CAS):
npm run bioflow -- report <runId> --format markdown --out ./reportsNote: new runs use UUID run IDs for compatibility with remote sync.
Run one command to execute the canonical deterministic path (normalize sample sheet -> score -> aggregate -> simulated ELN writeback), then verify and emit a deterministic Markdown report:
npm run demo:killerReference method and artifact contract: docs/KILLER_EXAMPLE.md.
Manual path using committed assets:
npm run bioflow -- validate ./examples/killer.workflow.yaml
npm run bioflow -- run ./examples/killer.workflow.yaml --input ./examples/synthetic/sample-sheet.messy.csv
npm run bioflow -- verify <runId>
npm run bioflow -- report <runId> --format markdown --out ./reportsNormalize messy sample metadata into a deterministic, verifiable bundle:
npm run bioflow -- tidy ./SampleSheet.csv --profile sample-sheet-v1 --out ./cleaned
# then verify locally
npm run bioflow -- verify <runId>Outputs include a cleaned CSV, a data-only CSV, an ID mapping JSON, and a report JSON (all content-addressed under .bioflow/objects/...).
Artifacts and execution records are written under .bioflow/:
.bioflow/objects/<shard>/<hash>: content-addressed bytes (SHA-256).bioflow/runs/<runId>/manifest.json: logical names →sha256:<hash>.bioflow/runs/<runId>/execution.json: execution + audit chain + outputs
With a compatible BioFlow Cloud endpoint, you can push/pull a completed local run (CAS objects + manifest + execution record):
export BIOFLOW_REMOTE_URL=http://localhost:8080
export BIOFLOW_REMOTE_ORG_ID=00000000-0000-0000-0000-000000000000
npm run bioflow -- push <runId>
npm run bioflow -- pull <runId>
npm run bioflow -- verify-remote <runId> --deep
npm run bioflow -- ls-remote --limit 20If the API is configured with JWT auth, set BIOFLOW_REMOTE_TOKEN instead of BIOFLOW_REMOTE_ORG_ID.
If the API is configured with API key auth, set BIOFLOW_REMOTE_TOKEN to your bf_live_... key.
npm run bioflow -- share <runId> --visibility org
npm run bioflow -- profiles put core-default ./profile.json
npm run bioflow -- profiles ls
npm run bioflow -- tidy ./SampleSheet.csv --profile core-defaultRun the MCP server over stdio:
npm run mcp
# or
npm run bioflow -- mcp
# or, after install
bioflow-mcpIt exposes repository summary, repo search, docs/source resources, run manifest/execution/report resources, remote durable sync bundles when the session is bound to a remote org, the chained MCP session audit resource, remote run/profile sync tools, and local workflow validate/run/verify/report tools.
Set BIOFLOW_MCP_REPO_ROOT if you want to point it at a different checkout.
Remote tools use the same BIOFLOW_REMOTE_* defaults and saved CLI config as the main CLI, and each call can override remoteUrl, token, and orgId.
You can also bind the whole stdio session once by sending bioflow.remoteUrl, bioflow.token, and bioflow.orgId in the MCP initialize request.
See docs/MCP.md for the current scope and docs/MCP_CLIENTS.md for launch/auth examples.
- Repo norms live in
instructions.md. - All examples and runtime behavior are simulation-first (deterministic, synthetic artifacts). Replace simulated connectors with real ones when you’re ready.
- Hosted control-plane docs and deployment notes live in OmnisGenomics/BioFlow-Cloud.
See CONTRIBUTING.md for the public-core contribution workflow and docs/CLA.md for the inbound license policy.