Add real stdin source by simonsmallchua · Pull Request #27 · Good-Native/paperbark

simonsmallchua · 2026-05-09T16:37:22Z

Summary

Replaces the stdin source stub with a real implementation that yields lines from sys.stdin (or an injected test stream). Supports format / format_keys with the same JSON-only conflict rule as flyctl and file. Intended for piping pre-captured logs into a one-shot paperbark monitor / analyse / search run.
A piped stdin is single-use: the first iteration drains it, subsequent iterations yield nothing rather than re-raising. There is intentionally no encoding knob in v0.2 — use the file source if byte-level robustness is needed.
Drops the third remaining v0.2 stub (wrangler / kubectl / cloudwatch still pending).

Drive-by

README's Plain files row was still labelled "stub (interface only, post-v1)" despite landing in v0.1.4 — fixed in the same source table.
docs/ROADMAP.md "Last verified" bumped to today.
uv.lock had paperbark 0.1.3 pinned despite three subsequent releases; uv sync reconciled it. The auto-release workflow doesn't refresh uv.lock when bumping pyproject.toml — worth a follow-up to fix at source rather than carrying the drift.

CHANGELOG.md carries an Added entry under [Unreleased] so the changelog-check gate is satisfied; this is a patch bump.

Test plan

uv run pytest -q — 419 passed (was 407, +12).
uv run ruff check . and uv run ruff format --check . — clean.
uv run mypy src/paperbark tests — clean.
Live smoke: printf '<JSON line with leading TS>\n' | paperbark monitor --iterations 1 writes a canonical run dir, captures the line, renders analysis.md with the severity rollup, exits 0.
Live smoke with format = "apache-combined" and Apache combined log lines: both lines captured via the format-aware cursor; HTTP status probe correctly identified the 200 / 500.

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

Summary by CodeRabbit

New Features
- stdin source now reads piped log lines, supports format patterns and JSON key-mapping, and is single-use per run (subsequent reads yield nothing).
Documentation
- Updated changelog, README and docs with quickstart, examples, supported options, one-shot behaviour and intentional lack of encoding option.
Tests
- Added focused tests for stdin behaviour, option validation and one-shot semantics.

coderabbitai · 2026-05-09T16:37:32Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: ae3c85fd-eb33-4be3-b609-810694c85c3e

📥 Commits

Reviewing files that changed from the base of the PR and between 79d3b2f and 8157918.

📒 Files selected for processing (4)

CHANGELOG.md
docs/CONFIG.md
docs/SOURCES.md
src/paperbark/sources/stdin.py

📝 Walkthrough

Walkthrough

This pull request implements a real stdin source that replaces the previous stub implementation. The StdinSource class now reads lines from sys.stdin with test injection support, validates configuration options (format and format_keys) through the dispatcher, and includes comprehensive documentation and test coverage for the feature.

Changes

Real stdin Source Implementation

Layer / File(s)	Summary
Core stdin Implementation `src/paperbark/sources/stdin.py`	StdinSource replaces `NotImplementedError` with a working implementation that yields lines from `sys.stdin` or an injected stream; adds `__init__` accepting optional `format_keys`, `line_format`, and `stream` parameters; `capture()` discards `since` and lazily yields lines until EOF.
Dispatcher Configuration `src/paperbark/dispatcher.py`	`build_source()` for stdin now validates allowed options (`format`, `format_keys`), resolves preset formats via registry, rejects `format` combined with `format_keys`, and constructs `StdinSource` with derived `line_format` and `format_keys` instead of an unconfigured instance.
Source Tests `tests/test_sources.py`	New test section covers StdinSource with injected stream yielding, empty stream handling, sys.stdin default, and draining behaviour (second capture after EOF yields nothing); removes StdinSource from stub test coverage.
Dispatcher Tests `tests/test_dispatcher.py`	Added stdin-specific tests for default construction, unknown-option rejection, format preset wiring, format_keys parsing, and format+format_keys conflict validation; removes stdin from generic stub test.
Documentation `CHANGELOG.md`, `README.md`, `docs/CONFIG.md`, `docs/SOURCES.md`, `docs/ROADMAP.md`	Updated status tables and CHANGELOG entry to mark stdin as implemented; added dedicated stdin configuration section in CONFIG.md and reference section in SOURCES.md; reorganised stub source sections; updated ROADMAP milestone and v0.2 status tracking.

Possibly Related PRs

Good-Native/paperbark#21: Main PR adds a real StdinSource and updates dispatcher to accept/validate format and format_keys and pass a line_format into StdinSource — which directly uses the same format/line_format plumbing introduced in the retrieved PR, so they are related.
Good-Native/paperbark#23: Both PRs implement real built-in sources and make parallel, code-level changes to dispatcher.build_source (adding per-type option validation and constructing concrete Source instances), so they are related.
Good-Native/paperbark#15: Both PRs modify dispatcher.py’s build_source logic (including parsing/validation of per-source options like format/format_keys) — they touch the same function and related option-handling code, so they are related.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add real stdin source' directly and accurately summarises the primary change: replacing the stdin source stub with a functional implementation.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-09T16:37:37Z

Release Preview

patch: v0.1.6 → v0.1.7

Changelog

Added

The stdin source is now a real implementation: capture() yields
lines from sys.stdin rather than raising NotImplementedError.
Supports format / format_keys with the same conflict rules as
flyctl and file. Intended for piping pre-captured logs into a
one-shot paperbark monitor run, e.g. cat app.log | paperbark monitor --iterations 1 (analyse and search read existing run
artefacts and never consume stdin). A piped stdin is single-use:
the first iteration drains it, subsequent iterations yield nothing.
There is intentionally no encoding knob — use the file source
if you need byte-level robustness or a custom encoding. See
docs/SOURCES.md and docs/CONFIG.md for the matrix.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/CONFIG.md`:
- Around line 292-294: Add a shell language tag to the stdin usage code fence
that currently contains "cat my.log | paperbark analyse --run latest" so the
fenced block is marked as shell/sh; locate the fenced block with that exact
command in CONFIG.md and change the opening triple backticks to include the
language identifier (e.g., ```sh) to satisfy markdown linting.

In `@docs/ROADMAP.md`:
- Line 8: Update the "**Last verified:**" entry in docs/ROADMAP.md from
2026-05-10 to 2026-05-09 so the recorded verification date is not in the future;
locate the bold "Last verified:" text and replace the date accordingly.

In `@docs/SOURCES.md`:
- Around line 159-161: The Markdown fenced code block containing the pipeline
command "cat app.log | paperbark monitor --iterations 1" is missing a language
tag; update that fence to include a shell language identifier (e.g., change the
opening "```" to "```sh") so the block becomes a shell snippet and satisfies
markdownlint MD040.

In `@README.md`:
- Around line 69-70: Update the README Quickstart section to include a short
non-Fly example using the newly-implemented stdin (or file) source so the docs
match the source table change; locate the "Quickstart" heading in README.md and
add one concise snippet demonstrating how to run the project with stdin (or a
file) as the data source (showing the minimal install/run commands and the
stdin/file invocation), and keep it parallel to the existing Fly quickstart
wording so both Fly and a non-Fly path are present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 1574135b-a4fb-47fe-b88f-a73f5f2443eb

📥 Commits

Reviewing files that changed from the base of the PR and between b57b1fb and 92c45c8.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

CHANGELOG.md
README.md
docs/CONFIG.md
docs/ROADMAP.md
docs/SOURCES.md
src/paperbark/dispatcher.py
src/paperbark/sources/stdin.py
tests/test_dispatcher.py
tests/test_sources.py

coderabbitai · 2026-05-09T16:41:12Z

 ## Current state

- **Last verified:** 2026-05-05
+- **Last verified:** 2026-05-10


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Last verified is set to a future date.

Line 8 says 2026-05-10, but today is 2026-05-09. Please set it to 2026-05-09 now, or only update to 2026-05-10 after that date is actually verified.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/ROADMAP.md` at line 8, Update the "**Last verified:**" entry in docs/ROADMAP.md from 2026-05-10 to 2026-05-09 so the recorded verification date is not in the future; locate the bold "Last verified:" text and replace the date accordingly.

simonsmallchua · 2026-05-09T21:33:08Z

Pushed 79d3b2f addressing the CodeRabbit findings:

✅ docs/CONFIG.md:292 — added sh language tag to the stdin code fence.
✅ docs/SOURCES.md:159 — same MD040 fix.
✅ README.md — added a stdin quickstart snippet after the Fly block (with a one-liner pointer to the file source as the on-disk equivalent), per the README.md maintenance standard in CLAUDE.md. Smoke-tested the minimal [[sources]] config locally — cat app.log | paperbark monitor --iterations 1 produces a clean run dir with the default [paperbark] / [monitor] settings.
⏭️ docs/ROADMAP.md:8 — skipped. The Last verified: 2026-05-10 date is correct as of the commit time; the bot's clock appears to be a day behind.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/CONFIG.md`:
- Around line 292-294: The example pipes stdin into the wrong CLI command:
`paperbark analyse` does not consume source input, so replace the piped command
usage with the capture flow using `paperbark monitor` (e.g., change `cat my.log
| paperbark analyse --run latest` to use `cat my.log | paperbark monitor
--iterations 1`) so stdin is actually captured; update the example in CONFIG.md
to reference `paperbark monitor` and the `--iterations` flag accordingly.

In `@docs/SOURCES.md`:
- Around line 155-157: Update the documentation text that currently states
"Reads lines from `sys.stdin` and yields them" to limit stdin usage to the
capture step: clarify that `sys.stdin` input is consumed only by `paperbark
monitor` (or the monitor capture mode) for one-shot piping of pre-captured logs,
and that `analyse` and `search` operate on existing run artifacts rather than
reading from `sys.stdin`; keep references to `paperbark monitor`, `analyse`,
`search`, and `sys.stdin` so readers can locate the relevant sentences to edit.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 12900ad1-1838-4238-a0f7-9b35173e4b5b

📥 Commits

Reviewing files that changed from the base of the PR and between 92c45c8 and 79d3b2f.

📒 Files selected for processing (3)

README.md
docs/CONFIG.md
docs/SOURCES.md

simonsmallchua · 2026-05-09T21:37:48Z

Pushed 8157918 — both findings valid, fixed at the root:

✅ docs/CONFIG.md:294 — example now uses paperbark monitor --iterations 1 (was incorrectly paperbark analyse --run latest).
✅ docs/SOURCES.md:155-157 — wording now scopes stdin capture to monitor and explicitly notes that analyse / search never consume stdin.
➕ Same misleading wording also appeared in CHANGELOG.md (the Added entry) and src/paperbark/sources/stdin.py (module docstring) — fixed both for consistency.

Add real stdin source

92c45c8

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Address CodeRabbit feedback on stdin PR

79d3b2f

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Comment thread docs/CONFIG.md

Comment thread docs/SOURCES.md Outdated

Clarify stdin only feeds monitor

8157918

simonsmallchua merged commit 187ad8f into main May 9, 2026
6 checks passed

coderabbitai Bot mentioned this pull request May 10, 2026

Add real wrangler source #28

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add real stdin source#27

Add real stdin source#27
simonsmallchua merged 3 commits into
mainfrom
work/hungry-fermat-533b02

simonsmallchua commented May 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 9, 2026 •

edited

Loading

Walkthrough

Changes

Possibly Related PRs

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot May 9, 2026

Uh oh!

Uh oh!

Uh oh!

simonsmallchua commented May 9, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

simonsmallchua commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

simonsmallchua commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Drive-by

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly Related PRs

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Preview

Changelog

Added

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

simonsmallchua commented May 9, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

simonsmallchua commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

simonsmallchua commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading