Skip to content

ErwinLiYH/file_progress

Repository files navigation

file-progress

中文文档

file-progress is a small progress indicator library that writes the latest progress state to a text file instead of continuously refreshing stdout.

It is built for cases where terminal progress bars are inconvenient or fragile:

  • In tmux, Slurm jobs, background tasks, and log collectors, dynamic terminal progress bars can corrupt the terminal view or produce noisy logs.
  • In multi-threaded and multi-process workloads, several workers writing progress to the same terminal is hard to read and easy to interleave.

The library writes a fresh snapshot to a progress file. You can inspect it with cat progress/xxx.txt, watch cat progress/xxx.txt, or another terminal. In multi-process mode, child processes do not write the file directly; they send progress events back to the parent process, which renders the file.

Example multi-worker progress snapshot:

worker 1 pid=12345 thread worker 1 [########################] 100.00% 3/3 elapsed=00:00:05 eta=00:00:00 task=3 phase=sleep
worker 2 pid=12345 thread worker 2 [##################------]  75.00% 3/4 elapsed=00:00:05 eta=00:00:01 task=3 phase=sleep
worker 3 pid=12345 thread worker 3 [############------------]  50.00% 3/6 elapsed=00:00:05 eta=00:00:05 task=3 phase=sleep
total thread pool demo [################--------]  69.23% 9/13 elapsed=00:00:05 eta=00:00:02

Install

From this repository:

pip install ./file_progress
pip install -e ./file_progress

From GitHub:

pip install "git+https://github.com/ErwinLiYH/file_progress.git"

Use #subdirectory=... only when the Python package lives inside a repository subdirectory instead of the repository root.

Minimal Examples

Single Worker

Main execution block from test_single_worker.py. See that file for the full runnable example.

steps = 5

with FileProgress(
    interval_seconds=0.0,
    cleanup_on_success=False,
    verbose=2,
) as progress:
    for step in range(1, steps + 1):
        progress.update(
            "single worker demo",
            step,
            steps,
            extra=f"task={step} phase=sleep",
        )
        time.sleep(1)

Multi-Thread

Main execution block from test_multi_threads.py. The _run_worker function is omitted here; see the linked file for the full runnable example.

worker_steps = [3, 4, 5]

with MultiWorkerFileProgress(
    desc="thread pool demo",
    total=sum(worker_steps),
    interval_seconds=0.0,
    cleanup_on_success=False,
    verbose=2,
) as progress:
    with ThreadPoolExecutor(max_workers=len(worker_steps)) as executor:
        futures = [
            executor.submit(_run_worker, progress, worker_id, steps)
            for worker_id, steps in enumerate(worker_steps, start=1)
        ]
        for future in futures:
            future.result()

Multi-Process

Main execution block from test_multi_processes.py. The _run_worker function is omitted here; see the linked file for the full runnable example. Keep if __name__ == "__main__" when using spawn.

if __name__ == "__main__":
    worker_steps = [3, 4, 5]
    ctx = multiprocessing.get_context("spawn")
    with MultiWorkerFileProgress(
        desc="process pool demo",
        total=sum(worker_steps),
        interval_seconds=0.0,
        cleanup_on_success=False,
        verbose=2,
    ) as progress:
        initializer, initargs = progress.process_initializer(ctx)
        with ProcessPoolExecutor(
            max_workers=len(worker_steps),
            mp_context=ctx,
            initializer=initializer,
            initargs=initargs,
        ) as executor:
            list(executor.map(_run_worker, enumerate(worker_steps, start=1)))

API

FileProgress

FileProgress(
    path=None,
    *,
    progress_dir="progress",
    interval_seconds=5.0,
    width=24,
    cleanup_on_success=True,
    verbose=2,
)
  • path: Full progress file path. If omitted, a randomly named progress file is created under progress_dir.
  • progress_dir: Directory for the auto-generated progress file when path is omitted. Defaults to ./progress.
  • interval_seconds: Minimum interval between file writes. The first and final updates are always written.
  • width: Character width of the progress bar.
  • cleanup_on_success: Delete the progress file when the context exits successfully.
  • verbose: Terminal output level. 0 prints nothing, 1 prints the progress file path, and 2 prints the path plus the final progress state.

Methods:

  • update(desc, current, total, *, extra=""): Update the single progress bar. Start time is tracked internally.
  • close(success=True): Close the file. Usually handled by the with block.

MultiWorkerFileProgress

MultiWorkerFileProgress(
    path=None,
    *,
    progress_dir="progress",
    desc="total",
    total=1,
    interval_seconds=5.0,
    width=24,
    cleanup_on_success=True,
    verbose=2,
)
  • path: Full multi-worker progress file path. If omitted, a randomly named progress file is created under progress_dir.
  • progress_dir: Directory for the auto-generated progress file when path is omitted. Defaults to ./progress.
  • desc: Description for the aggregate total progress bar.
  • total: Total work amount, usually the sum of all worker steps.
  • interval_seconds: Minimum interval between file writes.
  • width: Character width of each progress bar.
  • cleanup_on_success: Delete the progress file when the context exits successfully.
  • verbose: Terminal output level. Same meaning as FileProgress.

Thread helpers:

  • get_sub_progress_thread(worker_id): Create a worker progress proxy for a thread.
  • thread_worker(worker_id): Alias for get_sub_progress_thread().

Process helpers:

  • process_initializer(ctx): Return (initializer, initargs) for ProcessPoolExecutor.
  • process_queue(ctx): Return the internal process progress queue.
  • get_process_progress_queue(ctx): Low-level method used by process_queue(ctx).

Low-level update methods:

  • update_worker(worker_id, desc, current, total, start_time, *, extra="", pid=None): Update a worker state in the parent process. Most users should use worker proxies instead.
  • increment_total(amount=1): Increment the aggregate total progress.
  • close(success=True): Close and optionally print the final snapshot. Usually handled by the with block.

Worker Progress

Objects returned by get_sub_progress_thread() and get_sub_progress_process() support:

  • update(desc, current, total, *, extra=""): Update this worker's progress.
  • increment_total(amount=1): Increment the aggregate total progress.
  • pid: PID of the process that owns this worker proxy.

Process Helpers

  • configure_process_sub_progress(progress_queue): Install the progress queue in a child process.
  • get_sub_progress_process(worker_id=None): Return a process worker progress proxy. If worker_id is omitted, the current PID is used.
  • configure_process_worker_progress: Alias for configure_process_sub_progress.
  • get_process_worker_progress: Alias for get_sub_progress_process.

About

A tiny Python utility that writes progress bars to files instead of stdout, making progress tracking cleaner in tmux, Slurm, logs, and multi-thread or multi-process jobs. 一个轻量 Python 工具,将进度条写入文件而不是 stdout,适合 tmux、Slurm、日志系统以及多线程/多进程任务中的进度跟踪。

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages