A robust, cross-platform Python script that converts .cbz and .cbr comic book archives into high-quality PDF format. After successful conversion, the original files are safely moved to an old_files directory to keep your workspace organized.
Anti-Piracy Notice: This software is developed solely as a personal utility to help users manage, format, and read their legally obtained, DRM-free digital comic book collections on devices that primarily support PDF formats. The authors and contributors of this repository do not endorse, promote, or condone the piracy of copyrighted materials. We are not responsible for how you use this tool or for the legality of the files you choose to process. Please support the comic book industry and original creators by purchasing official releases.
- Cross-Platform: Works seamlessly on Windows, macOS, and Linux.
- Intelligent Resizing: Automatically scales down excessively large images (colossal pages) to a maximum of 2560x2560 pixels while strictly preserving the aspect ratio. Uses the Lanczos filter to prevent Moiré patterns in comic halftones.
- Format Normalization: Automatically handles transparent images (RGBA) and unsupported formats (like WebP or GIF) by flattening them onto a white background and converting them to RGB JPEGs.
- Natural Sorting: Sorts pages logically (e.g.,
page_2.jpgcomes beforepage_10.jpg). - Memory Management: Includes explicit garbage collection to process multiple heavy archives without crashing.
- Python 3.x must be installed.
- Required pip packages:
img2pdf,natsort,rarfile,Pillow,tqdm.
To process .cbr (RAR) files, your system must have an UnRAR executable available.
-
Windows: The script automatically looks for WinRAR in the standard installation paths. Alternatively, if you don't have WinRAR installed, you can simply download the Windows Release ZIP from this repository and extract the portable
UnRAR.exeinto the same folder as the script or executable. -
Ubuntu/Debian:
sudo apt update sudo apt install unrar
-
Arch Linux:
sudo pacman -S unrar
-
Clone the Repository: Open a terminal and clone the repository from GitHub:
git clone https://github.com/GabrielMaida/CBtoPDF.git cd CBtoPDF -
Install Required Libraries:
pip install img2pdf natsort rarfile pillow tqdm
-
Prepare Your Directory:
- Drop the
.cbzor.cbrfiles into the same directory as thecb2pdf.pyscript (and theUnRAR.exeif you are using the portable Windows release).
- Drop the
-
Run the Script:
- Open a terminal, navigate to the directory, and execute:
python cb2pdf.py
(On some Linux distributions, you might need to use
python3 cb2pdf.py)The script will process all archives, display a progress bar, create the PDFs, and move the originals to the
old_filesfolder. Any errors will be registered inconversion_error_log.txt.
We are constantly looking to improve this tool. Here are some features planned for future releases:
- Smart Archive Detection: Automatically detect and correctly process misnamed archives (e.g., a true ZIP file renamed to
.cbror a RAR file renamed to.cbz) to prevent extraction failures without throwing errors. - Reverse Conversion (PDF to CBZ): Introduce a feature to convert comic PDFs back into
.cbzarchives. (Note: Converting back to.cbris not planned as RAR archive creation is a proprietary algorithm). - Command-Line Interface (CLI): Add CLI arguments to allow advanced users to customize the conversion process (e.g., setting custom max resolutions, specifying input/output directories, or bypassing image resizing) directly from the terminal.
- Directory Setup: Creates the
old_filesoutput directory and initializes the logging system. - Extraction Phase: Extracts the contents of the CBZ (Zip) or CBR (RAR) archive into a temporary folder provided by the OS.
- Deep Scan Phase: Recursively searches the temporary folder for supported images (
.jpg,.png,.webp, etc.). It filters out hidden system folders like__MACOSXor.gitto prevent compilation errors and applies natural sorting to the file names. - Image Processing Phase:
- Checks the dimensions of each image. If an image exceeds 2560 pixels in width or height, it is proportionally downscaled.
- Cleans up alpha channels (transparency) by pasting the image over a solid white background.
- PDF Generation Phase: Uses
img2pdfto losslessly compile the processed images directly into a PDF byte-stream, which is highly efficient. - Cleanup Phase: The temporary folder is automatically deleted by the system. If the PDF is generated successfully, the original archive is moved to the
old_filesfolder.
-
ModuleNotFoundError: You are missing one of the Python libraries. Run the installation command:
pip install img2pdf natsort rarfile pillow tqdm
-
WARNING: UnRAR.exe could not be found / rarfile.RarCannotExec: The script cannot extract
.cbrfiles because it cannot find the UnRAR tool.- Windows: Install WinRAR or place the
UnRAR.exe(from the Releases page) in the same folder as the script. - Linux: Install the
unrarpackage using your system's package manager (e.g.,pacmanorapt).
- Windows: Install WinRAR or place the
-
PDF pages are out of order: Ensure the internal files of your archive are numbered consistently. While the script uses
natsortto handle numbers logically, wildly different naming schemes inside the same archive might cause issues. -
Corrupted Image Warnings in the Log: Sometimes downloaded archives contain broken image files. The script is designed to ignore them and log a warning in
conversion_error_log.txt, allowing the rest of the comic to be converted safely.
This project is licensed under the MIT License. See the LICENSE file for details.