Skip to content

aaqwesas/Automate1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PDF Splitter and Organizer

This script is the version 1 for certificate and receipt splitter. It automates the processing of certificate and receipt PDFs by splitting them into individual files named according to participant names from a csv file. It first merges input PDFs, then splits them page-by-page, and finally packages the result into a ZIP archive.

Note: This script is built specifically for my part time job. The use case is very restrictive and only for very specific task. Modify the code if there are any use cases for you.

Features

  • Reads participant names from a CSV file
  • Merges multiple certificate and receipt PDFs into single documents
  • Splits merged PDFs into individual pages named with participant information
  • Organizes output into course-specific folders
  • Creates ZIP archives of the output folders
  • Optional cleanup of intermediate directories
  • Comprehensive logging with rotation (max 10 MB per log file, 5 backups)

Requirements

Python Dependencies

Install required packages:

pip install PyPDF2

Directory Structure

The script expects the following directory layout:

project/
├── data/
│   ├── participants.csv        # Participant names (first column)
│   ├── Certificate/            # Input certificate PDFs
│   └── Receipt/                # Input receipt PDFs
└── main.py                     # This script

Usage

  1. Place your participant names in data/participants.csv with one name per row (first column only)
  2. Place certificate PDFs in data/Certificate/
  3. Place receipt PDFs in data/Receipt/
  4. Run the script:
python main.py
  1. Enter the course code when prompted

Output

The script generates:

  • Certificate_{course_code}/ containing individual certificate PDFs
  • Receipt_{course_code}/ containing individual receipt PDFs
  • ZIP archives: Certificate_{course_code}.zip and Receipt_{course_code}.zip

By default, the script retains the output folders after creating ZIP files. To automatically remove the folders after zipping, call main(remove=True).

CSV Format

The CSV file must contain participant names in the first column. The first row is treated as a header and skipped.

Example participants.csv:

Name
John Doe
Jane Smith
Robert Johnson

Logging

Logs are written to:

  • Console (stdout)
  • app.log file with automatic rotation (10 MB max size, 5 backup files)

Log messages include timestamps, logger name, severity level, and message content.

Error Handling

The script includes error handling for:

  • Missing CSV or PDF files
  • Mismatched page counts between PDFs and participant names
  • File system errors during read/write operations
  • PDF processing errors

Errors are logged with descriptive messages and the script exits gracefully on critical failures.

Cleanup

Temporary merged PDF files (temp_combined_certificates.pdf and temp_combined_receipts.pdf) are automatically deleted after processing.

Output folders can be automatically removed after ZIP creation by setting remove=True in the main() call.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages