📚🎧 EPUB to Audiobook Generator

A modern desktop application that converts EPUB ebooks into high-quality audiobooks using Kokoro TTS with GPU acceleration. Features a beautiful Electron-based UI with real-time progress tracking.

✨ Features

🚀 GPU Accelerated - Utilizes NVIDIA GPUs for fast processing
🎨 Modern Desktop UI - Beautiful Electron app with React frontend
📖 EPUB Support - Direct upload and conversion from EPUB files
✏️ Text Editing - Review and edit extracted text before conversion
📊 Real-Time Progress - Track conversion progress with live updates
🔧 Smart Chunking - Handles large books by splitting into manageable chunks
🎵 MP3 Output - Creates compressed audiobook files
📥 File Management - Download audio files or open file location
🧹 Auto Cleanup - Removes temporary files after processing

🏗️ Architecture

Backend: FastAPI (Python) - Handles EPUB extraction and TTS conversion
Frontend: Electron + React - Modern desktop application
TTS Engine: Kokoro TTS with PyTorch
Audio Processing: FFmpeg for audio combination

📋 Requirements

System Requirements

Windows (tested), Linux, or macOS
NVIDIA GPU (recommended) with CUDA support
FFmpeg installed and in PATH
Python 3.8+
Node.js 16+ (for Electron frontend)

Hardware Recommendations

GPU: NVIDIA RTX 3060 or better
RAM: 8GB+ (16GB+ recommended for large books)
Storage: 2-3GB free space per book

⚙️ Installation

1. Clone or Download

git clone <repository-url>
cd audiobook_generator

2. Backend Setup

Install Python Dependencies

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install requirements
pip install -r requirements.txt

Install PyTorch with CUDA (for GPU acceleration)

# For CUDA 12.1 (check your CUDA version with: nvidia-smi)
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121

# For CUDA 11.8
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

Install FFmpeg

Windows:

Download from ffmpeg.org or use winget install ffmpeg
Extract and add to PATH
Test: ffmpeg -version

Linux:

sudo apt update
sudo apt install ffmpeg

macOS:

brew install ffmpeg

3. Frontend Setup

cd frontend
npm install

🚀 Usage

Starting the Application

Step 1: Start the Backend Server

# Activate virtual environment if not already active
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Start FastAPI server
uvicorn main:app --reload --port 8000

The backend will be available at http://localhost:8000

Step 2: Start the Electron Frontend

cd frontend
npm start

Using the Application

Step 1: Upload EPUB File

Click "Choose EPUB File" to select your EPUB book
Click "Extract Text" to extract and process the text content
The extracted text will be saved to the output/ folder

Step 2: Review and Edit Text

Review the extracted text in the text editor
Make any edits or corrections as needed
Optionally download the .txt file
Click "Continue to Convert" when ready

Step 3: Convert to Audiobook

Watch the real-time progress bar as your audiobook is generated
Progress updates show:
- Current chunk being processed
- Overall completion percentage
- Status messages
When complete, you can:
- Download Audio: Download the MP3 file directly
- Open File Location: Open the file in your system file manager
- Convert Another File: Start a new conversion

📁 Project Structure

audiobook_generator/
├── main.py                 # FastAPI backend server
├── requirements.txt        # Python dependencies
├── frontend/               # Electron frontend
│   ├── src/
│   │   ├── main.js        # Electron main process
│   │   ├── preload.js     # Preload script (IPC)
│   │   ├── renderer.jsx   # React entry point
│   │   ├── pages/
│   │   │   └── Home.jsx   # Main application page
│   │   ├── components/    # React components
│   │   └── css/           # Stylesheets
│   └── package.json       # Node.js dependencies
├── uploads/               # Temporary EPUB storage
└── output/               # Generated files (txt, mp3)

🔌 API Endpoints

POST `/extract`

Upload and extract text from an EPUB file.

Request:

file: EPUB file (multipart/form-data)

Response:

{
  "message": "EPUB extracted successfully",
  "output": "book_name.txt",
  "text": "Full extracted text content..."
}

POST `/convert`

Convert text to audiobook.

Request:

{
  "text": "Text content to convert...",
  "filename": "book_name.epub"
}

Response:

{
  "message": "Conversion started",
  "task_id": "uuid-here"
}

GET `/progress/{task_id}`

Get conversion progress.

Response:

{
  "status": "processing",
  "progress": 45,
  "current_chunk": 10,
  "total_chunks": 25,
  "message": "Processing chunk 10 of 25...",
  "output_file": null
}

GET `/download/{filename}`

Download the generated audio file.

⚡ Performance Tips

GPU Optimization

The application automatically detects and uses your GPU
Larger chunks use more GPU memory but process faster
RTX 4070 users: Can process chunk sizes up to 150,000 characters

Speed Expectations

Hardware	Processing Speed	2.4M Character Book
RTX 4070	~400+ chars/sec	~19 minutes
RTX 3070	~300+ chars/sec	~25 minutes
CPU only	~50-100 chars/sec	2-4 hours

🔧 Configuration

Backend Configuration

Edit main.py to customize:

# Voice selection (line ~213)
voice = 'af_sarah'    # Options: 'af_sarah', 'af_heart', etc.

# Speech speed (line ~214)
speed = 1.0           # 0.8 = slower, 1.2 = faster

# Chunk size (line ~216)
chunk_size = 100000   # Larger = fewer files, more GPU memory usage

Frontend Configuration

Edit frontend/src/pages/Home.jsx to change the backend URL:

const CONST_BASE_URL = "http://localhost:8000";

🐛 Troubleshooting

Common Issues

"CUDA not available"

# Check CUDA installation
python -c "import torch; print(torch.cuda.is_available())"

# Reinstall PyTorch with CUDA
pip install torch --index-url https://download.pytorch.org/whl/cu121

"FFmpeg not found"

Ensure FFmpeg is installed and in your system PATH
Test: ffmpeg -version

Backend connection errors

Ensure the FastAPI server is running on port 8000
Check that CORS is properly configured
Verify the backend URL in the frontend matches your server

Progress bar not updating

Check browser console for errors
Verify the task_id is being received from the convert endpoint
Ensure the polling interval is working (check Network tab)

File download not working

Check that the file exists in the output/ folder
Verify file permissions
Check Electron IPC handlers are properly set up

Performance Issues

Slow processing: Ensure GPU is being used (check backend console output)
High memory usage: Reduce chunk_size in main.py
Frontend lag: Close DevTools if open, reduce polling frequency

📖 How It Works

Text Extraction:
- User uploads EPUB file via the frontend
- Backend parses EPUB and extracts clean text
- Text is cleaned and formatted
- Extracted text is returned to frontend and saved to output/ folder
Text Review:
- User can review and edit the extracted text
- Edits are stored in memory (not saved to file)
- User can download the original extracted text
TTS Conversion:
- User submits text for conversion
- Backend creates a unique task ID
- Text is split into GPU-manageable chunks
- Each chunk is processed with Kokoro TTS
- Progress is tracked and updated in real-time
Audio Combination:
- Individual audio chunks are combined using FFmpeg
- Final MP3 file is created in output/ folder
- Temporary files are cleaned up
File Access:
- User can download the MP3 file directly
- Or open the file location in system file manager (Electron only)

🎛️ Customization

Voice and Speed Settings

Edit main.py around line 213:

# Voice Options
voice = 'af_sarah'    # Default female voice
voice = 'af_heart'    # Alternative female voice

# Speed Control
speed = 0.8          # Slower, more deliberate
speed = 1.0          # Normal speed (default)
speed = 1.2          # Faster narration

GPU Memory Optimization

Edit main.py around line 216:

# Chunk Size (characters per processing chunk)
chunk_size = 50000   # Conservative (4GB+ GPU)
chunk_size = 100000  # Balanced (8GB+ GPU, like RTX 4070)
chunk_size = 150000  # Maximum (12GB+ GPU, like RTX 4080/4090)

Text Cleaning

Edit the clean_text() function in main.py:

def clean_text(text):
    # Basic cleanup
    text = re.sub(r'\s+', ' ', text.strip())

    # Handle censored words
    text = re.sub(r'F\s*\*\s*ck', 'Fuck', text)

    # Custom replacements
    text = re.sub(r'Dr\.', 'Doctor', text)
    text = re.sub(r'Mr\.', 'Mister', text)

    return text

📄 License

This project uses:

Kokoro TTS: Apache 2.0 License
FastAPI: MIT License
Electron: MIT License
React: MIT License
Other dependencies: Various open-source licenses

🤝 Contributing

Fork the repository
Create a feature branch
Make your improvements
Submit a pull request

📞 Support

For issues:

Check the troubleshooting section
Ensure all dependencies are installed
Verify GPU/CUDA setup
Check FFmpeg installation
Review browser/Electron console for errors

🙏 Acknowledgments

Kokoro TTS team for the excellent TTS model
PyTorch for GPU acceleration framework
FastAPI for the modern Python web framework
Electron for cross-platform desktop app framework
FFmpeg for audio processing
ebooklib for EPUB parsing

Happy audiobook generation! 🎧📚

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
__pycache__		__pycache__
frontend		frontend
.DS_Store		.DS_Store
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📚🎧 EPUB to Audiobook Generator

✨ Features

🏗️ Architecture

📋 Requirements

System Requirements

Hardware Recommendations

⚙️ Installation

1. Clone or Download

2. Backend Setup

Install Python Dependencies

Install PyTorch with CUDA (for GPU acceleration)

Install FFmpeg

3. Frontend Setup

🚀 Usage

Starting the Application

Step 1: Start the Backend Server

Step 2: Start the Electron Frontend

Using the Application

Step 1: Upload EPUB File

Step 2: Review and Edit Text

Step 3: Convert to Audiobook

📁 Project Structure

🔌 API Endpoints

POST /extract

POST /convert

GET /progress/{task_id}

GET /download/{filename}

⚡ Performance Tips

GPU Optimization

Speed Expectations

🔧 Configuration

Backend Configuration

Frontend Configuration

🐛 Troubleshooting

Common Issues

Performance Issues

📖 How It Works

🎛️ Customization

Voice and Speed Settings

GPU Memory Optimization

Text Cleaning

📄 License

🤝 Contributing

📞 Support

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

POST `/extract`

POST `/convert`

GET `/progress/{task_id}`

GET `/download/{filename}`

Packages