Project Auralis

Auralis is a speaker identification system that uses voice biometrics to identify speakers in audio files. It is particularly well-suited for scenarios where speakers are known and recur, such as earnings calls, meetings, or podcasts.

The system works by generating a unique "voiceprint" (a speaker embedding) for each person and storing it in a reference database. When given a new audio clip, Auralis compares the voice in the clip to the database to find a match.

This project is fully containerized using Docker, making it easy to set up and run on any system.

How it Works

The core of Auralis is a deep learning model (speechbrain/spkrec-ecapa-voxceleb) that has been trained to extract the unique characteristics of a person's voice. The process is as follows:

Audio Processing: Raw audio files are sliced into smaller, labeled clips for each speaker.
Embedding Generation: A speaker embedding (a vector of numbers) is generated for each clip. These embeddings, along with an average embedding for each speaker, are stored in a JSON database.
Speaker Matching: To identify a speaker in a new audio clip, an embedding is generated for the clip and compared against the average embeddings in the database using cosine similarity. The speaker with the highest similarity score is identified as the match.

Getting Started

Prerequisites

Docker installed and running on your system.
A Hugging Face account and an access token.

1. Hugging Face Authentication

This project requires downloading a pre-trained model from the Hugging Face Hub. The model used is speechbrain/spkrec-ecapa-voxceleb, which is a gated repository.

Create a Hugging Face Account: If you don't have one, create an account at huggingface.co.
Accept the Model's Terms: Visit the model's page at https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb and accept the license agreement.
Generate an Access Token: In your Hugging Face account settings, create an access token with "read" permissions.

This token will be passed to the Docker container as an environment variable.

2. Build the Docker Image

You can build the Docker image with or without GPU support.

For CPU:

docker build -t auralis-cpu -f docker/Dockerfile.cpu .

For GPU:

docker build -t auralis-gpu -f docker/Dockerfile.gpu .

3. Usage

For a complete, step-by-step walkthrough on how to process audio, generate embeddings, and test speaker matching, please refer to the EXAMPLES.md file.

This guide will walk you through the entire workflow, from raw audio to speaker identification, with copy-paste-friendly commands.

Compatibility

This project has been tested on an Intel-based Mac (macOS). The auralis-cpu Docker image and all scripts have been confirmed to work in this environment.

The Dockerfile.gpu is provided for users with NVIDIA GPUs, but it has not been tested.

Project Structure

. Auralis/
├── docker/               # Dockerfiles for CPU and GPU environments
│   ├── Dockerfile.cpu
│   ├── Dockerfile.gpu
│   └── requirements.txt
├── src/                  # Python source code
│   ├── process_audio.py
│   ├── generate_embeddings.py
│   └── test_matching.py
├── data/                 # Data directory (ignored by git)
│   ├── raw_audio/        # Place your raw audio files here
│   ├── processed_audio/  # Processed clips will be saved here
│   └── test_audio/       # Place audio files for testing here
├── .gitignore
├── README.md             # This file
├── EXAMPLES.md           # Step-by-step usage examples
└── LICENSE               # MIT License

Next Steps

This proof-of-value release provides a solid foundation. Future enhancements could include:

A more robust database for storing embeddings.
A user interface for easier interaction.
Real-time transcription and speaker identification.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Auralis

How it Works

Getting Started

Prerequisites

1. Hugging Face Authentication

2. Build the Docker Image

3. Usage

Compatibility

Project Structure

Next Steps

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docker		docker
src		src
.gitignore		.gitignore
EXAMPLES.md		EXAMPLES.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Project Auralis

How it Works

Getting Started

Prerequisites

1. Hugging Face Authentication

2. Build the Docker Image

3. Usage

Compatibility

Project Structure

Next Steps

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages