Professor Profiler

An advanced hierarchical multi-agent system that reverse-engineers exam papers using NVIDIA NIM and Google Gemini to decode topic weights and cognitive complexity, outputting optimized study recommendations.

Overview

Professor Profiler is a Hierarchical Multi-Agent System (HMAS) designed to mimic the cognitive process of an expert academic coach. By orchestrating specialized worker agents powered by NVIDIA NIM (with optional Google Gemini fallback), it ingests exam PDFs, classifies questions (using Bloom's Taxonomy), tracks statistical trends, and formulates high-impact study plans.

This project serves as a reference implementation for:

Hub-and-Spoke Agent Architecture
Hosted LLM Integration via NVIDIA NIM (OpenAI-compatible) and Google Gemini (fallback)
Long-term Memory Management (JSON-persisted memory banks)
Production-grade Observability (tracing, request count, and latency metrics)

System Architecture

The system creates a directed acyclic graph (DAG) of agent execution, managed by a central orchestrator.

High-Level Design

flowchart TD
    subgraph External_Layer [" External Layer"]
        User([User / Client])
        PDF_File[Exam PDF]
    end

    subgraph Orchestration_Layer [" Orchestration Layer"]
        Runner[<b>Runner</b><br><i>State Management</i>]
        Memory[(<b>Memory Bank</b><br><i>JSON Persistence</i>)]
        Session[<b>Session Service</b>]
    end

    subgraph Agent_Layer [" Agent Hierarchy"]
        Root[<b>ROOT AGENT</b><br><i>Llama 3.3 70B (NIM)</i><br>The Orchestrator]

        subgraph Workers ["Specialized Sub-Agents"]
            Taxonomist[<b>Taxonomist</b><br><i>Llama 3.1 70B (NIM)</i><br>Topic & Bloom's Classification]
            Trend[<b>Trend Spotter</b><br><i>Llama 3.3 70B (NIM)</i><br>Statistical Analysis]
            Strat[<b>Strategist</b><br><i>Llama 3.3 70B (NIM)</i><br>Study Planning]
        end
    end

    subgraph Tool_Layer [" Tool Layer"]
        Reader[PDF Ingestion]
        Plotter[Matplotlib Viz]
        Calc[Stats Engine]
    end

    User --> Runner
    PDF_File --> Reader
    Runner <--> Session
    Runner <--> Memory
    Runner --> Root

    Root --Delegates--> Taxonomist
    Root --Delegates--> Trend
    Root --Delegates--> Strat

    Root --Calls--> Reader
    Root --Calls--> Plotter
    Trend --Calls--> Calc

Tech Stack

Component	Technology	Description
Core Logic	Python 3.10+	Type-hinted, async-native codebase.
LLM Provider	NVIDIA NIM	OpenAI-compatible endpoint hosting high-performance open models.
Fallback Engine	Google Gemini	Rollback path when NIM fails or for comparison testing.
Document Processing	`pypdf`	Robust text extraction from exam PDFs.
Visualization	`matplotlib`	Generates distribution bar charts and pie charts.
Observability	Custom Logging	Structured logging with latency and success metrics.
Configuration	`python-dotenv`	Environment variable validation and typed configurations.

Agent Personas

The system splits the cognitive load across three distinct worker agents:

1. The Taxonomist (Classifier)

Model: meta/llama-3.1-70b-instruct (NIM) | gemini-2.0-flash-exp (Gemini)
Role: The meticulous grader. It reads every question and tags it with a topic and Bloom's Taxonomy Level (Remember, Understand, Apply, Analyze, Evaluate, Create).

2. The Trend Spotter (Analyst)

Model: meta/llama-3.3-70b-instruct (NIM) | gemini-2.0-pro-exp (Gemini)
Role: The data scientist. It looks at the frequency and cognitive complexity distribution to isolate shifts and outliers.

3. The Strategist (Coach)

Model: meta/llama-3.3-70b-instruct (NIM) | gemini-2.0-pro-exp (Gemini)
Role: The academic coach. It aggregates findings into a study recommendation containing a Hit List, Safe Zone, and Drop List.

Getting Started

For full instructions on setting up the core Python backend, configuring your API keys, and installing the optional interactive Node.js CLI, please see our dedicated Installation Guide (INSTALL.md).

Usage

1. Interactive Execution (Recommended)

Place your exam PDF inside the input/ folder, then run the startup runner:

Linux / macOS

python3 run.py

Windows

python run.py

This script will:

Check for existing PDFs in input/. If empty, it automatically generates three sample exams using create_sample_exams.py.
Prompt you to select a PDF for analysis.
Execute the orchestrator and all sub-agents sequentially.
Save a structured Markdown study report to output/reports/.
Print request metrics (total requests, average latency, and fallback counts).

2. Running the Benchmarks

To compare latency and execution between NVIDIA NIM and Google Gemini:

Linux / macOS

python3 scripts/benchmark_nim_vs_gemini.py

Windows

python scripts/benchmark_nim_vs_gemini.py

3. Verification & Testing

To execute the automated test suites:

Linux / macOS

# Run unit and migration tests
pytest tests/test_nim_migration.py

# Run integration tests
pytest tests/test_nim_full_integration.py

# Run agent routing tests
pytest tests/test_agent.py

Windows

# Run unit and migration tests
pytest tests/test_nim_migration.py

# Run integration tests
pytest tests/test_nim_full_integration.py

# Run agent routing tests
pytest tests/test_agent.py

Project Structure

Professor_Profiler/
├── input/                      # Exam PDF inputs
├── output/                     # Generated artifacts
│   ├── charts/                 # Visualizations (.png)
│   ├── logs/                   # Execution log files
│   └── reports/                # Markdown study recommendations
├── google/adk/                 # Custom Agent Development Kit (ADK)
│   ├── agents/                 # Base Agent & tool execution
│   ├── clients/                # NIM Client wrapping AsyncOpenAI
│   └── runners/                # Orchestrator Runner
├── profiler_agent/             # App-specific agents and configurations
│   ├── sub_agents/             # Taxonomist, Trend Spotter, Strategist
│   ├── tools.py                # Ingestion, Stats, and Viz tools
│   └── config.py               # Provider settings
├── run.py                      # Interactive startup runner
├── demo.py                     # Feature demo runner
└── tests/                      # Automated test suite

Troubleshooting

Issue	Cause	Solution
KeyError / 404 Not Found	Model not active on tier	Switch NIM models in `.env` to meta family (e.g. `meta/llama-3.1-70b-instruct`).
TypeError: ARC4	Warning message	Cryptography warning from pypdf. Safe to ignore or update cryptography package.
asyncio.TimeoutError	Slow hosted API endpoint	Increase `NIM_TIMEOUT` inside `.env` to `120` or higher.
401 Unauthorized	Invalid key	Verify `NIM_API_KEY` or `GOOGLE_API_KEY` is loaded correctly in `.env`.

License

Distributed under the MIT License. See LICENSE for more details.

Maintained by uffamit
Website: amitdevx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Professor Profiler

Overview

System Architecture

High-Level Design

Tech Stack

Agent Personas

1. The Taxonomist (Classifier)

2. The Trend Spotter (Analyst)

3. The Strategist (Coach)

Getting Started

Usage

1. Interactive Execution (Recommended)

Linux / macOS

Windows

2. Running the Benchmarks

Linux / macOS

Windows

3. Verification & Testing

Linux / macOS

Windows

Project Structure

Troubleshooting

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Professor Profiler

Overview

System Architecture

High-Level Design

Tech Stack

Agent Personas

1. The Taxonomist (Classifier)

2. The Trend Spotter (Analyst)

3. The Strategist (Coach)

Getting Started

Usage

1. Interactive Execution (Recommended)

Linux / macOS

Windows

2. Running the Benchmarks

Linux / macOS

Windows

3. Verification & Testing

Linux / macOS

Windows

Project Structure

Troubleshooting

License