Skip to content

PSquare-Lab/LLM-TCS-HPT

Repository files navigation

LLM Hyperparameter Tuning

A comprehensive, beginner-friendly framework for optimizing machine learning model hyperparameters using Large Language Models. This system automatically generates, tests, and analyzes hyperparameter configurations to improve your model's performance.

Overview

This framework uses a three-component pipeline to intelligently optimize hyperparameters:

Hyperparameter GeneratorModel Training ScriptResults Analyzer

The system works by having an LLM analyze your model's training results and suggest better hyperparameter values for the next iteration. It supports both deep learning (PyTorch, TensorFlow) and classic ML (scikit-learn) models.

How It Works at a High Level

  1. Start with your model - Create a training script that follows our simple interface
  2. Configure search space - Define which hyperparameters to optimize and their ranges
  3. Run optimization - The system automatically generates hyperparameters, trains your model, and analyzes results
  4. Get better performance - The LLM learns from each iteration to suggest improvements

The optimizer communicates with your model through simple JSON files, making it framework-agnostic and easy to integrate.

Requirements and Installation

Prerequisites

# Python 3.7+ required
python --version

# Install core dependencies
pip install ollama numpy matplotlib

# For deep learning examples (optional)
pip install torch transformers scikit-learn pandas

# For classic ML examples (optional)  
pip install scikit-learn pandas numpy

LLM Setup

# Install and run Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a recommended model (choose one)
ollama pull qwen2.5-coder:32b    # Best performance
ollama pull llama3.1:8b          # Balanced option
ollama pull qwen2.5-coder:7b     # Lightweight option

Quick Start

Step 1: Create Your Configuration

Copy info_template.json to info.json and customize:

{
  "model_info": "Your model description",
  "optimization_goal": "Maximize validation accuracy",
  "metrics": {
    "primary_metric": "val_accuracy",
    "description": "Validation accuracy on held-out data"
  },
  "hyperparameters": {
    "learning_rate": {
      "type": "float",
      "range": [1e-5, 0.1],
      "default": 0.001
    },
    "batch_size": {
      "type": "ordinal", 
      "values": [16, 32, 64, 128],
      "default": 32
    }
  }
}

Step 2: Create Your Model Script

Copy model_template.py to model.py and replace the template sections with your training code.

Step 3: Run Optimization

# Set your LLM model
export LLM_MODEL="qwen2.5-coder:32b"

# Run optimization (default: 10 iterations)
bash run.sh

File Layout and What Each File Does

Core System Files

Configuration Files

  • info.json - Main Configuration: Defines hyperparameters, metrics, and optimization goals
  • info_template.json - Template: Starting point for creating your configuration

Example Models

Working Directory

  • temp/ - Temporary Files: Contains hyperparameters.json, results.json, and analysis files during optimization

Integrating Your Own Model

The optimizer works with any model through a simple file-based interface. Your model script must:

Required Functions

Read Configuration:

import json
import os

# Get temp directory (default: "temp")  
TEMP_DIR = os.environ.get("TEMP_DIR", "temp")

# Read optimization target
with open(f"{TEMP_DIR}/info.json", "r") as f:
    config = json.load(f)
primary_metric = config["metrics"]["primary_metric"]

Read Hyperparameters:

# Load hyperparameters generated by optimizer
with open(f"{TEMP_DIR}/hyperparameters.json", "r") as f:
    hyperparams = json.load(f)

learning_rate = hyperparams["learning_rate"] 
batch_size = hyperparams["batch_size"]

Save Results:

# Save results in exact required format
results = {
    "metrics": {
        "val_accuracy": [0.85, 0.87, 0.89],  # Per-epoch values
        "train_accuracy": [0.92, 0.94, 0.95]  # Optional
    },
    "epochs": [1, 2, 3]
}

with open(f"{TEMP_DIR}/results.json", "w") as f:
    json.dump(results, f, indent=2)

Key Integration Points

Environment Variables:

  • TEMP_DIR - Directory for temporary files (default: "temp")
  • ITERATION - Current optimization iteration number
  • LLM_MODEL - LLM model name for the optimizer

Required File Paths:

  • ${TEMP_DIR}/info.json - Configuration (copied from main info.json)
  • ${TEMP_DIR}/hyperparameters.json - Generated hyperparameters to use
  • ${TEMP_DIR}/results.json - Training results you must save

Critical JSON Keys:

  • metrics.primary_metric in info.json - Must match your results key
  • metrics.{primary_metric} in results.json - Must contain per-epoch values
  • epochs in results.json - Must contain corresponding epoch numbers

Importing Hyperparameters Inside Model Scripts

Basic Pattern

import json
import os

def load_hyperparameters():
    """Load hyperparameters with safe defaults"""
    temp_dir = os.environ.get("TEMP_DIR", "temp")
    
    # Set safe defaults first
    defaults = {
        "learning_rate": 0.001,
        "batch_size": 32,
        "epochs": 20,
        "weight_decay": 0.0
    }
    
    # Try to load provided hyperparameters
    try:
        with open(f"{temp_dir}/hyperparameters.json", "r") as f:
            provided = json.load(f)
        # Merge with defaults
        return {**defaults, **provided}
    except (FileNotFoundError, json.JSONDecodeError):
        print("Using default hyperparameters")
        return defaults

# Use in your model
hyperparams = load_hyperparameters()
model = create_model(lr=hyperparams["learning_rate"])

Type-Safe Loading

def get_hyperparameter(hyperparams, key, default, param_type):
    """Safely extract and convert hyperparameter"""
    value = hyperparams.get(key, default)
    
    if param_type == "int":
        return int(float(value))  # Handle "64.0" -> 64
    elif param_type == "float": 
        return float(value)
    elif param_type == "bool":
        return str(value).lower() in ["true", "1", "yes"]
    else:
        return value

# Example usage
hyperparams = load_hyperparameters()
learning_rate = get_hyperparameter(hyperparams, "learning_rate", 0.001, "float")
batch_size = get_hyperparameter(hyperparams, "batch_size", 32, "int")
epochs = get_hyperparameter(hyperparams, "epochs", 20, "int")

Handling Unknown Parameters

def apply_hyperparameters(hyperparams, known_params):
    """Apply only known hyperparameters, ignore others"""
    applied = {}
    
    for key, default_value in known_params.items():
        if key in hyperparams:
            applied[key] = hyperparams[key]
            print(f"Using {key}: {applied[key]}")
        else:
            applied[key] = default_value
            print(f"Using default {key}: {applied[key]}")
    
    # Warn about unknown parameters
    unknown = set(hyperparams.keys()) - set(known_params.keys())
    if unknown:
        print(f"Ignoring unknown parameters: {unknown}")
    
    return applied

Exporting Training Trajectories from Model Scripts

Required Results Format

Your model must save results in this exact format:

# Collect metrics during training
val_accuracies = []
train_accuracies = []
epochs_list = []

for epoch in range(num_epochs):
    # Your training code here...
    train_acc = train_one_epoch()
    val_acc = validate_model()
    
    # Collect results
    val_accuracies.append(float(val_acc))
    train_accuracies.append(float(train_acc))
    epochs_list.append(epoch + 1)

# Save in required format
results = {
    "metrics": {
        "val_accuracy": val_accuracies,     # Primary metric MUST match info.json
        "train_accuracy": train_accuracies  # Optional additional metrics
    },
    "epochs": epochs_list  # Must match length of metric arrays
}

# Save to required location
temp_dir = os.environ.get("TEMP_DIR", "temp")
with open(f"{temp_dir}/results.json", "w") as f:
    json.dump(results, f, indent=2)

Schema Requirements

Critical Rules:

  • Primary metric name must exactly match metrics.primary_metric from info.json
  • All metric arrays must have same length as epochs array
  • Values must be JSON-serializable (use float() for numpy values)
  • File must be saved to ${TEMP_DIR}/results.json

Example for Different Metrics:

# For loss-based optimization
results = {
    "metrics": {
        "val_loss": [2.3, 1.8, 1.2],      # Primary metric (lower is better)
        "train_loss": [2.1, 1.5, 0.9]
    },
    "epochs": [1, 2, 3]
}

# For accuracy-based optimization  
results = {
    "metrics": {
        "val_accuracy": [0.6, 0.7, 0.85], # Primary metric (higher is better)
        "train_accuracy": [0.8, 0.9, 0.95]
    },
    "epochs": [1, 2, 3]
}

Optional: Print Final Metric for Humans

# Optional: Print final result for human readability
final_metric = results["metrics"]["val_accuracy"][-1]
print(f"Final val_accuracy: {final_metric:.6f}")

How to Run from Command Line and VS Code Integrated Terminal

Command Line Usage

Basic Run:

# Set your LLM model
export LLM_MODEL="qwen2.5-coder:32b"

# Run with defaults (10 iterations, temp/ directory)
bash run.sh

Custom Configuration:

# Custom settings
export LLM_MODEL="llama3.1:8b"
export MAX_ITERATIONS=20
export TEMP_DIR="my_experiment"

# Run optimization
bash run.sh

Single Components:

# Run individual steps manually
export TEMP_DIR="temp"
export ITERATION=1
export LLM_MODEL="qwen2.5-coder:32b"

# Step 1: Generate hyperparameters
python hyper_optimizer-latest.py

# Step 2: Train model  
python model.py

# Step 3: Analyze results
export PREVIOUS_HYPERPARAMETERS=$(cat temp/hyperparameters.json)
python results_analyzer_latest.py

VS Code Integrated Terminal

Setup in VS Code:

  1. Open project in VS Code
  2. Open integrated terminal (`Ctrl+``)
  3. Ensure you're in project root directory
  4. Run commands as shown above

Recommended VS Code Settings:

# In VS Code terminal, set up your environment
export LLM_MODEL="qwen2.5-coder:32b"
export MAX_ITERATIONS=10

# Run optimization
bash run.sh

Monitoring Progress:

  • Watch temp/ directory for generated files
  • Check temp/hyperparameters.json for current parameters
  • Monitor temp/results.json after each training run
  • View plots in temp/trajectory_plots.png

Example Usage with Short Commands

Quick Start Example

# 1. Clone and setup
git clone <repository>
cd llm_hyperopt

# 2. Setup LLM
ollama pull qwen2.5-coder:32b

# 3. Create simple model (copy from template)
cp model_template.py model.py
# Edit model.py with your training code

# 4. Create config (copy from template)  
cp info_template.json info.json
# Edit info.json with your hyperparameters

# 5. Run optimization
export LLM_MODEL="qwen2.5-coder:32b"
bash run.sh

Custom Experiments

# Deep learning experiment (more iterations)
export LLM_MODEL="qwen2.5-coder:32b"
export MAX_ITERATIONS=20
export TEMP_DIR="dl_experiment"
bash run.sh

# Quick test (fewer iterations)
export MAX_ITERATIONS=3
export TEMP_DIR="quick_test"  
bash run.sh

# Different LLM model
export LLM_MODEL="llama3.1:8b"
export TEMP_DIR="llama_test"
bash run.sh

Results and Visualization

# After optimization completes
ls temp/                        # View generated files
cat temp/hyperparameters.json   # See final hyperparameters
python plot_trajectories.py     # Create optimization plots

Minimal Examples for Both Deep Learning and Classic ML Models

Deep Learning Example (PyTorch)

#!/usr/bin/env python3
import torch
import torch.nn as nn
import json
import os

# Load hyperparameters
temp_dir = os.environ.get("TEMP_DIR", "temp")
with open(f"{temp_dir}/hyperparameters.json", "r") as f:
    hp = json.load(f)

# Simple neural network
class Net(nn.Module):
    def __init__(self, input_size=784, hidden_size=128, num_classes=10):
        super().__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, num_classes)
        self.dropout = nn.Dropout(hp.get("dropout_rate", 0.1))
    
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.dropout(x)
        return self.fc2(x)

# Training loop
model = Net(hidden_size=hp.get("hidden_size", 128))
optimizer = torch.optim.Adam(model.parameters(), lr=hp["learning_rate"])

val_accuracies = []
epochs_list = []

for epoch in range(hp.get("epochs", 10)):
    # Your training code here
    model.train()
    # ... training loop ...
    
    # Validation
    model.eval()
    val_acc = 0.85 + epoch * 0.01  # Placeholder - use real validation
    
    val_accuracies.append(float(val_acc))
    epochs_list.append(epoch + 1)

# Save results
results = {
    "metrics": {"val_accuracy": val_accuracies},
    "epochs": epochs_list
}

with open(f"{temp_dir}/results.json", "w") as f:
    json.dump(results, f, indent=2)

print(f"Final val_accuracy: {val_accuracies[-1]:.6f}")

Classic ML Example (Scikit-learn)

#!/usr/bin/env python3
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification
import json
import os
import numpy as np

# Load hyperparameters
temp_dir = os.environ.get("TEMP_DIR", "temp")
with open(f"{temp_dir}/hyperparameters.json", "r") as f:
    hp = json.load(f)

# Generate sample data (replace with your dataset)
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)

# Create model with hyperparameters
model = RandomForestClassifier(
    n_estimators=hp.get("n_estimators", 100),
    max_depth=hp.get("max_depth", 10),
    min_samples_split=hp.get("min_samples_split", 2),
    random_state=42
)

# Cross-validation to simulate training epochs
cv_scores = cross_val_score(model, X, y, cv=5, scoring='accuracy')

# Simulate per-epoch improvement (for classic ML, this might be per-fold)
val_accuracies = []
epochs_list = []

for i, score in enumerate(cv_scores):
    val_accuracies.append(float(score))
    epochs_list.append(i + 1)

# Save results
results = {
    "metrics": {"val_accuracy": val_accuracies},
    "epochs": epochs_list
}

with open(f"{temp_dir}/results.json", "w") as f:
    json.dump(results, f, indent=2)

print(f"Final val_accuracy: {np.mean(val_accuracies):.6f}")

Configuration and Search Space Basics

Hyperparameter Types

Define hyperparameters in info.json with these types:

Float (Continuous):

"learning_rate": {
  "type": "float",
  "range": [1e-6, 0.1],
  "default": 0.001,
  "log_scale": true
}

Integer (Discrete):

"epochs": {
  "type": "integer", 
  "range": [10, 200],
  "default": 50
}

Categorical (String Choices):

"optimizer": {
  "type": "categorical",
  "values": ["adam", "sgd", "rmsprop"],
  "default": "adam"
}

Ordinal (Ordered Discrete):

"batch_size": {
  "type": "ordinal",
  "values": [16, 32, 64, 128, 256],
  "default": 32
}

Search Space Design

Good Practices:

{
  "hyperparameters": {
    "learning_rate": {
      "type": "float",
      "range": [1e-5, 0.1],        # Wide range for exploration
      "default": 0.001,
      "log_scale": true             # Use log scale for learning rates
    },
    "batch_size": {
      "type": "ordinal", 
      "values": [16, 32, 64, 128],  # Powers of 2 for efficiency
      "default": 32
    },
    "dropout_rate": {
      "type": "float",
      "range": [0.0, 0.5],          # Reasonable dropout range
      "default": 0.1
    }
  }
}

Avoid:

  • Ranges that are too narrow (limits exploration)
  • Too many hyperparameters at once (>6-8 can be overwhelming)
  • Unrealistic ranges (e.g., learning_rate up to 10.0)

Metrics and Objectives

Supported Primary Metrics

The system automatically recognizes these metrics and optimizes appropriately:

Classification Metrics (Higher is Better):

  • val_accuracy, accuracy - Classification accuracy
  • f1_score, precision, recall - Classification quality metrics
  • auc_roc, auc - Area under curve metrics

Loss Metrics (Lower is Better):

  • val_loss, loss - Training/validation loss
  • mse, rmse, mae - Regression error metrics
  • cross_entropy, binary_crossentropy - Cross-entropy losses

Advanced Metrics:

  • bleu, rouge (NLP) - Text generation quality
  • iou, miou (Computer Vision) - Segmentation quality
  • ndcg, map (Information Retrieval) - Ranking quality

Metric Configuration

{
  "metrics": {
    "primary_metric": "val_accuracy",
    "description": "Validation accuracy on held-out test set"
  }
}

The optimizer will:

  • Maximize metrics like accuracy, f1_score, auc
  • Minimize metrics like loss, mse, mae
  • Set appropriate target values automatically

Custom Metrics

For custom metrics, the system assumes higher is better. To use a custom loss-style metric:

{
  "metrics": {
    "primary_metric": "custom_loss", 
    "description": "My custom loss function"
  }
}

Then ensure your metric name contains "loss", "error", or similar keywords for automatic detection.

Reproducibility and Seeding

Seed Pattern

Use this pattern in your model scripts for reproducible results:

import random
import numpy as np

def set_seed(seed=42):
    """Set seeds for reproducibility"""
    random.seed(seed)
    np.random.seed(seed)
    
    # PyTorch (if using)
    try:
        import torch
        torch.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False
    except ImportError:
        pass
    
    # TensorFlow (if using) 
    try:
        import tensorflow as tf
        tf.random.set_seed(seed)
    except ImportError:
        pass

# Call at the start of your script
set_seed(42)

Environment Variables for Reproducibility

# Set consistent seed across runs
export PYTHONHASHSEED=42
export CUDA_LAUNCH_BLOCKING=1

# For PyTorch
export PYTORCH_DETERMINISTIC=1

Hyperparameter Seeding

Include seed as a hyperparameter for full reproducibility:

{
  "hyperparameters": {
    "seed": {
      "type": "integer",
      "range": [0, 2147483647], 
      "default": 42
    }
  }
}

Then use in your model:

seed = hyperparams.get("seed", 42)
set_seed(seed)

Troubleshooting and Common Mistakes

File Path Issues

Problem: FileNotFoundError: hyperparameters.json not found

Solution:

# Always check if file exists and use defaults
import os
temp_dir = os.environ.get("TEMP_DIR", "temp")
hp_path = os.path.join(temp_dir, "hyperparameters.json")

if os.path.exists(hp_path):
    with open(hp_path, "r") as f:
        hyperparams = json.load(f)
else:
    print("Using default hyperparameters") 
    hyperparams = {"learning_rate": 0.001}  # Your defaults

Results Format Errors

Problem: KeyError: 'val_accuracy' in results.json

Solution: Ensure metric names match exactly:

# In info.json
"primary_metric": "val_accuracy"

# In your model's results.json - MUST match exactly
results = {
    "metrics": {
        "val_accuracy": [0.85, 0.87, 0.89],  # Exact match required
        "epochs": [1, 2, 3]
    }
}

Empty Results

Problem: Optimizer says "No trajectory data found"

Solution: Check your results format:

# Wrong - missing required structure
results = {"accuracy": 0.85}

# Right - proper structure
results = {
    "metrics": {"val_accuracy": [0.85]},  # Must be arrays
    "epochs": [1]
}

LLM Connection Issues

Problem: Error in API call: connection refused

Solution:

# Start Ollama service
ollama serve

# In another terminal, verify model is available
ollama list
ollama pull qwen2.5-coder:32b  # If not present

# Check environment variable
echo $LLM_MODEL
export LLM_MODEL="qwen2.5-coder:32b"

Hyperparameter Type Errors

Problem: TypeError: 'str' object cannot be interpreted as an integer

Solution: Always convert types:

# Wrong - direct use
batch_size = hyperparams["batch_size"]  # Might be string "32"

# Right - type conversion
batch_size = int(float(hyperparams["batch_size"]))  # Handles "32.0" -> 32
learning_rate = float(hyperparams["learning_rate"])

Memory Issues

Problem: CUDA out of memory during optimization

Solution:

# Set memory limits
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128
export CUDA_VISIBLE_DEVICES=0

# Or in your model script
import torch
torch.cuda.empty_cache()  # Clear memory between runs

Permission Errors

Problem: PermissionError: cannot write to temp/

Solution:

# Create temp directory with proper permissions
mkdir -p temp
chmod 755 temp

# Or use a different directory
export TEMP_DIR="my_temp"
mkdir -p my_temp

Common Integration Checklist

Before running optimization, verify:

  • info.json exists with correct primary_metric
  • model.py reads from ${TEMP_DIR}/hyperparameters.json
  • model.py saves to ${TEMP_DIR}/results.json
  • Results format: {"metrics": {...}, "epochs": [...]}
  • Metric names match exactly between info.json and results.json
  • LLM model is running (ollama list)
  • Environment variables set (LLM_MODEL, etc.)
  • All required Python packages installed

Quick Test:

# Test your integration
export LLM_MODEL="qwen2.5-coder:32b"
export MAX_ITERATIONS=1
bash run.sh

This should complete one full cycle without errors. If successful, increase MAX_ITERATIONS for full optimization.

About

LLM_TCS_HPT: A lightweight framework for hyperparameter tuning with small LLMs. It introduces the Trajectory Context Summarizer (TCS), a deterministic module that converts raw training trajectories into structured summaries, enabling compact LLMs to rival GPT-3.5–scale models in hyperparameter optimization tasks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors