A comprehensive, beginner-friendly framework for optimizing machine learning model hyperparameters using Large Language Models. This system automatically generates, tests, and analyzes hyperparameter configurations to improve your model's performance.
This framework uses a three-component pipeline to intelligently optimize hyperparameters:
Hyperparameter Generator → Model Training Script → Results Analyzer
The system works by having an LLM analyze your model's training results and suggest better hyperparameter values for the next iteration. It supports both deep learning (PyTorch, TensorFlow) and classic ML (scikit-learn) models.
- Start with your model - Create a training script that follows our simple interface
- Configure search space - Define which hyperparameters to optimize and their ranges
- Run optimization - The system automatically generates hyperparameters, trains your model, and analyzes results
- Get better performance - The LLM learns from each iteration to suggest improvements
The optimizer communicates with your model through simple JSON files, making it framework-agnostic and easy to integrate.
# Python 3.7+ required
python --version
# Install core dependencies
pip install ollama numpy matplotlib
# For deep learning examples (optional)
pip install torch transformers scikit-learn pandas
# For classic ML examples (optional)
pip install scikit-learn pandas numpy# Install and run Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a recommended model (choose one)
ollama pull qwen2.5-coder:32b # Best performance
ollama pull llama3.1:8b # Balanced option
ollama pull qwen2.5-coder:7b # Lightweight optionCopy info_template.json to info.json and customize:
{
"model_info": "Your model description",
"optimization_goal": "Maximize validation accuracy",
"metrics": {
"primary_metric": "val_accuracy",
"description": "Validation accuracy on held-out data"
},
"hyperparameters": {
"learning_rate": {
"type": "float",
"range": [1e-5, 0.1],
"default": 0.001
},
"batch_size": {
"type": "ordinal",
"values": [16, 32, 64, 128],
"default": 32
}
}
}Copy model_template.py to model.py and replace the template sections with your training code.
# Set your LLM model
export LLM_MODEL="qwen2.5-coder:32b"
# Run optimization (default: 10 iterations)
bash run.shhyper_optimizer-latest.py- Hyperparameter Generator: LLM-powered optimizer that analyzes results and suggests new hyperparametersresults_analyzer_latest.py- Results Analyzer: Analyzes training trajectories and provides insights for the optimizerrun.sh- Main Pipeline: Orchestrates the optimization loopplot_trajectories.py- Visualization: Creates plots of optimization progressenhanced_plotter.py- Advanced Visualization: Detailed analysis plots
info.json- Main Configuration: Defines hyperparameters, metrics, and optimization goalsinfo_template.json- Template: Starting point for creating your configuration
model_template.py- Template: Framework-agnostic template for any ML modelmodel_example.py- Deep Learning Example: DistilBERT text classificationmodel_exapmle2.py- Additional Example: Alternative implementation
temp/- Temporary Files: Contains hyperparameters.json, results.json, and analysis files during optimization
The optimizer works with any model through a simple file-based interface. Your model script must:
Read Configuration:
import json
import os
# Get temp directory (default: "temp")
TEMP_DIR = os.environ.get("TEMP_DIR", "temp")
# Read optimization target
with open(f"{TEMP_DIR}/info.json", "r") as f:
config = json.load(f)
primary_metric = config["metrics"]["primary_metric"]Read Hyperparameters:
# Load hyperparameters generated by optimizer
with open(f"{TEMP_DIR}/hyperparameters.json", "r") as f:
hyperparams = json.load(f)
learning_rate = hyperparams["learning_rate"]
batch_size = hyperparams["batch_size"]Save Results:
# Save results in exact required format
results = {
"metrics": {
"val_accuracy": [0.85, 0.87, 0.89], # Per-epoch values
"train_accuracy": [0.92, 0.94, 0.95] # Optional
},
"epochs": [1, 2, 3]
}
with open(f"{TEMP_DIR}/results.json", "w") as f:
json.dump(results, f, indent=2)Environment Variables:
TEMP_DIR- Directory for temporary files (default: "temp")ITERATION- Current optimization iteration numberLLM_MODEL- LLM model name for the optimizer
Required File Paths:
${TEMP_DIR}/info.json- Configuration (copied from main info.json)${TEMP_DIR}/hyperparameters.json- Generated hyperparameters to use${TEMP_DIR}/results.json- Training results you must save
Critical JSON Keys:
metrics.primary_metricin info.json - Must match your results keymetrics.{primary_metric}in results.json - Must contain per-epoch valuesepochsin results.json - Must contain corresponding epoch numbers
import json
import os
def load_hyperparameters():
"""Load hyperparameters with safe defaults"""
temp_dir = os.environ.get("TEMP_DIR", "temp")
# Set safe defaults first
defaults = {
"learning_rate": 0.001,
"batch_size": 32,
"epochs": 20,
"weight_decay": 0.0
}
# Try to load provided hyperparameters
try:
with open(f"{temp_dir}/hyperparameters.json", "r") as f:
provided = json.load(f)
# Merge with defaults
return {**defaults, **provided}
except (FileNotFoundError, json.JSONDecodeError):
print("Using default hyperparameters")
return defaults
# Use in your model
hyperparams = load_hyperparameters()
model = create_model(lr=hyperparams["learning_rate"])def get_hyperparameter(hyperparams, key, default, param_type):
"""Safely extract and convert hyperparameter"""
value = hyperparams.get(key, default)
if param_type == "int":
return int(float(value)) # Handle "64.0" -> 64
elif param_type == "float":
return float(value)
elif param_type == "bool":
return str(value).lower() in ["true", "1", "yes"]
else:
return value
# Example usage
hyperparams = load_hyperparameters()
learning_rate = get_hyperparameter(hyperparams, "learning_rate", 0.001, "float")
batch_size = get_hyperparameter(hyperparams, "batch_size", 32, "int")
epochs = get_hyperparameter(hyperparams, "epochs", 20, "int")def apply_hyperparameters(hyperparams, known_params):
"""Apply only known hyperparameters, ignore others"""
applied = {}
for key, default_value in known_params.items():
if key in hyperparams:
applied[key] = hyperparams[key]
print(f"Using {key}: {applied[key]}")
else:
applied[key] = default_value
print(f"Using default {key}: {applied[key]}")
# Warn about unknown parameters
unknown = set(hyperparams.keys()) - set(known_params.keys())
if unknown:
print(f"Ignoring unknown parameters: {unknown}")
return appliedYour model must save results in this exact format:
# Collect metrics during training
val_accuracies = []
train_accuracies = []
epochs_list = []
for epoch in range(num_epochs):
# Your training code here...
train_acc = train_one_epoch()
val_acc = validate_model()
# Collect results
val_accuracies.append(float(val_acc))
train_accuracies.append(float(train_acc))
epochs_list.append(epoch + 1)
# Save in required format
results = {
"metrics": {
"val_accuracy": val_accuracies, # Primary metric MUST match info.json
"train_accuracy": train_accuracies # Optional additional metrics
},
"epochs": epochs_list # Must match length of metric arrays
}
# Save to required location
temp_dir = os.environ.get("TEMP_DIR", "temp")
with open(f"{temp_dir}/results.json", "w") as f:
json.dump(results, f, indent=2)Critical Rules:
- Primary metric name must exactly match
metrics.primary_metricfrom info.json - All metric arrays must have same length as epochs array
- Values must be JSON-serializable (use
float()for numpy values) - File must be saved to
${TEMP_DIR}/results.json
Example for Different Metrics:
# For loss-based optimization
results = {
"metrics": {
"val_loss": [2.3, 1.8, 1.2], # Primary metric (lower is better)
"train_loss": [2.1, 1.5, 0.9]
},
"epochs": [1, 2, 3]
}
# For accuracy-based optimization
results = {
"metrics": {
"val_accuracy": [0.6, 0.7, 0.85], # Primary metric (higher is better)
"train_accuracy": [0.8, 0.9, 0.95]
},
"epochs": [1, 2, 3]
}# Optional: Print final result for human readability
final_metric = results["metrics"]["val_accuracy"][-1]
print(f"Final val_accuracy: {final_metric:.6f}")Basic Run:
# Set your LLM model
export LLM_MODEL="qwen2.5-coder:32b"
# Run with defaults (10 iterations, temp/ directory)
bash run.shCustom Configuration:
# Custom settings
export LLM_MODEL="llama3.1:8b"
export MAX_ITERATIONS=20
export TEMP_DIR="my_experiment"
# Run optimization
bash run.shSingle Components:
# Run individual steps manually
export TEMP_DIR="temp"
export ITERATION=1
export LLM_MODEL="qwen2.5-coder:32b"
# Step 1: Generate hyperparameters
python hyper_optimizer-latest.py
# Step 2: Train model
python model.py
# Step 3: Analyze results
export PREVIOUS_HYPERPARAMETERS=$(cat temp/hyperparameters.json)
python results_analyzer_latest.pySetup in VS Code:
- Open project in VS Code
- Open integrated terminal (`Ctrl+``)
- Ensure you're in project root directory
- Run commands as shown above
Recommended VS Code Settings:
# In VS Code terminal, set up your environment
export LLM_MODEL="qwen2.5-coder:32b"
export MAX_ITERATIONS=10
# Run optimization
bash run.shMonitoring Progress:
- Watch
temp/directory for generated files - Check
temp/hyperparameters.jsonfor current parameters - Monitor
temp/results.jsonafter each training run - View plots in
temp/trajectory_plots.png
# 1. Clone and setup
git clone <repository>
cd llm_hyperopt
# 2. Setup LLM
ollama pull qwen2.5-coder:32b
# 3. Create simple model (copy from template)
cp model_template.py model.py
# Edit model.py with your training code
# 4. Create config (copy from template)
cp info_template.json info.json
# Edit info.json with your hyperparameters
# 5. Run optimization
export LLM_MODEL="qwen2.5-coder:32b"
bash run.sh# Deep learning experiment (more iterations)
export LLM_MODEL="qwen2.5-coder:32b"
export MAX_ITERATIONS=20
export TEMP_DIR="dl_experiment"
bash run.sh
# Quick test (fewer iterations)
export MAX_ITERATIONS=3
export TEMP_DIR="quick_test"
bash run.sh
# Different LLM model
export LLM_MODEL="llama3.1:8b"
export TEMP_DIR="llama_test"
bash run.sh# After optimization completes
ls temp/ # View generated files
cat temp/hyperparameters.json # See final hyperparameters
python plot_trajectories.py # Create optimization plots#!/usr/bin/env python3
import torch
import torch.nn as nn
import json
import os
# Load hyperparameters
temp_dir = os.environ.get("TEMP_DIR", "temp")
with open(f"{temp_dir}/hyperparameters.json", "r") as f:
hp = json.load(f)
# Simple neural network
class Net(nn.Module):
def __init__(self, input_size=784, hidden_size=128, num_classes=10):
super().__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.fc2 = nn.Linear(hidden_size, num_classes)
self.dropout = nn.Dropout(hp.get("dropout_rate", 0.1))
def forward(self, x):
x = torch.relu(self.fc1(x))
x = self.dropout(x)
return self.fc2(x)
# Training loop
model = Net(hidden_size=hp.get("hidden_size", 128))
optimizer = torch.optim.Adam(model.parameters(), lr=hp["learning_rate"])
val_accuracies = []
epochs_list = []
for epoch in range(hp.get("epochs", 10)):
# Your training code here
model.train()
# ... training loop ...
# Validation
model.eval()
val_acc = 0.85 + epoch * 0.01 # Placeholder - use real validation
val_accuracies.append(float(val_acc))
epochs_list.append(epoch + 1)
# Save results
results = {
"metrics": {"val_accuracy": val_accuracies},
"epochs": epochs_list
}
with open(f"{temp_dir}/results.json", "w") as f:
json.dump(results, f, indent=2)
print(f"Final val_accuracy: {val_accuracies[-1]:.6f}")#!/usr/bin/env python3
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification
import json
import os
import numpy as np
# Load hyperparameters
temp_dir = os.environ.get("TEMP_DIR", "temp")
with open(f"{temp_dir}/hyperparameters.json", "r") as f:
hp = json.load(f)
# Generate sample data (replace with your dataset)
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
# Create model with hyperparameters
model = RandomForestClassifier(
n_estimators=hp.get("n_estimators", 100),
max_depth=hp.get("max_depth", 10),
min_samples_split=hp.get("min_samples_split", 2),
random_state=42
)
# Cross-validation to simulate training epochs
cv_scores = cross_val_score(model, X, y, cv=5, scoring='accuracy')
# Simulate per-epoch improvement (for classic ML, this might be per-fold)
val_accuracies = []
epochs_list = []
for i, score in enumerate(cv_scores):
val_accuracies.append(float(score))
epochs_list.append(i + 1)
# Save results
results = {
"metrics": {"val_accuracy": val_accuracies},
"epochs": epochs_list
}
with open(f"{temp_dir}/results.json", "w") as f:
json.dump(results, f, indent=2)
print(f"Final val_accuracy: {np.mean(val_accuracies):.6f}")Define hyperparameters in info.json with these types:
Float (Continuous):
"learning_rate": {
"type": "float",
"range": [1e-6, 0.1],
"default": 0.001,
"log_scale": true
}Integer (Discrete):
"epochs": {
"type": "integer",
"range": [10, 200],
"default": 50
}Categorical (String Choices):
"optimizer": {
"type": "categorical",
"values": ["adam", "sgd", "rmsprop"],
"default": "adam"
}Ordinal (Ordered Discrete):
"batch_size": {
"type": "ordinal",
"values": [16, 32, 64, 128, 256],
"default": 32
}Good Practices:
{
"hyperparameters": {
"learning_rate": {
"type": "float",
"range": [1e-5, 0.1], # Wide range for exploration
"default": 0.001,
"log_scale": true # Use log scale for learning rates
},
"batch_size": {
"type": "ordinal",
"values": [16, 32, 64, 128], # Powers of 2 for efficiency
"default": 32
},
"dropout_rate": {
"type": "float",
"range": [0.0, 0.5], # Reasonable dropout range
"default": 0.1
}
}
}Avoid:
- Ranges that are too narrow (limits exploration)
- Too many hyperparameters at once (>6-8 can be overwhelming)
- Unrealistic ranges (e.g., learning_rate up to 10.0)
The system automatically recognizes these metrics and optimizes appropriately:
Classification Metrics (Higher is Better):
val_accuracy,accuracy- Classification accuracyf1_score,precision,recall- Classification quality metricsauc_roc,auc- Area under curve metrics
Loss Metrics (Lower is Better):
val_loss,loss- Training/validation lossmse,rmse,mae- Regression error metricscross_entropy,binary_crossentropy- Cross-entropy losses
Advanced Metrics:
bleu,rouge(NLP) - Text generation qualityiou,miou(Computer Vision) - Segmentation qualityndcg,map(Information Retrieval) - Ranking quality
{
"metrics": {
"primary_metric": "val_accuracy",
"description": "Validation accuracy on held-out test set"
}
}The optimizer will:
- Maximize metrics like accuracy, f1_score, auc
- Minimize metrics like loss, mse, mae
- Set appropriate target values automatically
For custom metrics, the system assumes higher is better. To use a custom loss-style metric:
{
"metrics": {
"primary_metric": "custom_loss",
"description": "My custom loss function"
}
}Then ensure your metric name contains "loss", "error", or similar keywords for automatic detection.
Use this pattern in your model scripts for reproducible results:
import random
import numpy as np
def set_seed(seed=42):
"""Set seeds for reproducibility"""
random.seed(seed)
np.random.seed(seed)
# PyTorch (if using)
try:
import torch
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
except ImportError:
pass
# TensorFlow (if using)
try:
import tensorflow as tf
tf.random.set_seed(seed)
except ImportError:
pass
# Call at the start of your script
set_seed(42)# Set consistent seed across runs
export PYTHONHASHSEED=42
export CUDA_LAUNCH_BLOCKING=1
# For PyTorch
export PYTORCH_DETERMINISTIC=1Include seed as a hyperparameter for full reproducibility:
{
"hyperparameters": {
"seed": {
"type": "integer",
"range": [0, 2147483647],
"default": 42
}
}
}Then use in your model:
seed = hyperparams.get("seed", 42)
set_seed(seed)Problem: FileNotFoundError: hyperparameters.json not found
Solution:
# Always check if file exists and use defaults
import os
temp_dir = os.environ.get("TEMP_DIR", "temp")
hp_path = os.path.join(temp_dir, "hyperparameters.json")
if os.path.exists(hp_path):
with open(hp_path, "r") as f:
hyperparams = json.load(f)
else:
print("Using default hyperparameters")
hyperparams = {"learning_rate": 0.001} # Your defaultsProblem: KeyError: 'val_accuracy' in results.json
Solution: Ensure metric names match exactly:
# In info.json
"primary_metric": "val_accuracy"
# In your model's results.json - MUST match exactly
results = {
"metrics": {
"val_accuracy": [0.85, 0.87, 0.89], # Exact match required
"epochs": [1, 2, 3]
}
}Problem: Optimizer says "No trajectory data found"
Solution: Check your results format:
# Wrong - missing required structure
results = {"accuracy": 0.85}
# Right - proper structure
results = {
"metrics": {"val_accuracy": [0.85]}, # Must be arrays
"epochs": [1]
}Problem: Error in API call: connection refused
Solution:
# Start Ollama service
ollama serve
# In another terminal, verify model is available
ollama list
ollama pull qwen2.5-coder:32b # If not present
# Check environment variable
echo $LLM_MODEL
export LLM_MODEL="qwen2.5-coder:32b"Problem: TypeError: 'str' object cannot be interpreted as an integer
Solution: Always convert types:
# Wrong - direct use
batch_size = hyperparams["batch_size"] # Might be string "32"
# Right - type conversion
batch_size = int(float(hyperparams["batch_size"])) # Handles "32.0" -> 32
learning_rate = float(hyperparams["learning_rate"])Problem: CUDA out of memory during optimization
Solution:
# Set memory limits
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128
export CUDA_VISIBLE_DEVICES=0
# Or in your model script
import torch
torch.cuda.empty_cache() # Clear memory between runsProblem: PermissionError: cannot write to temp/
Solution:
# Create temp directory with proper permissions
mkdir -p temp
chmod 755 temp
# Or use a different directory
export TEMP_DIR="my_temp"
mkdir -p my_tempBefore running optimization, verify:
-
info.jsonexists with correctprimary_metric -
model.pyreads from${TEMP_DIR}/hyperparameters.json -
model.pysaves to${TEMP_DIR}/results.json - Results format:
{"metrics": {...}, "epochs": [...]} - Metric names match exactly between info.json and results.json
- LLM model is running (
ollama list) - Environment variables set (
LLM_MODEL, etc.) - All required Python packages installed
Quick Test:
# Test your integration
export LLM_MODEL="qwen2.5-coder:32b"
export MAX_ITERATIONS=1
bash run.shThis should complete one full cycle without errors. If successful, increase MAX_ITERATIONS for full optimization.