diff --git a/.gitignore b/.gitignore
index fb40c4a..a694705 100644
--- a/.gitignore
+++ b/.gitignore
@@ -14,6 +14,16 @@ __pycache__/
 *.pdf
 *.pt
 *.mp4
+results/
+
+# Data files
+*.csv
+*.jsonl
+*.json
+*.dat
+*.pkl
+*.hdf5
+*.h5
 *.json
 
 # Sketch
diff --git a/README.md b/README.md
index 6c5363f..8913fab 100644
--- a/README.md
+++ b/README.md
@@ -1,102 +1,149 @@
 # Diffusion Evolution
 
-This repo is for our ICLR 2025 paper [Diffusion models are evolutionary algorithms](https://openreview.net/forum?id=xVefsBbG2O), which anayatically proves that diffusion models are a type of evolutionary algorithm. This equivalence allows us to leverage advancements in diffusion models for evolutionary algorithm tasks, including accelerated sampling and latent space diffusion.
+This repository contains the official implementation of the ICLR 2025 paper, "[Diffusion Models are Evolutionary Algorithms](https://openreview.net/forum?id=xVefsBbG2O)". This work analytically proves that diffusion models can be interpreted as a form of evolutionary algorithm. This equivalence allows us to leverage advancements in diffusion models for evolutionary tasks, including accelerated sampling and latent space exploration.
 
-![](./experiments/2d_models/two_peaks/images/framwork.jpg)
+The core idea of the Diffusion Evolution framework is to treat the reverse diffusion process as an evolutionary algorithm. A population of samples estimates the noise that was added to them (or their noise-free states) based on savory of their neighbors. The population then "evolves" by taking a denoising step.
 
-The Diffusion Evolution framework treats inversed diffusion as evolutionary algorithm, where the population estimates its added noise (or their noise-free states) based on its neighbors' fitness then evolves via denoising. The following figure shows the process on optimizing a two-peak density function. The Diffusion Evolution initially has large neighbor range (shown as blue disk), calculating $x_0$ based on the fitness of its neighbors then move toward estimated $x_0$.
+## Project Structure
 
-![](./experiments/2d_models/figures/process.png)
+This project is organized with a focus on academic rigor, clarity, and reproducibility. The core library is located in `src/diffevo`, with a modular structure that separates concerns. The project now uses a plugin-based architecture for problems, optimizers, and callbacks.
 
+- `src/diffevo`: The core library, containing the `Orchestrator` and base classes for problems, optimizers, and callbacks.
+- `plugins/`: Home for problems, optimizers, and callbacks.
+  - `plugins/problems/`: Problem definitions.
+  - `plugins/optimizers/`: Optimizer implementations.
+  - `plugins/callbacks/`: Callback implementations.
+- `configs/`: YAML configuration files for experiments.
+- `tests/`: Unit tests for the core library.
 
-## Install
+## Installation
 
-You can install the package via pip:
+Getting started is as simple as cloning the repository and running the initialization script. This will handle the creation of a virtual environment and the installation of all necessary dependencies.
 
 ```bash
-pip install diffevo
+# Clone the repository
+git clone https://github.com/Zhangyanbo/diffusion-evolution
+cd diffusion-evolution
+
+# Run the initialization script
+python init.py
 ```
 
-or manually install:
+The `init.py` script will create a local Python environment in a `.venv` directory and install all dependencies. If you run it again, it will prompt you to reset the environment.
+
+## Quick Start
+
+To see the script in action, you can run a quick "smoketest" to verify that everything is working correctly.
+
 ```bash
-clone https://github.com/Zhangyanbo/diffusion-evolution
-cd diffevo/
-pip install .
+# Make sure to activate the virtual environment first
+source .venv/bin/activate
+
+# Run the smoketest
+python run.py configs/smoketest.yaml --smoketest
 ```
 
-Some benchmark codes requires dependencies, can be installed via:
+This will run a minimal configuration and save the results to a timestamped directory in `results/`.
+
+## Running Experiments
+
+All experiments are run through `run.py`, which takes a YAML configuration file as an argument.
+
 ```bash
-pip install cma gym pygame tqdm matplotlib numpy==1.26.4 
+python run.py <path_to_config>.yaml
 ```
 
-Also Pytorch version 2.5 or above is required
+We provide several example configurations in the `configs/` directory.
+
+## Generating a Report
+
+To run a sequence of experiments and generate a report, you can use the `report.py` script.
+
 ```bash
-pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
+python report.py
 ```
 
+This will run a predefined sequence of experiments (currently, just the smoketest) and will eventually generate a report of the results.
 
-The benchmark fitness functions can be found here: https://github.com/bhartl/foobench 
+## Graph-Based Optimizers
 
-## Typical Usage
+This framework includes two distinct approaches to using graphs in Diffusion Evolution:
 
-In most cases, tuning hyperparameters or adding custom operations is necessary to achieve higher performance. We recommend using the following form for the best balance between conciseness and versatility.
+1.  **`GraphDiffEvo`**: For problems *defined on* a graph.
+2.  **`GraphBasedDiffEvo`**: An optimizer where the population *is* a graph.
 
-```python
-from diffevo import DDIMScheduler, BayesianGenerator
-from diffevo.examples import two_peak_density
+### 1. `GraphDiffEvo`: Solving Problems on Graphs
 
-scheduler = DDIMScheduler(num_step=100)
+This is a specialized optimizer for graph-based optimization problems. The goal is to find a graph structure that maximizes a certain objective function.
 
-x = torch.randn(512, 2)
+#### Running `GraphDiffEvo` Experiments
 
-for t, alpha in scheduler:
-    fitness = two_peak_density(x, std=0.25)
-    generator = BayesianGenerator(x, fitness, alpha)
-    x = generator(noise=0)
+We provide two example graph-based experiments:
+
+-   **Graph Flow**: This experiment attempts to evolve a graph that maximizes the "flow" of capacity from a source node to a sink node. To run it:
+    ```bash
+    python run.py configs/graph_flow.yaml
+    ```
+
+-   **Max Clique**: This experiment attempts to find the largest clique (a fully connected subgraph) in a graph. To run it:
+    ```bash
+    python run.py configs/max_clique.yaml
+    ```
+
+### 2. `GraphBasedDiffEvo`: Structuring the Population as a Graph
+
+This optimizer introduces a novel approach where the population of candidate solutions is itself structured as a sparse graph. The evolutionary updates (diffusion) only occur between connected nodes (neighbors), rather than across the entire population. This can improve efficiency and exploration for large populations.
+
+#### Running `GraphBasedDiffEvo` Experiments
+
+To run an experiment with this optimizer, you can use the example configuration:
+```bash
+python run.py configs/graph_based_evolution.yaml
 ```
 
-The generator requires fitness values to be non-negative. If your objective
-returns negative values, please apply a mapping (see `diffevo.fitnessmapping`)
-to convert them before calling the generator.
+This will run the Rosenbrock benchmark problem, but the `GraphBasedDiffEvo` optimizer will use a k-Nearest Neighbors graph to structure its internal population.
 
-The following are two evolution trajectories of different fitness functions.
+This repository includes a specialized framework for graph-based optimization problems using Diffusion Evolution. This framework is designed to be extensible, allowing researchers to easily define new graph problems and apply the `GraphDiffEvo` optimizer to them.
 
-## Advanced Usage
+### Running Graph Evolution Experiments
 
-We also offer multiple choices for each component to accommodate more advanced use cases:
+We provide two example graph-based experiments:
 
-* In addition to the `DDIMScheduler`, we provide the `DDIMSchedulerCosine`, which features a different $\alpha$ scheduler.
-* We offer multiple fitness mapping functions that map the original fitness to a different value. These can be found in `diffevo.fitnessmapping`.
-* Currently, we have only one version of the generator.
+1.  **Graph Flow**: This experiment attempts to evolve a graph that maximizes the "flow" of capacity from a source node to a sink node. To run it:
+    ```bash
+    python run.py configs/graph_flow.yaml
+    ```
 
-Below is an example of how to change the diffusion process and conduct advanced experiments:
+2.  **Max Clique**: This experiment attempts to find the largest clique (a fully connected subgraph) in a graph. To run it:
+    ```bash
+    python run.py configs/max_clique.yaml
+    ```
 
-```python
-import torch
-from diffevo import DDIMScheduler, BayesianGenerator, DDIMSchedulerCosine
-from diffevo.examples import two_peak_density
-from diffevo.fitnessmapping import Power, Energy, Identity
+### Creating a New Plugin
 
-scheduler = DDIMSchedulerCosine(num_step=100) # use a different scheduler
+To create a new plugin, you need to:
 
-x = torch.randn(512, 2)
+1.  **Create a New Python File**: Create a new Python file in the appropriate plugin directory (`plugins/problems`, `plugins/optimizers`, or `plugins/callbacks`).
+2.  **Create a New Class**: In the new file, create a new class that inherits from the appropriate base class (`Problem`, `Optimizer`, or `Callback`).
+3.  **Implement the Required Methods**: Implement the required methods for the base class. For example, a `Problem` plugin needs to implement `evaluate`.
+4.  **Update Your Configuration File**: In your YAML configuration file, update the `problem.name`, `optimizer.class_name`, or `callbacks` list to match the name of your new class.
 
-trace = [] # store the trace of the population
+Here is a simple template for a new problem plugin:
 
-mapping_fn = Power(3) # setup the power mapping function
+```python
+# In plugins/problems/my_problem.py
+from diffevo.problems.base import Problem
+import torch
 
-for t, alpha in scheduler:
-    fitness = two_peak_density(x, std=0.25)
-    # apply the power mapping function
-    generator = BayesianGenerator(x, mapping_fn(fitness), alpha)
-    x = generator(noise=0.1)
-    trace.append(x)
+class MyProblem(Problem):
+    def __init__(self, dim):
+        super().__init__(dim=dim)
 
-trace = torch.stack(trace)
+    def evaluate(self, x):
+        return torch.sum(x ** 2, dim=-1)
 ```
 
-
-### Cite our work
+## Citing Our Work
 
 ```
 @inproceedings{
@@ -111,4 +158,4 @@ url={https://openreview.net/forum?id=xVefsBbG2O}
 
 ## License
 
-Our software is relased under modified Apache 2.0 License. We allow non-commercial usage for research, study, learning, etc., while limiting the commercial usage.
\ No newline at end of file
+This software is released under the Apache 2.0 License. See the [LICENSE](LICENSE) file for more details.
diff --git a/configs/base.yaml b/configs/base.yaml
new file mode 100644
index 0000000..38add85
--- /dev/null
+++ b/configs/base.yaml
@@ -0,0 +1,8 @@
+# Base configuration for all experiments
+seed: 42
+num_runs: 1
+
+callbacks:
+  - CSVLogger
+  - ConsoleLogger
+  - PlottingCallback
diff --git a/configs/graph_based_evolution.yaml b/configs/graph_based_evolution.yaml
new file mode 100644
index 0000000..7d7beb5
--- /dev/null
+++ b/configs/graph_based_evolution.yaml
@@ -0,0 +1,16 @@
+base: base.yaml
+
+name: graph_based_evolution_smoketest
+
+optimizer:
+  module: diffevo.optimizers.graph_based_diffevo
+  class_name: GraphBasedDiffEvo
+  params:
+    num_step: 100
+    popsize: 50
+    k: 5
+
+problem:
+  name: Rosenbrock
+  params:
+    dim: 2
diff --git a/configs/graph_evolution.yaml b/configs/graph_evolution.yaml
new file mode 100644
index 0000000..0a0db38
--- /dev/null
+++ b/configs/graph_evolution.yaml
@@ -0,0 +1,20 @@
+# Configuration for the graph evolution experiment
+
+problem:
+  name: Graphflow
+  params:
+    num_nodes: 3
+
+optimizer:
+  name: GraphDiffEvo
+  params:
+    pop_size: 512
+    dim: 9
+    T: 100
+    sigma_m: 0.0
+    power: 3.0
+
+callbacks:
+  - name: CSVLogger
+
+n_runs: 1
diff --git a/configs/graph_flow.yaml b/configs/graph_flow.yaml
new file mode 100644
index 0000000..4c3ec7f
--- /dev/null
+++ b/configs/graph_flow.yaml
@@ -0,0 +1,17 @@
+base: base.yaml
+
+name: graph_flow_experiment
+
+problem:
+  name: Graphflow
+  params:
+    dim: 25
+    num_nodes: 5
+
+optimizer:
+  name: GraphDiffEvo
+  params:
+    pop_size: 512
+    num_steps: 100
+    sigma_m: 0.0
+    power: 3.0
diff --git a/configs/image_evolution.yaml b/configs/image_evolution.yaml
new file mode 100644
index 0000000..2461973
--- /dev/null
+++ b/configs/image_evolution.yaml
@@ -0,0 +1,18 @@
+# Configuration for the image evolution experiment
+
+problem:
+  name: Image
+  params:
+    dim_sqrt: 28
+    target_image_name: "mnist_7"
+
+optimizer:
+  name: DiffEvo
+  params:
+    pop_size: 64
+    max_iters: 100
+
+callbacks:
+  - name: CSVLogger
+
+n_runs: 1
diff --git a/configs/max_clique.yaml b/configs/max_clique.yaml
new file mode 100644
index 0000000..9cdb7ec
--- /dev/null
+++ b/configs/max_clique.yaml
@@ -0,0 +1,17 @@
+base: base.yaml
+
+name: max_clique_experiment
+
+problem:
+  name: Maxclique
+  params:
+    dim: 25
+    num_nodes: 5
+
+optimizer:
+  name: GraphDiffEvo
+  params:
+    pop_size: 512
+    num_steps: 100
+    sigma_m: 0.0
+    power: 3.0
diff --git a/configs/rl_diffevo.yaml b/configs/rl_diffevo.yaml
new file mode 100644
index 0000000..a050bfe
--- /dev/null
+++ b/configs/rl_diffevo.yaml
@@ -0,0 +1,17 @@
+base: smoketest.yaml
+
+problem:
+  name: RL
+  env_name: "CartPole-v1"
+  dim_hidden: 8
+  n_hidden_layers: 1
+
+optimizer:
+  name: RLEvo
+  num_step: 10
+  population_size: 256
+  T: 10
+  scaling: 100
+  latent_dim: null
+  noise: 1
+  weight_decay: 0
diff --git a/configs/rl_evolution.yaml b/configs/rl_evolution.yaml
new file mode 100644
index 0000000..b74a546
--- /dev/null
+++ b/configs/rl_evolution.yaml
@@ -0,0 +1,17 @@
+# Configuration for the RL evolution experiment
+
+problem:
+  name: RL
+  params:
+    env_name: "CartPole-v1"
+
+optimizer:
+  name: DiffEvo
+  params:
+    pop_size: 64
+    max_iters: 100
+
+callbacks:
+  - name: CSVLogger
+
+n_runs: 1
diff --git a/configs/smoketest.yaml b/configs/smoketest.yaml
new file mode 100644
index 0000000..593c5e2
--- /dev/null
+++ b/configs/smoketest.yaml
@@ -0,0 +1,15 @@
+base: base.yaml
+
+name: smoketest
+
+optimizer:
+  module: diffevo.optimizers.diffevo
+  class_name: DiffEvo
+  params:
+    num_step: 1
+    popsize: 10
+
+problem:
+  name: Rosenbrock
+  params:
+    dim: 2
diff --git a/diffevo/__init__.py b/diffevo/__init__.py
deleted file mode 100644
index 7e0b47f..0000000
--- a/diffevo/__init__.py
+++ /dev/null
@@ -1,6 +0,0 @@
-from .optimizer import DiffEvo
-from .ddim import DDIMScheduler, DDIMSchedulerCosine, DDPMScheduler
-from .generator import BayesianGenerator, LatentBayesianGenerator
-from . import examples
-from . import fitnessmapping
-from .latent import RandomProjection
\ No newline at end of file
diff --git a/diffevo/kde.py b/diffevo/kde.py
deleted file mode 100644
index 782ade9..0000000
--- a/diffevo/kde.py
+++ /dev/null
@@ -1,27 +0,0 @@
-import torch
-
-
-def distance_matrix(x, y):
-    """Compute the pairwise distance matrix between x and y.
-    
-    Args:
-        x: (N, d) tensor.
-        y: (M, d) tensor.
-    Returns:
-        (N, M) tensor, the pairwise distance matrix.
-    """
-    return torch.cdist(x, y)
-
-def KDE(samples, h=0.1):
-    """Modified Kernel Density Estimation (KDE) method, which only estimate the density at the given samples.
-    
-    Args:
-        samples: (N, d) tensor, the samples to estimate the density.
-        h: float, the bandwidth.
-    Returns:
-        (N,) tensor, the estimated density at the given samples.
-    """
-    distances = distance_matrix(samples, samples) # (N, N)
-    weights = torch.exp(-(distances ** 2) / (2 * h**2)) # (N,)
-    weights = weights.sum(dim=-1)
-    return weights / sum(weights) * samples.shape[0]
\ No newline at end of file
diff --git a/diffevo/latent.py b/diffevo/latent.py
deleted file mode 100644
index 498b202..0000000
--- a/diffevo/latent.py
+++ /dev/null
@@ -1,20 +0,0 @@
-import torch
-import torch.nn as nn
-
-
-class RandomProjection(nn.Module):
-    def __init__(self, in_features, out_features, normalize=True):
-        super().__init__()
-        self.in_features = in_features
-        self.out_features = out_features
-        self.linear = nn.Linear(in_features, out_features, bias=False)
-        self.normalize = normalize
-        self.init_weight()
-    
-    def init_weight(self):
-        self.linear.weight.data = torch.randn_like(self.linear.weight.data) / (self.in_features ** 0.5)
-        if self.normalize:
-            self.linear.weight.data /= self.linear.weight.data.norm(dim=1, keepdim=True)
-
-    def forward(self, x):
-        return self.linear(x)
\ No newline at end of file
diff --git a/diffevo/optimizer.py b/diffevo/optimizer.py
deleted file mode 100644
index 845dd55..0000000
--- a/diffevo/optimizer.py
+++ /dev/null
@@ -1,95 +0,0 @@
-from .ddim import DDIMScheduler
-from .generator import BayesianGenerator
-from .fitnessmapping import Identity
-import torch
-from tqdm import tqdm
-
-
-class DiffEvo:
-    """Diffusion evolution algorithm for optimization.
-
-    Args:
-        - num_step: int, the number of steps to evolve the population.
-        - alpha: str or torch.Tensor, the alpha schedule for the diffusion process.
-        - density: str, the mode of the density function, only support 'uniform' and 'kde'.
-          This argument is kept for backward compatibility and is rarely used.
-        - sigma: str, the mode of the sigma, only support 'ddpm' and 'zero'.
-        - sigma_scale: float, the scaling factor for the sigma.
-        - sample_steps: list of int, the steps to evaluate the fitness.
-        - scaling: float, the scaling factor for the population.
-        - fitness_mapping: str, the mapping function from fitness to probability, only support 'identity' and 'energy'.
-        - temperature: float or list of float, the temperature for the fitness mapping.
-        - method: str, the method to estimate the density, only support 'bayesian' and 'nn'.
-        - kde_bandwidth: float, the bandwidth for the KDE density estimator.
-          Also a legacy option, defaults to 0.1.
-        - nn: nn.Module, the neural network for the density estimator.
-
-    Methods:
-        - step(gt, t): evolve the population by one step.
-            outputs:
-                - gt: torch.Tensor, the evolved population.
-                - density: torch.Tensor, the estimated density of the evolved population.
-
-        - optimize(fit_fn, initial_population, trace=False): optimize the population.
-            outputs:
-                - population: torch.Tensor, the optimized population.
-                - population_trace: list of torch.Tensor, the population trace during the optimization.
-                - fitness_count: list of float, the fitness count during the optimization.
-
-    Example:
-        ```python
-        optimizer = DiffEvo(num_step=100, sigma='ddpm')
-        sampled, trace, fitness = optimizer.optimize(fitness_function_2d, torch.randn(512, 2), trace=True)
-        ```
-    """
-
-    def __init__(self,
-                 num_step: int = 100,
-                 density='uniform',
-                 noise:float=1.0,
-                 scaling: float=1,
-                 fitness_mapping=None,
-                 kde_bandwidth=0.1):
-        self.num_step = num_step
-
-        if not density in ['uniform', 'kde']:
-            raise NotImplementedError(f'Density estimator {density} is not implemented.')
-        # legacy options kept for backward compatibility
-        self.density = density
-        self.kde_bandwidth = kde_bandwidth
-        self.scaling = scaling
-        self.noise = noise
-        if fitness_mapping is None:
-            self.fitness_mapping = Identity()
-        else:
-            self.fitness_mapping = fitness_mapping
-        self.scheduler = DDIMScheduler(self.num_step)
-    
-    def optimize(self, fit_fn, initial_population, trace=False):
-        x = initial_population
-
-        fitness_count = []
-        if trace:
-            population_trace = [initial_population]
-
-        for t, alpha in tqdm(self.scheduler):
-            fitness = fit_fn(x * self.scaling)
-            generator = BayesianGenerator(
-                x,
-                self.fitness_mapping(fitness),
-                alpha,
-                density=self.density,
-                h=self.kde_bandwidth,
-            )
-            x = generator(noise=self.noise)
-            if trace:
-                population_trace.append(x)
-            fitness_count.append(fitness)
-        
-        if trace:
-            population_trace = torch.stack(population_trace) * self.scaling
-        
-        if trace:
-            return x, population_trace, fitness_count
-        else:
-            return x
\ No newline at end of file
diff --git a/examples/01_classic_optimization.py b/examples/01_classic_optimization.py
new file mode 100644
index 0000000..48d8105
--- /dev/null
+++ b/examples/01_classic_optimization.py
@@ -0,0 +1,21 @@
+
+import subprocess
+import sys
+
+def main():
+    """
+    This example demonstrates how to run an experiment using the command-line interface.
+    It runs the smoketest configuration, which uses the DiffEvo optimizer on the TwoPeakDensity problem.
+    """
+    command = [
+        sys.executable,
+        "run.py",
+        "configs/smoketest.yaml",
+        "--smoketest"
+    ]
+
+    print(f"Running command: {' '.join(command)}")
+    subprocess.run(command, check=True)
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/02_graph_optimization.py b/examples/02_graph_optimization.py
new file mode 100644
index 0000000..b99890a
--- /dev/null
+++ b/examples/02_graph_optimization.py
@@ -0,0 +1,21 @@
+
+import subprocess
+import sys
+
+def main():
+    """
+    This example demonstrates how to run an experiment using the command-line interface.
+    It runs the graph_flow configuration.
+    """
+    command = [
+        sys.executable,
+        "run.py",
+        "configs/graph_flow.yaml",
+        "--smoketest"
+    ]
+
+    print(f"Running command: {' '.join(command)}")
+    subprocess.run(command, check=True)
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/03_image_reconstruction.py b/examples/03_image_reconstruction.py
new file mode 100644
index 0000000..7d7370c
--- /dev/null
+++ b/examples/03_image_reconstruction.py
@@ -0,0 +1,21 @@
+
+import subprocess
+import sys
+
+def main():
+    """
+    This example demonstrates how to run an experiment using the command-line interface.
+    It runs the image_evolution configuration.
+    """
+    command = [
+        sys.executable,
+        "run.py",
+        "configs/image_evolution.yaml",
+        "--smoketest"
+    ]
+
+    print(f"Running command: {' '.join(command)}")
+    subprocess.run(command, check=True)
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/04_reinforcement_learning.py b/examples/04_reinforcement_learning.py
new file mode 100644
index 0000000..19b9d46
--- /dev/null
+++ b/examples/04_reinforcement_learning.py
@@ -0,0 +1,21 @@
+
+import subprocess
+import sys
+
+def main():
+    """
+    This example demonstrates how to run an experiment using the command-line interface.
+    It runs the rl_diffevo configuration.
+    """
+    command = [
+        sys.executable,
+        "run.py",
+        "configs/rl_diffevo.yaml",
+        "--smoketest"
+    ]
+
+    print(f"Running command: {' '.join(command)}")
+    subprocess.run(command, check=True)
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/__init__.py b/examples/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/experiments/2d_models/experiment.py b/experiments/2d_models/experiment.py
deleted file mode 100644
index c64f2c7..0000000
--- a/experiments/2d_models/experiment.py
+++ /dev/null
@@ -1,115 +0,0 @@
-from two_peaks.experiment import plot_diffusion as plot_diffusion_two_peaks
-import torch
-import matplotlib.pyplot as plt
-from diffevo import DiffEvo, BayesianGenerator, DDIMScheduler
-from two_peaks.experiment import two_peak_density
-from two_peaks_step.experiment import two_peak_density as two_peak_density_step
-from tqdm import tqdm
-import numpy as np
-
-import matplotlib
-matplotlib.rcParams['mathtext.fontset'] = 'stix'
-matplotlib.rcParams['font.family'] = 'STIXGeneral'
-
-
-def optimizer(fit_fn, initial_population, scaling=1.0, noise=0.1, num_step=100):
-    x = initial_population
-
-    fitness_count = []
-    population_trace = [initial_population]
-    x0_trace = []
-    scheduler = DDIMScheduler(num_step)
-
-    for t, alpha in tqdm(scheduler):
-        fitness = fit_fn(x * scaling)
-        generator = BayesianGenerator(x, fitness, alpha)
-        x, x0 = generator(noise=noise, return_x0=True)
-        x0_trace.append(x0)
-        population_trace.append(x)
-        fitness_count.append(fitness)
-    
-    population_trace = torch.stack(population_trace) * scaling
-    x0_trace = torch.stack(x0_trace) * scaling
-    
-    return (x * scaling, population_trace, fitness_count), x0_trace, scheduler
-
-def make_plot(alpha, trace, fitnesses, method:str, focus_id=20, row=0, plot_diffusion=plot_diffusion_two_peaks, draw_circle=False):
-    time_steps = [20, 45, 70, 95]
-    for i, t in enumerate(time_steps):
-        plt.subplot(2, 4, i+1+row*4)
-        past_ts = time_steps[:i] if i > 0 else []
-        alpha_t = alpha[len(alpha) - t - 1]
-        plot_diffusion(alpha_t, trace, fitnesses, focus_id=focus_id, T=t, num_sample=100, dt=23, past_ts=past_ts)
-        # set aspect ratio to be equal
-        plt.gca().set_aspect('equal', adjustable='box')
-
-        if row == 0:
-            plt.title(f'$t={100-t}$')
-        
-        if i == 0:
-            plt.ylabel(method)
-        
-        # add a y = x line
-        plt.axline((-5, -5), (5, 5), color='black', linestyle='--', alpha=0.25)
-
-        if draw_circle:
-            circle = plt.Circle([-1, -1], 0.5, color='black', fill=False, zorder=2, linestyle='--', alpha=0.5)
-            plt.gca().add_artist(circle)
-            circle = plt.Circle([1, 1], 0.5, color='black', fill=False, zorder=2, linestyle='--', alpha=0.5)
-            plt.gca().add_artist(circle)
-
-def project_to_1d(trace):
-    return trace.mean(dim=-1) * (2 ** 0.5)
-
-def plot_distance_histogram(x0_trace, population_trace, t, ax=None, total_step=100, label=True, ylabel=True, title=False, xlabel=True):
-    if ax is None:
-        ax = plt.gca()
-    
-    T = total_step - t
-
-    plt.hist(project_to_1d(x0_trace[t]).numpy(), bins=32, density=True, alpha=0.75, range=(-3,3), label='$\hat{x}_0$', color='#E93A01')
-    plt.hist(project_to_1d(population_trace[t]).numpy(), bins=32, density=True, alpha=0.5, range=(-3,3), label='$x$', color='#6F6E6E')
-
-    # remove x, y ticks
-    ax.set_yticks([])
-    ax.set_xticks([])
-    ax.set_ylim(0, 1.5)
-
-    # add vertical line at +- sqrt(2)
-    plt.axvline(x=np.sqrt(2), color='black', linestyle='--')
-    plt.axvline(x=-np.sqrt(2), color='black', linestyle='--')
-
-    if label:
-        ax.legend()
-    if title:
-        ax.set_title(f't = {T}')
-    if ylabel:
-        ax.set_ylabel('(b) density')
-
-def make_plot_distance_histogram(x0_trace, population_trace, row=0, total_step=100):
-    time_steps = [20, 45, 70, 95]
-    for i, t in enumerate(time_steps):
-        plt.subplot(2, 4, i+1+row*4)
-        plot_distance_histogram(x0_trace, population_trace, t, total_step=total_step, label=(i==3), ylabel=(i==0))
-        # set aspect ratio to be a standard rectangle
-        plt.gca().set_aspect('auto', adjustable='box')
-
-if __name__ == '__main__':
-    torch.manual_seed(42)
-
-    x0 = torch.randn(512, 2)
-    result_two_peak, x0_trace, scheduler_two_peak = optimizer(two_peak_density, initial_population=x0, scaling=1.5, noise=0.1)
-
-    # save results
-    torch.save([result_two_peak, x0_trace, scheduler_two_peak.alpha], './data/two_peak.pt')
-
-    # make plots
-    plt.figure(figsize=(8, 4))
-    pop, trace, fitnesses = result_two_peak
-    make_plot(scheduler_two_peak.alpha, trace, fitnesses, '(a) evolution', row=0, focus_id=7, plot_diffusion=plot_diffusion_two_peaks)
-    make_plot_distance_histogram(x0_trace, trace, row=1, total_step=100)
-
-    plt.tight_layout()
-    plt.savefig(f'./figures/process.png')
-    plt.savefig(f'./figures/process.pdf')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/2d_models/figures/process.png b/experiments/2d_models/figures/process.png
deleted file mode 100644
index a34b0d8..0000000
Binary files a/experiments/2d_models/figures/process.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/diffusion.py b/experiments/2d_models/two_peaks/diffusion.py
deleted file mode 100644
index 2c7050b..0000000
--- a/experiments/2d_models/two_peaks/diffusion.py
+++ /dev/null
@@ -1,52 +0,0 @@
-# Making the figure of the similarity between diffusion and evolution
-import torch
-import torch.nn as nn
-import matplotlib.pyplot as plt
-from torch.distributions import MultivariateNormal
-from diffevo import DiffEvo
-from experiment import two_peak_density
-from matplotlib.colors import LinearSegmentedColormap
-
-colors = ["#C8C7C7", "#FF6D3D", "#E93A01"]
-custom_cmap = LinearSegmentedColormap.from_list("custom_cmap", colors)
-
-
-def diffuse(num_population, num_step):
-    optimizer_ddpm = DiffEvo(num_step=num_step, scaling=1.0)
-    x0 = torch.randn(num_population, 2)
-
-    fitness_func = lambda x: two_peak_density(x, std=0.5)
-
-    pop, trace, fitnesses = optimizer_ddpm.optimize(fitness_func, initial_population=x0, trace=True)
-
-    return pop, trace, fitnesses
-
-def make_plot(trace, fitnesses):
-    steps = [0, 80, 98]
-    fig, axes = plt.subplots(1, len(steps), figsize=(len(steps) * 3, 3))
-
-    for i, t in enumerate(steps):
-        ax = axes[i]
-        ax.scatter(trace[t, :, 0], trace[t, :, 1], s=1, c=fitnesses[t], cmap=custom_cmap, vmin=0, vmax=1)
-        ax.set_title(f'T={t}')
-        ax.set_xlim(-4, 4)
-        ax.set_ylim(-4, 4)
-        ax.set_aspect('equal', adjustable='box')
-        # remove ticks
-        ax.set_xticks([])
-        ax.set_yticks([])
-
-        cbar = plt.colorbar(ax.collections[0], orientation='vertical')
-        cbar.set_label('Fitness')
-
-if __name__ == '__main__':
-    torch.manual_seed(42)
-    num_population = 512
-    num_step = 100
-
-    pop, trace, fitnesses = diffuse(num_population, num_step)
-    make_plot(trace, fitnesses)
-    plt.tight_layout()
-    plt.savefig('./images/diffuse.png')
-    plt.savefig('./images/diffuse.pdf')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/2d_models/two_peaks/experiment.py b/experiments/2d_models/two_peaks/experiment.py
deleted file mode 100644
index 2c77c3d..0000000
--- a/experiments/2d_models/two_peaks/experiment.py
+++ /dev/null
@@ -1,123 +0,0 @@
-import torch
-import torch.nn as nn
-import matplotlib.pyplot as plt
-from torch.distributions import MultivariateNormal
-from diffevo import DiffEvo
-from diffevo.examples import two_peak_density
-
-def add_circle(mu, r, alpha=0.1):
-    circle = plt.Circle(mu, r, color='#46B3D5', alpha=alpha, zorder=2)
-    plt.gca().add_artist(circle)
-
-def plot_diffusion(alpha_t, trace, fitnesses, focus_id, T, num_sample=100, dt=25, past_ts=[]):
-    r = 3 * torch.sqrt((1 - alpha_t) / alpha_t)
-
-    # plot trace
-    for t in trace.transpose(0, 1)[:num_sample]:
-        plt.plot(t[:, 0], t[:, 1], '-', color='#E3E3E3', alpha=0.5)
-
-    # select samples in distance to the focus point, within r
-    selected_p = trace[T, focus_id, :]
-    mu = selected_p / alpha_t ** 0.5
-    d = torch.norm(trace[T, :num_sample] - mu, dim=1)
-    inrange = torch.where(d <= r)[0]
-    outrange = torch.where(d > r)[0]
-
-    next_t = min(T + dt, len(trace) - 1)
-    plt.plot(trace[:next_t, focus_id, 0], trace[:next_t, focus_id, 1], '-', color='#F5851E', zorder=3)
-
-    # plot selected
-
-    size = torch.stack(fitnesses)[T, :num_sample] ** 0.05 * 50 + 1
-    size = size / size.max() * 20
-    plt.scatter(trace[T, outrange, 0], trace[T, outrange, 1], color='#C6C6C6', s=size[outrange], alpha=1, zorder=10)
-    plt.scatter(trace[T, inrange, 0], trace[T, inrange, 1], color='#46B3D5', s=size[inrange] * 1, alpha=1, zorder=9)
-    plt.scatter(selected_p[0], selected_p[1], color='black', zorder=10, marker='*')
-    for pt in past_ts:
-        _sp = trace[pt, focus_id, :]
-        plt.scatter(_sp[0], _sp[1], color='gray', zorder=10, marker='*')
-
-    # draw a disk around selected_p
-    # filling the circle with transparent blue
-    add_circle(mu, r * 3/3, alpha=0.1)
-    add_circle(mu, r * 2/3, alpha=0.2)
-    add_circle(mu, r * 1/3, alpha=0.3)
-
-    # plt.scatter(pop[:num_sample, 0], pop[:num_sample, 1], color='#E93A01', s=1, zorder=11)
-    plt.scatter([-1, 1], [-1, 1], color='black', s=100, marker='+', zorder=12)
-
-    fit = torch.stack(fitnesses)[T].unsqueeze(1)
-    x = trace[T]
-    d = torch.norm(alpha_t.sqrt() * x - x[focus_id], dim=1).unsqueeze(1)
-    pd = torch.exp(-(d ** 2) / (1 - alpha_t) / 2)
-    w = fit * pd
-    w = w / w.sum()
-
-    x0 = torch.sum(x * w, dim=0)
-    x_next = trace[next_t, focus_id]
-    plt.scatter(x0[0], x0[1], color='#E93A01', zorder=13, marker='.')
-    # add text
-    plt.text(x0[0], x0[1], '$x_0$', fontsize=12, color='black', ha='left', va='top', zorder=13)
-    plt.scatter(x_next[0], x_next[1], color='#F5851E', zorder=15, marker='*')
-
-    # draw a dashed arrow from selected_p to x0
-    v = x0 - selected_p
-    u = v / torch.norm(v)
-    plt.arrow(selected_p[0] + 0.2 * u[0], selected_p[1] + 0.2 * u[1],
-            v[0] - 0.4 * u[0],
-            v[1] - 0.4 * u[1],
-            head_width=0.1, head_length=0.1, fc='black', ec='black', zorder=14, alpha=0.25)
-
-    # set limits
-    plt.xlim(-3, 3)
-    plt.ylim(-3, 3)
-    # remove ticks
-    plt.xticks([])
-    plt.yticks([])
-
-def make_plot(optimizer, trace, fitnesses, method:str):
-    plt.figure(figsize=(8, 2.5))
-    time_steps = [20, 45, 70, 95]
-    for i, t in enumerate(time_steps):
-        plt.subplot(1, 4, i+1)
-        past_ts = time_steps[:i] if i > 0 else []
-        alpha_t = optimizer.scheduler.alpha[optimizer.num_step - t - 1]
-        plot_diffusion(alpha_t, trace, fitnesses, focus_id=20, T=t, num_sample=100, dt=23, past_ts=past_ts)
-        # set aspect ratio to be equal
-        plt.gca().set_aspect('equal', adjustable='box')
-        plt.title(f'T={t}')
-
-    plt.tight_layout()
-    plt.savefig(f'./figures/process_bayesian_{method}.png')
-    plt.savefig(f'./figures/process_bayesian_{method}.pdf')
-    plt.close()
-
-    # do a simple scatter plot of the final population
-    plt.scatter(trace[-1, :, 0], trace[-1, :, 1], s=1)
-    plt.xlim(-2, 2)
-    plt.ylim(-2, 2)
-    plt.gca().set_aspect('equal', adjustable='box')
-    plt.savefig(f'./figures/final_population_{method}.png')
-    plt.savefig(f'./figures/final_population_{method}.pdf')
-    plt.close()
-
-
-if __name__ == '__main__':
-    torch.manual_seed(42)
-    optimizer_naive = DiffEvo(num_step=100, scaling=1.5, noise=0)
-    optimizer_ddpm = DiffEvo(num_step=100, scaling=1.5, noise=0.1)
-
-    x0 = torch.randn(512, 2)
-
-    result_naive = optimizer_naive.optimize(two_peak_density, initial_population=x0, trace=True)
-    result_ddpm = optimizer_ddpm.optimize(two_peak_density, initial_population=x0, trace=True)
-
-    x0 = torch.randn(512, 2) + torch.Tensor([[-1, 1]])
-    result_hard = optimizer_ddpm.optimize(two_peak_density, initial_population=x0, trace=True)
-
-    pop, trace, fitnesses = result_naive
-    make_plot(optimizer_naive, trace, fitnesses, 'zero')
-    pop, trace, fitnesses = result_ddpm
-    make_plot(optimizer_ddpm, trace, fitnesses, 'ddpm')
-    pop, trace, fitnesses = result_hard
-    make_plot(optimizer_ddpm, trace, fitnesses, 'hard')
\ No newline at end of file
diff --git a/experiments/2d_models/two_peaks/figures/final_population_ddpm.png b/experiments/2d_models/two_peaks/figures/final_population_ddpm.png
deleted file mode 100644
index dc11e6b..0000000
Binary files a/experiments/2d_models/two_peaks/figures/final_population_ddpm.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/figures/final_population_hard.png b/experiments/2d_models/two_peaks/figures/final_population_hard.png
deleted file mode 100644
index 197110b..0000000
Binary files a/experiments/2d_models/two_peaks/figures/final_population_hard.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/figures/final_population_zero.png b/experiments/2d_models/two_peaks/figures/final_population_zero.png
deleted file mode 100644
index 4371bf2..0000000
Binary files a/experiments/2d_models/two_peaks/figures/final_population_zero.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/figures/process_bayesian_ddpm.png b/experiments/2d_models/two_peaks/figures/process_bayesian_ddpm.png
deleted file mode 100644
index 129d1c5..0000000
Binary files a/experiments/2d_models/two_peaks/figures/process_bayesian_ddpm.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/figures/process_bayesian_hard.png b/experiments/2d_models/two_peaks/figures/process_bayesian_hard.png
deleted file mode 100644
index 0761d58..0000000
Binary files a/experiments/2d_models/two_peaks/figures/process_bayesian_hard.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/figures/process_bayesian_zero.png b/experiments/2d_models/two_peaks/figures/process_bayesian_zero.png
deleted file mode 100644
index bba49d5..0000000
Binary files a/experiments/2d_models/two_peaks/figures/process_bayesian_zero.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/images/diffuse.png b/experiments/2d_models/two_peaks/images/diffuse.png
deleted file mode 100644
index d5fca3d..0000000
Binary files a/experiments/2d_models/two_peaks/images/diffuse.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/images/framwork.jpg b/experiments/2d_models/two_peaks/images/framwork.jpg
deleted file mode 100644
index 3a9f285..0000000
Binary files a/experiments/2d_models/two_peaks/images/framwork.jpg and /dev/null differ
diff --git a/experiments/2d_models/two_peaks/images/framwork.png b/experiments/2d_models/two_peaks/images/framwork.png
deleted file mode 100644
index 023c0c0..0000000
Binary files a/experiments/2d_models/two_peaks/images/framwork.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/experiment.py b/experiments/2d_models/two_peaks_step/experiment.py
deleted file mode 100644
index f5dc44a..0000000
--- a/experiments/2d_models/two_peaks_step/experiment.py
+++ /dev/null
@@ -1,142 +0,0 @@
-import torch
-import torch.nn as nn
-import matplotlib.pyplot as plt
-from torch.distributions import MultivariateNormal
-from diffevo import DiffEvo
-
-
-def two_peak_density(x, mu1=None, mu2=None, std=0.5):
-    if mu1 is None:
-        mu1 = torch.tensor([-1., -1.])
-    if mu2 is None:
-        mu2 = torch.tensor([1., 1.])
-
-    # compute the minimal distance to the two peaks
-    d1 = torch.norm(x - mu1, dim=-1)
-    d2 = torch.norm(x - mu2, dim=-1)
-    d = torch.min(d1, d2)
-
-    # if the distance is smaller than the standard deviation, return 1, otherwise 0
-    p = (d < std).float()
-    p = torch.clamp(p, 1e-9, 1)
-
-    return p
-
-def add_circle(mu, r, alpha=0.1):
-    circle = plt.Circle(mu, r, color='#46B3D5', alpha=alpha, zorder=2)
-    plt.gca().add_artist(circle)
-
-def plot_diffusion(alpha_t, trace, fitnesses, focus_id, T, num_sample=100, dt=25, past_ts=[]):
-    r = 3 * torch.sqrt((1 - alpha_t) / alpha_t)
-
-    # plot trace
-    for t in trace.transpose(0, 1)[:num_sample]:
-        plt.plot(t[:, 0], t[:, 1], '-', color='#E3E3E3', alpha=0.5)
-
-    # select samples in distance to the focus point, within r
-    selected_p = trace[T, focus_id, :]
-    mu = selected_p / alpha_t ** 0.5
-    d = torch.norm(trace[T, :num_sample] - mu, dim=1)
-    inrange = torch.where(d <= r)[0]
-    outrange = torch.where(d > r)[0]
-
-    next_t = min(T + dt, len(trace) - 1)
-    plt.plot(trace[:next_t, focus_id, 0], trace[:next_t, focus_id, 1], '-', color='#F5851E', zorder=3)
-
-    # plot selected
-
-    size = torch.stack(fitnesses)[T, :num_sample] ** 0.5 * 50 + 1
-    size = size / size.max() * 10
-    plt.scatter(trace[T, outrange, 0], trace[T, outrange, 1], color='#C6C6C6', s=size[outrange], alpha=1, zorder=10)
-    plt.scatter(trace[T, inrange, 0], trace[T, inrange, 1], color='#46B3D5', s=size[inrange] * 1, alpha=1, zorder=9)
-    plt.scatter(selected_p[0], selected_p[1], color='black', zorder=10, marker='*')
-    for pt in past_ts:
-        _sp = trace[pt, focus_id, :]
-        plt.scatter(_sp[0], _sp[1], color='gray', zorder=10, marker='*')
-
-    # draw a disk around selected_p
-    # filling the circle with transparent blue
-    add_circle(mu, r * 3/3, alpha=0.1)
-    add_circle(mu, r * 2/3, alpha=0.2)
-    add_circle(mu, r * 1/3, alpha=0.3)
-
-    fit = torch.stack(fitnesses)[T].unsqueeze(1)
-    x = trace[T]
-    d = torch.norm(alpha_t.sqrt() * x - x[focus_id], dim=1).unsqueeze(1)
-    pd = torch.exp(-(d ** 2) / (1 - alpha_t) / 2)
-    w = fit * pd
-    w = w / w.sum()
-
-    x0 = torch.sum(x * w, dim=0)
-    eps = (selected_p - alpha_t.sqrt() * x0) / torch.sqrt(1-alpha_t)
-    x_next = trace[next_t, focus_id]
-    plt.scatter(x0[0], x0[1], color='#E93A01', zorder=13, marker='.')
-    # add text
-    plt.text(x0[0], x0[1], '$x_0$', fontsize=12, color='black', ha='left', va='top', zorder=13)
-    plt.scatter(x_next[0], x_next[1], color='#F5851E', zorder=15, marker='*')
-
-    # draw a dashed arrow from selected_p to x0
-    v = x0 - selected_p
-    u = v / torch.norm(v)
-    plt.arrow(selected_p[0] + 0.2 * u[0], selected_p[1] + 0.2 * u[1],
-            v[0] - 0.4 * u[0],
-            v[1] - 0.4 * u[1],
-            head_width=0.1, head_length=0.1, fc='black', ec='black', zorder=14, alpha=0.25)
-
-    # set limits
-    plt.xlim(-3, 3)
-    plt.ylim(-3, 3)
-    # remove ticks
-    plt.xticks([])
-    plt.yticks([])
-
-def make_plot(optimizer, trace, fitnesses, method:str, time_steps = [20, 45, 70, 95]):
-    plt.figure(figsize=(8, 2.5))
-    for i, t in enumerate(time_steps):
-        plt.subplot(1, 4, i+1)
-        past_ts = time_steps[:i] if i > 0 else []
-        alpha_t = optimizer.scheduler.alpha[optimizer.num_step - t - 1]
-        plot_diffusion(alpha_t, trace, fitnesses, focus_id=20, T=t, num_sample=100, dt=23, past_ts=past_ts)
-        # set aspect ratio to be equal
-        plt.gca().set_aspect('equal', adjustable='box')
-        plt.title(f'T={t}')
-        ## draw two dashed circles, no filling
-        circle = plt.Circle([-1, -1], 0.5, color='black', fill=False, zorder=2, linestyle='--', alpha=0.5)
-        plt.gca().add_artist(circle)
-        circle = plt.Circle([1, 1], 0.5, color='black', fill=False, zorder=2, linestyle='--', alpha=0.5)
-        plt.gca().add_artist(circle)
-
-    plt.tight_layout()
-    plt.savefig(f'./figures/process_bayesian_{method}.png')
-    plt.savefig(f'./figures/process_bayesian_{method}.pdf')
-    plt.close()
-
-    # do a simple scatter plot of the final population
-    plt.scatter(trace[-1, :, 0], trace[-1, :, 1], s=1)
-    ## draw two circles
-    circle = plt.Circle([-1, -1], 0.5, color='black', alpha=0.1, zorder=2)
-    plt.gca().add_artist(circle)
-    circle = plt.Circle([1, 1], 0.5, color='black', alpha=0.1, zorder=2)
-    plt.gca().add_artist(circle)
-    plt.xlim(-2, 2)
-    plt.ylim(-2, 2)
-    plt.gca().set_aspect('equal', adjustable='box')
-    plt.savefig(f'./figures/final_population_{method}.png')
-    plt.savefig(f'./figures/final_population_{method}.pdf')
-    plt.close()
-
-
-if __name__ == '__main__':
-    torch.manual_seed(7)
-    optimizer_zero = DiffEvo(num_step=100, scaling=1.5, noise=0.0)
-    optimizer_ddpm = DiffEvo(num_step=100, scaling=1.5, noise=0.1)
-
-    x0 = torch.randn(512, 2)
-
-    result_zero = optimizer_zero.optimize(two_peak_density, initial_population=x0, trace=True)
-    result_ddpm = optimizer_ddpm.optimize(two_peak_density, initial_population=x0, trace=True)
-
-    pop, trace, fitnesses = result_zero
-    make_plot(optimizer_zero, trace, fitnesses, 'zero')
-    pop, trace, fitnesses = result_ddpm
-    make_plot(optimizer_ddpm, trace, fitnesses, 'ddpm')
\ No newline at end of file
diff --git a/experiments/2d_models/two_peaks_step/figures/final_population_ddpm.png b/experiments/2d_models/two_peaks_step/figures/final_population_ddpm.png
deleted file mode 100644
index 8ec0d9a..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/final_population_ddpm.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/final_population_hard.png b/experiments/2d_models/two_peaks_step/figures/final_population_hard.png
deleted file mode 100644
index 8c781c5..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/final_population_hard.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/final_population_hard_zero.png b/experiments/2d_models/two_peaks_step/figures/final_population_hard_zero.png
deleted file mode 100644
index 3183cf1..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/final_population_hard_zero.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/final_population_zero.png b/experiments/2d_models/two_peaks_step/figures/final_population_zero.png
deleted file mode 100644
index 4b0c0df..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/final_population_zero.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/process_bayesian_ddpm.png b/experiments/2d_models/two_peaks_step/figures/process_bayesian_ddpm.png
deleted file mode 100644
index 50f1151..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/process_bayesian_ddpm.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/process_bayesian_hard.png b/experiments/2d_models/two_peaks_step/figures/process_bayesian_hard.png
deleted file mode 100644
index b67f6f7..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/process_bayesian_hard.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/process_bayesian_hard_zero.png b/experiments/2d_models/two_peaks_step/figures/process_bayesian_hard_zero.png
deleted file mode 100644
index 353e258..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/process_bayesian_hard_zero.png and /dev/null differ
diff --git a/experiments/2d_models/two_peaks_step/figures/process_bayesian_zero.png b/experiments/2d_models/two_peaks_step/figures/process_bayesian_zero.png
deleted file mode 100644
index 1c20180..0000000
Binary files a/experiments/2d_models/two_peaks_step/figures/process_bayesian_zero.png and /dev/null differ
diff --git a/experiments/RL/.gitignore b/experiments/RL/.gitignore
deleted file mode 100644
index 319af91..0000000
--- a/experiments/RL/.gitignore
+++ /dev/null
@@ -1,4 +0,0 @@
-*.ipynb
-results/more_experiments/
-results/*.ipynb
-results/
\ No newline at end of file
diff --git a/experiments/RL/acrobot.sh b/experiments/RL/acrobot.sh
deleted file mode 100644
index a8b7a96..0000000
--- a/experiments/RL/acrobot.sh
+++ /dev/null
@@ -1,3 +0,0 @@
-python run.py --exp_name DiffEvoLatent --method diff_evo --env_name Acrobot-v1 --latent_dim 2 --dim_in 6 --dim_out 3 --num_experiment 1 --controller_type discrete --T 1 --scaling 1
-# python run.py --exp_name DiffEvoRaw --method diff_evo --env_name Acrobot-v1 --num_experiment 1 --dim_in 6 --dim_out 3 --controller_type discrete --T 1 --scaling 1
-# python run.py --exp_name CMAES --method cmaes --env_name Acrobot-v1 --num_experiment 1 --controller_type discrete --dim_in 6 --dim_out 3 --T 1 --scaling 1
\ No newline at end of file
diff --git a/experiments/RL/bipedalwalker.sh b/experiments/RL/bipedalwalker.sh
deleted file mode 100644
index e830aab..0000000
--- a/experiments/RL/bipedalwalker.sh
+++ /dev/null
@@ -1 +0,0 @@
-python run.py --exp_name DiffEvoLatent --method diff_evo --env_name BipedalWalker-v3 --latent_dim 2 --dim_in 24 --dim_out 4 --num_experiment 1 --controller_type continuous --factor 1 --scaling 10 --T 1
\ No newline at end of file
diff --git a/experiments/RL/cart_pole.sh b/experiments/RL/cart_pole.sh
deleted file mode 100644
index 4805a5e..0000000
--- a/experiments/RL/cart_pole.sh
+++ /dev/null
@@ -1,3 +0,0 @@
-python run.py --exp_name DiffEvoLatent --method diff_evo --env_name CartPole-v1 --latent_dim 2 --num_experiment 1 --scaling 1 --T 1
-# python run.py --exp_name DiffEvoRaw --method diff_evo --env_name CartPole-v1 --num_experiment 2 --scaling 1
-# python run.py --exp_name CMAES --method cmaes --env_name CartPole-v1 --num_experiment 2 --scaling 1
\ No newline at end of file
diff --git a/experiments/RL/diffRL/__init__.py b/experiments/RL/diffRL/__init__.py
deleted file mode 100644
index 6ed9f69..0000000
--- a/experiments/RL/diffRL/__init__.py
+++ /dev/null
@@ -1,3 +0,0 @@
-from .models import ContinuousController, DiscreteController, ControllerMLP
-from .experiments import experiment, experiment_cmaes
-from .plots import make_plot, make_video
\ No newline at end of file
diff --git a/experiments/RL/diffRL/es/__init__.py b/experiments/RL/diffRL/es/__init__.py
deleted file mode 100644
index 2affc05..0000000
--- a/experiments/RL/diffRL/es/__init__.py
+++ /dev/null
@@ -1,2 +0,0 @@
-from .cmaes import CMAES
-from .pepg import PEPG
\ No newline at end of file
diff --git a/experiments/RL/diffRL/es/cmaes.py b/experiments/RL/diffRL/es/cmaes.py
deleted file mode 100644
index b4915a3..0000000
--- a/experiments/RL/diffRL/es/cmaes.py
+++ /dev/null
@@ -1,115 +0,0 @@
-import torch
-import numpy as np
-from torch import Tensor
-from . import utils
-
-
-class CMAES:
-    """
-    Covariance Matrix Adaptation Evolutionary Strategy (CMAES)
-    """
-
-    def __init__(self, num_params,
-                 sigma_init=1.0,
-                 popsize=255,
-                 weight_decay=0.01,
-                 reg='l2',
-                 x0=None,
-                 inopts=None
-                 ):
-        """Constructs a CMA-ES solver, based on Hannsen's `cma` module.
-
-        :param num_params: number of model parameters.
-        :param sigma_init: initial standard deviation.
-        :param popsize: population size.
-        :param weight_decay: weight decay coefficient.
-        :param reg: Choice between 'l2' or 'l1' norm for weight decay regularization.
-        :param inopts: dict-like CMAOptions, forwarded to cma.CMAEvolutionStrategy constructor).
-        :param x0: (Optional) either (i) a single or (ii) several initial guesses for a good solution,
-                   defaults to None (initialize via `np.zeros(num_parameters)`).
-                   In case (i), the population is seeded with x0.
-                   In case (ii), the population is seeded with mean(x0, axis=0) and x0 is subsequently injected.
-        """
-
-        self.popsize = popsize
-
-        inopts = inopts or {}
-        inopts['popsize'] = self.popsize
-
-        self.num_params = num_params
-        self.sigma_init = sigma_init
-        self.weight_decay = weight_decay
-        self.reg = reg
-        self.solutions = None
-        self.fitness = None
-
-        # HANDLE INITIAL SOLUTIONS
-        inject_solutions = None
-        if x0 is None:
-            x0 = np.zeros(self.num_params)
-
-        elif isinstance(x0, np.ndarray):
-            x0 = np.atleast_2d(x0)
-            inject_solutions = x0
-            x0 = np.mean(x0, axis=0)
-
-        # INITIALIZE
-        import cma
-        self.cma = cma.CMAEvolutionStrategy(x0, self.sigma_init, inopts)
-
-        if inject_solutions is not None:
-            if len(inject_solutions) == self.popsize:
-                self.flush(inject_solutions)
-            else:
-                self.inject(inject_solutions)  # INJECT POTENTIALLY PROVIDED SOLUTIONS
-
-    def inject(self, solutions=None):
-        if solutions is not None:
-            self.cma.inject(solutions, force=True)
-
-    def flush(self, solutions):
-        self.cma.ary = solutions
-        self.solutions = solutions
-
-    def rms_stdev(self):
-        sigma = self.cma.result[6]
-        return np.mean(np.sqrt(sigma * sigma))
-
-    def ask(self):
-        '''returns a list of parameters'''
-        self.solutions = np.array(self.cma.ask())
-        return torch.tensor(self.solutions)
-
-    def tell(self, reward_table_result):
-        if not isinstance(reward_table_result, Tensor):
-            reward_table = torch.tensor(reward_table_result)
-        else:
-            reward_table = reward_table_result.clone()
-
-        if self.weight_decay > 0:
-            reg = utils.compute_weight_decay(self.weight_decay, self.solutions, reg=self.reg)
-            reward_table += reg
-
-        try:
-            reward_table = reward_table.numpy()
-        except:
-            reward_table = reward_table.cpu().numpy()
-
-        self.cma.tell(self.solutions, (-reward_table).tolist())  # convert minimizer to maximizer.
-
-        fitness_argsort = np.argsort(reward_table)[::-1]  # sort in descending order
-        self.fitness = reward_table[fitness_argsort]
-        self.solutions = self.solutions[fitness_argsort]
-
-    def current_param(self):
-        return self.cma.result[5]  # mean solution, presumably better with noise
-
-    def set_mu(self, mu):
-        pass
-
-    def best_param(self):
-        return self.cma.result[0]  # best evaluated solution
-
-    def result(self):  # return best params so far, along with historically best reward, curr reward, sigma
-        r = self.cma.result
-        return r[0], -r[1], -r[1], r[6]
\ No newline at end of file
diff --git a/experiments/RL/diffRL/es/pepg.py b/experiments/RL/diffRL/es/pepg.py
deleted file mode 100644
index 96b2d4a..0000000
--- a/experiments/RL/diffRL/es/pepg.py
+++ /dev/null
@@ -1,212 +0,0 @@
-import torch
-import numpy as np
-from torch import Tensor
-from . import utils
-
-
-class PEPG:
-    '''
-    Extension of PEPG with bells and whistles.
-    '''
-    def __init__(self, num_params,
-                 sigma_init=1.0,
-                 sigma_alpha=0.20,
-                 sigma_decay=0.999,
-                 sigma_limit=0.01,
-                 sigma_max_change=0.2,
-                 learning_rate=0.01,
-                 learning_rate_decay=0.9999,
-                 learning_rate_limit=0.01,
-                 elite_ratio=0,
-                 popsize=256,
-                 average_baseline=True,
-                 weight_decay=0.01,
-                 reg='l2',
-                 rank_fitness=True,
-                 forget_best=True,
-                 x0=None,
-                 ):  #
-        """ Constructs a `PEPG` solver instance.
-
-        :param num_params: number of model parameters.
-        :param sigma_init: initial standard deviation.
-        :param sigma_alpha: learning rate for standard deviation.
-        :param sigma_decay: anneal standard deviation.
-        :param sigma_limit: stop annealing if less than this.
-        :param sigma_max_change: clips adaptive sigma to 20%.
-        :param learning_rate: learning rate for standard deviation.
-        :param learning_rate_decay: annealing the learning rate.
-        :param learning_rate_limit: stop annealing learning rate.
-        :param elite_ratio: if > 0, then ignore learning_rate.
-        :param popsize: population size.
-        :param average_baseline: set baseline to average of batch.
-        :param weight_decay: weight decay coefficient.
-        :param reg: Choice between 'l2' or 'l1' norm for weight decay regularization.
-        :param rank_fitness: use rank rather than fitness numbers.
-        :param forget_best: don't keep the historical best solution.
-        :param x0: initial guess for a good solution, defaults to None (initialize via np.zeros(num_parameters)).
-        """
-
-        self.num_params = num_params
-        self.sigma_init = sigma_init
-        self.sigma_alpha = sigma_alpha
-        self.sigma_decay = sigma_decay
-        self.sigma_limit = sigma_limit
-        self.sigma_max_change = sigma_max_change
-        self.learning_rate = learning_rate
-        self.learning_rate_decay = learning_rate_decay
-        self.learning_rate_limit = learning_rate_limit
-        self.popsize = popsize
-        self.average_baseline = average_baseline
-        if self.average_baseline:
-            assert (self.popsize % 2 == 0), "Population size must be even"
-            self.batch_size = int(self.popsize / 2)
-        else:
-            assert (self.popsize & 1), "Population size must be odd"
-            self.batch_size = int((self.popsize - 1) / 2)
-
-        # option to use greedy es method to select next mu, rather than using drift param
-        self.elite_ratio = elite_ratio
-        self.elite_popsize = int(self.popsize * self.elite_ratio)
-        self.use_elite = False
-        if self.elite_popsize > 0:
-            self.use_elite = True
-
-        self.forget_best = forget_best
-        self.batch_reward = np.zeros(self.batch_size * 2)
-
-        # BH: ADDING option to start from prior solution
-        self.mu = np.zeros(self.num_params) if x0 is None else np.asarray(x0)  # np.zeros(self.num_params)
-        self.best_mu = np.copy(self.mu[0])  # np.zeros(self.num_params)
-        self.curr_best_mu = np.copy(self.mu[0])  # np.zeros(self.num_params)
-
-        self.sigma = np.ones(self.num_params) * self.sigma_init
-        self.best_reward = 0
-        self.first_interation = True
-        self.weight_decay = weight_decay
-        self.reg = reg
-        self.rank_fitness = rank_fitness
-        if self.rank_fitness:
-            self.forget_best = True  # always forget the best one if we rank
-        # choose optimizer
-        self.optimizer = utils.Adam(mu=self.best_mu, num_params=num_params, stepsize=learning_rate)
-
-    def rms_stdev(self):
-        sigma = self.sigma
-        return np.mean(np.sqrt(sigma * sigma))
-
-    def ask(self):
-        '''returns a list of parameters'''
-        # antithetic sampling
-        self.epsilon = np.random.randn(self.batch_size, self.num_params) * self.sigma.reshape(1, self.num_params)
-        self.epsilon_full = np.concatenate([self.epsilon, - self.epsilon])
-        if self.average_baseline:
-            epsilon = self.epsilon_full
-        else:
-            # first population is mu, then positive epsilon, then negative epsilon
-            epsilon = np.concatenate([np.zeros((1, self.num_params)), self.epsilon_full])
-        solutions = self.mu.reshape(1, self.num_params) + epsilon
-        self.solutions = solutions
-        return solutions
-
-    def tell(self, reward_table_result):
-        # input must be a numpy float array
-        assert (len(reward_table_result) == self.popsize), "Inconsistent reward_table size reported."
-
-        reward_table = np.array(reward_table_result)
-
-        if self.rank_fitness:
-            reward_table = utils.compute_centered_ranks(reward_table)
-
-        if self.weight_decay > 0:
-            reg = utils.compute_weight_decay(self.weight_decay, self.solutions, reg=self.reg)
-            reward_table += reg
-
-        reward_offset = 1
-        if self.average_baseline:
-            b = np.mean(reward_table)
-            reward_offset = 0
-        else:
-            b = reward_table[0]  # baseline
-
-        reward = reward_table[reward_offset:]
-        if self.use_elite:
-            idx = np.argsort(reward)[::-1][0:self.elite_popsize]
-        else:
-            idx = np.argsort(reward)[::-1]
-
-        best_reward = reward[idx[0]]
-        if (best_reward > b or self.average_baseline):
-            best_mu = self.mu + self.epsilon_full[idx[0]]
-            best_reward = reward[idx[0]]
-        else:
-            best_mu = self.mu
-            best_reward = b
-
-        self.curr_best_reward = best_reward
-        self.curr_best_mu = best_mu
-
-        if self.first_interation:
-            self.sigma = np.ones(self.num_params) * self.sigma_init
-            self.first_interation = False
-            self.best_reward = self.curr_best_reward
-            self.best_mu = best_mu
-        else:
-            if self.forget_best or (self.curr_best_reward > self.best_reward):
-                self.best_mu = best_mu
-                self.best_reward = self.curr_best_reward
-
-        # short hand
-        epsilon = self.epsilon
-        sigma = self.sigma
-
-        # update the mean
-
-        # move mean to the average of the best idx means
-        if self.use_elite:
-            self.mu += self.epsilon_full[idx].mean(axis=0)
-        else:
-            rT = (reward[:self.batch_size] - reward[self.batch_size:])
-            change_mu = np.dot(rT, epsilon)
-            self.optimizer.stepsize = self.learning_rate
-            update_ratio = self.optimizer.update(-change_mu)  # adam, rmsprop, momentum, etc.
-            # self.mu += (change_mu * self.learning_rate) # normal SGD method
-
-        # adaptive sigma
-        # normalization
-        if (self.sigma_alpha > 0):
-            stdev_reward = 1.0
-            if not self.rank_fitness:
-                stdev_reward = reward.std()
-            S = ((epsilon * epsilon - (sigma * sigma).reshape(1, self.num_params)) / sigma.reshape(1, self.num_params))
-            reward_avg = (reward[:self.batch_size] + reward[self.batch_size:]) / 2.0
-            rS = reward_avg - b
-            delta_sigma = (np.dot(rS, S)) / (2 * self.batch_size * stdev_reward)
-
-            # adjust sigma according to the adaptive sigma calculation
-            # for stability, don't let sigma move more than 10% of orig value
-            change_sigma = self.sigma_alpha * delta_sigma
-            change_sigma = np.minimum(change_sigma, self.sigma_max_change * self.sigma)
-            change_sigma = np.maximum(change_sigma, - self.sigma_max_change * self.sigma)
-            self.sigma += change_sigma
-
-        if (self.sigma_decay < 1):
-            self.sigma[self.sigma > self.sigma_limit] *= self.sigma_decay
-
-        if (self.learning_rate_decay < 1 and self.learning_rate > self.learning_rate_limit):
-            self.learning_rate *= self.learning_rate_decay
-
-    def flush(self, solutions):
-        self.solutions = solutions
-
-    def current_param(self):
-        return self.curr_best_mu
-
-    def set_mu(self, mu):
-        self.mu = np.array(mu)
-
-    def best_param(self):
-        return self.best_mu
-
-    def result(self):  # return best params so far, along with historically best reward, curr reward, sigma
-        return (self.best_mu, self.best_reward, self.curr_best_reward, self.sigma)
diff --git a/experiments/RL/diffRL/es/utils.py b/experiments/RL/diffRL/es/utils.py
deleted file mode 100644
index 2840b11..0000000
--- a/experiments/RL/diffRL/es/utils.py
+++ /dev/null
@@ -1,205 +0,0 @@
-import torch
-import numpy as np
-from functools import partial
-
-
-def tensor_to_numpy(t: torch.Tensor):
-    t = t.detach()
-    try:
-        return t.numpy()
-    except RuntimeError:  # grad
-        return t.detach().numpy()
-    except TypeError:  # gpu
-        return t.cpu().numpy()
-
-
-class Optimizer(object):
-    def __init__(self, mu, num_params, epsilon=1e-08):
-        self.mu = mu
-        self.dim = num_params
-        self.epsilon = epsilon
-        self.t = 0
-
-    def update(self, globalg):
-        self.t += 1
-        step = self._compute_step(globalg)
-        theta = self.mu
-        ratio = np.linalg.norm(step) / (np.linalg.norm(theta) + self.epsilon)
-        self.mu = theta + step
-        return ratio
-
-    def _compute_step(self, globalg):
-        raise NotImplementedError
-
-
-class SGD(Optimizer):
-    def __init__(self, mu, num_params, stepsize, momentum=0.9, epsilon=1e-08):
-        Optimizer.__init__(self, mu, num_params, epsilon=epsilon)
-        self.v = np.zeros(self.dim, dtype=np.float32)
-        self.stepsize, self.momentum = stepsize, momentum
-
-    def _compute_step(self, globalg):
-        self.v = self.momentum * self.v + (1. - self.momentum) * globalg
-        step = -self.stepsize * self.v
-        return step
-
-
-class Adam(Optimizer):
-    def __init__(self, mu, num_params, stepsize, beta1=0.99, beta2=0.999, epsilon=1e-08):
-        Optimizer.__init__(self, mu, num_params, epsilon=epsilon)
-        self.stepsize = stepsize
-        self.beta1 = beta1
-        self.beta2 = beta2
-        self.m = np.zeros(self.dim, dtype=np.float32)
-        self.v = np.zeros(self.dim, dtype=np.float32)
-
-    def _compute_step(self, globalg):
-        a = self.stepsize * np.sqrt(1 - self.beta2 ** self.t) / (1 - self.beta1 ** self.t)
-        self.m = self.beta1 * self.m + (1 - self.beta1) * globalg
-        self.v = self.beta2 * self.v + (1 - self.beta2) * (globalg * globalg)
-        step = -a * self.m / (np.sqrt(self.v) + self.epsilon)
-        return step
-
-
-def compute_ranks(x):
-    """
-    Returns ranks in [0, len(x))
-    Note: This is different from scipy.stats.rankdata, which returns ranks in [1, len(x)].
-    (https://github.com/openai/evolution-strategies-starter/blob/master/es_distributed/es.py)
-    """
-    assert x.ndim == 1
-    ranks = np.empty(len(x), dtype=int)
-    ranks[x.argsort()] = np.arange(len(x))
-    return ranks
-
-
-def compute_centered_ranks(x):
-    """
-    https://github.com/openai/evolution-strategies-starter/blob/master/es_distributed/es.py
-    """
-    y = compute_ranks(x.ravel()).reshape(x.shape).astype(np.float32)
-    y /= (x.size - 1)
-    y -= .5
-    return y
-
-
-def compute_weight_decay(weight_decay, model_param_list, reg='l2'):
-    if isinstance(model_param_list, torch.Tensor):
-        mean = partial(torch.mean, dim=1)
-    else:
-        mean = partial(np.mean, axis=1)
-
-    if reg == 'l1':
-        return - weight_decay * mean(torch.abs(model_param_list))
-
-    return - weight_decay * mean(model_param_list * model_param_list)
-
-
-class ScheduledSelectionPressure:
-    """ Scheduled Selection Pressure. """
-    def __init__(self, selection_pressure, num_steps, rate, mu, offset=1.):
-        """ Initialize the ScheduledSelectionPressure.
-
-        :param selection_pressure: float, final selection pressure value
-        :param num_steps: int, number of steps for the scheduling
-        :param rate: float, rate of the sigmoid function
-        :param mu: float, center of the sigmoid function
-        """
-        self.selection_pressure = selection_pressure
-        self.offset = offset
-        self.mu = mu
-        self.num_steps = num_steps
-        self.rate = rate
-
-        self.current_step = 0
-
-    def reset(self):
-        self.current_step = 0
-
-    @property
-    def scaling_factor(self):
-        """ return sigmoid scaling factor based on current step and total steps """
-        # alpha = self.current_step / self.num_steps
-        x_adjusted = (self.current_step - self.mu) / self.num_steps
-        return 1 / (1 + np.exp(-x_adjusted * self.rate))
-
-    def get_value(self):
-        value = (self.selection_pressure - self.offset) * self.scaling_factor + self.offset
-        self.current_step += 1
-        return value
-
-    # override multiplication with numpy array
-    def __mul__(self, other):
-        return self.get_value() * other
-
-    # override right-side multiplication with numpy array
-    def __rmul__(self, other):
-        return self.get_value() * other
-
-    # override left-side multiplication with numpy array
-    def __lmul__(self, other):
-        return self.get_value() * other
-
-
-def roulette_wheel(f, s=3., eps=1e-12, assume_sorted=False, normalize=False):
-    """ Roulette wheel fitness transformation.
-
-    We transform the fitness values f to probabilities p by applying the roulette wheel fitness transformation.
-    The roulette wheel fitness transformation is a monotonic transformation that maps the fitness values to
-    probabilities. The selection pressure s controls the degree of selection. The higher the selection pressure,
-    the more the probabilities are concentrated on the best solutions (s can be positive or negative).
-
-    :param f: torch.Tensor of shape (popsize,), fitness values of the sampled solutions
-    :param s: float, selection pressure
-    :param eps: float, epsilon to avoid division by zero
-    :param assume_sorted: bool, whether to disable sorting of the fitness values and assume that they are already sorted
-    :param normalize: bool, whether to normalize the probabilities to sum to 1 (default False, i.e., the sum over
-                      the returned scaled probabilities is equal to the sum over the fitness absolute values)
-    :return: torch.Tensor of shape (popsize,), indices of the selected solutions
-    """
-    if not isinstance(f, (torch.Tensor, np.ndarray)):
-        f = torch.tensor(f)
-
-    if isinstance(f, torch.Tensor):
-        exp = torch.exp
-        indices = torch.arange(len(f))
-    else:
-        exp = np.exp
-        indices = np.arange(len(f))
-
-    if not assume_sorted:
-        # sort fitness in ascending order
-        if isinstance(f, torch.Tensor):
-            asc = torch.argsort(f.flatten(), descending=False, dim=0)
-            where = torch.where
-        else:  # numpy
-            asc = f.flatten().argsort()
-            where = np.where
-
-        indices = where(asc[None, :] == indices[:, None])[1]  # original order
-        f = f[asc]
-
-    if isinstance(f, torch.Tensor):
-        total_weight = torch.abs(f).sum()
-    else:
-        total_weight = np.abs(f).sum()
-
-    fs = (f - f.min()) / (f.max() - f.min() + eps)  # normalize fitness values to [0, 1], and sort
-    fs = exp(s*fs)  # apply selection pressure, s can be positive or negative
-
-    if isinstance(f, torch.Tensor):
-        fs = fs.cumsum(dim=0)  # compute cumulative sum
-    else:
-        fs = np.cumsum(fs)  # compute cumulative sum
-
-    fs /= fs.sum()
-    if not normalize:
-        fs *= total_weight
-    return fs[indices]
-
-
-def parameter_crowding(parameters, weight=1., sharpness=1., similarity_metric="euclidean"):
-    from sklearn.metrics.pairwise import pairwise_distances
-    parameter_similarity_matrix = pairwise_distances(parameters.reshape(len(parameters), -1), metric=similarity_metric)
-    loss = np.exp(-parameter_similarity_matrix * sharpness)
-    return loss.mean(axis=-1) * weight
diff --git a/experiments/RL/diffRL/experiments.py b/experiments/RL/diffRL/experiments.py
deleted file mode 100644
index 4588719..0000000
--- a/experiments/RL/diffRL/experiments.py
+++ /dev/null
@@ -1,130 +0,0 @@
-from .models import ControllerMLP, DiscreteController, ContinuousController
-import torch
-import numpy as np
-import gym
-from tqdm import tqdm
-from .es import CMAES
-from diffevo import LatentBayesianGenerator, RandomProjection, DDIMSchedulerCosine, BayesianGenerator
-from .utils import normalize_observation
-
-
-def compute_rewards(dim_in, dim_out, dim_hidden, param, env_name, n_hidden_layers=1, controller_type="discrete", factor=1):
-    env = gym.make(env_name, render_mode='rgb_array')
-
-    seed = np.random.randint(0, np.iinfo(np.int32).max)
-    observation, info = env.reset(seed=seed)
-
-    model = ControllerMLP.from_parameter(dim_in, dim_out, dim_hidden, param, n_hidden_layers=n_hidden_layers)
-
-    if controller_type == "discrete":
-        controller = DiscreteController(model, env.action_space)
-    elif controller_type == "continuous":
-        controller = ContinuousController(model, env.action_space, factor=factor)
-
-    total_reward = 0
-    observations = []
-    ending = {'terminated': False, 'truncated': False}
-
-    for i in range(500):
-        action = controller(torch.from_numpy(normalize_observation(observation, env.observation_space)).float())
-        observation, reward, terminated, truncated, info = env.step(action)
-        observations.append(observation)
-
-        total_reward += reward
-
-        if terminated or truncated:
-            ending['terminated'] = terminated
-            ending['truncated'] = truncated
-            break
-
-    env.close()
-    observations = torch.from_numpy(np.stack(observations)).float()
-    return total_reward, observations, ending
-
-def compute_rewards_list(dim_in, dim_out, dim_hidden, params, env_name, n_hidden_layers=1, controller_type="discrete", factor=1):
-    rewards = []
-    observations = []
-    endings = []
-    for p in params:
-        reward, obs, ending = compute_rewards(dim_in, dim_out, dim_hidden, p, env_name, n_hidden_layers=n_hidden_layers, controller_type=controller_type, factor=factor)
-        rewards.append(reward)
-        observations.append(obs)
-        endings.append(ending)
-    return torch.Tensor(rewards), observations, endings
-
-def calculate_dim(dim_in, dim_out, dim_hidden, n_hidden_layers):
-    # calculate the total dimension of the controller
-    return (dim_in + 1) * dim_hidden + (dim_hidden + 1) * dim_hidden * (n_hidden_layers-1) + (dim_hidden + 1) * dim_out
-
-def experiment(num_step, T=1, population_size=512, latent_dim=None, scaling=0.1, noise=1, dim_in=4, dim_out=2, dim_hidden=8, n_hidden_layers=1, weight_decay=0, env_name="CartPole-v1", controller_type="discrete", factor=1):
-
-    # print all arguments
-    print(f"num_step: {num_step}, T: {T}, population_size: {population_size}, latent_dim: {latent_dim}, scaling: {scaling}, noise: {noise}, dim_in: {dim_in}, dim_out: {dim_out}, dim_hidden: {dim_hidden}, n_hidden_layers: {n_hidden_layers}, weight_decay: {weight_decay}, env_name: {env_name}, controller_type: {controller_type}, factor: {factor}")
-
-    scheduler = DDIMSchedulerCosine(num_step=num_step)
-
-    dim = calculate_dim(dim_in, dim_out, dim_hidden, n_hidden_layers)
-    x = torch.randn(population_size, dim)
-
-    reward_history = []
-    population_history = [x * scaling]
-    x0_population = [x * scaling]
-    observations = []
-
-    if latent_dim is not None:
-        random_map = RandomProjection(dim, latent_dim, normalize=True)
-
-    for t, alpha in tqdm(scheduler, total=scheduler.num_step-1):
-        rewards, obs, endings = compute_rewards_list(dim_in, dim_out, dim_hidden, x * scaling, env_name, n_hidden_layers=n_hidden_layers, controller_type=controller_type, factor=factor)
-        l2 = torch.norm(population_history[-1], dim=-1) ** 2
-        fitness = torch.exp((rewards - rewards.max()) / T - l2 * weight_decay)
-
-        reward_history.append(rewards)
-
-        if latent_dim is not None:
-            generator = LatentBayesianGenerator(x, random_map(x).detach(), fitness, alpha)
-        else:
-            generator = BayesianGenerator(x, fitness, alpha)
-
-        x, x0 = generator(noise=noise, return_x0=True)
-        population_history.append(x * scaling)
-        x0_population.append(x0 * scaling)
-        observations.append(obs)
-    
-    rewards, obs, endings = compute_rewards_list(dim_in, dim_out, dim_hidden, x * scaling, env_name, n_hidden_layers=n_hidden_layers, controller_type=controller_type, factor=factor)
-    reward_history.append(rewards)
-    observations.append(obs)
-
-    reward_history = torch.stack(reward_history)
-    population_history = torch.stack(population_history)
-    x0_population = torch.stack(x0_population)
-
-    if latent_dim is not None:
-        return x, reward_history, population_history, x0_population, observations, random_map, endings
-    else:
-        return x, reward_history, population_history, x0_population, observations, None, endings
-
-def experiment_cmaes(num_step, T=1, population_size=512, latent_dim=None, scaling=0.1, noise=1, sigma_init=1, dim_in=4, dim_out=2, dim_hidden=8, n_hidden_layers=1, weight_decay=0, env_name="CartPole-v1", controller_type="discrete", factor=1):
-
-    dim = calculate_dim(dim_in, dim_out, dim_hidden, n_hidden_layers)
-    es = CMAES(num_params=dim, popsize=population_size, weight_decay=weight_decay, sigma_init=sigma_init, inopts={'seed': np.nan, 'CMA_elitist': 2})
-
-    population_history = []
-    reward_history = []
-    observations = []
-
-    for _ in tqdm(range(num_step)):
-        x = es.ask()
-        population_history.append(x * scaling)
-
-        rewards, obs, endings = compute_rewards_list(dim_in, dim_out, dim_hidden, x * scaling, env_name, n_hidden_layers=n_hidden_layers, controller_type=controller_type, factor=factor)
-        fitness = rewards
-
-        es.tell(fitness)
-        reward_history.append(rewards)
-        observations.append(obs)
-
-    population_history = torch.from_numpy(np.stack(population_history)).float()
-    reward_history = torch.stack(reward_history)
-
-    return x, reward_history, population_history, None, observations, None, endings
\ No newline at end of file
diff --git a/experiments/RL/diffRL/models.py b/experiments/RL/diffRL/models.py
deleted file mode 100644
index 94fc5eb..0000000
--- a/experiments/RL/diffRL/models.py
+++ /dev/null
@@ -1,64 +0,0 @@
-import torch
-import torch.nn as nn
-
-
-class ControllerMLP(nn.Module):
-    def __init__(self, dim_in, dim_out, n_hidden, n_hidden_layers=1):
-        super().__init__()
-        hidden_layers = []
-        for _ in range(n_hidden_layers-1):
-            hidden_layers.append(nn.Linear(n_hidden, n_hidden))
-            hidden_layers.append(nn.ReLU())
-
-        self.mlp = nn.Sequential(
-            nn.Linear(dim_in, n_hidden),
-            nn.ReLU(),
-            *hidden_layers,
-            nn.Linear(n_hidden, dim_out)
-        )
-    
-    def forward(self, x):
-        return self.mlp(x)
-    
-    def __len__(self):
-        # return the number of parameters
-        return sum(p.numel() for p in self.parameters())
-    
-    def fill(self, params):
-        # fill the parameters with a flat tensor
-        if len(params) != len(self):
-            raise ValueError(f"The number of parameters does not match, expected {len(self)} but got {len(params)}")
-
-        for p in self.parameters():
-            n = p.numel()
-            p.data.copy_(params[:n].view_as(p))
-            params = params[n:]
-    
-    @classmethod
-    def from_parameter(cls, dim_in, dim_out, n_hidden, params, n_hidden_layers=1):
-        # create a new instance and fill it with the given parameters
-        instance = cls(dim_in, dim_out, n_hidden, n_hidden_layers=n_hidden_layers)
-        instance.fill(params)
-        return instance
-
-
-class DiscreteController:
-    def __init__(self, model, action_space):
-        self.model = model
-        self.action_space = action_space
-    
-    def __call__(self, x):
-        with torch.no_grad():
-            logits = self.model(x)
-            return torch.argmax(logits).item()
-
-class ContinuousController:
-    def __init__(self, model, action_space, factor=1):
-        self.model = model
-        self.action_space = action_space
-        self.factor = factor
-    
-    def __call__(self, x):
-        with torch.no_grad():
-            result = torch.tanh(self.model(x)).reshape(-1).numpy() * self.factor
-            return result
\ No newline at end of file
diff --git a/experiments/RL/diffRL/plots.py b/experiments/RL/diffRL/plots.py
deleted file mode 100644
index 7e42c9e..0000000
--- a/experiments/RL/diffRL/plots.py
+++ /dev/null
@@ -1,54 +0,0 @@
-import torch
-import gym
-import matplotlib
-import matplotlib.pyplot as plt
-import numpy as np
-from .models import ControllerMLP, DiscreteController, ContinuousController
-from .utils import normalize_observation
-
-matplotlib.rcParams['mathtext.fontset'] = 'stix'
-matplotlib.rcParams['font.family'] = 'STIXGeneral'
-
-def make_plot(reward_history):
-    plt.plot(reward_history.median(dim=-1).values, label="median", color='#46B3D5')
-    plt.fill_between(
-        range(reward_history.size(0)),
-        reward_history.quantile(0.1, dim=-1),
-        reward_history.quantile(0.9, dim=-1),
-        alpha=0.3, label=r"10%-90% quantile", color='#46B3D5')
-    plt.xlabel("generation")
-    plt.ylabel("rewards")
-    plt.legend()
-
-def make_video(folder, para, controller_type="discrete", env_name="CartPole-v1", dim_in=4, dim_out=2, dim_hidden=8, n_hidden_layers=1, factor=1):
-    env = gym.make(env_name, render_mode="rgb_array")
-    env = gym.wrappers.RecordVideo(env=env, video_folder=folder, name_prefix="test-video", episode_trigger=lambda x: x % 2 == 0)
-
-    model = ControllerMLP.from_parameter(dim_in, dim_out, dim_hidden, para, n_hidden_layers=n_hidden_layers)
-    if controller_type == "discrete":
-        controller = DiscreteController(model, env.action_space)
-    elif controller_type == "continuous":
-        controller = ContinuousController(model, env.action_space, factor=factor)
-    
-    seed = np.random.randint(0, np.iinfo(np.int32).max)
-    observation, info = env.reset(seed=seed)
-    rewards = []
-    infos = []
-
-    # Start the recorder
-    env.start_video_recorder()
-
-    for i in range(500):
-        action = controller(torch.from_numpy(normalize_observation(observation, env.observation_space)).float())
-        observation, reward, terminated, truncated, info = env.step(action)
-        rewards.append(reward)
-
-        if len(info) > 0:
-            infos.append(info)
-
-        if terminated or truncated:
-            observation, info = env.reset()
-            break
-
-    env.close_video_recorder()
-    env.close()
\ No newline at end of file
diff --git a/experiments/RL/diffRL/utils.py b/experiments/RL/diffRL/utils.py
deleted file mode 100644
index 23e10b1..0000000
--- a/experiments/RL/diffRL/utils.py
+++ /dev/null
@@ -1,10 +0,0 @@
-import numpy as np
-
-def normalize_observation(observation, observation_space, extreme_threshold=1e3):
-    # Replace inf/-inf with threshold values
-    low = np.where(observation_space.low < -extreme_threshold, -1, observation_space.low)
-    high = np.where(observation_space.high > extreme_threshold, 1, observation_space.high)
-    
-    # Normalize to [-1, 1] range
-    rescaled = 2 * (observation - low) / (high - low) - 1
-    return rescaled * np.sqrt(3) # scale to unit variance
\ No newline at end of file
diff --git a/experiments/RL/figures/cartpole.png b/experiments/RL/figures/cartpole.png
deleted file mode 100644
index fafee5c..0000000
Binary files a/experiments/RL/figures/cartpole.png and /dev/null differ
diff --git a/experiments/RL/fitness.md b/experiments/RL/fitness.md
deleted file mode 100644
index 1e618c6..0000000
--- a/experiments/RL/fitness.md
+++ /dev/null
@@ -1,40 +0,0 @@
-
-## Scaling = 0.1
-
-|                          | DiffEvoRaw        | DiffEvoLatent     | DiffEvoLargeLatent   | CMAES             |
-|:-------------------------|:------------------|:------------------|:---------------------|:------------------|
-| Acrobot-v1               | -279.49 (187.24)  | -148.98 (115.02)  | -157.55 (116.94)     | -486.89 (56.69)   |
-| CartPole-v1              | 447.38 (121.18)   | 447.02 (124.34)   | 407.55 (154.89)      | 32.64 (71.81)     |
-| MountainCar-v0           | -163.21 (39.39)   | -139.65 (35.63)   | -138.14 (37.58)      | -199.18 (7.29)    |
-| MountainCarContinuous-v0 | -0.89 (1.19)      | -0.02 (0.05)      | -0.01 (0.05)         | -0.14 (0.21)      |
-| Pendulum-v1              | -1230.81 (337.57) | -1223.76 (346.99) | -1227.12 (340.66)    | -1257.21 (302.57) |
-
-## Scaling = 1.0
-
-|                          | DiffEvoRaw        | DiffEvoLatent     | DiffEvoLargeLatent   | CMAES             |
-|:-------------------------|:------------------|:------------------|:---------------------|:------------------|
-| Acrobot-v1               | -199.86 (160.28)  | -127.03 (93.06)   | -147.55 (107.21)     | -471.03 (81.48)   |
-| CartPole-v1              | 482.90 (73.24)    | 491.61 (50.19)    | 445.03 (124.20)      | 77.67 (127.25)    |
-| MountainCar-v0           | -134.65 (34.79)   | -130.57 (33.28)   | -134.80 (37.49)      | -194.68 (18.30)   |
-| MountainCarContinuous-v0 | 55.97 (47.91)     | 78.59 (39.05)     | 88.58 (21.66)        | 33.94 (63.75)     |
-| Pendulum-v1              | -1261.96 (330.36) | -1186.61 (408.49) | -1094.40 (532.60)    | -1397.09 (217.38) |
-
-## Scaling = 10.0
-
-|                          | DiffEvoRaw        | DiffEvoLatent     | DiffEvoLargeLatent   | CMAES             |
-|:-------------------------|:------------------|:------------------|:---------------------|:------------------|
-| Acrobot-v1               | -191.80 (156.28)  | -121.02 (76.98)   | -149.77 (105.04)     | -469.18 (83.56)   |
-| CartPole-v1              | 478.71 (78.58)    | 488.64 (59.73)    | 428.89 (142.70)      | 79.47 (130.25)    |
-| MountainCar-v0           | -134.22 (34.93)   | -129.53 (32.78)   | -133.92 (36.60)      | -194.85 (17.89)   |
-| MountainCarContinuous-v0 | 79.44 (37.46)     | 91.66 (11.30)     | 83.41 (33.82)        | 10.90 (68.52)     |
-| Pendulum-v1              | -1131.67 (488.44) | -1076.78 (520.49) | -1100.99 (519.50)    | -1368.21 (246.34) |
-
-## Scaling = 100.0
-
-|                          | DiffEvoRaw        | DiffEvoLatent     | DiffEvoLargeLatent   | CMAES             |
-|:-------------------------|:------------------|:------------------|:---------------------|:------------------|
-| Acrobot-v1               | -190.72 (156.28)  | -120.63 (76.73)   | -151.72 (110.32)     | -469.58 (83.27)   |
-| CartPole-v1              | 477.87 (82.10)    | 489.47 (57.86)    | 443.12 (128.48)      | 77.72 (127.89)    |
-| MountainCar-v0           | -133.58 (34.99)   | -130.62 (33.74)   | -134.58 (37.45)      | -194.64 (18.22)   |
-| MountainCarContinuous-v0 | 78.56 (39.49)     | 90.67 (16.02)     | 82.38 (35.57)        | 12.97 (69.27)     |
-| Pendulum-v1              | -1118.88 (494.36) | -1066.00 (535.60) | -1102.28 (521.71)    | -1367.11 (243.03) |
diff --git a/experiments/RL/mountain_car.sh b/experiments/RL/mountain_car.sh
deleted file mode 100644
index 738f5bb..0000000
--- a/experiments/RL/mountain_car.sh
+++ /dev/null
@@ -1,4 +0,0 @@
-# python run.py --exp_name DiffEvoLatent --method diff_evo --env_name MountainCar-v0 --latent_dim 2 --dim_in 2 --dim_out 3 --num_experiment 1 --controller_type discrete --T 1 --scaling 100
-python run.py --exp_name DiffEvoLargeLatent --method diff_evo --env_name MountainCar-v0 --latent_dim 2 --dim_in 2 --dim_out 3 --dim_hidden 128 --n_hidden_layers 2 --num_experiment 1 --controller_type discrete --T 1 --scaling 10
-# python run.py --exp_name DiffEvoRaw --method diff_evo --env_name MountainCar-v0 --num_experiment 1 --dim_in 2 --dim_out 3 --controller_type discrete --T 1 --scaling 10
-# python run.py --exp_name CMAES --method cmaes --env_name MountainCar-v0 --num_experiment 1 --controller_type discrete --dim_in 2 --dim_out 3 --T 1 --scaling 10
\ No newline at end of file
diff --git a/experiments/RL/mountain_car_continuous.sh b/experiments/RL/mountain_car_continuous.sh
deleted file mode 100644
index 0473180..0000000
--- a/experiments/RL/mountain_car_continuous.sh
+++ /dev/null
@@ -1,4 +0,0 @@
-python run.py --exp_name DiffEvoLatent --method diff_evo --env_name MountainCarContinuous-v0 --latent_dim 2 --dim_in 2 --dim_out 1 --num_experiment 1 --controller_type continuous --T 1 --scaling 10
-# python run.py --exp_name DiffEvoRaw --method diff_evo --env_name MountainCarContinuous-v0 --num_experiment 1 --dim_in 2 --dim_out 1 --controller_type continuous --T 1 --scaling 10
-# python run.py --exp_name DiffEvoLargeLatent --method diff_evo --env_name MountainCarContinuous-v0 --latent_dim 2 --dim_in 2 --dim_out 1 --dim_hidden 128 --n_hidden_layers 2 --num_experiment 1 --controller_type continuous --T 1 --scaling 10
-# python run.py --exp_name CMAES --method cmaes --env_name MountainCarContinuous-v0 --num_experiment 1 --controller_type continuous --dim_in 2 --dim_out 1 --T 1 --scaling 10
\ No newline at end of file
diff --git a/experiments/RL/pendulum.sh b/experiments/RL/pendulum.sh
deleted file mode 100644
index ae01aaf..0000000
--- a/experiments/RL/pendulum.sh
+++ /dev/null
@@ -1,3 +0,0 @@
-python run.py --exp_name DiffEvoLatent --method diff_evo --env_name Pendulum-v1 --latent_dim 2 --dim_in 3 --dim_out 1 --num_experiment 1 --controller_type continuous --factor 2 --scaling 10 --T 1
-# python run.py --exp_name DiffEvoRaw --method diff_evo --env_name Pendulum-v1 --num_experiment 1 --dim_in 3 --dim_out 1 --controller_type continuous --factor 2 --scaling 1 --T 10
-# python run.py --exp_name CMAES --method cmaes --env_name Pendulum-v1 --num_experiment 1 --controller_type continuous --dim_in 3 --dim_out 1 --factor 2 --scaling 1 --T 10
\ No newline at end of file
diff --git a/experiments/RL/results/.gitignore b/experiments/RL/results/.gitignore
deleted file mode 100644
index f567d34..0000000
--- a/experiments/RL/results/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-/more_experiments
\ No newline at end of file
diff --git a/experiments/RL/run.py b/experiments/RL/run.py
deleted file mode 100644
index 6567642..0000000
--- a/experiments/RL/run.py
+++ /dev/null
@@ -1,123 +0,0 @@
-from diffRL import experiment, make_plot, make_video, experiment_cmaes
-import torch
-import numpy as np
-import matplotlib.pyplot as plt
-import os
-import argparse
-import json
-import gym
-
-
-def save_experiment_data(folder, population, x0_population, observations, random_map, reward_history, controller_params):
-
-    # Save experiment data
-    torch.save(population[-1].clone(), f"{folder}/population.pt") # only save the last step
-    if x0_population is not None:
-        torch.save(x0_population[-1].clone(), f"{folder}/x0_population.pt")
-    torch.save(observations, f"{folder}/observations.pt") # [num_step, population_size, (t_last, dim_in)]
-    if random_map is not None:
-        torch.save(random_map.state_dict(), f"{folder}/random_map.pt")
-
-    # Generate and save plots
-    make_plot(reward_history)
-    plt.savefig(f"{folder}/fitness.png")
-    plt.savefig(f"{folder}/fitness.pdf")
-    plt.close()
-
-    # Generate video with best parameters
-    best_para = population[-1][reward_history[-1].argmax().item()]
-    make_video(
-        folder,
-        best_para, 
-        controller_type=controller_params["controller_type"],
-        env_name=controller_params["env_name"],
-        dim_in=controller_params["dim_in"],
-        dim_out=controller_params["dim_out"],
-        dim_hidden=controller_params["dim_hidden"],
-        n_hidden_layers=controller_params["n_hidden_layers"],
-        factor=controller_params["factor"]
-    )
-
-if __name__ == '__main__':
-    parser = argparse.ArgumentParser(description='RL experiment runner')
-    parser.add_argument('--method', type=str, default='diff_evo', choices=['diff_evo', 'cmaes'],
-                      help='Training method to use')
-    parser.add_argument('--env_name', type=str, default='CartPole-v1',
-                      help='Environment name')
-    parser.add_argument('--latent_dim', type=int, default=None,
-                      help='Dimension of latent space')
-    parser.add_argument('--dim_in', type=int, default=4,
-                      help='Input dimension')
-    parser.add_argument('--dim_out', type=int, default=2,
-                      help='Output dimension')
-    parser.add_argument('--dim_hidden', type=int, default=8,
-                      help='Hidden layer dimension')
-    parser.add_argument('--n_hidden_layers', type=int, default=1,
-                      help='Number of hidden layers')
-    parser.add_argument('--factor', type=float, default=1.0,
-                      help='Factor parameter')
-    parser.add_argument('--controller_type', type=str, default='discrete',
-                      help='Type of controller, discrete or continuous')
-    parser.add_argument('--T', type=float, default=10,
-                      help='Temperature')
-    parser.add_argument('--scaling', type=float, default=100,
-                      help='Scaling factor')
-    parser.add_argument('--num_experiment', type=int, default=1,
-                      help='Number of experiments')
-    # required arguments
-    parser.add_argument('--exp_name', type=str, required=True,
-                      help='Experiment name')
-
-    args = parser.parse_args()
-
-    # print the arguments
-    for arg in vars(args):
-        print(f"{arg}: {getattr(args, arg)}")
-
-    # set seed
-    torch.manual_seed(42)
-    np.random.seed(42)
-
-    all_reward_history = []
-    all_endings = []
-
-
-    controller_params = {
-        "dim_in": args.dim_in,
-        "dim_out": args.dim_out,
-        "dim_hidden": args.dim_hidden,
-        "n_hidden_layers": args.n_hidden_layers,
-        "factor": args.factor,
-        "controller_type": args.controller_type,
-        "env_name": args.env_name,
-    }
-
-    if args.method == "diff_evo":
-        experiment_func = experiment
-    elif args.method == "cmaes":
-        experiment_func = experiment_cmaes
-
-    folder = f'./results/{str(args.scaling)}/{args.env_name}/{args.exp_name}'
-    os.makedirs(folder, exist_ok=True)
-
-    for i in range(args.num_experiment):
-        x, reward_history, population, x0_population, observations, random_map, endings = experiment_func(
-            num_step=10, 
-            population_size=256, 
-            T=args.T, 
-            scaling=args.scaling, 
-            latent_dim=args.latent_dim,
-            noise=1,
-            **controller_params
-        )
-        
-        all_reward_history.append(reward_history)
-        all_endings.append(endings)
-        if i == 0:
-            save_experiment_data(folder, population, x0_population, observations, random_map, reward_history, controller_params)
-    # save the data
-    torch.save(all_reward_history, f"{folder}/reward_history.pt")
-    torch.save(all_endings, f"{folder}/endings.pt")
-    # save all the arguments
-    with open(f"{folder}/args.json", "w") as f:
-        json.dump(vars(args), f, indent=4)
\ No newline at end of file
diff --git a/experiments/RL/success_rate.md b/experiments/RL/success_rate.md
deleted file mode 100644
index 61dd150..0000000
--- a/experiments/RL/success_rate.md
+++ /dev/null
@@ -1,36 +0,0 @@
-
-## Scaling = 0.1
-
-|                          |   DiffEvoRaw |   DiffEvoLatent |   DiffEvoLargeLatent |   CMAES |
-|:-------------------------|-------------:|----------------:|---------------------:|--------:|
-| Acrobot-v1               |         59.3 |            91.6 |                 91.1 |     6.2 |
-| CartPole-v1              |         80   |            80.8 |                 68.7 |     1.5 |
-| MountainCar-v0           |         53.8 |            86.6 |                 88.6 |     1.6 |
-| MountainCarContinuous-v0 |          0   |             0   |                  0   |     0   |
-
-## Scaling = 1.0
-
-|                          |   DiffEvoRaw |   DiffEvoLatent |   DiffEvoLargeLatent |   CMAES |
-|:-------------------------|-------------:|----------------:|---------------------:|--------:|
-| Acrobot-v1               |         79.1 |            95.2 |                 93   |    13.5 |
-| CartPole-v1              |         92.5 |            96.4 |                 79.4 |     5.9 |
-| MountainCar-v0           |         92   |            97.1 |                 91.9 |    10.2 |
-| MountainCarContinuous-v0 |         59.4 |            80.7 |                 96.9 |    56.7 |
-
-## Scaling = 10.0
-
-|                          |   DiffEvoRaw |   DiffEvoLatent |   DiffEvoLargeLatent |   CMAES |
-|:-------------------------|-------------:|----------------:|---------------------:|--------:|
-| Acrobot-v1               |         80.9 |            97.3 |                 93.6 |    14.4 |
-| CartPole-v1              |         90.4 |            95.6 |                 76.4 |     6.4 |
-| MountainCar-v0           |         93.1 |            98.2 |                 93.7 |    10.2 |
-| MountainCarContinuous-v0 |         92   |            99.3 |                 94   |    43.5 |
-
-## Scaling = 100.0
-
-|                          |   DiffEvoRaw |   DiffEvoLatent |   DiffEvoLargeLatent |   CMAES |
-|:-------------------------|-------------:|----------------:|---------------------:|--------:|
-| Acrobot-v1               |         81   |            97.2 |                 92.6 |    14.1 |
-| CartPole-v1              |         90.9 |            95.8 |                 80   |     6   |
-| MountainCar-v0           |         93.5 |            97.1 |                 92.1 |    10.5 |
-| MountainCarContinuous-v0 |         91.4 |            98.7 |                 93.3 |    45.3 |
diff --git a/experiments/RL/visualization.py b/experiments/RL/visualization.py
deleted file mode 100644
index 6384b47..0000000
--- a/experiments/RL/visualization.py
+++ /dev/null
@@ -1,246 +0,0 @@
-import torch
-import numpy as np
-import matplotlib.pyplot as plt
-from matplotlib.colors import LinearSegmentedColormap
-from mpl_toolkits.axes_grid1.inset_locator import inset_axes
-import matplotlib
-from matplotlib.lines import Line2D
-import matplotlib.gridspec as gridspec
-from diffevo import RandomProjection
-from matplotlib.ticker import LogLocator, LogFormatter
-
-matplotlib.rcParams['mathtext.fontset'] = 'stix'
-matplotlib.rcParams['font.family'] = 'STIXGeneral'
-
-# change default font size
-matplotlib.rcParams['font.size'] = 12
-matplotlib.rcParams['legend.fontsize'] = 10
-
-# colors = ["#C8C7C7", "#E93A01"]
-colors = ['#efefef', '#f6d8cd', '#f9c0ab', '#faa98b', '#f8906b', '#f5774c', '#f05c2c', '#e93a01']
-custom_cmap = LinearSegmentedColormap.from_list("custom_cmap", colors)
-
-
-def cartpole_plot(angles, positions, box_size=1, y0_shift=0, pole_length=1, cart_size=0.1, ang_scale=2.4, max_alpha=1, decay=10, color='black'):
-    total_time = len(angles)
-    x0 = torch.linspace(0, total_time * box_size, total_time)
-    x0 = x0 + positions
-    y0 = x0 * 0 + y0_shift
-
-    x1 = x0 + torch.sin(angles * ang_scale) * pole_length
-    y1 = y0 + torch.cos(angles * ang_scale) * pole_length
-
-    alpha = 1
-    i = len(x0) - 1
-    plt.arrow(x0[i], y0[i], x1[i] - x0[i], y1[i] - y0[i], head_width=0.0, head_length=0.0, alpha=alpha * max_alpha, color=color)
-    # add a line to represent the cart
-    plt.plot([x0[i]-cart_size,x0[i]+cart_size], [y0[i], y0[i]], color=color, alpha=alpha * max_alpha)
-
-def plot_cartpole(observations, generations, rewards, ax=None, box_size=1.5, box_size_y=2, dt=25, color_bar=False):
-    if ax is None:
-        ax = plt.gca()
-
-    for g, t in enumerate(generations):
-        ax.axhline(g * 2, color='black', alpha=0.05)
-        for i in range(len(observations[0])):
-            c = custom_cmap(rewards[t][i] / 500)
-            ang = observations[t][i][::dt, 2]
-            pos = observations[t][i][::dt, 0] / 4.8
-            cartpole_plot(ang, pos, max_alpha=0.5, decay=2, y0_shift=g * box_size_y, box_size=box_size, color=c)
-
-    x = np.arange(0, 501, 50)
-    x_corr = x / dt * box_size
-    ax.set_xticks(x_corr, x)
-    ax.set_yticks(np.arange(0, len(generations) * box_size_y, box_size_y), generations+1)
-    ax.set_ylim(-0.5, None)
-    ax.set_xlabel('time steps')
-    ax.set_ylabel('generation')
-
-    if color_bar:
-        # Adding the horizontal color bar inside the plot using inset_axes
-        cbar_ax = inset_axes(ax, width="20%", height="3%", loc='lower right',
-                            bbox_to_anchor=(0.05, 0.15, 0.9, 0.95),
-                            bbox_transform=ax.transAxes, borderpad=0)
-        
-        # Correctly referencing the figure associated with ax
-        cbar = ax.figure.colorbar(plt.cm.ScalarMappable(cmap=custom_cmap), cax=cbar_ax, orientation='horizontal')
-
-        # Remove color bar ticks
-        cbar.ax.set_xticks([])
-        cbar.ax.set_yticks([])
-
-        # Set color bar label
-        cbar.set_label('reward')
-
-def prepare_reward(rewards):
-    # rewards.shape = [num_experiment, num_generation, num_population]
-    # merge each experiment into one for each generation
-    if isinstance(rewards, list):
-        rewards = torch.stack(rewards)
-
-    rewards = rewards.permute(1, 0, 2).reshape(rewards.shape[1], -1)
-    return rewards
-
-def range_plot(x, color=None, label=None):
-    print(f'{len(x)} experiments, (num_generation, num_population)={x[0].shape}')
-    x = prepare_reward(x)
-    center = x.quantile(0.5, dim=-1)
-    lower = x.quantile(0.25, dim=-1)
-    upper = x.quantile(0.75, dim=-1)
-    X = np.arange(len(center)) + 1
-    plt.plot(X, center, color=color, label=label)
-    plt.fill_between(X, lower, upper, alpha=0.25, color=color, edgecolor='none')
-
-
-def reward_compare_plot(*rewards, labels=None, colors=None, ax=None):
-    if ax is None:
-        ax = plt.gca()
-    for i, w in enumerate(rewards):
-        range_plot(w, color=colors[i] if colors else f'C{i}', label=labels[i] if labels else None)
-
-    ax.axhline(500, color='gray', linestyle='--',
-                label='max reward', alpha=0.5)
-    ax.legend(fontsize='small')
-    ax.set_ylim(5, 570)
-    # ax.set_xlim(None, len(rewards[0]))
-    ax.set_xlabel('generation')
-    ax.set_ylabel('reward')
-
-    # set x-axis ticks with 2, 4, 6, 8, 10
-    ax.set_xticks([2, 4, 6, 8, 10])
-    ax.set_xticklabels([f'{tick}' for tick in [2, 4, 6, 8, 10]])
-    
-    # set y-axis as reversed log scale
-    ax.set_yscale('log')
-    
-    major_ticks = [10, 100, 300, 500]
-    plt.yticks(major_ticks, [f'{tick}' for tick in major_ticks])
-
-    # Set the minor ticks locator and formatter
-    ax.yaxis.set_minor_locator(LogLocator(base=10.0, subs='auto', numticks=10))
-    ax.yaxis.set_minor_formatter(LogFormatter(base=10.0, labelOnlyBase=False))
-
-
-def latent_plot(z, ax, color=None, alpha=1, label=None, zorder=1):
-    ax.scatter(z[:, 0], z[:, 1], zorder=zorder, marker='o', color=color, alpha=alpha, label=label, edgecolors='none')
-
-def compare_latent_plot(pop, pop_raw, pop_cmaes, random_map, pop_large, random_map_large, ax=None):
-    if ax is None:
-        ax = plt.gca()
-    latent_plot(random_map(pop).detach(), ax, color='#E93A01', label='latent diffusion evolution', alpha=0.5)
-    latent_plot(random_map(pop_raw).detach(), ax, color='#46B3D5', label='DiffEvo', alpha=0.25)
-    latent_plot(random_map(pop_cmaes).detach(), ax, color='#6F6E6E', alpha=0.5, label='CMA-ES')
-    latent_plot(random_map_large(pop_large).detach(), ax, color='#F5851E', alpha=0.25, label='latent DiffEvo (high-d)')
-
-    # calculate the range of the data
-    x = torch.cat([random_map(pop).detach(), random_map(pop_raw).detach(), random_map_large(pop_large).detach()], dim=0) # not include cmaes
-    x_mean = x.mean(dim=0)
-    x_std = x.std(dim=0)
-    n = 3.0
-
-    ax.set_xlabel('$z_1$')
-    ax.set_ylabel('$z_2$')
-    ax.set_xlim(x_mean[0]-n*x_std[0], x_mean[0]+n*x_std[0])
-    ax.set_ylim(x_mean[1]-n*x_std[1], x_mean[1]+n*x_std[1])
-
-def draw_cartpole_demo(ax):
-    ax.axhline(y=0, color='gray', linestyle='--')
-    # ax.axvline(x=0, color='gray', linestyle='--')
-
-    def draw_cart_and_pole(ax, center=(0, 0), angle=0.25, alpha=1):
-        width=1
-        height=0.25
-        length=1
-        # add a black rectangle representing the cart
-        ax.add_patch(plt.Rectangle((center[0]-width/2, center[1]-height/2), width, height, color='black', alpha=alpha, linewidth=0))
-
-        # draw a pole with the given angle
-        ax.plot(
-            [center[0], center[0]+length*np.sin(angle)], 
-            [center[1], center[1]+length*np.cos(angle)], 
-            color='#CC9965',
-            linewidth=4,
-            alpha=alpha)
-
-    # Call the sub-function
-    draw_cart_and_pole(ax) # main plot
-    # add left and right arrows
-    ax.arrow(0.1, -0.25, 1, 0, head_width=0.1, head_length=0.1, color='black', zorder=10)
-    ax.arrow(-0.1, -0.25, -1, 0, head_width=0.1, head_length=0.1, color='black', zorder=10)
-    for i in range(10):
-        # random x and angle
-        x = np.random.uniform(-3, 3)
-        angle = np.random.uniform(-np.pi/4, np.pi/4)
-        draw_cart_and_pole(ax, center=(x, 0), angle=angle, alpha=0.1)
-    
-    # remove y-axis
-    ax.set_yticks([])
-    ax.set_xlabel('$x$')
-
-    ax.set_xlim(-3, 3)
-    ax.set_ylim(-0.5, 1.7)
-
-
-if __name__ == '__main__':
-    # set random seed
-    np.random.seed(0)
-    torch.manual_seed(0)
-
-    path = './results/1.0/CartPole-v1'
-    obs = torch.load(f'{path}/DiffEvoLatent/observations.pt') # [time, num_pop, [T, 4]]
-
-    pop_latent = torch.load(f'{path}/DiffEvoLatent/population.pt')
-    rewards_latent = torch.load(f'{path}/DiffEvoLatent/reward_history.pt')
-    random_map = RandomProjection(58, 2, normalize=True)
-    random_map.load_state_dict(torch.load(f'{path}/DiffEvoLatent/random_map.pt'))
-
-    rewards_raw = torch.load(f'{path}/DiffEvoRaw/reward_history.pt')
-    pop_raw = torch.load(f'{path}/DiffEvoRaw/population.pt')
-
-    rewards_cmaes = torch.load(f'{path}/CMAES/reward_history.pt')
-    pop_cmaes = torch.load(f'{path}/CMAES/population.pt')
-
-    rewards_large = torch.load(f'{path}/DiffEvoLargeLatent/reward_history.pt')
-    pop_large = torch.load(f'{path}/DiffEvoLargeLatent/population.pt')
-    random_map_large = RandomProjection(17410, 2, normalize=True)
-    random_map_large.load_state_dict(torch.load(f'{path}/DiffEvoLargeLatent/random_map.pt'))
-
-    # generations = np.array([1, 40, 70, 90, 100])-1
-    generations = np.array([2, 4, 6, 8, 10])-1
-
-    # Create a figure
-    fig = plt.figure(figsize=(10, 6))
-
-    # Create a GridSpec with 2 rows and 3 columns
-    gs = gridspec.GridSpec(2, 3, height_ratios=[1, 1])
-
-    # Top plot, merged across both columns
-    ax1 = fig.add_subplot(gs[0, :])
-    plot_cartpole(obs, generations, rewards_latent[0], ax=ax1)
-    ax1.set_title('(a) evolution process')
-
-    # Bottom left plot
-    ax2 = fig.add_subplot(gs[1, 0])
-    reward_compare_plot(rewards_raw, rewards_latent, rewards_large, rewards_cmaes,
-                        labels=['DiffEvo', 'latent DiffEvo', 'latent DiffEvo (high-d)', 'CMA-ES'],
-                        colors=['#46B3D5', '#E93A01', '#F5851E', '#6F6E6E'], ax=ax2)
-    ax2.set_title('(b) reward comparison')
-
-    # Bottom middle plot
-    ax3 = fig.add_subplot(gs[1, 1])
-    compare_latent_plot(pop_latent, pop_raw, pop_cmaes, random_map, pop_large, random_map_large, ax=ax3)
-    ax3.set_title('(c) latent space comparison')
-
-    # Bottom right plot
-    ax4 = fig.add_subplot(gs[1, 2])
-    ## a placeholder plot
-    draw_cartpole_demo(ax4)
-    ax4.set_title('(d) cart-pole system')
-
-    # add margin between the subplots
-    plt.tight_layout()
-
-    plt.savefig('./figures/cartpole.png', bbox_inches='tight')
-    # save as pdf with transparent background
-    plt.savefig('./figures/cartpole.pdf', bbox_inches='tight', transparent=True)
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/.gitignore b/experiments/benchmarks/.gitignore
deleted file mode 100644
index 70fec74..0000000
--- a/experiments/benchmarks/.gitignore
+++ /dev/null
@@ -1,3 +0,0 @@
-data/
-*.pdf
-*.ipynb
\ No newline at end of file
diff --git a/experiments/benchmarks/alpha.sh b/experiments/benchmarks/alpha.sh
deleted file mode 100644
index c94b868..0000000
--- a/experiments/benchmarks/alpha.sh
+++ /dev/null
@@ -1,3 +0,0 @@
-python run_alphas.py --scheduler DDIMSchedulerCosine --num_experiments 100
-python run_alphas.py --scheduler DDPMScheduler --num_experiments 100
-python run_alphas.py --scheduler DDIMScheduler --num_experiments 100
\ No newline at end of file
diff --git a/experiments/benchmarks/figures/alpha.png b/experiments/benchmarks/figures/alpha.png
deleted file mode 100644
index 7827365..0000000
Binary files a/experiments/benchmarks/figures/alpha.png and /dev/null differ
diff --git a/experiments/benchmarks/figures/temperature_boxplot.png b/experiments/benchmarks/figures/temperature_boxplot.png
deleted file mode 100644
index 8cebe2a..0000000
Binary files a/experiments/benchmarks/figures/temperature_boxplot.png and /dev/null differ
diff --git a/experiments/benchmarks/figures/temperature_combined.png b/experiments/benchmarks/figures/temperature_combined.png
deleted file mode 100644
index 9855ad6..0000000
Binary files a/experiments/benchmarks/figures/temperature_combined.png and /dev/null differ
diff --git a/experiments/benchmarks/figures/temperature_entropy.png b/experiments/benchmarks/figures/temperature_entropy.png
deleted file mode 100644
index 1fe9c07..0000000
Binary files a/experiments/benchmarks/figures/temperature_entropy.png and /dev/null differ
diff --git a/experiments/benchmarks/figures/temperature_qd_scores.png b/experiments/benchmarks/figures/temperature_qd_scores.png
deleted file mode 100644
index 0a6d3b9..0000000
Binary files a/experiments/benchmarks/figures/temperature_qd_scores.png and /dev/null differ
diff --git a/experiments/benchmarks/images/MAPElite.png b/experiments/benchmarks/images/MAPElite.png
deleted file mode 100644
index 4621cc7..0000000
Binary files a/experiments/benchmarks/images/MAPElite.png and /dev/null differ
diff --git a/experiments/benchmarks/images/OpenES.png b/experiments/benchmarks/images/OpenES.png
deleted file mode 100644
index f8bc70a..0000000
Binary files a/experiments/benchmarks/images/OpenES.png and /dev/null differ
diff --git a/experiments/benchmarks/images/PEPG.png b/experiments/benchmarks/images/PEPG.png
deleted file mode 100644
index 7cf67ab..0000000
Binary files a/experiments/benchmarks/images/PEPG.png and /dev/null differ
diff --git a/experiments/benchmarks/images/benchmark.png b/experiments/benchmarks/images/benchmark.png
deleted file mode 100644
index ca1c140..0000000
Binary files a/experiments/benchmarks/images/benchmark.png and /dev/null differ
diff --git a/experiments/benchmarks/images/cmaes.png b/experiments/benchmarks/images/cmaes.png
deleted file mode 100644
index e0cd732..0000000
Binary files a/experiments/benchmarks/images/cmaes.png and /dev/null differ
diff --git a/experiments/benchmarks/images/diff_evo.png b/experiments/benchmarks/images/diff_evo.png
deleted file mode 100644
index c44b80b..0000000
Binary files a/experiments/benchmarks/images/diff_evo.png and /dev/null differ
diff --git a/experiments/benchmarks/images/latent_diff_evo.png b/experiments/benchmarks/images/latent_diff_evo.png
deleted file mode 100644
index 4041061..0000000
Binary files a/experiments/benchmarks/images/latent_diff_evo.png and /dev/null differ
diff --git a/experiments/benchmarks/methods/__init__.py b/experiments/benchmarks/methods/__init__.py
deleted file mode 100644
index 85e5ab0..0000000
--- a/experiments/benchmarks/methods/__init__.py
+++ /dev/null
@@ -1,6 +0,0 @@
-from .cmaes import CMAES_benchmark
-from .diff_evo import DiffEvo_benchmark
-from .openes import OpenES_benchmark
-from .map_elite import MAPElite_benchmark
-from .diff_evo_latent import LatentDiffEvo_benchmark
-from .pepg import PEPG_benchmark
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/benchmarks.py b/experiments/benchmarks/methods/benchmarks.py
deleted file mode 100644
index fc3d2bb..0000000
--- a/experiments/benchmarks/methods/benchmarks.py
+++ /dev/null
@@ -1,148 +0,0 @@
-"""
-This file contains basic functions for the benchmarks.
-
-Main functions used in the benchmarks:
-
-1. plot_background(obj:str, ax=None, title=None): 
-   Plots the background for a given objective function.
-
-2. get_obj(obj_name:str):
-   Returns the objective function and its rescaled version.
-
-3. get_cmap(obj_name:str):
-   Returns the appropriate colormap for the given objective function.
-
-"""
-import matplotlib.pyplot as plt
-from foobench import Objective
-import torch
-from .color_plate import *
-
-
-# set fitness target and distance scale to unify the scale and slope of the fitness
-fitness_target = {
-    "rosenbrock": 0,
-    "beale": 0,
-    "himmelblau": 0,
-    "ackley": -12.5401,
-    "rastrigin": -64.6249, # x_i = 3.51786 max=0 for all dimensions
-    "rastrigin_4d": -129.2498,
-    "rastrigin_32d": -1033.9980,
-    "rastrigin_256d": -8271.9844
-}
-
-distance_scale = {
-    "rosenbrock": 287.51,
-    "beale": 20,
-    "himmelblau": 17.01,
-    "ackley": 2,
-    "rastrigin": 30,
-    "rastrigin_4d": 60,
-    "rastrigin_32d": 500,
-    "rastrigin_256d": 4000
-}
-
-max_distances = { # maximum distance to the target in given parameter range
-    "rosenbrock": 40009,
-    "beale": 72769.2,
-    "himmelblau": 308.803,
-    "ackley": 12.5401,
-    "rastrigin": 64.6249,
-    "rastrigin_4d": 129.2498,
-    "rastrigin_32d": 1033.9980,
-    "rastrigin_256d": 8271.9844
-}
-
-
-def visualize_2D(objective, ax=None, n_points=100, parameter_range=None, title=None, **imshow_kwargs):
-    # get a list of points in the parameter range
-    if parameter_range is None:
-        parameter_range = [[-4, 4], [-4, 4]]
-    xy_points = torch.meshgrid(*[torch.linspace(pr[0], pr[1], n_points) for pr in parameter_range])
-    xy_points = torch.stack(xy_points, dim=-1).reshape(-1, len(parameter_range))
-    Z = objective(xy_points)
-    Z = Z.reshape(*[n_points for _ in parameter_range])
-
-    if ax is None:
-        fig, ax = plt.subplots(1, 1)
-
-    im = ax.imshow(torch.log(Z.T+1e-3), extent=(*parameter_range[0], *reversed(parameter_range[1])), **imshow_kwargs)
-    ax.invert_yaxis()
-
-    ax.set_title(title)
-
-    return im
-
-def get_cmap(obj_name:str):
-    return custom_cmap
-
-def rescale_wrapper(obj, vmin=None, vmax=None, **kwargs):
-    if vmin is None or vmax is None:
-        return obj
-    def rescaled_obj(x):
-        # return (obj(x) - vmin) / (vmax - vmin)
-        return obj(x) - vmin
-    return rescaled_obj
-
-def inverse_wrapper(obj, eps=1e-2, p=2, **kwargs):
-    def inverse_obj(x):
-        return eps / (obj(x) ** p + eps)
-    return inverse_obj
-
-def objective_wrapper(obj, target=0, scale=1, eps=1e-3, p=2, **kwargs):
-    def wrapped_obj(x):
-        d = abs(obj(x) - target) / scale
-        return eps / (d ** p + eps)
-    return wrapped_obj
-
-def energy_wrapper(obj, temperature=1, target=0, scale=1, max_distance=None, **kwargs):
-    def wrapped_obj(x):
-        minimal_p = torch.exp(-torch.tensor(max_distance) / (temperature * scale))
-        p = torch.exp(-abs(obj(x) - target) / (temperature * scale))
-        return (p - minimal_p) / (1 - minimal_p)
-    return wrapped_obj
-
-def exp_wrapper(obj, temperature=1, **kwargs):
-    def wrapped_obj(x):
-        return torch.exp(obj(x) / temperature)
-    return wrapped_obj
-
-def _original_name(obj_name:str):
-    if obj_name in ["rastrigin_4d", "rastrigin_32d", "rastrigin_256d"]:
-        return "rastrigin"
-    return obj_name
-
-def get_obj(obj_name:str, eps=1e-2, target=None, scale=None, wrapper=None, **kwargs):
-    if obj_name in ["rosenbrock", "beale", "himmelblau"]: # zero as the target
-        obj = Objective(foo=obj_name, maximize=False, limit_val=100)
-    else: # high values as the target
-        obj = Objective(foo=_original_name(obj_name), maximize=True, limit_val=1e-9)
-    
-    if target is None:
-        target = fitness_target[obj_name]
-    if scale is None:
-        scale = distance_scale[obj_name]
-    
-    max_distance = max_distances[obj_name]
-    
-    if wrapper is None:
-        wrapper = energy_wrapper
-    return obj, wrapper(obj, target=target, scale=scale, eps=eps, max_distance=max_distance, **kwargs)
-
-def get_visualize_obj(obj):
-    return Objective(foo=obj.foo_name)
-
-def plot_background(obj, ax=None, title=None):
-    # obj = get_visualize_obj(obj)
-    # _, obj = get_obj(obj)
-    _, obj_rescaled = get_obj(obj.foo_name)
-    cmap = get_cmap(obj.foo_name)
-    visualize_2D(obj_rescaled, ax=ax, cmap=cmap, title=title)
-
-    if ax is not None:
-        # remove x, y label and ticks
-        ax.set_xlabel('')
-        ax.set_ylabel('')
-        ax.set_xticks([])
-        ax.set_yticks([])
-        ax.set_aspect('equal', adjustable='box')
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/cmaes.py b/experiments/benchmarks/methods/cmaes.py
deleted file mode 100644
index e21200c..0000000
--- a/experiments/benchmarks/methods/cmaes.py
+++ /dev/null
@@ -1,124 +0,0 @@
-from .es import CMAES
-from matplotlib.patches import Ellipse
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from .benchmarks import plot_background, get_obj
-from .color_plate import *
-
-
-def CMAES_experiment(obj, num_steps=10, sigma_init=1):
-    es = CMAES(num_params=2, popsize=512, sigma_init=sigma_init, weight_decay=1e-3, inopts={'seed': np.nan}) # ensure reproducibility
-    
-    populations = []
-    fitnesses = []
-    mu = [np.zeros(2)]
-    cor = [np.eye(2) * sigma_init ** 2]
-    for i in range(num_steps):
-        pop = es.ask()
-        populations.append(pop)
-        mu.append(es.cma.mean)
-        cor.append(es.cma.C)
-        fitness = obj(pop)
-        es.tell(fitness)
-        fitnesses.append(fitness)
-
-    mu = np.stack(mu)
-    cor = np.stack(cor)
-    populations = np.stack(populations)
-    fitnesses = np.stack(fitnesses)
-
-    populations = torch.from_numpy(populations).float()
-    fitnesses = torch.from_numpy(fitnesses).float()
-
-    return es, mu, cor, populations, fitnesses
-
-def CMAES_plot(obj, es, mu, cor, ax=None):
-    if ax is None:
-        fig, ax = plt.subplots()
-
-    plot_background(obj, ax=ax, title='')
-
-    plt.plot(mu[:, 0], mu[:, 1], '.-', color=traj_color, label='Mean', zorder=5)
-
-    population = es.ask()
-    plt.scatter(population[:, 0], population[:, 1], c=x0_color, marker='o', alpha=0.25, zorder=10, edgecolors='none')
-    ax.set_xlabel('')
-    ax.set_ylabel('')
-    ax.set_xticks([])
-    ax.set_yticks([])
-
-    for i, (m, c) in enumerate(zip(mu, cor)):
-        eigenvalues, eigenvectors = np.linalg.eigh(c)
-        angle = np.arctan2(eigenvectors[1, 0], eigenvectors[0, 0])
-        width, height = np.sqrt(eigenvalues) * 2
-
-        alpha = (i + 1) / len(cor)
-        ellipse = Ellipse(
-            xy=m,
-            width=width * 1,
-            height=height * 1,
-            angle=np.degrees(angle),
-            linewidth=2,
-            edgecolor=traj_color,
-            facecolor='none',
-            alpha=alpha ** 0.5,
-            label='Covariance Ellipsoid'
-        )
-        if i % 2 == 0:
-            ax.add_patch(ellipse)
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-
-def prepare_data(obj, trace, arg, fitnesses):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "fitnesses": fitnesses
-    }
-    return info
-
-def CMAES_benchmark(objs, num_steps, row=0, total_row=4, total_col=5, sigma_init=4, limit_val=100, plot=False, **kwargs):
-    arg = {
-        "num_step": num_steps,
-        "sigma_init": sigma_init,
-        "limit_val": limit_val
-    }
-
-    trace = []
-    record = dict()
-
-    for i, foo_name in enumerate(objs):
-        obj, obj_rescaled = get_obj(foo_name, **kwargs)
-
-        es, mu, cor, trace, fitnesses = CMAES_experiment(obj_rescaled, num_steps=num_steps, sigma_init=sigma_init)
-        record[foo_name] = prepare_data(obj, trace, arg, fitnesses)
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + row * total_col)
-            CMAES_plot(obj, es, mu, cor, ax=ax)
-            if i == 0:
-                ax.set_ylabel('CMAES')
-    
-    return record
-
-
-if __name__ == '__main__':
-    import random
-
-    # set random seed for reproducibility
-    torch.manual_seed(0)
-    np.random.seed(0)
-    random.seed(0)
-
-    objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-
-    plt.figure(figsize=(12, 3))
-
-    record = CMAES_benchmark(objs, num_steps=10, row=0, total_row=1, limit_val=100, plot=True, sigma_init=4)
-    torch.save(record, './data/cmaes.pt')
-
-    # remove xy ticks and labels
-    plt.setp(plt.gcf().get_axes(), xticks=[], yticks=[], xlabel='', ylabel='')
-    plt.tight_layout()
-    plt.savefig('./images/cmaes.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/color_plate.py b/experiments/benchmarks/methods/color_plate.py
deleted file mode 100644
index f05f9e6..0000000
--- a/experiments/benchmarks/methods/color_plate.py
+++ /dev/null
@@ -1,11 +0,0 @@
-from matplotlib.colors import LinearSegmentedColormap
-
-traj_color = '#6F6E6E'
-x0_color = '#E93A01'
-
-# background color
-colors = ["#F9F9F9", "#7BCFEA"]
-custom_cmap = LinearSegmentedColormap.from_list("custom_cmap", colors)
-
-
-__all__ = ["x0_color", "traj_color", "custom_cmap"]
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/diff_evo.py b/experiments/benchmarks/methods/diff_evo.py
deleted file mode 100644
index ee2b048..0000000
--- a/experiments/benchmarks/methods/diff_evo.py
+++ /dev/null
@@ -1,119 +0,0 @@
-import matplotlib.pyplot as plt
-import torch
-from tqdm import tqdm
-from diffevo import DDIMScheduler, BayesianGenerator, DDIMSchedulerCosine, DDPMScheduler
-from .benchmarks import plot_background, get_obj
-from .color_plate import *
-
-
-
-def experiment(obj, num_pop=256, num_step=100, scaling=4.0, temperatures=None, disable_bar=False, dim=2, scheduler=None):
-    
-    if scheduler is None:
-        scheduler = DDIMSchedulerCosine(num_step=num_step)
-    else:
-        scheduler = scheduler(num_step=num_step)
-
-    x = torch.randn(num_pop, dim)
-
-    trace = []
-    x0_trace = []
-    fitnesses = []
-    x0_fitness = []
-
-    for t, alpha in tqdm(scheduler, total=num_step-1, disable=disable_bar):
-        fitness = obj(x * scaling)
-        fitnesses.append(fitness)
-        generator = BayesianGenerator(x, fitness, alpha, density='uniform')
-        x, x0 = generator(noise=0.1, return_x0=True)
-        x0_fit = obj(x0 * scaling)
-        x0_fitness.append(x0_fit)
-        trace.append(x.clone() * scaling)
-        x0_trace.append(x0.clone() * scaling)
-    fitness = obj(x * scaling)
-    fitnesses.append(fitness)
-    x0_fitness.append(x0_fit)
-    
-    pop = x * scaling
-    trace = torch.stack(trace)
-    x0_trace = torch.stack(x0_trace)
-    fitnesses = torch.stack(fitnesses)
-    x0_fitness = torch.stack(x0_fitness)
-    return pop, trace, x0_trace, fitnesses, x0_fitness
-
-def make_plot(obj, pop, ax=None, traj=None, x0_trace=None, num_trace=64, title=None):
-    plot_background(obj, ax=ax, title=title)
-
-    x0 = x0_trace[-1]
-    plt.scatter(x0[:, 0], x0[:, 1], c=x0_color, marker='o', alpha=0.1, zorder=10, edgecolors='none')
-
-    if traj is not None:
-        t = traj[:, :num_trace]
-        plt.plot(t[:, :, 0], t[:, :, 1], c=traj_color, alpha=0.25, zorder=5)
-
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-
-def prepare_data(obj, trace, x0_trace, arg, fitnesses, x0_fitness, benchmark_fitness=None):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "x0_trace": x0_trace,
-        "fitnesses": fitnesses,
-        "x0_fitness": x0_fitness,
-        "benchmark_fitness": benchmark_fitness
-    }
-    return info
-
-def DiffEvo_benchmark(objs, num_steps, row=0, total_row=4, total_col=5, num_pop=256, scaling=4.0, plot=False, disable_bar=False, benchmark_temperature=1.0, dim=2, scheduler=None, **kwargs):
-    arg = {
-        "limit_val": 100,
-        "num_pop": num_pop,
-        "num_step": num_steps,
-        "scaling": scaling,
-    }
-
-    shift = row * total_col
-
-    record = dict()
-
-    for i, name in enumerate(objs):
-        obj, obj_rescaled = get_obj(name, **kwargs)
-        pop, trace, x0_trace, fitnesses, x0_fitness = experiment(
-            obj_rescaled, 
-            num_pop=num_pop, 
-            num_step=num_steps, 
-            scaling=scaling, 
-            disable_bar=disable_bar,
-            dim=dim,
-            scheduler=scheduler
-            )
-        
-        _, obj_benchmark = get_obj(name, temperature=benchmark_temperature)
-        benchmark_fitness = obj_benchmark(pop)
-
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + shift)
-            plot_name = name[0].upper() + name[1:]
-            make_plot(obj, pop, ax=ax, traj=trace, x0_trace=x0_trace, title=plot_name)
-            if i == 0:
-                ax.set_ylabel('DiffEvo')
-
-        arg['limit_val'] = obj.limit_val
-        record[name] = prepare_data(obj, trace, x0_trace, arg, fitnesses, x0_fitness, benchmark_fitness)
-    
-    return record
-
-if __name__ == '__main__':
-    # set random seed for reproducibility
-    torch.manual_seed(42)
-
-    obj_names = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-    
-    plt.figure(figsize=(12, 3))
-    record = DiffEvo_benchmark(obj_names, num_steps=100, row=0, total_row=1, num_pop=512, scaling=4, plot=True)
-    torch.save(record, './data/diff_evo.pt')
-
-    plt.tight_layout()
-    plt.savefig('./images/diff_evo.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/diff_evo_highd.py b/experiments/benchmarks/methods/diff_evo_highd.py
deleted file mode 100644
index 7fa27de..0000000
--- a/experiments/benchmarks/methods/diff_evo_highd.py
+++ /dev/null
@@ -1,124 +0,0 @@
-import matplotlib.pyplot as plt
-import torch
-from tqdm import tqdm
-from diffevo import DDIMScheduler, BayesianGenerator, DDIMSchedulerCosine, DDPMScheduler, RandomProjection, LatentBayesianGenerator
-from .benchmarks import plot_background, get_obj
-from .color_plate import *
-
-
-def energy_prob_mapping(fitness, temperature):
-    energy = fitness
-    power = -energy / temperature
-    power = power - power.max() # avoid overflow without changing the results
-    p = torch.exp(power)
-    return p
-
-def experiment(obj, num_pop=256, num_step=100, scaling=4.0, latent=True, temperatures=None, disable_bar=False, noise=0.1, dim=2):
-    
-    scheduler = DDIMSchedulerCosine(num_step=num_step)
-    if latent:
-        random_map = RandomProjection(dim, 2, normalize=True)
-
-    x = torch.randn(num_pop, dim)
-
-    trace = []
-    x0_trace = []
-    fitnesses = []
-    x0_fitness = []
-
-    for t, alpha in tqdm(scheduler, total=num_step-1, disable=disable_bar):
-        fitness = obj(x * scaling)
-        fitnesses.append(fitness)
-        if latent:
-            generator = LatentBayesianGenerator(x, random_map(x).detach(), fitness, alpha, density='uniform')
-        else:
-            generator = BayesianGenerator(x, fitness, alpha, density='uniform')
-        x, x0 = generator(noise=noise, return_x0=True)
-        x0_fit = obj(x0 * scaling)
-        x0_fitness.append(x0_fit)
-        trace.append(x.clone() * scaling)
-        x0_trace.append(x0.clone() * scaling)
-    fitness = obj(x * scaling)
-    fitnesses.append(fitness)
-    x0_fitness.append(x0_fit)
-    
-    pop = x * scaling
-    trace = torch.stack(trace)
-    x0_trace = torch.stack(x0_trace)
-    fitnesses = torch.stack(fitnesses)
-    x0_fitness = torch.stack(x0_fitness)
-    return pop, trace, x0_trace, fitnesses, x0_fitness
-
-def make_plot(obj, pop, ax=None, traj=None, x0_trace=None, num_trace=64, title=None):
-    plot_background(obj, ax=ax, title=title)
-
-    x0 = x0_trace[-1]
-    plt.scatter(x0[:, 0], x0[:, 1], c=x0_color, marker='o', alpha=0.1, zorder=10, edgecolors='none')
-
-    if traj is not None:
-        t = traj[:, :num_trace]
-        plt.plot(t[:, :, 0], t[:, :, 1], c=traj_color, alpha=0.25, zorder=5)
-
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-
-def prepare_data(obj, trace, x0_trace, arg, fitnesses, x0_fitness):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "x0_trace": x0_trace,
-        "fitnesses": fitnesses,
-        "x0_fitness": x0_fitness
-    }
-    return info
-
-def DiffEvo_benchmark(objs, num_steps, row=0, total_row=4, total_col=5, num_pop=256, scaling=4.0, plot=False, disable_bar=False, dim=2, eps=1e-3, latent=True, wrapper=None, noise=0.1, **kwargs):
-    arg = {
-        "limit_val": 100,
-        "num_pop": num_pop,
-        "num_step": num_steps,
-        "scaling": scaling,
-    }
-
-    shift = row * total_col
-
-    record = dict()
-
-    for i, name in enumerate(objs):
-        obj, obj_rescaled = get_obj(name, **kwargs)
-        pop, trace, x0_trace, fitnesses, x0_fitness = experiment(
-            obj_rescaled, 
-            num_pop=num_pop, 
-            num_step=num_steps, 
-            scaling=scaling, 
-            disable_bar=disable_bar,
-            dim=dim,
-            noise=noise,
-            latent=latent
-            )
-
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + shift)
-            plot_name = name[0].upper() + name[1:]
-            make_plot(obj, pop, ax=ax, traj=trace, x0_trace=x0_trace, title=plot_name)
-            if i == 0:
-                ax.set_ylabel('DiffEvo')
-
-        arg['limit_val'] = obj.limit_val
-        record[obj.foo_name] = prepare_data(obj, trace, x0_trace, arg, fitnesses, x0_fitness)
-    
-    return record
-
-if __name__ == '__main__':
-    # set random seed for reproducibility
-    torch.manual_seed(42)
-
-    obj_names = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-    
-    plt.figure(figsize=(12, 3))
-    record = DiffEvo_benchmark(obj_names, num_steps=100, row=0, total_row=1, num_pop=512, scaling=4, plot=True)
-    torch.save(record, './data/diff_evo.pt')
-
-    plt.tight_layout()
-    plt.savefig('./images/diff_evo.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/diff_evo_latent.py b/experiments/benchmarks/methods/diff_evo_latent.py
deleted file mode 100644
index 49f315d..0000000
--- a/experiments/benchmarks/methods/diff_evo_latent.py
+++ /dev/null
@@ -1,119 +0,0 @@
-import matplotlib.pyplot as plt
-import torch
-from tqdm import tqdm
-from diffevo import DDIMScheduler, BayesianGenerator, DDIMSchedulerCosine, DDPMScheduler, RandomProjection, LatentBayesianGenerator
-from .benchmarks import plot_background, get_obj
-from .color_plate import *
-
-
-
-def experiment(obj, num_pop=256, num_step=100, scaling=4.0, temperatures=None, disable_bar=False, dim=2):
-    
-    scheduler = DDIMSchedulerCosine(num_step=num_step)
-
-    x = torch.randn(num_pop, dim)
-    random_map = RandomProjection(dim, 2, normalize=True)
-
-    trace = []
-    x0_trace = []
-    fitnesses = []
-    x0_fitness = []
-
-    for t, alpha in tqdm(scheduler, total=num_step-1, disable=disable_bar):
-        fitness = obj(x * scaling)
-        fitnesses.append(fitness)
-        generator = LatentBayesianGenerator(x, random_map(x).detach(), fitness, alpha, density='uniform')
-        x, x0 = generator(noise=0.1, return_x0=True)
-        x0_fit = obj(x0 * scaling)
-        x0_fitness.append(x0_fit)
-        trace.append(x.clone() * scaling)
-        x0_trace.append(x0.clone() * scaling)
-    fitness = obj(x * scaling)
-    fitnesses.append(fitness)
-    x0_fitness.append(x0_fit)
-    
-    pop = x * scaling
-    trace = torch.stack(trace)
-    x0_trace = torch.stack(x0_trace)
-    fitnesses = torch.stack(fitnesses)
-    x0_fitness = torch.stack(x0_fitness)
-    return pop, trace, x0_trace, fitnesses, x0_fitness
-
-def make_plot(obj, pop, ax=None, traj=None, x0_trace=None, num_trace=64, title=None):
-    plot_background(obj, ax=ax, title=title)
-
-    x0 = x0_trace[-1]
-    plt.scatter(x0[:, 0], x0[:, 1], c=x0_color, marker='o', alpha=0.1, zorder=10, edgecolors='none')
-
-    if traj is not None:
-        t = traj[:, :num_trace]
-        plt.plot(t[:, :, 0], t[:, :, 1], c=traj_color, alpha=0.25, zorder=5)
-
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-
-def prepare_data(obj, trace, x0_trace, arg, fitnesses, x0_fitness):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "x0_trace": x0_trace,
-        "fitnesses": fitnesses,
-        "x0_fitness": x0_fitness
-    }
-    return info
-
-def LatentDiffEvo_benchmark(objs, num_steps, row=0, total_row=4, total_col=5, num_pop=256, scaling=4.0, plot=False, disable_bar=False, dim=2, **kwargs):
-    arg = {
-        "limit_val": 100,
-        "num_pop": num_pop,
-        "num_step": num_steps,
-        "scaling": scaling,
-    }
-
-    shift = row * total_col
-
-    record = dict()
-
-    for i, name in enumerate(objs):
-        obj, obj_rescaled = get_obj(name, **kwargs)
-        # if name has _4d, _32d, _256d, set dim to 4, 32, 256
-        if '_4d' in name:
-            dim = 4
-        elif '_32d' in name:
-            dim = 32
-        elif '_256d' in name:
-            dim = 256
-        pop, trace, x0_trace, fitnesses, x0_fitness = experiment(
-            obj_rescaled, 
-            num_pop=num_pop, 
-            num_step=num_steps, 
-            scaling=scaling, 
-            disable_bar=disable_bar,
-            dim=dim
-            )
-
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + shift)
-            plot_name = name[0].upper() + name[1:]
-            make_plot(obj, pop, ax=ax, traj=trace, x0_trace=x0_trace, title=plot_name)
-            if i == 0:
-                ax.set_ylabel('DiffEvo')
-
-        arg['limit_val'] = obj.limit_val
-        record[name] = prepare_data(obj, trace, x0_trace, arg, fitnesses, x0_fitness)
-    
-    return record
-
-if __name__ == '__main__':
-    # set random seed for reproducibility
-    torch.manual_seed(42)
-
-    obj_names = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-    
-    plt.figure(figsize=(12, 3))
-    record = LatentDiffEvo_benchmark(obj_names, num_steps=100, row=0, total_row=1, num_pop=512, scaling=4, plot=True)
-    torch.save(record, './data/latent_diff_evo.pt')
-
-    plt.tight_layout()
-    plt.savefig('./images/latent_diff_evo.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/map_elite.py b/experiments/benchmarks/methods/map_elite.py
deleted file mode 100644
index 7fc44b9..0000000
--- a/experiments/benchmarks/methods/map_elite.py
+++ /dev/null
@@ -1,124 +0,0 @@
-from .benchmarks import plot_background, get_obj
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from copy import deepcopy
-from .color_plate import *
-
-
-def MapEliteExperiment(obj, init_num_pop=100, num_iter=256, sigma_mut=0.1, sigma_init=1, grid_size=10):
-    # https://arxiv.org/pdf/1504.04909
-    assert num_iter > init_num_pop
-    populations = []
-    maps = dict()
-    def feature_descriptor(x):
-        cls = tuple(torch.round(x * grid_size).long().tolist())
-        return cls
-
-    # generate initial population
-    pop_init = torch.randn(init_num_pop, 2) * sigma_init
-    rewards = obj(pop_init)
-    for p, r in zip(pop_init, rewards):
-        cls = feature_descriptor(p)
-        if cls not in maps:
-            maps[cls] = (p, r)
-            populations.append(p)
-        elif r > maps[cls][1]:
-            maps[cls] = (p, r)
-            populations.append(p)
-    # iterate
-    for i in range(num_iter - init_num_pop):
-        # random select a population to mutate
-        idx = np.random.randint(0, len(maps))
-        p_old = list(maps.values())[idx][0]
-        p_new = p_old + torch.randn(2) * sigma_mut
-        r_new = obj(p_new.unsqueeze(0)).squeeze(0)
-        cls = feature_descriptor(p_new)
-        if cls not in maps:
-            maps[cls] = (p_new, r_new)
-            populations.append(p_new)
-        elif r_new > maps[cls][1]:
-            maps[cls] = (p_new, r_new)
-            populations.append(p_new)
-    
-    populations = torch.stack(populations)
-    fitnesses = torch.stack([r for p, r in maps.values()])
-    return populations, maps, fitnesses
-
-def MAPElite_plot(obj, maps, ax=None, grid_size=1):
-    if ax is None:
-        fig, ax = plt.subplots()
-
-    plot_background(obj, ax=ax, title='')
-    pop_elite = torch.stack([p for p, r in maps.values()])
-    rewards = torch.stack([r for p, r in maps.values()])
-
-    plt.scatter(pop_elite[:, 0], pop_elite[:, 1], c=x0_color, alpha=0.8, marker='.', zorder=10, edgecolors='none', s=(rewards + 0.1)*100)
-    # add grid to reflect the feature_descriptor
-    # Add grid lines for feature descriptor visualization
-    grid_step = grid_size  # Since plot range is -4 to 4
-    
-    # Vertical grid lines
-    for x in np.arange(-4, 4.1, grid_step):
-        plt.axvline(x=x+0.5, color='gray', linestyle=':', alpha=0.3, zorder=999)
-    
-    # Horizontal grid lines 
-    for y in np.arange(-4, 4.1, grid_step):
-        plt.axhline(y=y+0.5, color='gray', linestyle=':', alpha=0.3, zorder=999)
-
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-
-def prepare_data(trace, arg, fitnesses, maps):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "fitnesses": fitnesses,
-        "maps": maps
-    }
-    return info
-
-def MAPElite_benchmark(objs, num_steps, row=0, grid_size=1, sigma_mut=0.5, total_row=4, total_col=5, sigma_init=4, plot=False, **kwargs):
-    init_num_pop = 256
-    arg = {
-        "num_step": num_steps,
-        "init_num_pop": init_num_pop,
-        "sigma_init": sigma_init,
-        "sigma_mut": sigma_mut,
-        "grid_size": grid_size
-    }
-
-    record = dict()
-
-    for i, foo_name in enumerate(objs):
-        obj, obj_rescaled = get_obj(foo_name, **kwargs)
-
-        # es, traj, mus, sigmas, fitnesses = PEPG_experiment(obj_rescaled, num_steps=num_steps, sigma_init=sigma_init)
-        populations, maps, fitnesses = MapEliteExperiment(obj_rescaled, 
-            init_num_pop=init_num_pop, 
-            num_iter=num_steps*init_num_pop, 
-            sigma_mut=sigma_mut, 
-            sigma_init=sigma_init, 
-            grid_size=grid_size
-        )
-        record[foo_name] = prepare_data(populations, arg, fitnesses, maps)
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + row * total_col)
-            MAPElite_plot(obj, maps, ax=ax, grid_size=grid_size)
-            if i == 0:
-                ax.set_ylabel(f"MAP-Elite")
-    
-    return record
-
-if __name__ == '__main__':
-
-    objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-
-    plt.figure(figsize=(12, 3))
-
-    record = MAPElite_benchmark(objs, 10, 0, total_row=1, plot=True)
-    torch.save(record, './data/map_elite.pt')
-    plt.tight_layout()
-
-    plt.savefig('./images/MAPElite.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/openes.py b/experiments/benchmarks/methods/openes.py
deleted file mode 100644
index dbbfb34..0000000
--- a/experiments/benchmarks/methods/openes.py
+++ /dev/null
@@ -1,122 +0,0 @@
-from .benchmarks import plot_background, get_obj
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from .color_plate import *
-
-
-class OpenES:
-    def __init__(self, num_params, popsize, sigma_init=1, learning_rate=1e-3, learning_rate_decay=1, sigma_decay=1, momentum=0.9):
-        self.num_params = num_params
-        self.popsize = popsize
-        self.sigma = np.ones(num_params) * sigma_init
-        self.learning_rate = learning_rate
-        self.sigma_decay = sigma_decay
-        self.learning_rate_decay = learning_rate_decay
-        self.momentum = momentum
-
-        self.theta = np.zeros(num_params)
-        self.velocity = np.zeros(num_params)
-        self.eps = None
-    
-    def ask(self):
-        self.eps = np.random.randn(self.popsize, self.num_params)
-        return self.theta + self.sigma * self.eps
-    
-    def tell(self, fitnesses):
-        fitnesses = np.array(fitnesses).reshape(-1, 1)
-        dmu = (fitnesses * self.eps).mean(axis=0) / self.sigma #* (self.popsize ** 0.5)
-        
-        # Apply momentum
-        self.velocity = self.momentum * self.velocity + (1 - self.momentum) * dmu
-        self.theta += self.learning_rate * self.velocity
-        
-        self.sigma = self.sigma * self.sigma_decay
-        self.learning_rate = self.learning_rate * self.learning_rate_decay
-
-
-def OpenES_experiment(obj, num_steps=100, sigma_init=1):
-    es = OpenES(
-        num_params=2, 
-        popsize=512, 
-        sigma_init=sigma_init, 
-        learning_rate=1000,
-        learning_rate_decay=0.00001**(1/num_steps), 
-        sigma_decay=0.01**(1/num_steps), 
-        )
-    
-    populations = []
-    fitnesses = []
-    mu = []
-
-    for i in range(num_steps):
-        pop = es.ask()
-        populations.append(pop)
-
-        fitness = obj(pop)
-        fitnesses.append(fitness)
-        mu.append(es.theta.copy())
-        es.tell(fitness)
-    
-    populations = torch.from_numpy(np.stack(populations)).float()
-    fitnesses = torch.from_numpy(np.stack(fitnesses)).float()
-    mu = torch.from_numpy(np.stack(mu)).float()
-    return es, populations, fitnesses, mu
-
-def prepare_data(trace, arg, fitnesses):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "fitnesses": fitnesses
-    }
-    return info
-
-def OpenES_plot(obj, es, traj, mu, ax=None, traces=32):
-    if ax is None:
-        fig, ax = plt.subplots()
-
-    plot_background(obj, ax=ax, title='')
-    pop = traj[-1]
-    plt.scatter(pop[:, 0], pop[:, 1], c=x0_color, alpha=0.5, marker='.', zorder=10, edgecolors='none')
-    for i, history in enumerate(traj[-10:]):
-        alpha = 0.5 * (i / len(traj))
-        plt.plot(history[:traces, 0], history[:traces, 1], '.', c='white', alpha=alpha, zorder=5)
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-    plt.plot(mu[::1, 0], mu[::1, 1], '-', color=traj_color, zorder=4, alpha=0.5)
-
-def OpenES_benchmark(objs, num_steps, row=0, total_row=4, total_col=5, sigma_init=4, plot=False, **kwargs):
-    arg = {
-        "num_step": num_steps,
-        "sigma_init": sigma_init
-    }
-
-    record = dict()
-
-    for i, foo_name in enumerate(objs):
-        obj, obj_rescaled = get_obj(foo_name, **kwargs)
-
-        es, traj, fitnesses, mu = OpenES_experiment(obj_rescaled, num_steps=num_steps, sigma_init=sigma_init)
-        record[foo_name] = prepare_data(traj, arg, fitnesses)
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + row * total_col)
-            OpenES_plot(obj, es, traj, mu, ax=ax)
-            if i == 0:
-                ax.set_ylabel(f"OpenES")
-    
-    return record
-
-
-if __name__ == '__main__':
-
-    objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-
-    plt.figure(figsize=(12, 3))
-
-    record = OpenES_benchmark(objs, 1000, 0, total_row=1, plot=True, sigma_init=4)
-    torch.save(record, './data/OpenES.pt')
-    plt.setp(plt.gcf().get_axes(), xticks=[], yticks=[], xlabel='', ylabel='')
-    plt.tight_layout()
-
-    plt.savefig('./images/OpenES.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/methods/pepg.py b/experiments/benchmarks/methods/pepg.py
deleted file mode 100644
index 8d8e282..0000000
--- a/experiments/benchmarks/methods/pepg.py
+++ /dev/null
@@ -1,106 +0,0 @@
-from .benchmarks import plot_background, get_obj
-from .es import PEPG
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from copy import deepcopy
-from matplotlib.patches import Ellipse
-from .color_plate import *
-
-# https://www.sciencedirect.com/science/article/pii/S0893608009003220
-
-
-def PEPG_experiment(obj, num_steps=10, sigma_init=1):
-    es = PEPG(
-        num_params=2, 
-        popsize=512, 
-        sigma_init=sigma_init, 
-        sigma_decay=0.01**(1/num_steps), 
-        elite_ratio=0.1 # Elite ratio can lead to multiple solutions
-        )
-    
-    populations = []
-    mus = []
-    sigmas = []
-    fitnesses = []
-    for i in range(num_steps):
-        pop = es.ask()
-        mus.append(deepcopy(es.mu))
-        sigmas.append(deepcopy(es.sigma))
-        populations.append(deepcopy(pop))
-        fitness = obj(pop)
-        fitnesses.append(fitness)
-        es.tell(fitness)
-    
-    populations = torch.from_numpy(np.stack(populations)).float()
-    fitnesses = torch.from_numpy(np.stack(fitnesses)).float()
-
-    return es, populations, np.stack(mus), np.stack(sigmas), fitnesses
-
-def prepare_data(trace, arg, fitnesses):
-    info = {
-        "arguments": arg,
-        "trace": trace,
-        "fitnesses": fitnesses
-    }
-    return info
-
-def PEPG_plot(obj, es, mus, sigmas, ax=None, traces=16):
-    if ax is None:
-        fig, ax = plt.subplots()
-
-    plot_background(obj, ax=ax, title='')
-    pop = es.ask()
-    plt.scatter(pop[:, 0], pop[:, 1], c=x0_color, alpha=0.5, marker='.', zorder=10, edgecolors='none')
-    plt.plot(mus[:, 0], mus[:, 1], '.-', c=traj_color, alpha=1, zorder=0)
-
-    # plot sigma ranges
-    for i, (mu, sigma) in enumerate(zip(mus, sigmas)):
-        alpha = (i + 1) / len(mus)
-        ellipse = Ellipse(
-            xy=mu,
-            width=sigma[0] * 2,
-            height=sigma[1] * 2,
-            linewidth=2,
-            edgecolor=traj_color,
-            facecolor='none',
-            alpha=0.5 * alpha ** 0.5,
-        )
-        ax.add_patch(ellipse)
-    plt.xlim(-4, 4)
-    plt.ylim(-4, 4)
-
-def PEPG_benchmark(objs, num_steps, row=0, total_row=4, total_col=5, sigma_init=4, plot=False, **kwargs):
-    arg = {
-        "num_step": num_steps,
-        "sigma_init": sigma_init
-    }
-
-    record = dict()
-
-    for i, foo_name in enumerate(objs):
-        obj, obj_rescaled = get_obj(foo_name, **kwargs)
-
-        es, traj, mus, sigmas, fitnesses = PEPG_experiment(obj_rescaled, num_steps=num_steps, sigma_init=sigma_init)
-        record[foo_name] = prepare_data(traj, arg, fitnesses)
-        if plot:
-            ax = plt.subplot(total_row, total_col, i + 1 + row * total_col)
-            PEPG_plot(obj, es, mus, sigmas, ax=ax)
-            if i == 0:
-                ax.set_ylabel(f"PEPG")
-    
-    return record
-
-
-if __name__ == '__main__':
-
-    objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-
-    plt.figure(figsize=(12, 3))
-
-    record = PEPG_benchmark(objs, 10, 0, total_row=1, plot=True)
-    torch.save(record, './data/pepg.pt')
-    plt.tight_layout()
-
-    plt.savefig('./images/PEPG.png')
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/plot_alphas.py b/experiments/benchmarks/plot_alphas.py
deleted file mode 100644
index ae8efd7..0000000
--- a/experiments/benchmarks/plot_alphas.py
+++ /dev/null
@@ -1,120 +0,0 @@
-import torch
-import matplotlib.pyplot as plt
-import numpy as np
-import os
-
-name_table = {
-    'DDIMSchedulerCosine': 'Cosine',
-    'DDPMScheduler': 'DDPM',
-    'DDIMScheduler': 'Linear',
-}
-
-colors = ['#6F6E6E', '#F5851E', '#343434']
-
-import matplotlib
-matplotlib.rcParams['mathtext.fontset'] = 'stix'
-matplotlib.rcParams['font.family'] = 'STIXGeneral'
-
-def get_avg_fitness(record, idx_experiment, idx_step, top_n=1e9):
-    data = record['records'][idx_experiment][idx_step]['rastrigin']
-    num_steps = data['arguments']['num_step']
-    all_fitnesses = data['fitnesses'][-1]
-    all_fitnesses = all_fitnesses.sort().values
-    top_n = min(top_n, len(all_fitnesses))
-    top_n_fitnesses = all_fitnesses[-top_n:]
-    return top_n_fitnesses.mean().item(), num_steps
-
-def get_step_fitness(record, top_n=1e9):
-    total_step_fitness = []
-    x = []
-    for idx_exp in range(len(record['records'])):
-        y = []
-        for idx_step in range(len(record['records'][idx_exp])):
-            avg, num_steps = get_avg_fitness(record, idx_exp, idx_step, top_n)
-            if idx_exp == 0:  # Only need to collect x once
-                x.append(num_steps)
-            y.append(avg)
-        total_step_fitness.append(y)
-
-    total_step_fitness = np.array(total_step_fitness).mean(axis=0)
-    return total_step_fitness, x, record['scheduler']
-
-def get_step_fitness_std(record, top_n=1e9):
-    total_step_fitness = []
-    x = []
-    for idx_exp in range(len(record['records'])):
-        y = []
-        for idx_step in range(len(record['records'][idx_exp])):
-            std, num_steps = get_avg_fitness(record, idx_exp, idx_step, top_n)
-            if idx_exp == 0:
-                x.append(num_steps)
-            y.append(std)
-        total_step_fitness.append(y)
-    total_step_fitness = np.array(total_step_fitness).std(axis=0)
-    return total_step_fitness, x, record['scheduler']
-
-def main():
-    # Load data
-    folder = './data/schedulers'
-    schedulers = os.listdir(folder)
-    all_records = []
-
-    for scheduler in schedulers:
-        records = torch.load(f'{folder}/{scheduler}')
-        all_records.append(records)
-
-    # Create plot
-    plt.figure(figsize=(8, 3))
-    plt.subplot(1, 2, 2)
-    for idx_scheduler in range(len(all_records)):
-        total_step_fitness, x, scheduler = get_step_fitness(all_records[idx_scheduler], top_n=64)
-        total_step_fitness_std, x_std, scheduler_std = get_step_fitness_std(all_records[idx_scheduler])
-        # Add dots at center points
-        plt.plot(x, total_step_fitness, 'o',
-                color=colors[idx_scheduler], 
-                markersize=4)
-        # Add dashes at top and bottom of error bars
-        plt.plot(x, total_step_fitness + total_step_fitness_std, '_',
-                color=colors[idx_scheduler],
-                markersize=5)
-        plt.plot(x, total_step_fitness - total_step_fitness_std, '_', 
-                color=colors[idx_scheduler],
-                markersize=5)
-        # Plot error bars with caps
-        plt.errorbar(x, total_step_fitness, yerr=total_step_fitness_std,
-                    label=name_table[scheduler], color=colors[idx_scheduler],
-                    capsize=5, capthick=1, elinewidth=1,
-                    fmt='-', # Line only
-                    marker='None') # No markers on the line
-
-    # Configure plot
-    plt.semilogx()
-    plt.legend(loc='lower right')
-    plt.xlabel('Number of total steps')
-    plt.ylabel('Average fitness (top 64 elites)')
-    plt.title(r'(b) compare performance')
-    # Demostrate different alphas
-    plt.subplot(1, 2, 1)
-    T = 100
-    t = torch.linspace(0, T, T)
-    alpha_linear = 1 - t / T
-    alpha_cosine = torch.cos(t * np.pi / T) / 2 + 0.5
-    beta0 = 0.0003
-    gamma = 0.069
-    alpha_ddpm = torch.exp(-beta0 * t - gamma * (t ** 2) / T)
-    plt.plot(t, alpha_linear, label='Linear', color=colors[0])
-    plt.plot(t, alpha_cosine, label='Cosine', color=colors[1])
-    plt.plot(t, alpha_ddpm, label='DDPM', color=colors[2])
-    plt.legend()
-    plt.xlabel('$t$')
-    plt.ylabel('$\\alpha$')
-    plt.title(r'(a) $\alpha$ schedule')
-    
-    plt.tight_layout()
-    # Save and show plot
-    os.makedirs('./figures', exist_ok=True)
-    plt.savefig('./figures/alpha.png', dpi=300)
-    plt.savefig('./figures/alpha.pdf', bbox_inches='tight')
-
-if __name__ == "__main__":
-    main()
diff --git a/experiments/benchmarks/plotbenchmark.py b/experiments/benchmarks/plotbenchmark.py
deleted file mode 100644
index 4a6ebe1..0000000
--- a/experiments/benchmarks/plotbenchmark.py
+++ /dev/null
@@ -1,40 +0,0 @@
-import matplotlib.pyplot as plt
-from methods import *
-import torch
-import numpy as np
-import random
-
-objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin"]
-
-if __name__ == '__main__':
-    # set random seed for reproducibility
-    seed = 1
-    torch.manual_seed(seed)
-    np.random.seed(seed)
-    random.seed(seed)
-
-    num_benchmark = 4
-    plt.figure(figsize=(12, 1 + num_benchmark * 2))
-
-    temperature = 0.25
-    # DiffEvo
-    record = DiffEvo_benchmark(objs, num_steps=25, row=0, total_row=num_benchmark, plot=True, num_pop=512, temperature=temperature)
-    torch.save(record, './data/diff_evo.pt')
-
-    # CMAES
-    record = CMAES_benchmark(objs, num_steps=25, row=1, total_row=num_benchmark, limit_val=100, plot=True, temperature=temperature)
-    torch.save(record, './data/cmaes.pt')
-
-    # OpenES
-    record = OpenES_benchmark(objs, num_steps=1000, row=2, total_row=num_benchmark, plot=True, temperature=temperature)
-    torch.save(record, './data/openes.pt')
-
-    # MAPElite
-    record = MAPElite_benchmark(objs, num_steps=25, row=3, total_row=num_benchmark, plot=True, temperature=temperature)
-    torch.save(record, './data/map_elite.pt')
-
-    # save the plot
-    plt.tight_layout()
-    plt.savefig('./images/benchmark.png')
-    plt.savefig('./images/benchmark.pdf', transparent=True)
-    plt.close()
\ No newline at end of file
diff --git a/experiments/benchmarks/readme.md b/experiments/benchmarks/readme.md
deleted file mode 100644
index 243977c..0000000
--- a/experiments/benchmarks/readme.md
+++ /dev/null
@@ -1,40 +0,0 @@
-# Benchmark Experiments on 2D Fitness
-
-To run the experiment:
-
-```bash
-python plotbenchmark.py
-```
-
-The plots are saved in `./images`. And running data are saved in `./data/{obj.foo_name}.pt`. They can be loaded by `torch.load`. An example format is:
-
-```json
-{
-    "name": "himmelblau",
-    "arguments": 
-        {
-            "limit_val": 100,
-            "num_pop": 256,
-            "num_step": 100,
-            "scaling": 4.0,
-        },
-    "trace": "[[p_1, p_2, ..., p_n], [<generation 2>], ...]",
-    "fitnesses": "[f_1, f_2, ..., f_n]"
-}
-```
-
-To run the statistics (100 experiments for each algorithm):
-
-```bash
-python statistics.py
-```
-
-## Results
-
-This algorithm finds all optimial points in the benchmarks. In the following figures, the white region represents high fitness, while blue means low fitness. The gray lines are the trajectories of a populations. For simplicity, here we only plot the trajectories of 64 populations.
-
-![](./images/benchmark.png)
-
-## Ideas
-
-- Using different probability mapping to get better results
\ No newline at end of file
diff --git a/experiments/benchmarks/run_alphas.py b/experiments/benchmarks/run_alphas.py
deleted file mode 100644
index 7809ba1..0000000
--- a/experiments/benchmarks/run_alphas.py
+++ /dev/null
@@ -1,68 +0,0 @@
-import matplotlib.pyplot as plt
-from methods import *
-import torch
-import os
-import numpy as np
-import random
-from tqdm import tqdm
-import pandas as pd
-import argparse
-from diffevo import DDIMScheduler, DDIMSchedulerCosine, DDPMScheduler
-
-# experiment with different alphas on different total steps
-
-obj = "rastrigin"
-steps = [10, 25, 50, 100, 250, 500, 1000]
-
-num_steps = 25
-
-
-def get_records(num_experiments, scheduler, scheduler_name):
-    all_records = dict()
-
-    pop_size = 512
-
-    records = []
-    print(f"Running DiffEvo_benchmark...")
-
-    for i in tqdm(range(num_experiments)):
-        records_per_exp = []
-        for step in steps:
-            r = DiffEvo_benchmark([obj], num_steps=step, disable_bar=True, limit_val=100, num_pop=pop_size, init_num_pop=pop_size, scheduler=scheduler)
-            records_per_exp.append(r)
-        records.append(records_per_exp)
-    
-    result = {
-        "scheduler": scheduler_name,
-        "num_experiments": num_experiments,
-        "records": records
-    }
-        
-    return result
-
-
-if __name__ == '__main__':
-
-    # Parse command line arguments
-    parser = argparse.ArgumentParser(description='Run optimization benchmarks')
-    parser.add_argument('--scheduler', type=str, default='DDIMSchedulerCosine')
-    parser.add_argument('--num_experiments', type=int, default=100)
-    args = parser.parse_args()
-
-    schedulers = {
-        "DDIMSchedulerCosine": DDIMSchedulerCosine,
-        "DDPMScheduler": DDPMScheduler,
-        "DDIMScheduler": DDIMScheduler
-    }
-
-    if args.scheduler not in schedulers:
-        raise ValueError(f"Scheduler {args.scheduler} not supported")
-
-    # set random seed
-    random.seed(42)
-    np.random.seed(42)
-    torch.manual_seed(42)
-
-    records = get_records(args.num_experiments, schedulers[args.scheduler], args.scheduler)
-    # save to ./data/schedulers/
-    torch.save(records, f'./data/schedulers/{args.scheduler}.pt')
diff --git a/experiments/benchmarks/run_benchmarks.py b/experiments/benchmarks/run_benchmarks.py
deleted file mode 100644
index 764e701..0000000
--- a/experiments/benchmarks/run_benchmarks.py
+++ /dev/null
@@ -1,96 +0,0 @@
-import matplotlib.pyplot as plt
-from methods import *
-import torch
-import os
-import numpy as np
-import random
-from tqdm import tqdm
-import pandas as pd
-import argparse
-
-
-objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin", "rastrigin_4d", "rastrigin_32d", "rastrigin_256d"]
-
-experiments = {
-    "diffevo":{
-        "method": DiffEvo_benchmark,
-        "num_steps": 25
-    },
-    "latentdiffevo": {
-        "method": LatentDiffEvo_benchmark,
-        "num_steps": 25
-    },
-    "cmaes": {
-        "method": CMAES_benchmark, 
-        "num_steps": 25
-    },
-    "openes": {
-        "method": OpenES_benchmark,
-        "num_steps": 1000
-    },
-    "pepg": {
-        "method": PEPG_benchmark,
-        "num_steps": 25
-    },
-    "mapelite": {
-        "method": MAPElite_benchmark,
-        "num_steps": 25
-    }
-}
-
-
-def get_all_records(num_experiments, exp_names):
-    all_records = dict()
-
-    pop_size = 512
-    methods = [experiments[name]['method'] for name in exp_names]
-    num_steps = [experiments[name]['num_steps'] for name in exp_names]
-
-    assert len(methods) == len(num_steps)
-
-    for method, step in zip(methods, num_steps):
-        name = method.__name__
-        records = []
-        print(f"Running {name}...")
-
-        for i in tqdm(range(num_experiments)):
-            r = method(objs, num_steps=step, disable_bar=True, limit_val=100, num_pop=pop_size, init_num_pop=pop_size)
-            records.append(r)
-        
-        # save to ./data/records/
-        torch.save(records, f'./data/records/{name}.pt')
-        all_records[name] = records
-    
-    return all_records
-
-
-if __name__ == '__main__':
-    num_experiments = 100
-    top_k = 64
-
-    # Parse command line arguments
-    parser = argparse.ArgumentParser(description='Run optimization benchmarks')
-    parser.add_argument('--experiments', nargs='+', default=['all'],
-                       help='List of experiments to run. Use experiment names or "all" for all experiments. '
-                            'Valid names: diffevo, latentdiffevo, cmaes, openes, pepg, mapelite')
-    args = parser.parse_args()
-
-    # Determine which experiments to run
-    if 'all' in args.experiments:
-        exp_names = list(experiments.keys())
-    else:
-        # Validate experiment names
-        valid_names = set(experiments.keys())
-        exp_names = []
-        for name in args.experiments:
-            if name not in valid_names:
-                raise ValueError(f'Invalid experiment name: {name}. '
-                               f'Valid names are: {", ".join(valid_names)}')
-            exp_names.append(name)
-
-    # set random seed
-    random.seed(42)
-    np.random.seed(42)
-    torch.manual_seed(42)
-
-    get_all_records(num_experiments, exp_names)
\ No newline at end of file
diff --git a/experiments/benchmarks/run_temperature.py b/experiments/benchmarks/run_temperature.py
deleted file mode 100644
index 525930b..0000000
--- a/experiments/benchmarks/run_temperature.py
+++ /dev/null
@@ -1,55 +0,0 @@
-import matplotlib.pyplot as plt
-from methods import *
-import torch
-import os
-import numpy as np
-import random
-from tqdm import tqdm
-import pandas as pd
-import argparse
-
-
-objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin", "rastrigin_4d", "rastrigin_32d", "rastrigin_256d"]
-
-method = DiffEvo_benchmark
-num_steps = 25
-
-
-def get_records(num_experiments, temperature):
-    all_records = dict()
-
-    pop_size = 512
-
-    name = method.__name__
-    records = []
-    print(f"Running {name}...")
-
-    for i in tqdm(range(num_experiments)):
-        r = method(objs, num_steps=num_steps, disable_bar=True, limit_val=100, num_pop=pop_size, init_num_pop=pop_size, temperature=temperature)
-        records.append(r)
-    
-    result = {
-        "temperature": temperature,
-        "num_experiments": num_experiments,
-        "records": records
-    }
-        
-    return result
-
-
-if __name__ == '__main__':
-
-    # Parse command line arguments
-    parser = argparse.ArgumentParser(description='Run optimization benchmarks')
-    parser.add_argument('--temperature', type=float, default=1.0)
-    parser.add_argument('--num_experiments', type=int, default=100)
-    args = parser.parse_args()
-
-    # set random seed
-    random.seed(42)
-    np.random.seed(42)
-    torch.manual_seed(42)
-
-    records = get_records(args.num_experiments, args.temperature)
-    # save to ./data/temperatures/
-    torch.save(records, f'./data/temperatures/temperature_{args.temperature}.pt')
diff --git a/experiments/benchmarks/statistic.py b/experiments/benchmarks/statistic.py
deleted file mode 100644
index fd93f67..0000000
--- a/experiments/benchmarks/statistic.py
+++ /dev/null
@@ -1,169 +0,0 @@
-import matplotlib.pyplot as plt
-import torch
-import os
-import numpy as np
-import random
-from tqdm import tqdm
-import pandas as pd
-
-
-objs = ["rosenbrock", "beale", "himmelblau", "ackley", "rastrigin", "rastrigin_4d", "rastrigin_32d", "rastrigin_256d"]
-
-
-def statistics(func):
-    """apply the func to each record of a list of experiments
-
-    Args of decorated function:
-        records: list of records of experiments
-            structure: experiments[experiment_1[fitness_func_1, ...], ...]
-    
-    Returns:
-        list of statistics of each experiment
-            structure: [num_experiments, num_fitness_funcs, *statistics]
-    """
-    def wrapper(records, *args, **kwargs):
-        results = []
-        for record in records:
-            result_temp = {}
-            for fitness_func in record.keys():
-                result_temp[fitness_func] = func(record[fitness_func], *args, **kwargs)
-            results.append(result_temp)
-        return results
-    return wrapper
-
-def group(statistics:list):
-    results = {}
-    for measure in statistics:
-        for fitness_func in measure.keys():
-            if fitness_func not in results:
-                results[fitness_func] = []
-            results[fitness_func].append(measure[fitness_func])
-    return results
-
-def avg_group(statistics:list):
-    grouped = group(statistics)
-    for k, v in grouped.items():
-        grouped[k] = np.mean(v, axis=0)
-    
-    return grouped
-
-def std_group(statistics:list):
-    grouped = group(statistics)
-    for k, v in grouped.items():
-        grouped[k] = np.std(v, axis=0)
-    
-    return grouped
-
-def get_top_values(fitness, x, n):
-    idx = np.argsort(-fitness)[:n]
-    return x[idx]
-
-@statistics
-def top_rewards(record, n=None, use_x0=False):
-    if use_x0:
-        fitnesses = record['x0_fitness']
-    else:
-        fitnesses = record['fitnesses']
-    
-    if n is not None:
-        if len(fitnesses.shape) == 1:
-            fitnesses = fitnesses.unsqueeze(0)
-        fitnesses = fitnesses[-1]
-        fitnesses = get_top_values(fitnesses, fitnesses, n)
-    else:
-        fitnesses = fitnesses[-1]
-    return fitnesses.mean().item()
-
-def prob(x, scale=10):
-    classification = torch.round(x * scale).long()
-    # count the number of points in each class, return [class, num]
-    classes, num = torch.unique(classification, return_counts=True, dim=0)
-    prob = num.float() / num.sum()
-    return prob
-
-def entropy(x, scale=10):
-    p = prob(x, scale)
-    return torch.sum(-p * torch.log2(p))
-
-@statistics
-def point_entropy(record, n=None, scale=10, use_x0=False, name=None):
-    if name != 'MAPElite_benchmark':
-        x = record['trace'][-1]
-        if use_x0:
-            fitnesses = record['x0_fitness']
-        else:
-            fitnesses = record['fitnesses']
-    else:
-        x = [p for p, r in record['maps'].values()]
-        # print(x)
-        x = torch.stack(x)
-        # print(record['fitnesses'])
-        fitnesses = record['fitnesses'].unsqueeze(0)
-    
-    if n is not None:
-        x = get_top_values(fitnesses[-1], x, n)
-    return entropy(x, scale).item()
-
-
-if __name__ == '__main__':
-    top_k = 64
-
-    methods = ['DiffEvo_benchmark', 'LatentDiffEvo_benchmark', 'CMAES_benchmark', 'PEPG_benchmark', 'OpenES_benchmark', 'MAPElite_benchmark']
-
-    print('Loading records...')
-    all_records = {}
-    for method_name in methods:
-        all_records[method_name] = torch.load(f'./data/records/{method_name}.pt')
-    print('Done!')
-
-    # add title
-    with open('./data/results.md', 'w') as f:
-        f.write('# Benchmark Results\n\n')
-    
-    # entropy
-    entropy_table = pd.DataFrame()
-    entropy_std = pd.DataFrame()
-    for method_name, records in all_records.items():
-        use_x0 = (method_name=='DiffEvo_benchmark' or method_name=='LatentDiffEvo_benchmark')
-        average_grouped = avg_group(point_entropy(records, n=top_k, use_x0=use_x0, name=method_name)).items()
-        std_grouped = std_group(point_entropy(records, n=top_k, use_x0=use_x0, name=method_name)).items()
-        for k, v in average_grouped:
-            entropy_table.loc[k, method_name.replace("_benchmark", "")] = v
-        for k, v in std_grouped:
-            entropy_std.loc[k, method_name.replace("_benchmark", "")] = v
-    
-    # save to ./data/entropy_top_<top_k>.csv
-    entropy_table.to_csv(f'./data/entropy_top_{top_k}.csv')
-    entropy_std.to_csv(f'./data/entropy_std_top_{top_k}.csv')
-    
-    # fitness
-    fitness_table = pd.DataFrame()
-    fitness_std = pd.DataFrame()
-    for method_name, records in all_records.items():
-        use_x0 = (method_name=='DiffEvo_benchmark' or method_name=='LatentDiffEvo_benchmark')
-        average_grouped = avg_group(top_rewards(records, n=top_k, use_x0=use_x0)).items()
-        std_grouped = std_group(top_rewards(records, n=top_k, use_x0=use_x0)).items()
-        for k, v in average_grouped:
-            fitness_table.loc[k, method_name.replace("_benchmark", "")] = v
-        for k, v in std_grouped:
-            fitness_std.loc[k, method_name.replace("_benchmark", "")] = v
-    
-    # save to ./data/fitness_top_<top_k>.csv
-    fitness_table.to_csv(f'./data/fitness_top_{top_k}.csv')
-    fitness_std.to_csv(f'./data/fitness_std_top_{top_k}.csv')
-
-    # merge two tables together, each cell is "entropy (fitness)"
-    # use string to format
-    merged_table = pd.DataFrame()
-    for i in range(len(entropy_table)):
-        for j in range(len(entropy_table.columns)):
-            # merged_table.loc[i, j] = f"{entropy_table.iloc[i, j]:.2f} ({entropy_std.iloc[i, j]:.2f}), {fitness_table.iloc[i, j]:.2f} ({fitness_std.iloc[i, j]:.2f})"
-            merged_table.loc[i, j] = f"{entropy_table.iloc[i, j]:.2f} ({fitness_table.iloc[i, j]:.2f})"
-    # add row and column index
-    merged_table.index = entropy_table.index
-    merged_table.columns = entropy_table.columns
-    
-    with open('./data/results.md', 'a') as f:
-        f.write('## Result Table\n\n')
-        f.write('Each cell is entropy (fitness)\n\n')
-        f.write(merged_table.to_markdown(floatfmt=".2f") + '\n\n')
\ No newline at end of file
diff --git a/experiments/benchmarks/temperatures.sh b/experiments/benchmarks/temperatures.sh
deleted file mode 100644
index 50bd693..0000000
--- a/experiments/benchmarks/temperatures.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-python run_temperature.py --temperature 0.1 --num_experiments 10
-python run_temperature.py --temperature 0.5 --num_experiments 10
-python run_temperature.py --temperature 1.0 --num_experiments 10
-python run_temperature.py --temperature 2.0 --num_experiments 10
-python run_temperature.py --temperature 5.0 --num_experiments 10
-python run_temperature.py --temperature 10.0 --num_experiments 10
\ No newline at end of file
diff --git a/init.py b/init.py
new file mode 100644
index 0000000..214c229
--- /dev/null
+++ b/init.py
@@ -0,0 +1,64 @@
+import os
+import sys
+import subprocess
+import shutil
+
+# --- Environment Setup Script ---
+
+SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+VENV_DIR = os.path.join(SCRIPT_DIR, ".venv")
+
+def prompt_reset():
+    """Prompts the user to reset the environment and returns their choice."""
+    while True:
+        response = input(f"A virtual environment already exists at '{VENV_DIR}'.\n"
+                         "Would you like to reset it? (y/n): ").lower().strip()
+        if response in ['y', 'yes']:
+            return True
+        if response in ['n', 'no']:
+            return False
+        print("Invalid input. Please enter 'y' or 'n'.")
+
+def setup_environment():
+    """
+    Sets up the virtual environment, handling creation, dependency installation,
+    and user prompts for resetting.
+    """
+    if os.path.exists(VENV_DIR):
+        if not prompt_reset():
+            print("Setup aborted. Using the existing environment.")
+            sys.exit(0)
+        print("Resetting the virtual environment...")
+        shutil.rmtree(VENV_DIR)
+
+    print("Creating a new virtual environment...")
+    try:
+        subprocess.run([sys.executable, "-m", "venv", VENV_DIR], check=True, cwd=SCRIPT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+
+        venv_python = os.path.join(VENV_DIR, "bin", "python") if sys.platform != "win32" else os.path.join(VENV_DIR, "Scripts", "python.exe")
+
+        print("Installing dependencies...")
+        subprocess.run([venv_python, "-m", "pip", "install", "--upgrade", "pip"], check=True, cwd=SCRIPT_DIR, capture_output=True)
+        subprocess.run([venv_python, "-m", "pip", "install", "-r", "requirements-dev.txt"], check=True, cwd=SCRIPT_DIR, capture_output=True)
+        subprocess.run([venv_python, "-m", "pip", "install", "-e", "."], check=True, cwd=SCRIPT_DIR, capture_output=True)
+
+        print("\n--- Environment setup complete! ---")
+        print(f"To activate it, run: source {os.path.join(os.path.basename(VENV_DIR), 'bin', 'activate')}")
+        print("You can now run experiments using 'python run.py <config_file>'.")
+
+    except subprocess.CalledProcessError as e:
+        print("\n--- ERROR: Failed to set up the environment ---", file=sys.stderr)
+        print(f"--- Command '{' '.join(e.cmd)}' returned non-zero exit status {e.returncode}. ---", file=sys.stderr)
+        if e.stdout:
+            print("\n--- STDOUT ---", file=sys.stderr)
+            print(e.stdout.decode(), file=sys.stderr)
+        if e.stderr:
+            print("\n--- STDERR ---", file=sys.stderr)
+            print(e.stderr.decode(), file=sys.stderr)
+        sys.exit(1)
+    except Exception as e:
+        print(f"\nAn unexpected error occurred: {e}", file=sys.stderr)
+        sys.exit(1)
+
+if __name__ == '__main__':
+    setup_environment()
diff --git a/plugins/__init__.py b/plugins/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/plugins/callbacks/__init__.py b/plugins/callbacks/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/plugins/callbacks/callbacks.py b/plugins/callbacks/callbacks.py
new file mode 100644
index 0000000..0d2434f
--- /dev/null
+++ b/plugins/callbacks/callbacks.py
@@ -0,0 +1,97 @@
+# This file will define the Callback abstraction for logging, plotting, etc.
+from abc import ABC, abstractmethod
+import os
+import pandas as pd
+import numpy as np
+import matplotlib.pyplot as plt
+import torch
+
+from diffevo.callbacks import Callback
+
+class ConsoleLogger(Callback):
+    """A simple callback to log progress to the console."""
+    def on_experiment_start(self, orchestrator):
+        print(f"--- Starting Experiment: {orchestrator.config.name} ---")
+        print(f"Problem: {orchestrator.problem.name}, Optimizer: {orchestrator.config.optimizer.class_name}")
+
+    def on_step_end(self, orchestrator, step_data):
+        step = step_data['step']
+        best_fitness = step_data['best_fitness']
+        print(f"Step {step}: Best Fitness = {best_fitness:.4f}")
+
+    def on_experiment_end(self, orchestrator):
+        print(f"--- Experiment Finished ---")
+
+
+class CSVLogger(Callback):
+    """Callback to log all fitness scores at each step to a CSV file."""
+    def on_experiment_start(self, orchestrator):
+        self.output_dir = orchestrator.output_dir
+        self.records = []
+
+    def on_step_end(self, orchestrator, step_data):
+        step = step_data['step']
+        fitnesses = step_data['fitnesses']
+        for i, fitness in enumerate(fitnesses):
+            self.records.append({
+                'step': step,
+                'run_id': orchestrator.run_id,
+                'individual_id': i,
+                'fitness': fitness.item()
+            })
+
+    def on_experiment_end(self, orchestrator):
+        df = pd.DataFrame(self.records)
+        report_path = os.path.join(self.output_dir, "fitness_log.csv")
+        df.to_csv(report_path, index=False)
+        print(f"Saved fitness log to {report_path}")
+
+
+class PlottingCallback(Callback):
+    """Callback to generate and save plots of fitness progression."""
+    def on_experiment_start(self, orchestrator):
+        self.output_dir = orchestrator.output_dir
+        self.all_runs_best_fitness = []
+
+    def on_experiment_end(self, orchestrator):
+        # This is a simplified plotting logic.
+        # It assumes data from multiple runs is available.
+        # For this refactor, we'll plot the progression of the single run.
+        # A more robust implementation would aggregate data across multiple Orchestrator runs.
+
+        # We need to load the data from the CSVLogger to do this properly.
+        csv_path = os.path.join(self.output_dir, "fitness_log.csv")
+        if not os.path.exists(csv_path):
+            return # Cannot plot if there's no data
+
+        df = pd.read_csv(csv_path)
+
+        plt.figure()
+
+        for run_id in df['run_id'].unique():
+            run_df = df[df['run_id'] == run_id]
+            best_fitness_per_step = run_df.groupby('step')['fitness'].max()
+            plt.plot(best_fitness_per_step.index, best_fitness_per_step.values, alpha=0.5)
+
+        # Calculate and plot mean and std dev
+        mean_fitness = df.groupby('step')['fitness'].max().groupby(level=0).mean()
+        std_fitness = df.groupby('step')['fitness'].max().groupby(level=0).std()
+
+        plt.plot(mean_fitness.index, mean_fitness.values, label="Mean Best Fitness", color='blue', linewidth=2)
+        plt.fill_between(
+            mean_fitness.index,
+            mean_fitness - std_fitness,
+            mean_fitness + std_fitness,
+            alpha=0.2,
+            color='blue',
+            label="Std Dev",
+        )
+
+        plt.xlabel("Step")
+        plt.ylabel("Best Fitness")
+        plt.title(f"Fitness Progression for {orchestrator.problem.name}")
+        plt.legend()
+        plot_path = os.path.join(self.output_dir, f"fitness_progression.png")
+        plt.savefig(plot_path)
+        plt.close()
+        print(f"Saved plot to {plot_path}")
diff --git a/plugins/optimizers/__init__.py b/plugins/optimizers/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/plugins/optimizers/base_diffevo.py b/plugins/optimizers/base_diffevo.py
new file mode 100644
index 0000000..cde9d71
--- /dev/null
+++ b/plugins/optimizers/base_diffevo.py
@@ -0,0 +1,39 @@
+import torch
+from diffevo.optimizers.base import Optimizer
+from diffevo.schedulers import DDIMSchedulerCosine
+from diffevo.fitness_mappings import Identity
+from diffevo.generators import BayesianGenerator
+
+class BaseDiffEvo(Optimizer):
+    def __init__(self, problem, popsize, num_step=100, noise=1.0, scaling=1.0,
+                 fitness_mapping=None, scheduler=None, generator_class=BayesianGenerator,
+                 generator_config={}, **kwargs):
+
+        self.num_params = problem.dim
+        self.popsize = popsize
+        self.noise = noise
+        self.scaling = scaling
+        self.generator_class = generator_class
+        self.generator_config = generator_config
+        self.lower_bound = problem.lower_bound
+        self.upper_bound = problem.upper_bound
+
+        self.fitness_mapping = fitness_mapping or Identity()
+
+        # Handle num_step from kwargs for GraphBasedDiffEvo
+        if 'num_step' in kwargs:
+            num_step = kwargs.get('num_step', 100)
+
+        self.scheduler = scheduler or DDIMSchedulerCosine(num_step=num_step)
+        self.alphas = [alpha for _, alpha in self.scheduler]
+
+        # Initialize population
+        self.population = torch.rand(self.popsize, self.num_params) * (self.upper_bound - self.lower_bound) + self.lower_bound
+        self.step_idx = 0
+
+    def ask(self):
+        """Returns the current population of candidate solutions."""
+        return self.population * self.scaling
+
+    def tell(self, fitnesses):
+        raise NotImplementedError
diff --git a/plugins/optimizers/diffevo.py b/plugins/optimizers/diffevo.py
new file mode 100644
index 0000000..f0cd788
--- /dev/null
+++ b/plugins/optimizers/diffevo.py
@@ -0,0 +1,17 @@
+from .base_diffevo import BaseDiffEvo
+
+class DiffEvo(BaseDiffEvo):
+    def tell(self, fitnesses):
+        if self.step_idx >= len(self.alphas):
+            return
+
+        alpha = self.alphas[self.step_idx]
+
+        generator = self.generator_class(
+            self.population,
+            self.fitness_mapping(fitnesses),
+            alpha,
+            **self.generator_config,
+        )
+        self.population = generator(noise=self.noise)
+        self.step_idx += 1
diff --git a/plugins/optimizers/graph_based_diffevo.py b/plugins/optimizers/graph_based_diffevo.py
new file mode 100644
index 0000000..0863a84
--- /dev/null
+++ b/plugins/optimizers/graph_based_diffevo.py
@@ -0,0 +1,60 @@
+import torch
+from .base_diffevo import BaseDiffEvo
+
+class GraphBasedDiffEvo(BaseDiffEvo):
+    """
+    An optimizer where the diffusion process is modeled as a sparse graph.
+    Each candidate solution is a node, and updates only happen between neighbors.
+    This implementation uses a k-Nearest Neighbors graph to structure the population.
+    """
+    def __init__(self, problem, popsize, k=5, **kwargs):
+        super().__init__(problem, popsize, **kwargs)
+        self.k = k
+        # Build the k-NN graph
+        self._build_graph()
+
+    def _build_graph(self):
+        """Builds a k-NN graph based on Euclidean distance between individuals."""
+        distances = torch.cdist(self.population, self.population)
+        # Get the indices of the k+1 nearest neighbors (including self)
+        _, indices = torch.topk(distances, self.k + 1, largest=False, sorted=True)
+        # The first column is the node itself, the rest are its neighbors
+        self.adj = indices[:, 1:]
+
+    def tell(self, fitnesses):
+        """Updates the population based on a sparse diffusion process on the graph."""
+        if self.step_idx >= len(self.alphas):
+            return
+
+        alpha = self.alphas[self.step_idx]
+        new_population = torch.zeros_like(self.population)
+
+        # Iterate through each individual to compute its update based on its local neighborhood
+        for i in range(self.popsize):
+            # Get the indices of the neighbors for the current individual
+            neighbor_indices = self.adj[i]
+
+            # Create the local population (the individual and its neighbors)
+            local_indices = torch.cat([torch.tensor([i]), neighbor_indices])
+            local_population = self.population[local_indices]
+            local_fitnesses = fitnesses[local_indices]
+
+            # Use the generator on the local population
+            generator = self.generator_class(
+                local_population,
+                self.fitness_mapping(local_fitnesses),
+                alpha,
+                **self.generator_config,
+            )
+
+            # The generator creates a new full local population. We only care about the new version of our target individual 'i',
+            # which is the first one in the local population.
+            new_local_population = generator(noise=self.noise)
+            new_population[i] = new_local_population[0]
+
+        self.population = new_population
+
+        # Optionally, the graph could be rebuilt periodically
+        # self._build_graph()
+
+        self.step_idx += 1
diff --git a/plugins/optimizers/rl_evo.py b/plugins/optimizers/rl_evo.py
new file mode 100644
index 0000000..0064602
--- /dev/null
+++ b/plugins/optimizers/rl_evo.py
@@ -0,0 +1,60 @@
+from diffevo.optimizers.base import Optimizer
+from diffevo import (
+    LatentBayesianGenerator,
+    RandomProjection,
+    DDIMSchedulerCosine,
+    BayesianGenerator,
+)
+import torch
+
+class RLEvo(Optimizer):
+    def __init__(
+        self,
+        problem,
+        popsize: int = 512,
+        num_step: int = 10,
+        T: float = 1.0,
+        latent_dim: int = None,
+        scaling: float = 0.1,
+        noise: float = 1.0,
+        weight_decay: float = 0.0,
+        *args,
+        **kwargs
+    ):
+        super().__init__(problem, *args, **kwargs)
+        self.num_step = num_step
+        self.T = T
+        self.popsize = popsize
+        self.latent_dim = latent_dim
+        self.scaling = scaling
+        self.noise = noise
+        self.weight_decay = weight_decay
+
+        self.scheduler = DDIMSchedulerCosine(num_step=self.num_step)
+        # The scheduler returns tuples of (t, (alpha, alpha_past)), so we need to unpack the inner tuple
+        self.alphas = [alpha_tuple[0] for _, alpha_tuple in self.scheduler]
+        self.population = torch.randn(self.popsize, self.problem.dim)
+        self.step_idx = 0
+
+    def ask(self):
+        return self.population * self.scaling
+
+    def tell(self, rewards):
+        if self.step_idx >= len(self.alphas):
+            return
+
+        alpha = self.alphas[self.step_idx]
+
+        l2 = torch.norm(self.population, dim=-1) ** 2
+        fitness = torch.exp((rewards - rewards.max()) / self.T - l2 * self.weight_decay)
+
+        if self.latent_dim is not None:
+            random_map = RandomProjection(self.problem.dim, self.latent_dim, normalize=True)
+            generator = LatentBayesianGenerator(
+                self.population, random_map(self.population).detach(), fitness, alpha
+            )
+        else:
+            generator = BayesianGenerator(self.population, fitness, alpha)
+
+        self.population, x0 = generator(noise=self.noise, return_x0=True)
+        self.step_idx += 1
diff --git a/plugins/problems/__init__.py b/plugins/problems/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/plugins/problems/classic.py b/plugins/problems/classic.py
new file mode 100644
index 0000000..87ca0e4
--- /dev/null
+++ b/plugins/problems/classic.py
@@ -0,0 +1,50 @@
+from foobench import Objective
+import torch
+
+from diffevo.problems.base import Problem
+from diffevo.problems.helpers import energy_wrapper, fitness_target, distance_scale, max_distances
+from diffevo.problems.utils import two_peak_density
+
+class TwoPeakDensity(Problem):
+    def __init__(self, dim: int = 2):
+        assert dim == 2, "TwoPeakDensity is only defined for 2 dimensions."
+        super().__init__(name="TwoPeakDensity", dim=dim, lower_bound=-2.0, upper_bound=2.0)
+
+    def _create_objective(self):
+        return two_peak_density
+
+
+class Rosenbrock(Problem):
+    def __init__(self, dim: int = 2):
+        super().__init__(name="rosenbrock", dim=dim, lower_bound=-100.0, upper_bound=100.0)
+
+    def _create_objective(self):
+        obj = Objective(foo="rosenbrock", maximize=False, limit_val=self.upper_bound)
+        target = fitness_target[self.name]
+        scale = distance_scale[self.name]
+        return energy_wrapper(
+            obj,
+            target=target,
+            scale=scale,
+            max_distance=max_distances[self.name]
+        )
+
+class Rastrigin(Problem):
+    def __init__(self, dim: int = 2):
+        name = "rastrigin"
+        if dim > 2:
+            name = f"rastrigin_{dim}d"
+        super().__init__(name=name, dim=dim, lower_bound=-100.0, upper_bound=100.0)
+
+    def _create_objective(self):
+        # The original name is needed for foobench
+        original_name = "rastrigin"
+        obj = Objective(foo=original_name, maximize=True, limit_val=self.upper_bound)
+        target = fitness_target[self.name]
+        scale = distance_scale[self.name]
+        return energy_wrapper(
+            obj,
+            target=target,
+            scale=scale,
+            max_distance=max_distances[self.name]
+        )
diff --git a/plugins/problems/graph.py b/plugins/problems/graph.py
new file mode 100644
index 0000000..feb90b3
--- /dev/null
+++ b/plugins/problems/graph.py
@@ -0,0 +1,45 @@
+import torch
+import networkx as nx
+import numpy as np
+
+from diffevo.problems.base import Problem
+
+
+class Maxclique(Problem):
+    def __init__(self, n_nodes: int = 10):
+        self.n_nodes = n_nodes
+        dim = n_nodes * n_nodes
+        super().__init__(name='Maxclique', dim=dim, lower_bound=0.0, upper_bound=1.0)
+
+    def _create_objective(self):
+        def objective(pop: torch.Tensor) -> torch.Tensor:
+            pop_size = pop.shape[0]
+            adj_matrices = (pop.view(pop_size, self.n_nodes, self.n_nodes) > 0.5).int()
+
+            fitness_values = []
+            for i in range(pop_size):
+                # Symmetrize the matrix to represent an undirected graph
+                adj_matrix = adj_matrices[i].numpy()
+                adj_matrix = np.maximum(adj_matrix, adj_matrix.T)
+                G = nx.from_numpy_array(adj_matrix)
+                max_clique = nx.max_weight_clique(G, weight=None)
+                fitness_values.append(len(max_clique))
+
+            return torch.tensor(fitness_values, dtype=torch.float32)
+        return objective
+
+
+class Graphflow(Problem):
+    def __init__(self, num_nodes: int = 3):
+        self.num_nodes = num_nodes
+        dim = num_nodes ** 2
+        super().__init__(name='Graphflow', dim=dim, lower_bound=-5.0, upper_bound=5.0)
+
+    def _create_objective(self):
+        def objective(pop: torch.Tensor) -> torch.Tensor:
+            pop_size = pop.shape[0]
+            cap = torch.clamp(pop.view(pop_size, self.num_nodes, self.num_nodes), min=0)  # Positive weights
+            # Add a small epsilon for numerical stability
+            fitness = cap[:, 0, 2] + torch.min(cap[:, 0, 1], cap[:, 1, 2]) + 1e-9
+            return fitness
+        return objective
diff --git a/plugins/problems/image.py b/plugins/problems/image.py
new file mode 100644
index 0000000..2787cab
--- /dev/null
+++ b/plugins/problems/image.py
@@ -0,0 +1,38 @@
+from diffevo.problems.base import Problem
+import torch
+import torch.nn.functional as F
+
+
+def get_target_image(image_name: str, dim_sqrt: int = 28):
+    """
+    Returns a target image tensor based on the name.
+    For simplicity, we are not loading real datasets like MNIST.
+    """
+    if image_name == "mnist_7":
+        # Create a dummy 28x28 image of a 7
+        target = torch.zeros(dim_sqrt, dim_sqrt)
+        target[2:4, 5:20] = 1.0  # top bar
+        target[2:20, 18:20] = 1.0  # right bar
+        return target.flatten()
+    else:
+        # Default to random
+        return torch.rand(dim_sqrt**2)
+
+
+class Image(Problem):
+    def __init__(self, dim_sqrt: int = 28, target_image_name: str = "mnist_7"):
+        self.dim_sqrt = dim_sqrt
+        dim = dim_sqrt ** 2
+        self.target_image = get_target_image(target_image_name, self.dim_sqrt)
+
+        super().__init__(name="ImageReconstruction", dim=dim, lower_bound=0.0, upper_bound=1.0)
+
+    def _create_objective(self):
+        def objective(pop: torch.Tensor) -> torch.Tensor:
+            # We want to minimize the MSE, but optimizers maximize fitness.
+            # So, fitness is the negative of the MSE.
+            # The target_image needs to be expanded to match the population size for mse_loss.
+            target_expanded = self.target_image.unsqueeze(0).expand(pop.shape[0], -1)
+            mse_per_individual = F.mse_loss(pop, target_expanded, reduction='none').mean(dim=1)
+            return -mse_per_individual
+        return objective
diff --git a/plugins/problems/rl.py b/plugins/problems/rl.py
new file mode 100644
index 0000000..c674d11
--- /dev/null
+++ b/plugins/problems/rl.py
@@ -0,0 +1,131 @@
+from diffevo.problems.base import Problem
+from diffevo.utils import normalize_observation
+import torch
+import gymnasium as gym
+import torch.nn as nn
+
+class ControllerMLP(nn.Module):
+    def __init__(self, dim_in, dim_out, n_hidden, n_hidden_layers=1):
+        super().__init__()
+        hidden_layers = []
+        for _ in range(n_hidden_layers - 1):
+            hidden_layers.append(nn.Linear(n_hidden, n_hidden))
+            hidden_layers.append(nn.ReLU())
+
+        self.mlp = nn.Sequential(
+            nn.Linear(dim_in, n_hidden),
+            nn.ReLU(),
+            *hidden_layers,
+            nn.Linear(n_hidden, dim_out),
+        )
+
+    def forward(self, x):
+        return self.mlp(x)
+
+    def __len__(self):
+        # return the number of parameters
+        return sum(p.numel() for p in self.parameters())
+
+    def fill(self, params):
+        # fill the parameters with a flat tensor
+        if len(params) != len(self):
+            raise ValueError(
+                f"The number of parameters does not match, expected {len(self)} but got {len(params)}"
+            )
+
+        start_idx = 0
+        for p in self.parameters():
+            n = p.numel()
+            p.data.copy_(params[start_idx:start_idx+n].view_as(p))
+            start_idx += n
+
+    @classmethod
+    def from_parameter(cls, dim_in, dim_out, n_hidden, params, n_hidden_layers=1):
+        # create a new instance and fill it with the given parameters
+        instance = cls(dim_in, dim_out, n_hidden, n_hidden_layers=n_hidden_layers)
+        instance.fill(params)
+        return instance
+
+
+class DiscreteController:
+    def __init__(self, model, action_space):
+        self.model = model
+        self.action_space = action_space
+
+    def __call__(self, x):
+        with torch.no_grad():
+            logits = self.model(x)
+            return torch.argmax(logits).item()
+
+
+class ContinuousController:
+    def __init__(self, model, action_space, factor=1):
+        self.model = model
+        self.action_space = action_space
+        self.factor = factor
+
+    def __call__(self, x):
+        with torch.no_grad():
+            result = torch.tanh(self.model(x)).reshape(-1).numpy() * self.factor
+            return result
+
+
+class RL(Problem):
+    def __init__(self, env_name: str = "CartPole-v1", dim_hidden: int = 8, n_hidden_layers: int = 1, factor: float = 1.0):
+        self.env_name = env_name
+        self.env = gym.make(env_name)
+
+        obs_space_dims = self.env.observation_space.shape[0]
+
+        if isinstance(self.env.action_space, gym.spaces.Discrete):
+            self.controller_type = "discrete"
+            action_space_dims = self.env.action_space.n
+        elif isinstance(self.env.action_space, gym.spaces.Box):
+            self.controller_type = "continuous"
+            action_space_dims = self.env.action_space.shape[0]
+        else:
+            raise ValueError(f"Unsupported action space: {self.env.action_space}")
+
+        self.dim_hidden = dim_hidden
+        self.n_hidden_layers = n_hidden_layers
+        self.factor = factor
+
+        # The dimension is the number of parameters in the policy network
+        self.policy = ControllerMLP(obs_space_dims, action_space_dims, dim_hidden, n_hidden_layers)
+        dim = len(self.policy)
+
+        super().__init__(name="ReinforcementLearning", dim=dim, lower_bound=-2.0, upper_bound=2.0)
+
+    def _create_objective(self):
+        def objective(pop: torch.Tensor) -> torch.Tensor:
+            fitness_values = []
+            for individual in pop:
+                # Load the individual's parameters into the policy network
+                self.policy.fill(individual)
+
+                if self.controller_type == "discrete":
+                    controller = DiscreteController(self.policy, self.env.action_space)
+                else:
+                    controller = ContinuousController(self.policy, self.env.action_space, factor=self.factor)
+
+                # Run an episode and get the total reward
+                observation, info = self.env.reset()
+                done = False
+                total_reward = 0
+                while not done:
+                    action = controller(
+                        torch.from_numpy(
+                            normalize_observation(observation, self.env.observation_space)
+                        ).float()
+                    )
+                    observation, reward, terminated, truncated, info = self.env.step(action)
+                    total_reward += reward
+                    done = terminated or truncated
+                fitness_values.append(total_reward)
+
+            return torch.tensor(fitness_values, dtype=torch.float32)
+        return objective
+
+    def __del__(self):
+        if hasattr(self, 'env'):
+            self.env.close()
diff --git a/report.py b/report.py
new file mode 100644
index 0000000..cba66e0
--- /dev/null
+++ b/report.py
@@ -0,0 +1,58 @@
+import subprocess
+import sys
+import os
+
+# --- Experiment Sequencing and Reporting Script ---
+
+SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+VENV_DIR = os.path.join(SCRIPT_DIR, ".venv")
+RUN_PY = os.path.join(SCRIPT_DIR, "run.py")
+
+def get_venv_python():
+    """Returns the path to the python executable in the virtual environment."""
+    if sys.platform != "win32":
+        return os.path.join(VENV_DIR, "bin", "python")
+    return os.path.join(VENV_DIR, "Scripts", "python.exe")
+
+def run_experiment(config_path, smoketest=False):
+    """Runs a single experiment using run.py."""
+    venv_python = get_venv_python()
+    if not os.path.exists(venv_python):
+        print(f"Error: Virtual environment python not found at '{venv_python}'.")
+        print("Please run 'python init.py' first.")
+        sys.exit(1)
+
+    cmd = [venv_python, RUN_PY, config_path]
+    if smoketest:
+        cmd.append('--smoketest')
+
+    print(f"--- Running experiment: {' '.join(cmd)} ---")
+    try:
+        subprocess.run(cmd, check=True)
+        print(f"--- Experiment '{config_path}' completed successfully. ---")
+    except subprocess.CalledProcessError as e:
+        print(f"--- ERROR: Experiment '{config_path}' failed with exit code {e.returncode}. ---", file=sys.stderr)
+    except FileNotFoundError:
+        print(f"--- ERROR: Could not find 'run.py' at '{RUN_PY}'. ---", file=sys.stderr)
+
+
+def main():
+    """Runs a sequence of experiments and generates a report."""
+    # For now, we only run the smoketest as a demonstration.
+    # In the future, this could be expanded to run a full suite of experiments.
+    experiments = [
+        'configs/smoketest.yaml'
+    ]
+
+    print("--- Starting experiment sequence for report generation ---")
+
+    for experiment in experiments:
+        run_experiment(experiment, smoketest=True)
+
+    print("\n--- All experiments completed. ---")
+    # In the future, a report generation step would be added here.
+    print("Report generation is not yet implemented.")
+
+
+if __name__ == '__main__':
+    main()
diff --git a/requirements-dev.txt b/requirements-dev.txt
new file mode 100644
index 0000000..f8d0894
--- /dev/null
+++ b/requirements-dev.txt
@@ -0,0 +1,3 @@
+pytest
+pydantic
+pyyaml
diff --git a/run.py b/run.py
new file mode 100644
index 0000000..36bdee0
--- /dev/null
+++ b/run.py
@@ -0,0 +1,88 @@
+import argparse
+import logging
+import sys
+import os
+import yaml
+from collections.abc import Mapping
+
+# --- Main Application Logic ---
+
+# Add the project root to the Python path
+sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)))
+
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
+
+def deep_merge(d1, d2):
+    """Recursively merges d2 into d1."""
+    for k, v in d2.items():
+        if k in d1 and isinstance(d1[k], Mapping) and isinstance(v, Mapping):
+            d1[k] = deep_merge(d1[k], v)
+        else:
+            d1[k] = v
+    return d1
+
+def load_config(config_path):
+    """Loads a YAML configuration, handling base configurations."""
+    with open(config_path, 'r') as f:
+        config_data = yaml.safe_load(f)
+
+    if 'base' in config_data:
+        base_path = os.path.join(os.path.dirname(config_path), config_data['base'])
+        base_config = load_config(base_path)
+        del config_data['base']
+        return deep_merge(base_config, config_data)
+
+    return config_data
+
+def main():
+    """Main function to run the evaluation script from a config file."""
+    from diffevo.config import ExperimentConfig
+    from diffevo.orchestrator import Orchestrator
+    parser = argparse.ArgumentParser(description='Run a Diffusion Evolution experiment from a configuration file.')
+    parser.add_argument('config_path', type=str, help='Path to the YAML configuration file for the experiment.')
+    parser.add_argument('--output_dir', type=str, default='results', help='The base directory to save experiment results.')
+    parser.add_argument('--smoketest', action='store_true', help='Run in smoketest mode, overriding with smoketest.yaml.')
+    args = parser.parse_args()
+
+    config_path = args.config_path
+    if args.smoketest:
+        config_path = 'configs/smoketest.yaml'
+
+    logging.info(f"Loading configuration from: {config_path}")
+
+    try:
+        config_data = load_config(config_path)
+
+        if args.smoketest:
+            smoketest_overrides = {
+                'optimizer': {
+                    'params': {
+                        'num_step': 1,
+                        'popsize': 10
+                    }
+                },
+                'problem': {
+                    'params': {
+                        'dim': 2
+                    }
+                }
+            }
+            config_data = deep_merge(config_data, smoketest_overrides)
+            config_data['name'] = f"smoketest_{config_data.get('name', 'experiment')}"
+
+        config = ExperimentConfig(**config_data)
+
+        logging.info(f"Successfully loaded configuration: {config.name}")
+
+        orchestrator = Orchestrator(config=config, output_dir=args.output_dir)
+        orchestrator.run()
+
+        logging.info("Experiment run completed successfully.")
+
+    except FileNotFoundError:
+        logging.error(f"Configuration file not found at: {config_path}")
+    except Exception as e:
+        logging.error(f"An error occurred: {e}", exc_info=True)
+
+if __name__ == '__main__':
+    main()
diff --git a/scripts/generate_report.py b/scripts/generate_report.py
new file mode 100644
index 0000000..90af6af
--- /dev/null
+++ b/scripts/generate_report.py
@@ -0,0 +1,82 @@
+import sys
+import os
+sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
+
+import torch
+import pandas as pd
+from diffevo.utils import top_rewards, point_entropy, avg_group, std_group
+
+
+if __name__ == "__main__":
+    top_k = 64
+
+    methods = [
+        "DiffEvo_benchmark",
+        "LatentDiffEvo_benchmark",
+        "CMAES_benchmark",
+        "PEPG_benchmark",
+        "OpenES_benchmark",
+        "MAPElite_benchmark",
+    ]
+
+    print("Loading records...")
+    all_records = {}
+    for method_name in methods:
+        all_records[method_name] = torch.load(f"./data/records/{method_name}.pt")
+    print("Done!")
+
+    # add title
+    with open("./data/results.md", "w") as f:
+        f.write("# Benchmark Results\n\n")
+
+    entropy_table = pd.DataFrame()
+    entropy_std = pd.DataFrame()
+    for method_name, records in all_records.items():
+        use_x0 = "DiffEvo" in method_name
+        avg = avg_group(
+            point_entropy(records, n=top_k, use_x0=use_x0, name=method_name)
+        )
+        std = std_group(
+            point_entropy(records, n=top_k, use_x0=use_x0, name=method_name)
+        )
+        for k, v in avg.items():
+            entropy_table.loc[k, method_name.replace("_benchmark", "")] = v
+        for k, v in std.items():
+            entropy_std.loc[k, method_name.replace("_benchmark", "")] = v
+
+    # save to ./data/entropy_top_<top_k>.csv
+    entropy_table.to_csv(f"./data/entropy_top_{top_k}.csv")
+    entropy_std.to_csv(f"./data/entropy_std_top_{top_k}.csv")
+
+    fitness_table = pd.DataFrame()
+    fitness_std = pd.DataFrame()
+    for method_name, records in all_records.items():
+        use_x0 = "DiffEvo" in method_name
+        avg = avg_group(top_rewards(records, n=top_k, use_x0=use_x0))
+        std = std_group(top_rewards(records, n=top_k, use_x0=use_x0))
+        for k, v in avg.items():
+            fitness_table.loc[k, method_name.replace("_benchmark", "")] = v
+        for k, v in std.items():
+            fitness_std.loc[k, method_name.replace("_benchmark", "")] = v
+
+    # save to ./data/fitness_top_<top_k>.csv
+    fitness_table.to_csv(f"./data/fitness_top_{top_k}.csv")
+    fitness_std.to_csv(f"./data/fitness_std_top_{top_k}.csv")
+
+    # merge two tables together, each cell is "entropy (fitness)"
+    # use string to format
+    merged_table = pd.DataFrame()
+    for i in range(len(entropy_table)):
+        for j in range(len(entropy_table.columns)):
+            # merged_table.loc[i, j] = f"{entropy_table.iloc[i, j]:.2f} ({entropy_std.iloc[i, j]:.2f}), {fitness_table.iloc[i, j]:.2f} ({fitness_std.iloc[i, j]:.2f})"
+            merged_table.loc[i, j] = (
+                f"{entropy_table.iloc[i, j]:.2f} ({fitness_table.iloc[i, j]:.2f})"
+            )
+    # add row and column index
+    merged_table.index = entropy_table.index
+    merged_table.columns = entropy_table.columns
+
+    with open("./data/results.md", "a") as f:
+        f.write("## Result Table\n\n")
+        f.write("Each cell is entropy (fitness)\n\n")
+        f.write(merged_table.to_markdown(floatfmt=".2f") + "\n\n")
diff --git a/scripts/plot_alphas.py b/scripts/plot_alphas.py
new file mode 100644
index 0000000..42eb909
--- /dev/null
+++ b/scripts/plot_alphas.py
@@ -0,0 +1,136 @@
+import sys
+import os
+sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
+
+import torch
+import matplotlib.pyplot as plt
+import numpy as np
+import os
+
+name_table = {
+    "DDIMSchedulerCosine": "Cosine",
+    "DDPMScheduler": "DDPM",
+    "DDIMScheduler": "Linear",
+}
+
+colors = ["#6F6E6E", "#F5851E", "#343434"]
+
+import matplotlib
+
+matplotlib.rcParams["mathtext.fontset"] = "stix"
+matplotlib.rcParams["font.family"] = "STIXGeneral"
+
+
+def get_avg_fitness(record, idx_experiment, idx_step, top_n=1e9):
+    data = record["records"][idx_experiment][idx_step]["rastrigin"]
+    num_steps = data["arguments"]["num_step"]
+    all_fitnesses = data["fitnesses"][-1]
+    all_fitnesses = all_fitnesses.sort().values
+    top_n = min(top_n, len(all_fitnesses))
+    top_n_fitnesses = all_fitnesses[-top_n:]
+    return top_n_fitnesses.mean().item(), num_steps
+
+
+def get_step_fitness(record, top_n=1e9):
+    total_step_fitness = []
+    x = []
+    for idx_exp in range(len(record["records"])):
+        y = []
+        for idx_step in range(len(record["records"][idx_exp])):
+            avg, num_steps = get_avg_fitness(record, idx_exp, idx_step, top_n)
+            if idx_exp == 0:  # Only need to collect x once
+                x.append(num_steps)
+            y.append(avg)
+        total_step_fitness.append(y)
+
+    total_step_fitness = np.array(total_step_fitness).mean(axis=0)
+    return total_step_fitness, x, record["scheduler"]
+
+
+def get_step_fitness_std(record, top_n=1e9):
+    total_step_fitness = []
+    x = []
+    for idx_exp in range(len(record["records"])):
+        y = []
+        for idx_step in range(len(record["records"][idx_exp])):
+            std, num_steps = get_avg_fitness(record, idx_exp, idx_step, top_n)
+            if idx_exp == 0:
+                x.append(num_steps)
+            y.append(std)
+        total_step_fitness.append(y)
+    total_step_fitness = np.array(total_step_fitness).std(axis=0)
+    return total_step_fitness, x, record["scheduler"]
+
+
+def main():
+    # Load data
+    folder = "./data/schedulers"
+    schedulers = os.listdir(folder)
+    all_records = []
+
+    for scheduler in schedulers:
+        records = torch.load(f"{folder}/{scheduler}")
+        all_records.append(records)
+
+    # Create plot
+    plt.figure(figsize=(8, 3))
+    plt.subplot(1, 2, 2)
+    for idx_scheduler in range(len(all_records)):
+        total_step_fitness, x, scheduler = get_step_fitness(
+            all_records[idx_scheduler], top_n=64
+        )
+        total_step_fitness_std, x_std, scheduler_std = get_step_fitness_std(
+            all_records[idx_scheduler]
+        )
+        # Add dots at center points
+        plt.plot(x, total_step_fitness, "o", color=colors[idx_scheduler], markersize=4)
+        # Add dashes at top and bottom of error bars
+        plt.plot(
+            x,
+            total_step_fitness + total_step_fitness_std,
+            "_",
+            color=colors[idx_scheduler],
+            markersize=5,
+        )
+        plt.plot(
+            x,
+            total_step_fitness - total_step_fitness_std,
+            "_",
+            color=colors[idx_scheduler],
+            markersize=5,
+        )
+        # Plot error bars with caps
+        plt.errorbar(
+            x,
+            total_step_fitness,
+            yerr=total_step_fitness_std,
+            label=name_table[scheduler],
+            color=colors[idx_scheduler],
+            capsize=5,
+            capthick=1,
+            elinewidth=1,
+            fmt="-",  # Line only
+            marker="None",
+        )  # No markers on the line
+
+    # Configure plot
+    plt.semilogx()
+    plt.legend(loc="lower right")
+    plt.xlabel("Number of total steps")
+    plt.ylabel("Average fitness (top 64 elites)")
+    plt.title(r"(b) compare performance")
+
+    # Demostrate different alphas
+    from diffevo.plotting import plot_alpha_schedules
+    ax = plt.subplot(1, 2, 1)
+    plot_alpha_schedules(ax)
+
+    plt.tight_layout()
+    # Save and show plot
+    os.makedirs("./figures", exist_ok=True)
+    plt.savefig("./figures/alpha.png", dpi=300)
+    plt.savefig("./figures/alpha.pdf", bbox_inches="tight")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/experiments/benchmarks/plot_temperature.py b/scripts/plot_temperature.py
similarity index 55%
rename from experiments/benchmarks/plot_temperature.py
rename to scripts/plot_temperature.py
index 59a60f9..302ec90 100644
--- a/experiments/benchmarks/plot_temperature.py
+++ b/scripts/plot_temperature.py
@@ -1,34 +1,60 @@
+import sys
+import os
+sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
+
 import torch
 import matplotlib.pyplot as plt
 import numpy as np
 import os
-from statistic import point_entropy, avg_group, std_group
+from diffevo.utils import point_entropy, avg_group, std_group
 
 import matplotlib
-matplotlib.rcParams['mathtext.fontset'] = 'stix'
-matplotlib.rcParams['font.family'] = 'STIXGeneral'
+
+matplotlib.rcParams["mathtext.fontset"] = "stix"
+matplotlib.rcParams["font.family"] = "STIXGeneral"
 
 # Constants
-experiment_names = ['rosenbrock', 'beale', 'himmelblau', 'ackley', 'rastrigin', 'rastrigin_4d', 'rastrigin_32d', 'rastrigin_256d']
+experiment_names = [
+    "rosenbrock",
+    "beale",
+    "himmelblau",
+    "ackley",
+    "rastrigin",
+    "rastrigin_4d",
+    "rastrigin_32d",
+    "rastrigin_256d",
+]
 name_display = {
-    'rosenbrock': 'Rosenbrock',
-    'beale': 'Beale',
-    'himmelblau': 'Himmelblau',
-    'ackley': 'Ackley',
-    'rastrigin': r'Rastrigin$^{2}$',
-    'rastrigin_4d': r'Rastrigin$^{4}$',
-    'rastrigin_32d': r'Rastrigin$^{32}$',
-    'rastrigin_256d': r'Rastrigin$^{256}$'
+    "rosenbrock": "Rosenbrock",
+    "beale": "Beale",
+    "himmelblau": "Himmelblau",
+    "ackley": "Ackley",
+    "rastrigin": r"Rastrigin$^{2}$",
+    "rastrigin_4d": r"Rastrigin$^{4}$",
+    "rastrigin_32d": r"Rastrigin$^{32}$",
+    "rastrigin_256d": r"Rastrigin$^{256}$",
 }
-colors = ['#F5851E', '#E93A01', '#6F6E6E', '#800080', '#2B9BBF', '#46B3D5', '#73C5DF', '#94D3E7']
+colors = [
+    "#F5851E",
+    "#E93A01",
+    "#6F6E6E",
+    "#800080",
+    "#2B9BBF",
+    "#46B3D5",
+    "#73C5DF",
+    "#94D3E7",
+]
+
 
 def QD_score(maps):
     return np.sum([p.item() for x, p in maps.values()])
 
+
 def feature_descriptor(x, grid_size=1):
     cls = tuple(torch.round(x * grid_size).long().tolist())
     return cls
 
+
 def QD_score_from_trace(trace, fitness, grid_size=1):
     maps = dict()
     for x, f in zip(trace, fitness):
@@ -41,32 +67,45 @@ def QD_score_from_trace(trace, fitness, grid_size=1):
             maps[cls] = (x, f)
     return QD_score(maps)
 
-def load_data(folder='./data/temperatures/'):
-    files = [f for f in os.listdir(folder) if f.endswith('.pt')]
+
+def load_data(folder="./data/temperatures/"):
+    files = [f for f in os.listdir(folder) if f.endswith(".pt")]
     return [torch.load(os.path.join(folder, f)) for f in files]
 
+
 def plot_boxplots(records, ax=None, savefig=True, legend=True):
     # Process data for boxplots
     temperature_data = {}
     for record in records:
-        temp = record['temperature']
+        temp = record["temperature"]
         if temp not in temperature_data:
             temperature_data[temp] = {}
-        for run in record['records']:
+        for run in record["records"]:
             for exp_name in experiment_names:
                 results = run[exp_name]
                 if exp_name not in temperature_data[temp]:
                     temperature_data[temp][exp_name] = []
-                temperature_data[temp][exp_name].append(results['benchmark_fitness'].mean().item())
+                temperature_data[temp][exp_name].append(
+                    results["benchmark_fitness"].mean().item()
+                )
     if ax is None:
         plt.figure(figsize=(8, 4))
         ax = plt.gca()
-    
+
     positions = []
     data = []
     spacing = 1.5
     curr_pos = 0
-    colors = ['#F5851E', '#E93A01', '#6F6E6E', '#800080', '#2B9BBF', '#46B3D5', '#73C5DF', '#94D3E7']
+    colors = [
+        "#F5851E",
+        "#E93A01",
+        "#6F6E6E",
+        "#800080",
+        "#2B9BBF",
+        "#46B3D5",
+        "#73C5DF",
+        "#94D3E7",
+    ]
 
     # Plot boxplots for each temperature
     for temp in sorted(temperature_data.keys()):
@@ -80,24 +119,24 @@ def plot_boxplots(records, ax=None, savefig=True, legend=True):
 
     # Color the boxplots
     num_experiments = len(experiment_names)
-    for i in range(len(bp['boxes'])):
+    for i in range(len(bp["boxes"])):
         color_idx = i % num_experiments
-        bp['boxes'][i].set_facecolor(colors[color_idx])
+        bp["boxes"][i].set_facecolor(colors[color_idx])
         # set line color to the same color
-        bp['boxes'][i].set_edgecolor(colors[color_idx])
+        bp["boxes"][i].set_edgecolor(colors[color_idx])
         # set whiskers color
-        bp['whiskers'][2*i].set_color(colors[color_idx])
-        bp['whiskers'][2*i+1].set_color(colors[color_idx])
+        bp["whiskers"][2 * i].set_color(colors[color_idx])
+        bp["whiskers"][2 * i + 1].set_color(colors[color_idx])
         # set cap color
-        bp['caps'][2*i].set_color(colors[color_idx])
-        bp['caps'][2*i+1].set_color(colors[color_idx])
+        bp["caps"][2 * i].set_color(colors[color_idx])
+        bp["caps"][2 * i + 1].set_color(colors[color_idx])
         # set outlier color
-        bp['fliers'][i].set_markeredgecolor(colors[color_idx])
+        bp["fliers"][i].set_markeredgecolor(colors[color_idx])
         # set median line color to white
-        bp['medians'][i].set_color('white')
+        bp["medians"][i].set_color("white")
 
     # ax.set_xlabel('Temperature')
-    ax.set_ylabel('(a) Final Fitness')
+    ax.set_ylabel("(a) Final Fitness")
 
     # Add temperature labels and vertical lines
     temp_positions = []
@@ -106,10 +145,14 @@ def plot_boxplots(records, ax=None, savefig=True, legend=True):
         center = curr_pos + (len(experiment_names) - 1) / 2
         temp_positions.append(center)
         if curr_pos > 0:  # Add vertical line before each temperature group except first
-            plt.axvline(x=curr_pos - spacing/2, color='gray', linestyle='--', alpha=0.5)
+            plt.axvline(
+                x=curr_pos - spacing / 2, color="gray", linestyle="--", alpha=0.5
+            )
         curr_pos += len(experiment_names) + spacing
 
-    ax.set_xticks(temp_positions, [f'$T={t:.1f}$' for t in sorted(temperature_data.keys())])
+    ax.set_xticks(
+        temp_positions, [f"$T={t:.1f}$" for t in sorted(temperature_data.keys())]
+    )
 
     # add joint lines for each experiment
     for exp_idx, exp_name in enumerate(experiment_names):
@@ -122,37 +165,47 @@ def plot_boxplots(records, ax=None, savefig=True, legend=True):
             exp_positions.append(exp_pos)
             exp_medians.append(np.median(temperature_data[temp][exp_name]))
             curr_pos += len(temperature_data[temp]) + spacing
-        ax.plot(exp_positions, exp_medians, color=colors[exp_idx], linestyle='-', alpha=0.5)
+        ax.plot(
+            exp_positions, exp_medians, color=colors[exp_idx], linestyle="-", alpha=0.5
+        )
 
     # Add legend for experiments
-    legend_elements = [plt.Rectangle((0,0),1,1, facecolor=colors[experiment_names.index(exp)]) 
-                      for exp in experiment_names]
+    legend_elements = [
+        plt.Rectangle((0, 0), 1, 1, facecolor=colors[experiment_names.index(exp)])
+        for exp in experiment_names
+    ]
     if legend:
-        ax.legend(legend_elements, [name_display[exp] for exp in experiment_names],
-                 loc='upper right', bbox_to_anchor=(1, 1), fontsize='small')
+        ax.legend(
+            legend_elements,
+            [name_display[exp] for exp in experiment_names],
+            loc="upper right",
+            bbox_to_anchor=(1, 1),
+            fontsize="small",
+        )
 
     # plt.tight_layout()
     if savefig:
-        plt.savefig('./figures/temperature_boxplot.png', dpi=300)
-        plt.savefig('./figures/temperature_boxplot.pdf', bbox_inches='tight')
+        plt.savefig("./figures/temperature_boxplot.png", dpi=300)
+        plt.savefig("./figures/temperature_boxplot.pdf", bbox_inches="tight")
         plt.close()
 
+
 def plot_qd_scores(records, ax=None, savefig=True, legend=True):
     if ax is None:
         plt.figure(figsize=(10, 5))
         ax = plt.gca()
-    
+
     qd_scores = {}
     for record in records:
-        temp = record['temperature']
+        temp = record["temperature"]
         if temp not in qd_scores:
             qd_scores[temp] = {}
-        for run in record['records']:
+        for run in record["records"]:
             for exp_name in experiment_names:
                 if exp_name not in qd_scores[temp]:
                     qd_scores[temp][exp_name] = []
-                X = run[exp_name]['trace'][-1]
-                fitness = run[exp_name]['benchmark_fitness']
+                X = run[exp_name]["trace"][-1]
+                fitness = run[exp_name]["benchmark_fitness"]
                 qd_score = QD_score_from_trace(X, fitness)
                 qd_scores[temp][exp_name].append(qd_score)
 
@@ -164,38 +217,46 @@ def plot_qd_scores(records, ax=None, savefig=True, legend=True):
         # worst_score = min(scores)
         # scores = [(s - worst_score) / (best_score - worst_score) for s in scores]
         std_scores = [np.std(qd_scores[t][exp_name]) for t in temps]
-        
-        ax.plot(temps, scores, '.-', label=name_display[exp_name],
-                color=colors[experiment_names.index(exp_name)])
-        ax.fill_between(temps, 
-                        [s - std for s, std in zip(scores, std_scores)],
-                        [s + std for s, std in zip(scores, std_scores)],
-                        color=colors[experiment_names.index(exp_name)],
-                        alpha=0.2)
-
-    ax.set_xlabel('Temperature')
-    ax.set_ylabel('(c) QD-Score')
+
+        ax.plot(
+            temps,
+            scores,
+            ".-",
+            label=name_display[exp_name],
+            color=colors[experiment_names.index(exp_name)],
+        )
+        ax.fill_between(
+            temps,
+            [s - std for s, std in zip(scores, std_scores)],
+            [s + std for s, std in zip(scores, std_scores)],
+            color=colors[experiment_names.index(exp_name)],
+            alpha=0.2,
+        )
+
+    ax.set_xlabel("Temperature")
+    ax.set_ylabel("(c) QD-Score")
     if legend:
         ax.legend()
     ax.semilogx()
     if savefig:
-        plt.savefig('./figures/temperature_qd_scores.png', dpi=300)
-        plt.savefig('./figures/temperature_qd_scores.pdf', bbox_inches='tight')
+        plt.savefig("./figures/temperature_qd_scores.png", dpi=300)
+        plt.savefig("./figures/temperature_qd_scores.pdf", bbox_inches="tight")
         plt.close()
 
+
 def plot_entropy(records, ax=None, savefig=True, legend=True):
     if ax is None:
         plt.figure(figsize=(10, 5))
         ax = plt.gca()
-    
+
     # Calculate entropy for each temperature
     entropy_table = []
     std_table = []
     temperature_list = []
     for record in records:
-        avg_entropy = avg_group(point_entropy(record['records'], n=64))
-        std_entropy = std_group(point_entropy(record['records'], n=64))
-        temperature_list.append(record['temperature'])
+        avg_entropy = avg_group(point_entropy(record["records"], n=64))
+        std_entropy = std_group(point_entropy(record["records"], n=64))
+        temperature_list.append(record["temperature"])
         entropy_table.append(list(avg_entropy.values()))
         std_table.append(list(std_entropy.values()))
 
@@ -211,24 +272,31 @@ def plot_entropy(records, ax=None, savefig=True, legend=True):
 
     # Create entropy plot
     for i in range(entropy_table.shape[1]):
-        ax.plot(temperature_list, entropy_table[:, i], '.-',
-                label=name_display[experiment_names[i]], 
-                color=colors[i])
-        ax.fill_between(temperature_list,
-                       entropy_table[:, i] - std_table[:, i],
-                       entropy_table[:, i] + std_table[:, i],
-                       color=colors[i],
-                       alpha=0.2)
+        ax.plot(
+            temperature_list,
+            entropy_table[:, i],
+            ".-",
+            label=name_display[experiment_names[i]],
+            color=colors[i],
+        )
+        ax.fill_between(
+            temperature_list,
+            entropy_table[:, i] - std_table[:, i],
+            entropy_table[:, i] + std_table[:, i],
+            color=colors[i],
+            alpha=0.2,
+        )
     if legend:
         ax.legend()
-    ax.set_xlabel('Temperature')
-    ax.set_ylabel('(b) Entropy')
+    ax.set_xlabel("Temperature")
+    ax.set_ylabel("(b) Entropy")
     ax.semilogx()
     if savefig:
-        plt.savefig('./figures/temperature_entropy.png', dpi=300)
-        plt.savefig('./figures/temperature_entropy.pdf', bbox_inches='tight')
+        plt.savefig("./figures/temperature_entropy.png", dpi=300)
+        plt.savefig("./figures/temperature_entropy.pdf", bbox_inches="tight")
         plt.close()
 
+
 def combined_plot(records, ax=None, savefig=True):
     """combine boxplot, entropy, and qd-score plots
     Structure:
@@ -238,56 +306,66 @@ def combined_plot(records, ax=None, savefig=True):
     # Create figure with 2x2 grid
     scale = 0.85
     fig = plt.figure(figsize=(10 * scale, 5 * scale))
-    
+
     # Import gridspec if not already imported
     from matplotlib import gridspec
-    
+
     # Create main gridspec with proper spacing
     gs = gridspec.GridSpec(2, 1, height_ratios=[1, 1], hspace=0.3)
-    
+
     # Create sub-gridspecs for each row
-    gs_top = gridspec.GridSpecFromSubplotSpec(1, 2, subplot_spec=gs[0], width_ratios=[6, 1], wspace=0.05)
-    gs_bottom = gridspec.GridSpecFromSubplotSpec(1, 2, subplot_spec=gs[1], width_ratios=[1, 1], wspace=0.2)
-    
+    gs_top = gridspec.GridSpecFromSubplotSpec(
+        1, 2, subplot_spec=gs[0], width_ratios=[6, 1], wspace=0.05
+    )
+    gs_bottom = gridspec.GridSpecFromSubplotSpec(
+        1, 2, subplot_spec=gs[1], width_ratios=[1, 1], wspace=0.2
+    )
+
     # First row - boxplot on left (75%)
     ax_box = fig.add_subplot(gs_top[0])
     plot_boxplots(records, ax=ax_box, savefig=False, legend=False)
-    
+
     # First row - legend on right (25%)
     ax_legend = fig.add_subplot(gs_top[1])
-    ax_legend.axis('off')
-    legend_elements = [plt.Rectangle((0,0),1,1, facecolor=colors[experiment_names.index(exp)]) 
-                      for exp in experiment_names]
-    ax_legend.legend(legend_elements, [name_display[exp] for exp in experiment_names],
-                    loc='center', fontsize='small')
-    
+    ax_legend.axis("off")
+    legend_elements = [
+        plt.Rectangle((0, 0), 1, 1, facecolor=colors[experiment_names.index(exp)])
+        for exp in experiment_names
+    ]
+    ax_legend.legend(
+        legend_elements,
+        [name_display[exp] for exp in experiment_names],
+        loc="center",
+        fontsize="small",
+    )
+
     # Second row - entropy plot on left (50%)
     ax_entropy = fig.add_subplot(gs_bottom[0])
     plot_entropy(records, ax=ax_entropy, savefig=False, legend=False)
-    
+
     # Second row - QD score plot on right (50%)
     ax_qd = fig.add_subplot(gs_bottom[1])
     plot_qd_scores(records, ax=ax_qd, savefig=False, legend=False)
 
     if savefig:
-        plt.savefig('./figures/temperature_combined.png', dpi=300, bbox_inches='tight')
-        plt.savefig('./figures/temperature_combined.pdf', bbox_inches='tight')
+        plt.savefig("./figures/temperature_combined.png", dpi=300, bbox_inches="tight")
+        plt.savefig("./figures/temperature_combined.pdf", bbox_inches="tight")
         plt.close()
 
+
 def main():
     # Create figures directory if it doesn't exist
-    os.makedirs('./figures', exist_ok=True)
-    
+    os.makedirs("./figures", exist_ok=True)
+
     # Load data
     records = load_data()
-    
-    
-    
+
     # Create all plots
     plot_boxplots(records)
     plot_entropy(records)
     plot_qd_scores(records)
     combined_plot(records)
 
+
 if __name__ == "__main__":
-    main()
\ No newline at end of file
+    main()
diff --git a/setup.py b/setup.py
index d4bcd03..1719027 100644
--- a/setup.py
+++ b/setup.py
@@ -17,9 +17,30 @@
     },
     classifiers=[
         "Programming Language :: Python :: 3",
-        "License :: Other/Proprietary License",
+        "License :: OSI Approved :: Apache Software License",
         "Operating System :: OS Independent",
     ],
-    packages=['diffevo'],
+    package_dir={"": "src"},
+    packages=setuptools.find_packages(where="src"),
     python_requires=">=3.6",
-)
\ No newline at end of file
+    install_requires=[
+        'cma',
+        'gymnasium',
+        'pygame',
+        'tqdm',
+        'matplotlib',
+        'numpy==1.26.4',
+        'torch',
+        'torchvision',
+        'torchaudio',
+        'pandas',
+        'foobench @ git+https://github.com/bhartl/foobench.git',
+        'pydantic',
+        'pyyaml',
+    ],
+    entry_points={
+        "console_scripts": [
+            "diffevo-evaluate = run_evaluation:main",
+        ],
+    },
+)
diff --git a/src/diffevo/__init__.py b/src/diffevo/__init__.py
new file mode 100644
index 0000000..a524c03
--- /dev/null
+++ b/src/diffevo/__init__.py
@@ -0,0 +1,24 @@
+# __init__.py for the diffevo package
+
+from .config import ExperimentConfig
+from .orchestrator import Orchestrator
+from .generators import BayesianGenerator, LatentBayesianGenerator
+from .utils import RandomProjection
+from .schedulers import DDIMSchedulerCosine
+
+def run(config: ExperimentConfig, output_dir: str = "results"):
+    """
+    Programmatic API entry point for running an experiment.
+
+    Args:
+        config (ExperimentConfig): The experiment configuration object.
+        output_dir (str): The base directory to save experiment results.
+
+    Returns:
+        A tuple containing:
+        - output_dir (str): The path to the directory where results were saved.
+        - final_populations (list): A list of the final population from each run.
+    """
+    orchestrator = Orchestrator(config=config, output_dir=output_dir)
+    output_dir, final_populations = orchestrator.run()
+    return output_dir, final_populations
diff --git a/src/diffevo/callbacks.py b/src/diffevo/callbacks.py
new file mode 100644
index 0000000..83be68e
--- /dev/null
+++ b/src/diffevo/callbacks.py
@@ -0,0 +1,17 @@
+from abc import ABC
+
+class Callback(ABC):
+    """
+    Abstract base class for a callback.
+    """
+    def on_experiment_start(self, orchestrator):
+        """Called at the beginning of an experiment."""
+        pass
+
+    def on_step_end(self, orchestrator, step_data):
+        """Called at the end of each optimization step."""
+        pass
+
+    def on_experiment_end(self, orchestrator):
+        """Called at the end of an experiment."""
+        pass
diff --git a/src/diffevo/config.py b/src/diffevo/config.py
new file mode 100644
index 0000000..47266ea
--- /dev/null
+++ b/src/diffevo/config.py
@@ -0,0 +1,19 @@
+from pydantic import BaseModel
+from typing import Dict, Any, List
+
+class OptimizerConfig(BaseModel):
+    module: str
+    class_name: str
+    params: Dict[str, Any] = {}
+
+class ProblemConfig(BaseModel):
+    name: str
+    params: Dict[str, Any] = {}
+
+class ExperimentConfig(BaseModel):
+    name: str
+    optimizer: OptimizerConfig
+    problem: ProblemConfig
+    seed: int
+    num_runs: int
+    callbacks: List[str]
diff --git a/experiments/benchmarks/methods/es/__init__.py b/src/diffevo/es/__init__.py
similarity index 52%
rename from experiments/benchmarks/methods/es/__init__.py
rename to src/diffevo/es/__init__.py
index 2affc05..b876bd2 100644
--- a/experiments/benchmarks/methods/es/__init__.py
+++ b/src/diffevo/es/__init__.py
@@ -1,2 +1,2 @@
 from .cmaes import CMAES
-from .pepg import PEPG
\ No newline at end of file
+from .pepg import PEPG
diff --git a/experiments/benchmarks/methods/es/cmaes.py b/src/diffevo/es/cmaes.py
similarity index 67%
rename from experiments/benchmarks/methods/es/cmaes.py
rename to src/diffevo/es/cmaes.py
index 032461f..23988d0 100644
--- a/experiments/benchmarks/methods/es/cmaes.py
+++ b/src/diffevo/es/cmaes.py
@@ -11,32 +11,37 @@ class CMAES:
     From HADES package, author: Benedikt Hartl
     """
 
-    def __init__(self, num_params,
-                 sigma_init=1.0,
-                 popsize=255,
-                 weight_decay=0.01,
-                 reg='l2',
-                 x0=None,
-                 inopts=None
-                 ):
+    def __init__(
+        self,
+        num_params,
+        sigma_init=1.0,
+        popsize=255,
+        weight_decay=0.01,
+        reg="l2",
+        x0=None,
+        inopts=None,
+    ):
         """Constructs a CMA-ES solver, based on Hannsen's `cma` module.
-
         :param num_params: number of model parameters.
         :param sigma_init: initial standard deviation.
         :param popsize: population size.
         :param weight_decay: weight decay coefficient.
-        :param reg: Choice between 'l2' or 'l1' norm for weight decay regularization.
-        :param inopts: dict-like CMAOptions, forwarded to cma.CMAEvolutionStrategy constructor).
-        :param x0: (Optional) either (i) a single or (ii) several initial guesses for a good solution,
-                   defaults to None (initialize via `np.zeros(num_parameters)`).
+        :param reg: Choice between 'l2' or 'l1' norm for weight decay
+                    regularization.
+        :param inopts: dict-like CMAOptions, forwarded to
+                       cma.CMAEvolutionStrategy constructor).
+        :param x0: (Optional) either (i) a single or (ii) several initial
+                   guesses for a good solution, defaults to None
+                   (initialize via `np.zeros(num_parameters)`).
                    In case (i), the population is seeded with x0.
-                   In case (ii), the population is seeded with mean(x0, axis=0) and x0 is subsequently injected.
+                   In case (ii), the population is seeded with mean(x0, axis=0)
+                   and x0 is subsequently injected.
         """
 
         self.popsize = popsize
 
         inopts = inopts or {}
-        inopts['popsize'] = self.popsize
+        inopts["popsize"] = self.popsize
 
         self.num_params = num_params
         self.sigma_init = sigma_init
@@ -57,13 +62,14 @@ def __init__(self, num_params,
 
         # INITIALIZE
         import cma
+
         self.cma = cma.CMAEvolutionStrategy(x0, self.sigma_init, inopts)
 
         if inject_solutions is not None:
             if len(inject_solutions) == self.popsize:
                 self.flush(inject_solutions)
             else:
-                self.inject(inject_solutions)  # INJECT POTENTIALLY PROVIDED SOLUTIONS
+                self.inject(inject_solutions)
 
     def inject(self, solutions=None):
         if solutions is not None:
@@ -78,7 +84,7 @@ def rms_stdev(self):
         return np.mean(np.sqrt(sigma * sigma))
 
     def ask(self):
-        '''returns a list of parameters'''
+        """returns a list of parameters"""
         self.solutions = np.array(self.cma.ask())
         return torch.tensor(self.solutions)
 
@@ -89,29 +95,33 @@ def tell(self, reward_table_result):
             reward_table = reward_table_result.clone()
 
         if self.weight_decay > 0:
-            reg = utils.compute_weight_decay(self.weight_decay, self.solutions, reg=self.reg)
+            reg = utils.compute_weight_decay(
+                self.weight_decay, self.solutions, reg=self.reg
+            )
             reward_table += reg
 
         try:
             reward_table = reward_table.numpy()
-        except:
+        except Exception:
             reward_table = reward_table.cpu().numpy()
 
-        self.cma.tell(self.solutions, (-reward_table).tolist())  # convert minimizer to maximizer.
+        self.cma.tell(self.solutions, (-reward_table).tolist())
 
-        fitness_argsort = np.argsort(reward_table)[::-1]  # sort in descending order
+        fitness_argsort = np.argsort(reward_table)[::-1]
         self.fitness = reward_table[fitness_argsort]
         self.solutions = self.solutions[fitness_argsort]
 
     def current_param(self):
-        return self.cma.result[5]  # mean solution, presumably better with noise
+        return self.cma.result[5]
 
     def set_mu(self, mu):
         pass
 
     def best_param(self):
-        return self.cma.result[0]  # best evaluated solution
+        return self.cma.result[0]
 
-    def result(self):  # return best params so far, along with historically best reward, curr reward, sigma
+    def result(
+        self,
+    ):
         r = self.cma.result
-        return r[0], -r[1], -r[1], r[6]
\ No newline at end of file
+        return r[0], -r[1], -r[1], r[6]
diff --git a/experiments/benchmarks/methods/es/pepg.py b/src/diffevo/es/pepg.py
similarity index 68%
rename from experiments/benchmarks/methods/es/pepg.py
rename to src/diffevo/es/pepg.py
index e69b543..fad97ff 100644
--- a/experiments/benchmarks/methods/es/pepg.py
+++ b/src/diffevo/es/pepg.py
@@ -1,35 +1,35 @@
-import torch
 import numpy as np
-from torch import Tensor
 from . import utils
 
 
 class PEPG:
-    '''
+    """
     Extension of PEPG with bells and whistles.
 
     From HADES package, author: Benedikt Hartl
-    '''
-    def __init__(self, num_params,
-                 sigma_init=1.0,
-                 sigma_alpha=0.20,
-                 sigma_decay=0.999,
-                 sigma_limit=0.01,
-                 sigma_max_change=0.2,
-                 learning_rate=0.01,
-                 learning_rate_decay=0.9999,
-                 learning_rate_limit=0.01,
-                 elite_ratio=0,
-                 popsize=256,
-                 average_baseline=True,
-                 weight_decay=0.01,
-                 reg='l2',
-                 rank_fitness=True,
-                 forget_best=True,
-                 x0=None,
-                 ):  #
-        """ Constructs a `PEPG` solver instance.
-
+    """
+
+    def __init__(
+        self,
+        num_params,
+        sigma_init=1.0,
+        sigma_alpha=0.20,
+        sigma_decay=0.999,
+        sigma_limit=0.01,
+        sigma_max_change=0.2,
+        learning_rate=0.01,
+        learning_rate_decay=0.9999,
+        learning_rate_limit=0.01,
+        elite_ratio=0,
+        popsize=256,
+        average_baseline=True,
+        weight_decay=0.01,
+        reg="l2",
+        rank_fitness=True,
+        forget_best=True,
+        x0=None,
+    ):  #
+        """Constructs a `PEPG` solver instance.
         :param num_params: number of model parameters.
         :param sigma_init: initial standard deviation.
         :param sigma_alpha: learning rate for standard deviation.
@@ -43,10 +43,12 @@ def __init__(self, num_params,
         :param popsize: population size.
         :param average_baseline: set baseline to average of batch.
         :param weight_decay: weight decay coefficient.
-        :param reg: Choice between 'l2' or 'l1' norm for weight decay regularization.
+        :param reg: Choice between 'l2' or 'l1' norm for weight decay
+                    regularization.
         :param rank_fitness: use rank rather than fitness numbers.
         :param forget_best: don't keep the historical best solution.
-        :param x0: initial guess for a good solution, defaults to None (initialize via np.zeros(num_parameters)).
+        :param x0: initial guess for a good solution, defaults to None
+                   (initialize via np.zeros(num_parameters)).
         """
 
         self.num_params = num_params
@@ -61,10 +63,10 @@ def __init__(self, num_params,
         self.popsize = popsize
         self.average_baseline = average_baseline
         if self.average_baseline:
-            assert (self.popsize % 2 == 0), "Population size must be even"
+            assert self.popsize % 2 == 0, "Population size must be even"
             self.batch_size = int(self.popsize / 2)
         else:
-            assert (self.popsize & 1), "Population size must be odd"
+            assert self.popsize & 1, "Population size must be odd"
             self.batch_size = int((self.popsize - 1) / 2)
 
         # option to use greedy es method to select next mu, rather than using drift param
@@ -78,7 +80,9 @@ def __init__(self, num_params,
         self.batch_reward = np.zeros(self.batch_size * 2)
 
         # BH: ADDING option to start from prior solution
-        self.mu = np.zeros(self.num_params) if x0 is None else np.asarray(x0)  # np.zeros(self.num_params)
+        self.mu = (
+            np.zeros(self.num_params) if x0 is None else np.asarray(x0)
+        )  # np.zeros(self.num_params)
         self.best_mu = np.copy(self.mu[0])  # np.zeros(self.num_params)
         self.curr_best_mu = np.copy(self.mu[0])  # np.zeros(self.num_params)
 
@@ -91,29 +95,37 @@ def __init__(self, num_params,
         if self.rank_fitness:
             self.forget_best = True  # always forget the best one if we rank
         # choose optimizer
-        self.optimizer = utils.Adam(mu=self.best_mu, num_params=num_params, stepsize=learning_rate)
+        self.optimizer = utils.Adam(
+            mu=self.best_mu, num_params=num_params, stepsize=learning_rate
+        )
 
     def rms_stdev(self):
         sigma = self.sigma
         return np.mean(np.sqrt(sigma * sigma))
 
     def ask(self):
-        '''returns a list of parameters'''
+        """returns a list of parameters"""
         # antithetic sampling
-        self.epsilon = np.random.randn(self.batch_size, self.num_params) * self.sigma.reshape(1, self.num_params)
-        self.epsilon_full = np.concatenate([self.epsilon, - self.epsilon])
+        self.epsilon = np.random.randn(
+            self.batch_size, self.num_params
+        ) * self.sigma.reshape(1, self.num_params)
+        self.epsilon_full = np.concatenate([self.epsilon, -self.epsilon])
         if self.average_baseline:
             epsilon = self.epsilon_full
         else:
             # first population is mu, then positive epsilon, then negative epsilon
-            epsilon = np.concatenate([np.zeros((1, self.num_params)), self.epsilon_full])
+            epsilon = np.concatenate(
+                [np.zeros((1, self.num_params)), self.epsilon_full]
+            )
         solutions = self.mu.reshape(1, self.num_params) + epsilon
         self.solutions = solutions
         return solutions
 
     def tell(self, reward_table_result):
         # input must be a numpy float array
-        assert (len(reward_table_result) == self.popsize), "Inconsistent reward_table size reported."
+        assert (
+            len(reward_table_result) == self.popsize
+        ), "Inconsistent reward_table size reported."
 
         reward_table = np.array(reward_table_result)
 
@@ -121,7 +133,9 @@ def tell(self, reward_table_result):
             reward_table = utils.compute_centered_ranks(reward_table)
 
         if self.weight_decay > 0:
-            reg = utils.compute_weight_decay(self.weight_decay, self.solutions, reg=self.reg)
+            reg = utils.compute_weight_decay(
+                self.weight_decay, self.solutions, reg=self.reg
+            )
             reward_table += reg
 
         reward_offset = 1
@@ -133,12 +147,12 @@ def tell(self, reward_table_result):
 
         reward = reward_table[reward_offset:]
         if self.use_elite:
-            idx = np.argsort(reward)[::-1][0:self.elite_popsize]
+            idx = np.argsort(reward)[::-1][0 : self.elite_popsize]
         else:
             idx = np.argsort(reward)[::-1]
 
         best_reward = reward[idx[0]]
-        if (best_reward > b or self.average_baseline):
+        if best_reward > b or self.average_baseline:
             best_mu = self.mu + self.epsilon_full[idx[0]]
             best_reward = reward[idx[0]]
         else:
@@ -168,34 +182,34 @@ def tell(self, reward_table_result):
         if self.use_elite:
             self.mu += self.epsilon_full[idx].mean(axis=0)
         else:
-            rT = (reward[:self.batch_size] - reward[self.batch_size:])
+            rT = reward[: self.batch_size] - reward[self.batch_size :]
             change_mu = np.dot(rT, epsilon)
             self.optimizer.stepsize = self.learning_rate
-            update_ratio = self.optimizer.update(-change_mu)  # adam, rmsprop, momentum, etc.
-            # self.mu += (change_mu * self.learning_rate) # normal SGD method
+            self.optimizer.update(-change_mu)
 
         # adaptive sigma
         # normalization
-        if (self.sigma_alpha > 0):
+        if self.sigma_alpha > 0:
             stdev_reward = 1.0
             if not self.rank_fitness:
                 stdev_reward = reward.std()
-            S = ((epsilon * epsilon - (sigma * sigma).reshape(1, self.num_params)) / sigma.reshape(1, self.num_params))
-            reward_avg = (reward[:self.batch_size] + reward[self.batch_size:]) / 2.0
+            S = (epsilon**2 - sigma**2) / sigma
+            reward_avg = (reward[: self.batch_size] + reward[self.batch_size :]) / 2.0
             rS = reward_avg - b
             delta_sigma = (np.dot(rS, S)) / (2 * self.batch_size * stdev_reward)
 
-            # adjust sigma according to the adaptive sigma calculation
-            # for stability, don't let sigma move more than 10% of orig value
             change_sigma = self.sigma_alpha * delta_sigma
             change_sigma = np.minimum(change_sigma, self.sigma_max_change * self.sigma)
-            change_sigma = np.maximum(change_sigma, - self.sigma_max_change * self.sigma)
+            change_sigma = np.maximum(change_sigma, -self.sigma_max_change * self.sigma)
             self.sigma += change_sigma
 
-        if (self.sigma_decay < 1):
+        if self.sigma_decay < 1:
             self.sigma[self.sigma > self.sigma_limit] *= self.sigma_decay
 
-        if (self.learning_rate_decay < 1 and self.learning_rate > self.learning_rate_limit):
+        if (
+            self.learning_rate_decay < 1
+            and self.learning_rate > self.learning_rate_limit
+        ):
             self.learning_rate *= self.learning_rate_decay
 
     def flush(self, solutions):
@@ -210,5 +224,7 @@ def set_mu(self, mu):
     def best_param(self):
         return self.best_mu
 
-    def result(self):  # return best params so far, along with historically best reward, curr reward, sigma
+    def result(
+        self,
+    ):  # return best params so far, along with historically best reward, curr reward, sigma
         return (self.best_mu, self.best_reward, self.curr_best_reward, self.sigma)
diff --git a/experiments/benchmarks/methods/es/utils.py b/src/diffevo/es/utils.py
similarity index 81%
rename from experiments/benchmarks/methods/es/utils.py
rename to src/diffevo/es/utils.py
index fff1525..2758fa2 100644
--- a/experiments/benchmarks/methods/es/utils.py
+++ b/src/diffevo/es/utils.py
@@ -4,6 +4,7 @@
 
 # From HADES package, author: Benedikt Hartl
 
+
 def tensor_to_numpy(t: torch.Tensor):
     t = t.detach()
     try:
@@ -40,13 +41,15 @@ def __init__(self, mu, num_params, stepsize, momentum=0.9, epsilon=1e-08):
         self.stepsize, self.momentum = stepsize, momentum
 
     def _compute_step(self, globalg):
-        self.v = self.momentum * self.v + (1. - self.momentum) * globalg
+        self.v = self.momentum * self.v + (1.0 - self.momentum) * globalg
         step = -self.stepsize * self.v
         return step
 
 
 class Adam(Optimizer):
-    def __init__(self, mu, num_params, stepsize, beta1=0.99, beta2=0.999, epsilon=1e-08):
+    def __init__(
+        self, mu, num_params, stepsize, beta1=0.99, beta2=0.999, epsilon=1e-08
+    ):
         Optimizer.__init__(self, mu, num_params, epsilon=epsilon)
         self.stepsize = stepsize
         self.beta1 = beta1
@@ -55,7 +58,7 @@ def __init__(self, mu, num_params, stepsize, beta1=0.99, beta2=0.999, epsilon=1e
         self.v = np.zeros(self.dim, dtype=np.float32)
 
     def _compute_step(self, globalg):
-        a = self.stepsize * np.sqrt(1 - self.beta2 ** self.t) / (1 - self.beta1 ** self.t)
+        a = self.stepsize * np.sqrt(1 - self.beta2**self.t) / (1 - self.beta1**self.t)
         self.m = self.beta1 * self.m + (1 - self.beta1) * globalg
         self.v = self.beta2 * self.v + (1 - self.beta2) * (globalg * globalg)
         step = -a * self.m / (np.sqrt(self.v) + self.epsilon)
@@ -79,27 +82,28 @@ def compute_centered_ranks(x):
     https://github.com/openai/evolution-strategies-starter/blob/master/es_distributed/es.py
     """
     y = compute_ranks(x.ravel()).reshape(x.shape).astype(np.float32)
-    y /= (x.size - 1)
-    y -= .5
+    y /= x.size - 1
+    y -= 0.5
     return y
 
 
-def compute_weight_decay(weight_decay, model_param_list, reg='l2'):
+def compute_weight_decay(weight_decay, model_param_list, reg="l2"):
     if isinstance(model_param_list, torch.Tensor):
         mean = partial(torch.mean, dim=1)
     else:
         mean = partial(np.mean, axis=1)
 
-    if reg == 'l1':
-        return - weight_decay * mean(torch.abs(model_param_list))
+    if reg == "l1":
+        return -weight_decay * mean(torch.abs(model_param_list))
 
-    return - weight_decay * mean(model_param_list * model_param_list)
+    return -weight_decay * mean(model_param_list * model_param_list)
 
 
 class ScheduledSelectionPressure:
-    """ Scheduled Selection Pressure. """
-    def __init__(self, selection_pressure, num_steps, rate, mu, offset=1.):
-        """ Initialize the ScheduledSelectionPressure.
+    """Scheduled Selection Pressure."""
+
+    def __init__(self, selection_pressure, num_steps, rate, mu, offset=1.0):
+        """Initialize the ScheduledSelectionPressure.
 
         :param selection_pressure: float, final selection pressure value
         :param num_steps: int, number of steps for the scheduling
@@ -119,13 +123,15 @@ def reset(self):
 
     @property
     def scaling_factor(self):
-        """ return sigmoid scaling factor based on current step and total steps """
+        """return sigmoid scaling factor based on current step and total steps"""
         # alpha = self.current_step / self.num_steps
         x_adjusted = (self.current_step - self.mu) / self.num_steps
         return 1 / (1 + np.exp(-x_adjusted * self.rate))
 
     def get_value(self):
-        value = (self.selection_pressure - self.offset) * self.scaling_factor + self.offset
+        value = (
+            self.selection_pressure - self.offset
+        ) * self.scaling_factor + self.offset
         self.current_step += 1
         return value
 
@@ -142,8 +148,8 @@ def __lmul__(self, other):
         return self.get_value() * other
 
 
-def roulette_wheel(f, s=3., eps=1e-12, assume_sorted=False, normalize=False):
-    """ Roulette wheel fitness transformation.
+def roulette_wheel(f, s=3.0, eps=1e-12, assume_sorted=False, normalize=False):
+    """Roulette wheel fitness transformation.
 
     We transform the fitness values f to probabilities p by applying the roulette wheel fitness transformation.
     The roulette wheel fitness transformation is a monotonic transformation that maps the fitness values to
@@ -185,8 +191,10 @@ def roulette_wheel(f, s=3., eps=1e-12, assume_sorted=False, normalize=False):
     else:
         total_weight = np.abs(f).sum()
 
-    fs = (f - f.min()) / (f.max() - f.min() + eps)  # normalize fitness values to [0, 1], and sort
-    fs = exp(s*fs)  # apply selection pressure, s can be positive or negative
+    fs = (f - f.min()) / (
+        f.max() - f.min() + eps
+    )  # normalize fitness values to [0, 1], and sort
+    fs = exp(s * fs)  # apply selection pressure, s can be positive or negative
 
     if isinstance(f, torch.Tensor):
         fs = fs.cumsum(dim=0)  # compute cumulative sum
@@ -199,8 +207,13 @@ def roulette_wheel(f, s=3., eps=1e-12, assume_sorted=False, normalize=False):
     return fs[indices]
 
 
-def parameter_crowding(parameters, weight=1., sharpness=1., similarity_metric="euclidean"):
+def parameter_crowding(
+    parameters, weight=1.0, sharpness=1.0, similarity_metric="euclidean"
+):
     from sklearn.metrics.pairwise import pairwise_distances
-    parameter_similarity_matrix = pairwise_distances(parameters.reshape(len(parameters), -1), metric=similarity_metric)
+
+    parameter_similarity_matrix = pairwise_distances(
+        parameters.reshape(len(parameters), -1), metric=similarity_metric
+    )
     loss = np.exp(-parameter_similarity_matrix * sharpness)
     return loss.mean(axis=-1) * weight
diff --git a/src/diffevo/fitness_mappings/__init__.py b/src/diffevo/fitness_mappings/__init__.py
new file mode 100644
index 0000000..f35f857
--- /dev/null
+++ b/src/diffevo/fitness_mappings/__init__.py
@@ -0,0 +1,3 @@
+from .fitness_mappings import Identity, Energy, Power
+
+__all__ = ["Identity", "Energy", "Power"]
diff --git a/src/diffevo/fitness_mappings/base.py b/src/diffevo/fitness_mappings/base.py
new file mode 100644
index 0000000..e6dc29a
--- /dev/null
+++ b/src/diffevo/fitness_mappings/base.py
@@ -0,0 +1,8 @@
+from abc import ABC, abstractmethod
+import torch
+
+
+class BaseFitnessMapping(ABC):
+    @abstractmethod
+    def __call__(self, x: torch.Tensor) -> torch.Tensor:
+        raise NotImplementedError
diff --git a/diffevo/fitnessmapping.py b/src/diffevo/fitness_mappings/fitness_mappings.py
similarity index 80%
rename from diffevo/fitnessmapping.py
rename to src/diffevo/fitness_mappings/fitness_mappings.py
index 58c3a8d..013bb08 100644
--- a/diffevo/fitnessmapping.py
+++ b/src/diffevo/fitness_mappings/fitness_mappings.py
@@ -1,17 +1,21 @@
 """
 This module contains classes of fitness mapping function.
 """
+
 import torch
 
+from .base import BaseFitnessMapping
+
 
-class Identity:
+class Identity(BaseFitnessMapping):
     """Identity fitness mapping function."""
+
     def __init__(self, l2_factor=0.0):
         self.l2_factor = l2_factor
 
     def l2(self, x):
         return torch.norm(x, dim=-1) ** 2
-    
+
     def forward(self, x):
         return x
 
@@ -28,13 +32,15 @@ class Energy(Identity):
     Returns:
         p: torch.Tensor, the probability of the fitness. Compute by exp(-x / temperature).
     """
-    def __init__(self, temperature=1.0, l2_factor=0.0):
+
+    def __init__(self, temperature=1.0, l2_factor=0.0, overflow_offset=5):
         super().__init__(l2_factor=l2_factor)
         self.temperature = temperature
-    
+        self.overflow_offset = overflow_offset
+
     def forward(self, x):
         power = -x / self.temperature
-        power = power - power.max() + 5 # avoid overflow
+        power = power - power.max() + self.overflow_offset  # avoid overflow
         p = torch.exp(power)
         return p
 
@@ -45,14 +51,15 @@ class Power(Identity):
     Args:
         power: float, the power of the fitness.
         temperature: float, the temperature of the system.
-    
+
     Returns:
         p: torch.Tensor, the probability of the fitness. Compute by (x / temperature) ** power.
     """
+
     def __init__(self, power=1.0, temperature=1.0, l2_factor=0.0):
         super().__init__(l2_factor=l2_factor)
         self.power = power
         self.temperature = temperature
-    
+
     def forward(self, x):
-        return torch.pow(x / self.temperature, self.power)
\ No newline at end of file
+        return torch.pow(x / self.temperature, self.power)
diff --git a/src/diffevo/generators/__init__.py b/src/diffevo/generators/__init__.py
new file mode 100644
index 0000000..7512e2b
--- /dev/null
+++ b/src/diffevo/generators/__init__.py
@@ -0,0 +1,4 @@
+from .generators import BayesianGenerator, LatentBayesianGenerator
+from .elite import EliteGenerator
+
+__all__ = ["BayesianGenerator", "LatentBayesianGenerator", "EliteGenerator"]
diff --git a/src/diffevo/generators/base.py b/src/diffevo/generators/base.py
new file mode 100644
index 0000000..3b643a8
--- /dev/null
+++ b/src/diffevo/generators/base.py
@@ -0,0 +1,10 @@
+from abc import ABC, abstractmethod
+
+
+class BaseGenerator(ABC):
+    @abstractmethod
+    def generate(self, noise: float = 1.0, return_x0: bool = False):
+        raise NotImplementedError
+
+    def __call__(self, noise: float = 1.0, return_x0: bool = False):
+        return self.generate(noise=noise, return_x0=return_x0)
diff --git a/src/diffevo/generators/elite.py b/src/diffevo/generators/elite.py
new file mode 100644
index 0000000..ae1acad
--- /dev/null
+++ b/src/diffevo/generators/elite.py
@@ -0,0 +1,25 @@
+import torch
+
+from .base import BaseGenerator
+
+
+class EliteGenerator(BaseGenerator):
+    """
+    A generator that selects the top `k` individuals from the population based on fitness.
+    """
+
+    def __init__(self, x, fitness, alpha, k: int = 1):
+        self.x = x
+        self.fitness = fitness
+        self.k = k
+
+    def generate(self, noise: float = 1.0, return_x0: bool = False):
+        _, indices = torch.topk(self.fitness, self.k)
+        elites = self.x[indices]
+        # For simplicity, we just repeat the elites to form the next generation
+        # A more sophisticated implementation could involve crossover and mutation
+        next_generation = elites.repeat(len(self.x) // self.k + 1, 1)[: len(self.x)]
+        if return_x0:
+            return next_generation, next_generation
+        else:
+            return next_generation
diff --git a/diffevo/generator.py b/src/diffevo/generators/generators.py
similarity index 68%
rename from diffevo/generator.py
rename to src/diffevo/generators/generators.py
index 01cd8e9..e5df247 100644
--- a/diffevo/generator.py
+++ b/src/diffevo/generators/generators.py
@@ -1,29 +1,40 @@
 import torch
-import torch.nn as nn
-from .kde import KDE
+from ..utils import KDE
 
 
 class BayesianEstimator:
     """Bayesian Estimator of the origin points, based on current samples and fitness values."""
-    def __init__(self, x: torch.tensor, fitness: torch.tensor, alpha, density='uniform', h=0.1):
+
+    def __init__(
+        self,
+        x: torch.tensor,
+        fitness: torch.tensor,
+        alpha,
+        density="uniform",
+        h=0.1,
+        eps=1e-9,
+    ):
         self.x = x
         self.fitness = fitness
         self.alpha = alpha
         self.density_method = density
         self.h = h
-        if not density in ['uniform', 'kde']:
-            raise NotImplementedError(f'Density estimator {density} is not implemented.')
+        self.eps = eps
+        if density not in ["uniform", "kde"]:
+            raise NotImplementedError(
+                f"Density estimator {density} is not implemented."
+            )
 
     def append(self, estimator):
         self.x = torch.cat([self.x, estimator.x], dim=0)
         self.fitness = torch.cat([self.fitness, estimator.fitness], dim=0)
-    
+
     def density(self, x):
-        if self.density_method == 'uniform':
+        if self.density_method == "uniform":
             return torch.ones(x.shape[0]) / x.shape[0]
-        elif self.density_method == 'kde':
+        elif self.density_method == "kde":
             return KDE(x, h=self.h)
-    
+
     @staticmethod
     def norm(x):
         if x.shape[-1] == 1:
@@ -34,18 +45,18 @@ def norm(x):
 
     def gaussian_prob(self, x, mu, sigma):
         dist = self.norm(x - mu)
-        return torch.exp(-(dist ** 2) / (2 * sigma ** 2))
+        return torch.exp(-(dist**2) / (2 * sigma**2))
 
     def _estimate(self, x_t, p_x_t):
         # diffusion proability, P = N(x_t; \sqrt{α_t}x,\sqrt{1-α_t})
-        mu = self.x * (self.alpha ** 0.5)
+        mu = self.x * (self.alpha**0.5)
         sigma = (1 - self.alpha) ** 0.5
         p_diffusion = self.gaussian_prob(x_t, mu, sigma)
 
         # estimate the origin
-        prob = (self.fitness + 1e-9) * (p_diffusion + 1e-9) / (p_x_t + 1e-9)
+        prob = (self.fitness + self.eps) * (p_diffusion + self.eps) / (p_x_t + self.eps)
         z = torch.sum(prob)
-        origin = torch.sum(prob.unsqueeze(1) * self.x, dim=0) / (z + 1e-9)
+        origin = torch.sum(prob.unsqueeze(1) * self.x, dim=0) / (z + self.eps)
 
         return origin
 
@@ -58,26 +69,36 @@ def __call__(self, x_t):
         return self.estimate(x_t)
 
     def __repr__(self):
-        return f'<BayesianEstimator {len(self.x)} samples>'
+        return f"<BayesianEstimator {len(self.x)} samples>"
+
 
 class LatentBayesianEstimator(BayesianEstimator):
-    def __init__(self, x: torch.tensor, latent: torch.tensor, fitness: torch.tensor, alpha, density='uniform', h=0.1):
-        super().__init__(x, fitness, alpha, density=density, h=h)
+    def __init__(
+        self,
+        x: torch.tensor,
+        latent: torch.tensor,
+        fitness: torch.tensor,
+        alpha,
+        density="uniform",
+        h=0.1,
+        eps=1e-9,
+    ):
+        super().__init__(x, fitness, alpha, density=density, h=h, eps=eps)
         self.z = latent
 
     def _estimate(self, z_t, p_z_t):
         # diffusion proability, P = N(x_t; \sqrt{α_t}x,\sqrt{1-α_t})
-        mu = self.z * (self.alpha ** 0.5)
+        mu = self.z * (self.alpha**0.5)
         sigma = (1 - self.alpha) ** 0.5
         p_diffusion = self.gaussian_prob(z_t, mu, sigma)
 
         # estimate the origin
-        prob = (self.fitness + 1e-9) * (p_diffusion + 1e-9) / (p_z_t + 1e-9)
+        prob = (self.fitness + self.eps) * (p_diffusion + self.eps) / (p_z_t + self.eps)
         z = torch.sum(prob)
-        origin = torch.sum(prob.unsqueeze(1) * self.x, dim=0) / (z + 1e-9)
+        origin = torch.sum(prob.unsqueeze(1) * self.x, dim=0) / (z + self.eps)
 
         return origin
-    
+
     def estimate(self, z_t):
         p_z_t = self.density(self.z)
         origin = torch.vmap(self._estimate, (0, 0))(z_t, p_z_t)
@@ -97,33 +118,43 @@ def ddim_step(xt, x0, alphas: tuple, noise: float = None):
     """
     alphat, alphatp = alphas
     sigma = ddpm_sigma(alphat, alphatp) * noise
-    eps = (xt - (alphat ** 0.5) * x0) / (1.0 - alphat) ** 0.5
+    eps = (xt - (alphat**0.5) * x0) / (1.0 - alphat) ** 0.5
     if sigma is None:
         sigma = ddpm_sigma(alphat, alphatp)
-    x_next = (alphatp ** 0.5) * x0 + ((1 - alphatp - sigma ** 2) ** 0.5) * \
-        eps + sigma * torch.randn_like(x0)
+    x_next = (
+        (alphatp**0.5) * x0
+        + ((1 - alphatp - sigma**2) ** 0.5) * eps
+        + sigma * torch.randn_like(x0)
+    )
     return x_next
 
+
+from .base import BaseGenerator
+
+
 def ddpm_sigma(alphat, alphatp):
     """Compute the default sigma for the DDPM algorithm."""
     return ((1 - alphatp) / (1 - alphat) * (1 - alphat / alphatp)) ** 0.5
 
 
-class BayesianGenerator:
+class BayesianGenerator(BaseGenerator):
     """Bayesian Generator for the DDIM algorithm.
 
     Args:
         density: legacy option for changing the density estimator.
         h: bandwidth for KDE when ``density`` is ``'kde'``. Both rarely used.
     """
-    def __init__(self, x, fitness, alpha, density='uniform', h=0.1):
+
+    def __init__(self, x, fitness, alpha, density="uniform", h=0.1):
         self.x = x
         if torch.any(fitness < 0):
-            raise ValueError('fitness must be non-negative')
+            raise ValueError("fitness must be non-negative")
         self.fitness = fitness
         self.alpha, self.alpha_past = alpha
-        self.estimator = BayesianEstimator(self.x, self.fitness, self.alpha, density=density, h=h)
-    
+        self.estimator = BayesianEstimator(
+            self.x, self.fitness, self.alpha, density=density, h=h
+        )
+
     def generate(self, noise=1.0, return_x0=False):
         x0_est = self.estimator(self.x)
         x_next = ddim_step(self.x, x0_est, (self.alpha, self.alpha_past), noise=noise)
@@ -138,20 +169,23 @@ def __call__(self, noise=1.0, return_x0=False):
 
 class LatentBayesianGenerator(BayesianGenerator):
     """Bayesian Generator for the DDIM algorithm."""
-    def __init__(self, x, latent, fitness, alpha, density='uniform', h=0.1):
+
+    def __init__(self, x, latent, fitness, alpha, density="uniform", h=0.1):
         # density and h are legacy options kept for backward compatibility
         self.x = x
         self.latent = latent
         if torch.any(fitness < 0):
-            raise ValueError('fitness must be non-negative')
+            raise ValueError("fitness must be non-negative")
         self.fitness = fitness
         self.alpha, self.alpha_past = alpha
-        self.estimator = LatentBayesianEstimator(self.x, self.latent, self.fitness, self.alpha, density=density, h=h)
-    
+        self.estimator = LatentBayesianEstimator(
+            self.x, self.latent, self.fitness, self.alpha, density=density, h=h
+        )
+
     def generate(self, noise=1.0, return_x0=False):
         x0_est = self.estimator(self.latent)
         x_next = ddim_step(self.x, x0_est, (self.alpha, self.alpha_past), noise=noise)
         if return_x0:
             return x_next, x0_est
         else:
-            return x_next
\ No newline at end of file
+            return x_next
diff --git a/src/diffevo/optimizers/__init__.py b/src/diffevo/optimizers/__init__.py
new file mode 100644
index 0000000..93b17a6
--- /dev/null
+++ b/src/diffevo/optimizers/__init__.py
@@ -0,0 +1,3 @@
+from .base import Optimizer
+
+__all__ = ["Optimizer"]
diff --git a/src/diffevo/optimizers/base.py b/src/diffevo/optimizers/base.py
new file mode 100644
index 0000000..521d623
--- /dev/null
+++ b/src/diffevo/optimizers/base.py
@@ -0,0 +1,44 @@
+from abc import ABC, abstractmethod
+import torch
+
+class Optimizer(ABC):
+    """
+    Abstract base class for an optimizer.
+    """
+
+    @abstractmethod
+    def ask(self) -> torch.Tensor:
+        """
+        Returns a population of candidate solutions.
+        """
+        raise NotImplementedError
+
+    @abstractmethod
+    def tell(self, fitnesses: torch.Tensor):
+        """
+        Updates the optimizer's state with the fitnesses of the candidate solutions.
+        """
+        raise NotImplementedError
+
+    def optimize(self, problem, orchestrator):
+        """
+        Runs the full optimization loop.
+        This method is for optimizers that manage their own loop.
+        """
+        num_steps = orchestrator.config.optimizer.params.get('num_steps', 100)
+        for step in range(num_steps):
+            population = self.ask()
+            fitnesses = problem.evaluate(population)
+            self.tell(fitnesses)
+
+            step_data = {
+                'step': step,
+                'population': population,
+                'fitnesses': fitnesses,
+                'best_fitness': torch.max(fitnesses).item()
+            }
+
+            for callback in orchestrator.callbacks:
+                callback.on_step_end(orchestrator, step_data)
+
+        return population
diff --git a/src/diffevo/orchestrator.py b/src/diffevo/orchestrator.py
new file mode 100644
index 0000000..21e5ed9
--- /dev/null
+++ b/src/diffevo/orchestrator.py
@@ -0,0 +1,141 @@
+import torch
+import os
+import numpy as np
+import random
+import importlib
+import datetime
+import subprocess
+import yaml
+import pkgutil
+import inspect
+from .config import ExperimentConfig
+from .problems.base import Problem
+from .optimizers.base import Optimizer
+from .callbacks import Callback
+
+def _load_plugins(plugin_dir, base_class):
+    """Dynamically loads plugins from a specified directory."""
+    plugins = {}
+    if not os.path.exists(plugin_dir):
+        return plugins
+
+    plugin_package = "plugins." + os.path.basename(plugin_dir)
+
+    for _, name, _ in pkgutil.iter_modules([plugin_dir]):
+        module = importlib.import_module(f"{plugin_package}.{name}")
+        for attribute_name in dir(module):
+            attribute = getattr(module, attribute_name)
+            if inspect.isclass(attribute) and issubclass(attribute, base_class) and attribute is not base_class:
+                plugins[attribute.__name__] = attribute
+    return plugins
+
+class Orchestrator:
+    """
+    Orchestrates an experiment based on a given configuration.
+    It handles the initialization of the problem, optimizer, and callbacks,
+    and executes the main optimization loop.
+    """
+    def __init__(self, config: ExperimentConfig, output_dir: str):
+        self.config = config
+        self.base_output_dir = output_dir
+        self.output_dir = None  # Will be set in run()
+
+        self.problem_registry = _load_plugins("plugins/problems", Problem)
+        self.optimizer_registry = _load_plugins("plugins/optimizers", Optimizer)
+        self.callback_registry = _load_plugins("plugins/callbacks", Callback)
+
+        self.problem = self._init_problem()
+        self.callbacks = self._init_callbacks()
+        self.optimizer = self._init_optimizer()
+        self.run_id = 0
+
+    def _init_problem(self):
+        """Instantiates the problem from the registry."""
+        problem_class = self.problem_registry.get(self.config.problem.name)
+        if not problem_class:
+            raise ValueError(f"Problem '{self.config.problem.name}' not found in registry.")
+        return problem_class(**self.config.problem.params)
+
+    def _init_optimizer(self):
+        """Instantiates the optimizer from the registry."""
+        optimizer_class = self.optimizer_registry.get(self.config.optimizer.class_name)
+        if not optimizer_class:
+            raise ValueError(f"Optimizer '{self.config.optimizer.class_name}' not found in registry.")
+
+        params = self.config.optimizer.params.copy()
+
+        sig = inspect.signature(optimizer_class.__init__)
+        if 'callbacks' in sig.parameters:
+            params['callbacks'] = self.callbacks
+
+        return optimizer_class(
+            problem=self.problem,
+            popsize=params.pop('popsize', 512),
+            **params
+        )
+
+    def _init_callbacks(self):
+        """Instantiates the callbacks from the registry."""
+        callbacks = []
+        for callback_name in self.config.callbacks:
+            callback_class = self.callback_registry.get(callback_name)
+            if not callback_class:
+                raise ValueError(f"Callback '{callback_name}' not found in registry.")
+            callbacks.append(callback_class())
+        return callbacks
+
+    def _save_artifacts(self):
+        """Saves reproducibility artifacts to the output directory."""
+        # Save config
+        config_path = os.path.join(self.output_dir, "config.yaml")
+        with open(config_path, 'w') as f:
+            yaml.dump(self.config.dict(), f, default_flow_style=False)
+
+        # Save environment
+        env_path = os.path.join(self.output_dir, "environment.txt")
+        with open(env_path, 'w') as f:
+            subprocess.run(["pip", "freeze"], stdout=f, check=True)
+
+        # Save git hash
+        git_hash_path = os.path.join(self.output_dir, "git_hash.txt")
+        try:
+            with open(git_hash_path, 'w') as f:
+                subprocess.run(["git", "rev-parse", "HEAD"], stdout=f, check=True, stderr=subprocess.PIPE)
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            with open(git_hash_path, 'w') as f:
+                f.write("Not a git repository or git not found.")
+
+    def run(self):
+        """Runs the full experiment, creating a unique directory for the results."""
+        timestamp = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
+        experiment_name = self.config.name
+        self.output_dir = os.path.join(self.base_output_dir, f"{experiment_name}_{timestamp}")
+        os.makedirs(self.output_dir, exist_ok=True)
+
+        self._save_artifacts()
+
+        final_populations = []
+        for i in range(self.config.num_runs):
+            self.run_id = i
+            final_population = self._run_single()
+            final_populations.append(final_population)
+
+        return self.output_dir, final_populations
+
+    def _run_single(self):
+        """Executes a single run of the experiment."""
+        seed = self.config.seed + self.run_id
+        random.seed(seed)
+        np.random.seed(seed)
+        torch.manual_seed(seed)
+
+        for callback in self.callbacks:
+            callback.on_experiment_start(self)
+
+        # All optimizers now have an `optimize` method
+        final_population = self.optimizer.optimize(self.problem, self)
+
+        for callback in self.callbacks:
+            callback.on_experiment_end(self)
+
+        return final_population
diff --git a/src/diffevo/plotting.py b/src/diffevo/plotting.py
new file mode 100644
index 0000000..55fda4e
--- /dev/null
+++ b/src/diffevo/plotting.py
@@ -0,0 +1,28 @@
+import matplotlib.pyplot as plt
+import torch
+import numpy as np
+
+
+def plot_alpha_schedules(ax=None, T=100):
+    """Plots common alpha schedules."""
+    if ax is None:
+        ax = plt.gca()
+
+    t = torch.linspace(0, T, T)
+    alpha_linear = 1 - t / T
+    alpha_cosine = torch.cos(t * np.pi / T) / 2 + 0.5
+    beta0 = 0.0003
+    gamma = 0.069
+    alpha_ddpm = torch.exp(-beta0 * t - gamma * (t**2) / T)
+
+    # Colors used in the original plot
+    colors = ["#6F6E6E", "#F5851E", "#343434"]
+
+    ax.plot(t, alpha_linear, label="Linear", color=colors[0])
+    ax.plot(t, alpha_cosine, label="Cosine", color=colors[1])
+    ax.plot(t, alpha_ddpm, label="DDPM", color=colors[2])
+    ax.legend()
+    ax.set_xlabel("$t$")
+    ax.set_ylabel("$\\alpha$")
+    ax.set_title(r"$\alpha$ schedule")
+    return ax
diff --git a/src/diffevo/problems/__init__.py b/src/diffevo/problems/__init__.py
new file mode 100644
index 0000000..a5817f9
--- /dev/null
+++ b/src/diffevo/problems/__init__.py
@@ -0,0 +1,3 @@
+from .base import Problem
+
+__all__ = ["Problem"]
diff --git a/src/diffevo/problems/base.py b/src/diffevo/problems/base.py
new file mode 100644
index 0000000..72ff59b
--- /dev/null
+++ b/src/diffevo/problems/base.py
@@ -0,0 +1,41 @@
+# This file will define the `Problem` abstraction.
+from abc import ABC, abstractmethod
+from typing import Tuple
+import torch
+
+# --- Problem Abstraction ---
+
+class Problem(ABC):
+    """
+    Abstract base class for an optimization problem.
+
+    This class defines the interface for a problem to be solved by an optimizer.
+    It encapsulates the objective function, the dimensionality of the problem,
+    and the solution space (domain).
+    """
+    def __init__(self, name: str, dim: int, lower_bound: float, upper_bound: float):
+        self.name = name
+        self.dim = dim
+        self.lower_bound = lower_bound
+        self.upper_bound = upper_bound
+        self.objective_function = self._create_objective()
+
+    @property
+    def domain(self) -> Tuple[float, float]:
+        """Returns the domain of the search space as (lower_bound, upper_bound)."""
+        return (self.lower_bound, self.upper_bound)
+
+    @abstractmethod
+    def _create_objective(self):
+        """
+        Initializes and returns the objective function.
+        The objective function should take a PyTorch tensor of shape (pop_size, dim)
+        and return a tensor of shape (pop_size,) with fitness values.
+        """
+        raise NotImplementedError
+
+    def evaluate(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Evaluates the objective function for a given population of solutions x.
+        """
+        return self.objective_function(x)
diff --git a/src/diffevo/problems/helpers.py b/src/diffevo/problems/helpers.py
new file mode 100644
index 0000000..aa69db8
--- /dev/null
+++ b/src/diffevo/problems/helpers.py
@@ -0,0 +1,43 @@
+import torch
+
+
+fitness_target = {
+    "rosenbrock": 0,
+    "beale": 0,
+    "himmelblau": 0,
+    "ackley": -12.5401,
+    "rastrigin": -64.6249,
+    "rastrigin_4d": -129.2498,
+    "rastrigin_32d": -1033.9980,
+    "rastrigin_256d": -8271.9844,
+}
+distance_scale = {
+    "rosenbrock": 287.51,
+    "beale": 20,
+    "himmelblau": 17.01,
+    "ackley": 2,
+    "rastrigin": 30,
+    "rastrigin_4d": 60,
+    "rastrigin_32d": 500,
+    "rastrigin_256d": 4000,
+}
+max_distances = {
+    "rosenbrock": 40009,
+    "beale": 72769.2,
+    "himmelblau": 308.803,
+    "ackley": 12.5401,
+    "rastrigin": 64.6249,
+    "rastrigin_4d": 129.2498,
+    "rastrigin_32d": 1033.9980,
+    "rastrigin_256d": 8271.9844,
+}
+
+def energy_wrapper(obj, temperature=1, target=0, scale=1, max_distance=None, **kwargs):
+    def wrapped_obj(x):
+        minimal_p = torch.exp(-torch.tensor(max_distance) / (temperature * scale))
+        p = torch.exp(-abs(obj(x) - target) / (temperature * scale))
+        return (p - minimal_p) / (1 - minimal_p)
+    return wrapped_obj
+
+def _original_name(obj_name: str):
+    return "rastrigin" if "rastrigin" in obj_name else obj_name
diff --git a/diffevo/examples.py b/src/diffevo/problems/utils.py
similarity index 84%
rename from diffevo/examples.py
rename to src/diffevo/problems/utils.py
index c395044..2cc337e 100644
--- a/diffevo/examples.py
+++ b/src/diffevo/problems/utils.py
@@ -1,18 +1,19 @@
 from torch.distributions import MultivariateNormal
 import torch
 
+
 def two_peak_density(x, mu1=None, mu2=None, std=0.1):
     if mu1 is None:
-        mu1 = torch.tensor([-1., -1.])
+        mu1 = torch.tensor([-1.0, -1.0])
     if mu2 is None:
-        mu2 = torch.tensor([1., 1.])
+        mu2 = torch.tensor([1.0, 1.0])
 
     # Checking if the input tensor x has shape (2,) and unsqueeze to make it (*N, 2)
     if len(x.shape) == 1:
         x = x.unsqueeze(0)
 
     # Covariance matrix for the Gaussian distributions (identity matrix, since it's a standard Gaussian)
-    covariance_matrix = torch.eye(2) * (std ** 2)
+    covariance_matrix = torch.eye(2) * (std**2)
 
     # Create two multivariate normal distributions
     dist1 = MultivariateNormal(mu1, covariance_matrix)
@@ -28,9 +29,9 @@ def two_peak_density(x, mu1=None, mu2=None, std=0.1):
 
 def two_peak_density_step(x, mu1=None, mu2=None, std=0.5):
     if mu1 is None:
-        mu1 = torch.tensor([-1., -1.])
+        mu1 = torch.tensor([-1.0, -1.0])
     if mu2 is None:
-        mu2 = torch.tensor([1., 1.])
+        mu2 = torch.tensor([1.0, 1.0])
 
     # compute the minimal distance to the two peaks
     d1 = torch.norm(x - mu1, dim=-1)
@@ -41,4 +42,4 @@ def two_peak_density_step(x, mu1=None, mu2=None, std=0.5):
     p = (d < std).float()
     p = torch.clamp(p, 1e-9, 1)
 
-    return p
\ No newline at end of file
+    return p
diff --git a/src/diffevo/schedulers/__init__.py b/src/diffevo/schedulers/__init__.py
new file mode 100644
index 0000000..d6c81f4
--- /dev/null
+++ b/src/diffevo/schedulers/__init__.py
@@ -0,0 +1,3 @@
+from .schedulers import DDIMScheduler, DDIMSchedulerCosine, DDPMScheduler
+
+__all__ = ["DDIMScheduler", "DDIMSchedulerCosine", "DDPMScheduler"]
diff --git a/src/diffevo/schedulers/base.py b/src/diffevo/schedulers/base.py
new file mode 100644
index 0000000..98a794a
--- /dev/null
+++ b/src/diffevo/schedulers/base.py
@@ -0,0 +1,19 @@
+from abc import ABC, abstractmethod
+
+
+class BaseScheduler(ABC):
+    @abstractmethod
+    def __init__(self, num_step: int, **kwargs):
+        self.num_step = num_step
+
+    @abstractmethod
+    def __next__(self):
+        raise NotImplementedError
+
+    @abstractmethod
+    def __len__(self):
+        raise NotImplementedError
+
+    @abstractmethod
+    def __iter__(self):
+        raise NotImplementedError
diff --git a/diffevo/ddim.py b/src/diffevo/schedulers/schedulers.py
similarity index 81%
rename from diffevo/ddim.py
rename to src/diffevo/schedulers/schedulers.py
index e8b23b0..181b5b4 100644
--- a/diffevo/ddim.py
+++ b/src/diffevo/schedulers/schedulers.py
@@ -1,43 +1,49 @@
 import torch
 import numpy as np
 
+from .base import BaseScheduler
 
-class DDIMScheduler:
+
+class DDIMScheduler(BaseScheduler):
     """
     DDIMScheduler is a scheduler for the DDIM algorithm.
 
     Args:
         num_step: int, the number of steps for the DDIM algorithm
-    
+
     Iters:
         t: int, the current time step.
         alpha: float, the current value of alpha.
         alpha_past: float, the previous value of alpha
-    
+
     Example:
         scheduler = DDIMScheduler(num_step=100)
         for t, alpha, alpha_past in scheduler:
             # do something with t, alpha, and alpha_past
     """
+
     def __init__(self, num_step, power=1, eps=1e-4):
         self.num_step = num_step
         self.power = power
-        self.alpha = torch.linspace(1 - eps, (eps*eps) ** (1 / self.power), num_step) ** self.power
+        self.alpha = (
+            torch.linspace(1 - eps, (eps * eps) ** (1 / self.power), num_step)
+            ** self.power
+        )
         self.index = 0
-    
+
     def __next__(self):
         if self.index >= self.num_step - 1:
             raise StopIteration
-        
+
         t = self.num_step - self.index - 1
         alpha = self.alpha[t]
         alpha_past = self.alpha[t - 1]
         self.index += 1
         return t, (alpha, alpha_past)
-    
+
     def __len__(self):
         return self.num_step - 1
-    
+
     def __iter__(self):
         self.index = 0
         return self
@@ -47,32 +53,34 @@ class DDIMSchedulerCosine(DDIMScheduler):
     """
     DDIMSchedulerCosine is a scheduler for the DDIM algorithm with cosine alpha schedule.
     Ref: https://arxiv.org/abs/2102.09672
-    
+
     Args:
         num_step: int, the number of steps for the DDIM algorithm
-    
+
     Iters:
         t: int, the current time step.
         alpha: float, the current value of alpha.
         alpha_past: float, the previous value of alpha
-    
+
     Example:
         scheduler = DDIMSchedulerCosine(num_step=100)
         for t, alpha, alpha_past in scheduler:
             # do something with t, alpha, and alpha_past
     """
 
-    def __init__(self, num_step):
+    def __init__(self, num_step, eps_min=1e-3, eps_max=1 - 1e-3):
         super().__init__(num_step)
         alpha = torch.cos(torch.linspace(0, torch.pi, num_step)) + 1
         self.alpha = alpha / 2
-        # rescaling alpha to [1e-3, 1-1e-3]
-        self.alpha = (self.alpha + 1e-3) * (1 - 1e-3) / (1 + 1e-3)
+        # rescaling alpha to [eps_min, eps_max]
+        self.alpha = (self.alpha + eps_min) * (eps_max) / (1 + eps_min)
+
 
 class DDPMScheduler(DDIMScheduler):
     """
     DDPMScheduler is a scheduler for the DDPM algorithm.
     """
+
     def __init__(self, num_step, eps=1e-4):
         r"""Approximate the alpha schedule of DDPM.
 
@@ -92,8 +100,11 @@ def __init__(self, num_step, eps=1e-4):
         """
         super().__init__(num_step)
         # ensure alpha[0] = 1 - eps, and alpha[-1] = eps
-        beta = ((num_step ** 2) * np.log(1 / (1 - eps)) + np.log(eps)) / (num_step - 1)
-        gamma = - num_step * (num_step * np.log(1 / (1-eps)) + np.log(eps)) / (num_step - 1)
+        beta = ((num_step**2) * np.log(1 / (1 - eps)) + np.log(eps)) / (num_step - 1)
+        gamma = (
+            -num_step
+            * (num_step * np.log(1 / (1 - eps)) + np.log(eps))
+            / (num_step - 1)
+        )
         t = torch.linspace(1.0 / num_step, 1.0, num_step)
         self.alpha = torch.exp(-beta * t - gamma * t.square())
-        
\ No newline at end of file
diff --git a/src/diffevo/utils.py b/src/diffevo/utils.py
new file mode 100644
index 0000000..19285bc
--- /dev/null
+++ b/src/diffevo/utils.py
@@ -0,0 +1,172 @@
+import torch
+import torch.nn as nn
+import numpy as np
+
+
+def distance_matrix(x, y):
+    """Compute the pairwise distance matrix between x and y.
+
+    Args:
+        x: (N, d) tensor.
+        y: (M, d) tensor.
+    Returns:
+        (N, M) tensor, the pairwise distance matrix.
+    """
+    return torch.cdist(x, y)
+
+
+def KDE(samples, h=0.1):
+    """Modified Kernel Density Estimation (KDE) method, which only estimate the density at the given samples.
+
+    Args:
+        samples: (N, d) tensor, the samples to estimate the density.
+        h: float, the bandwidth.
+    Returns:
+        (N,) tensor, the estimated density at the given samples.
+    """
+    distances = distance_matrix(samples, samples)  # (N, N)
+    weights = torch.exp(-(distances**2) / (2 * h**2))  # (N,)
+    weights = weights.sum(dim=-1)
+    return weights / sum(weights) * samples.shape[0]
+
+
+class RandomProjection(nn.Module):
+    def __init__(self, in_features, out_features, normalize=True):
+        super().__init__()
+        self.in_features = in_features
+        self.out_features = out_features
+        self.linear = nn.Linear(in_features, out_features, bias=False)
+        self.normalize = normalize
+        self.init_weight()
+
+    def init_weight(self):
+        self.linear.weight.data = torch.randn_like(self.linear.weight.data) / (
+            self.in_features**0.5
+        )
+        if self.normalize:
+            self.linear.weight.data /= self.linear.weight.data.norm(dim=1, keepdim=True)
+
+    def forward(self, x):
+        return self.linear(x)
+
+objs = [
+    "rosenbrock",
+    "beale",
+    "himmelblau",
+    "ackley",
+    "rastrigin",
+    "rastrigin_4d",
+    "rastrigin_32d",
+    "rastrigin_256d",
+]
+
+
+def statistics(func):
+    """apply the func to each record of a list of experiments
+
+    Args of decorated function:
+        records: list of records of experiments
+            structure: experiments[experiment_1[fitness_func_1, ...], ...]
+
+    Returns:
+        list of statistics of each experiment
+            structure: [num_experiments, num_fitness_funcs, *statistics]
+    """
+
+    def wrapper(records, *args, **kwargs):
+        results = []
+        for record in records:
+            result_temp = {}
+            for fitness_func in record.keys():
+                result_temp[fitness_func] = func(record[fitness_func], *args, **kwargs)
+            results.append(result_temp)
+        return results
+
+    return wrapper
+
+
+def group(statistics: list):
+    results = {}
+    for measure in statistics:
+        for fitness_func in measure.keys():
+            if fitness_func not in results:
+                results[fitness_func] = []
+            results[fitness_func].append(measure[fitness_func])
+    return results
+
+
+def avg_group(statistics: list):
+    grouped = group(statistics)
+    for k, v in grouped.items():
+        grouped[k] = np.mean(v, axis=0)
+
+    return grouped
+
+
+def std_group(statistics: list):
+    grouped = group(statistics)
+    for k, v in grouped.items():
+        grouped[k] = np.std(v, axis=0)
+
+    return grouped
+
+
+def get_top_values(fitness, x, n):
+    idx = np.argsort(-fitness)[:n]
+    return x[idx]
+
+
+@statistics
+def top_rewards(record, n=None, use_x0=False):
+    fitnesses = record["x0_fitness"] if use_x0 else record["fitnesses"]
+
+    if n is not None:
+        if len(fitnesses.shape) == 1:
+            fitnesses = fitnesses.unsqueeze(0)
+        fitnesses = fitnesses[-1]
+        fitnesses = get_top_values(fitnesses, fitnesses, n)
+    else:
+        fitnesses = fitnesses[-1]
+    return fitnesses.mean().item()
+
+
+def prob(x, scale=10):
+    classification = torch.round(x * scale).long()
+    # count the number of points in each class, return [class, num]
+    classes, num = torch.unique(classification, return_counts=True, dim=0)
+    prob = num.float() / num.sum()
+    return prob
+
+
+def entropy(x, scale=10):
+    p = prob(x, scale)
+    return torch.sum(-p * torch.log2(p))
+
+
+@statistics
+def point_entropy(record, n=None, scale=10, use_x0=False, name=None):
+    if name != "MAPElite_benchmark":
+        x = record["trace"][-1]
+        fitnesses = record["x0_fitness"] if use_x0 else record["fitnesses"]
+    else:
+        x = [p for p, r in record["maps"].values()]
+        x = torch.stack(x)
+        fitnesses = record["fitnesses"].unsqueeze(0)
+
+    if n is not None:
+        x = get_top_values(fitnesses[-1], x, n)
+    return entropy(x, scale).item()
+
+
+def normalize_observation(observation, observation_space, extreme_threshold=1e3):
+    # Replace inf/-inf with threshold values
+    low = np.where(
+        observation_space.low < -extreme_threshold, -1, observation_space.low
+    )
+    high = np.where(
+        observation_space.high > extreme_threshold, 1, observation_space.high
+    )
+
+    # Normalize to [-1, 1] range
+    rescaled = 2 * (observation - low) / (high - low) - 1
+    return rescaled * np.sqrt(3)  # scale to unit variance
diff --git a/tests/unit/evaluation/test_experiment.py b/tests/unit/evaluation/test_experiment.py
new file mode 100644
index 0000000..82cb33e
--- /dev/null
+++ b/tests/unit/evaluation/test_experiment.py
@@ -0,0 +1,62 @@
+import unittest
+import torch
+import os
+import numpy as np
+import pandas as pd
+from src.diffevo.evaluation.experiment import Experiment
+
+class TestExperiment(unittest.TestCase):
+
+    def setUp(self):
+        self.output_dir = 'test_results'
+        os.makedirs(self.output_dir, exist_ok=True)
+        self.mock_method = lambda objs, **kwargs: {obj: {'fitnesses': [np.random.rand(10) for _ in range(5)]} for obj in objs}
+
+    def tearDown(self):
+        import shutil
+        if os.path.exists(self.output_dir):
+            shutil.rmtree(self.output_dir)
+
+    def test_experiment_run(self):
+        experiment = Experiment(name='test_exp', method=self.mock_method, num_steps=5)
+        records = experiment.run(num_experiments=2, output_dir=self.output_dir)
+        self.assertIn('test_exp', records)
+        self.assertEqual(len(records['test_exp']), 2)
+        self.assertTrue(os.path.exists(os.path.join(self.output_dir, 'test_exp.pt')))
+        self.assertTrue(os.path.exists(os.path.join(self.output_dir, 'test_exp_report.csv')))
+
+    def test_generate_report_with_data(self):
+        experiment = Experiment(name='test_report', method=self.mock_method, num_steps=5)
+        records = [{'rosenbrock': {'fitnesses': [[0.1, 0.2], [0.3, 0.4]]}}]
+        experiment.generate_report(records, self.output_dir)
+        report_path = os.path.join(self.output_dir, 'test_report_report.csv')
+        self.assertTrue(os.path.exists(report_path))
+        df = pd.read_csv(report_path)
+        self.assertEqual(len(df), 1)
+        self.assertAlmostEqual(df['mean_best_fitness'][0], 0.4)
+
+    def test_generate_report_with_empty_data(self):
+        experiment = Experiment(name='test_empty_report', method=self.mock_method, num_steps=5)
+        records = []
+        experiment.generate_report(records, self.output_dir)
+        report_path = os.path.join(self.output_dir, 'test_empty_report_report.csv')
+        self.assertTrue(os.path.exists(report_path))
+        df = pd.read_csv(report_path)
+        self.assertEqual(len(df), 0)
+
+    def test_generate_plots_with_data(self):
+        experiment = Experiment(name='test_plots', method=self.mock_method, num_steps=5)
+        records = [{'rosenbrock': {'fitnesses': [np.array([0.1, 0.2]), np.array([0.3, 0.4])]}}]
+        experiment.generate_plots(records, self.output_dir)
+        plot_path = os.path.join(self.output_dir, 'test_plots_rosenbrock_plot.png')
+        self.assertTrue(os.path.exists(plot_path))
+
+    def test_generate_plots_with_no_fitness_data(self):
+        experiment = Experiment(name='test_no_fitness_plots', method=self.mock_method, num_steps=5)
+        records = [{'rosenbrock': {'fitnesses': []}}]
+        experiment.generate_plots(records, self.output_dir)
+        plot_path = os.path.join(self.output_dir, 'test_no_fitness_plots_rosenbrock_plot.png')
+        self.assertFalse(os.path.exists(plot_path))
+
+if __name__ == '__main__':
+    unittest.main()
diff --git a/tests/unit/optimizers/test_optimizer.py b/tests/unit/optimizers/test_optimizer.py
new file mode 100644
index 0000000..71385db
--- /dev/null
+++ b/tests/unit/optimizers/test_optimizer.py
@@ -0,0 +1,12 @@
+import torch
+import pytest
+from src.diffevo.optimizer import DiffEvo
+
+def fitness_function(x):
+    return torch.sum(x ** 2, dim=-1)
+
+def test_diffevo_optimizer():
+    optimizer = DiffEvo(num_step=10)
+    initial_population = torch.randn(100, 2)
+    optimized_population = optimizer.optimize(fitness_function, initial_population)
+    assert optimized_population.shape == initial_population.shape
diff --git a/tests/unit/test_orchestrator.py b/tests/unit/test_orchestrator.py
new file mode 100644
index 0000000..25c40cb
--- /dev/null
+++ b/tests/unit/test_orchestrator.py
@@ -0,0 +1,53 @@
+import unittest
+import os
+import yaml
+from collections.abc import Mapping
+from diffevo.config import ExperimentConfig
+from diffevo.orchestrator import Orchestrator
+
+def deep_merge(d1, d2):
+    """Recursively merges d2 into d1."""
+    for k, v in d2.items():
+        if k in d1 and isinstance(d1[k], Mapping) and isinstance(v, Mapping):
+            d1[k] = deep_merge(d1[k], v)
+        else:
+            d1[k] = v
+    return d1
+
+def load_config_for_test(config_path):
+    """Loads a YAML configuration for testing, handling base configurations."""
+    with open(config_path, 'r') as f:
+        config_data = yaml.safe_load(f)
+
+    if 'base' in config_data:
+        base_path = os.path.join(os.path.dirname(config_path), config_data['base'])
+        base_config = load_config_for_test(base_path)
+        del config_data['base']
+        return deep_merge(base_config, config_data)
+
+    return config_data
+
+class TestOrchestrator(unittest.TestCase):
+
+    def test_smoketest_run(self):
+        """
+        Tests a full run of the orchestrator with the smoketest config.
+        """
+        config_path = "configs/smoketest.yaml"
+        output_dir = "results/test_run"
+
+        config_data = load_config_for_test(config_path)
+
+        config = ExperimentConfig(**config_data)
+
+        orchestrator = Orchestrator(config=config, output_dir=output_dir)
+        orchestrator.run()
+
+        # Verify that the output directory and some artifact files were created
+        self.assertTrue(os.path.exists(orchestrator.output_dir))
+        self.assertTrue(os.path.exists(os.path.join(orchestrator.output_dir, "config.yaml")))
+        self.assertTrue(os.path.exists(os.path.join(orchestrator.output_dir, "fitness_log.csv")))
+        self.assertTrue(os.path.exists(os.path.join(orchestrator.output_dir, "fitness_progression.png")))
+
+if __name__ == '__main__':
+    unittest.main()