A self-tuning scheduler for streaming data workflows on heterogeneous clusters.
Research project — M1 Informatique, Université Claude Bernard Lyon 1. Targets the intersection of three Master's specializations: DiPaC (distributed scheduling), DataScale (data-aware pipelines), Autonomic Systems (MAPE-K control loop).
A streaming-workflow scheduler that combines three angles:
| Component | Role |
|---|---|
| HEFT-LC | Static baseline — load-aware HEFT with tie-break tolerance ε |
| DLS | Data-Locality Scheduler — locality-aware variant |
| RLScheduler | Tabular Q-learning agent that learns per-task placements |
| MAPE-K controller | Autonomic loop that adjusts ε online from observed metrics |
Workflows arrive over time as a Poisson stream (50% MapReduce, 30% ETL, 20% random DAGs), each task carrying compute cost, input/output partitions and data volumes. The cluster models per-node bandwidth, data residency, and node failures.
Stable-workload scheduler comparison (108 runs):
| Scheduler | Makespan (s) | Imbalance (CV) | Locality |
|---|---|---|---|
| HEFT-LC | 23.8 | 0.030 | 0.552 |
| DLS | 17.3 | 0.130 | 0.747 |
| RL | 46.7 | 0.049 | 0.492 |
- DLS reduces makespan by 27% vs HEFT-LC by exploiting data locality (75% vs 55%).
- MAPE-K controller reduces mean makespan by 31–39% vs static
ε=0.05on heterogeneous clusters by detecting that the default is too aggressive and lowering ε to zero — see Scenario D and the non-stationary three-phase workload (Scenario F). - RL agent converges from 2.57× to 1.57× the HEFT-LC baseline within 350 episodes (39% improvement). Honest result: it does not beat the strong baseline; the paper discusses why (coarse tabular state).
src/
cluster.py — heterogeneous cluster + data residency + failures
dag_generator.py — DataDAG model + WorkflowStream (Poisson arrivals)
schedulers.py — HEFT-LC, DLS, RLScheduler
rl_agent.py — tabular Q-learning agent (4-D discrete state)
controller.py — MAPE-K autonomic controller
simulator.py — event-driven streaming simulator
experiments/
run_experiments.py — 190 streaming runs across 5 scenarios
generate_figures.py — 7 publication-quality figures
paper/
paper.tex — IEEE-format research paper (~7 pages)
data/
results.csv — main results table
adaptations.csv — controller adaptation trace
learning_curve.csv — per-episode RL convergence data
shift_timeline.csv — per-DAG timeline for the shift scenario
figures/ — fig1–fig7 (PNG + PDF)
pip install networkx numpy pandas matplotlib scipy seaborn
python experiments/run_experiments.py # writes data/*.csv (~3 min)
python experiments/generate_figures.py # writes figures/*.pdf + .pngTo compile the paper:
cd paper && pdflatex paper.texEvery result in the paper traces back to a row in data/results.csv or
data/learning_curve.csv. All RNG seeds are explicit. The benchmark runs
in under three minutes on a laptop CPU.
Sidali. (2026). AdaptiveFlow: A Self-Tuning Scheduler for Streaming Data
Workflows on Heterogeneous Clusters. Research report, Université Lyon 1.