feat: runnable deployment examples (Dagster, Airflow, Celery)#943
Open
egordm wants to merge 3 commits into
Open
feat: runnable deployment examples (Dagster, Airflow, Celery)#943egordm wants to merge 3 commits into
egordm wants to merge 3 commits into
Conversation
Add an optional artifact_location field to MLFlowStorage and pass it to create_experiment. This lets experiments created against a database tracking backend (e.g. sqlite:///...) store artifacts at an explicit absolute location instead of MLflow's default CWD-relative ./mlruns, keeping a local setup self-contained and cross-process loadable. Backward compatible (defaults to None). Signed-off-by: Egor Dmitriev <egordmitriev2@gmail.com>
…elery)
Add examples/deployment: self-contained, runnable examples for the DAG-based
(Dagster, Airflow) and queued (Celery) deployment patterns. They simulate data
integration with the Liander benchmark dataset, so they run with no external
infrastructure, and demonstrate the production-correct cross-process model
handoff via a local SQLite-backed MLflow store.
- common/: shared pydantic-settings config, mocked external services
(metering, weather, publish) speaking OpenSTEF types, and the pipeline glue.
- dagster_app/: three partitioned assets (input_data, trained_model, forecast),
one partition per target, with a CLI entrypoint and the dagster dev UI.
- airflow_app/: TaskFlow train/forecast DAGs with dynamic task mapping.
- celery_app/: eager train/forecast CLI plus a real worker on a filesystem
broker (no Redis) and the Flower UI.
Wire it into the uv workspace, root ruff/pyright/pytest config, and poe tasks
(deploy-<framework>-{ui,train,forecast}, included from the example's pyproject).
Signed-off-by: Egor Dmitriev <egordmitriev2@gmail.com>
Add a 'Runnable examples' admonition to the deployment guide pointing at the new Dagster/Airflow/Celery examples, and list the deployment examples in the examples README. Signed-off-by: Egor Dmitriev <egordmitriev2@gmail.com>
46ee1c4 to
b5b22df
Compare
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



What
Adds
examples/deployment/— self-contained, runnable examples for deploying OpenSTEF on three orchestrators, implementing the patterns from the deployment guide:They simulate data integration with the Liander 2024 benchmark dataset, so they run with no external infrastructure, and demonstrate the production-correct cross-process model handoff via a self-contained, SQLite-backed local MLflow store.
Structure
common/— oneSettings(pydantic-settings) embedding OpenSTEF'sForecastingWorkflowConfig; mocked external systems (services.py: metering, weather, publish) speaking OpenSTEF'sTimeSeriesDataset/ForecastDataset; and the real pipeline glue (pipeline.py).dagster_app/— three partitioned assets (input_data→trained_model→forecast), one partition per target, with the built-in IO manager between assets and MLflow for the model handoff. UI viadagster dev, CLI viarun.py.airflow_app/— TaskFlow train/forecast DAGs with dynamic task mapping.celery_app/— eager train/forecast CLI plus a real worker on a filesystem broker (no Redis) and the Flower UI.Every framework exposes the same commands:
uv run poe deploy-<framework>-{ui,train,forecast}.Library change
Adds an optional
artifact_locationtoMLFlowStorage(passed tocreate_experiment), so a database tracking backend (e.g.sqlite:///...) can store artifacts at an explicit absolute location instead of MLflow's CWD-relative./mlruns. Backward compatible (defaults toNone), covered by a new unit test.Wiring
uv workspace member; root ruff/pyright/pytest config; poe tasks co-located in the example's pyproject and pulled into root via
[tool.poe] include.Verified
All three apps run end-to-end (CLI + UI where applicable);
lint,format,type(0 errors),lock,reuse, and tests all green.