Skip to content

cyrizon/m2_enedis

Repository files navigation

⚡ m2-enedis: Master SISE Project

Python version Streamlit version Scikit-learn version

🚀 Try our services

🧠 Introduction

This repository hosts a complete web application that provides an interface for data analysis, dataset management, and prediction using pretrained machine learning models. The application predicts:

  • Total annual energy cost of a house or apartment
  • DPE classification (Energy Performance Diagnosis)

The DPE classification is a 7-level label ranging from A (most energy efficient) to G (least efficient). It is a crucial criterion when selling or renting properties, with regulatory restrictions for low-performing buildings. Our goal is to predict this classification using easily accessible input data through a simple web form, avoiding the need for detailed technical measurements.

Since the DPE label is closely linked to total energy costs, the project also includes a regression model to predict the annual energy expenditure. Together, these form a two-model prediction pipeline.

This project was developed by four Master SISE students and concludes the Python and Machine Learning lessons of the program.

Note

All the preliminary data explorations, models testing and devlopment are findable in the following repository: 📊 Data exploration and models building


⚙️ Installation

1️⃣ Clone the repository

git clone https://github.com/cyrizon/ml-enedis.git
cd ml-enedis

2️⃣ Install dependencies

Option A: Run with Docker Compose (Recommended)

Prerequisites: Docker and Docker Compose installed.

  1. Create a .env file in the project root:
MAPBOX_API_KEY=your_mapbox_api_key_here
  1. Build and run the application:
docker-compose up -d
  1. Access the application:

  2. Stop the application:

docker-compose down

Option B: Run locally

Prerequisite: Python 3.13 installed.

  • Using UV package manager
uv sync
  • Without UV
pip install -r requirements.txt

Before running :

  1. Rename the example file below
m2_enedis/
  └── .streamlit/
       └── secrets_example.toml -> rename it "secrets.toml"
  1. Then replace the content by your api key
MAPBOX_API_KEY="your_api_key"
  1. Run the web app:
  • Using UV package manager
uv run streamlit run home.py  
  • Without UV
streamlit run home.py
  1. (Optional) Run the FastAPI backend in a separate terminal:
  • Using UV package manager
uv run uvicorn backend.main:app --host 0.0.0.0 --port 8000
  • Without UV
uvicorn backend.main:app --host 0.0.0.0 --port 8000

📊 Features

  • 🏠 Predict DPE labels (A–G)
  • 💰 Predict total annual energy cost
  • 📈 Interactive web interface with dashboard and map
  • 🔍 Data exploration and visualization built-in
  • 🛜 Download and update datasets

🛠 Tech Stack

This project leverages the following technologies and libraries:

  • Python – Core programming language for the application.
  • Streamlit – Web application framework for interactive UI.
  • FastAPI – Backend API framework for handling requests and predictions.
  • Pydantic – Data validation and schema definition for API inputs.
  • Pandas – Data manipulation and preprocessing.
  • Scikit-learn – Machine learning models, pipelines, and preprocessing.
  • Plotly – Interactive visualizations and dynamic plots.

🗂 Project Structure

ml-enedis/
├─ home.py                   # Streamlit app launcher
├─ pages/                    # Streamlit multi-page interface
│  ├─ data.py
│  ├─ context.py
│  ├─ datasets.py
│  ├─ intro.py
│  ├─ map.py
│  ├─ prediction.py
│  └─ retrain_models.py
├─ assets/                   # Images and icons for app
├─ MLModels/                 # Pretrained machine learning models and encoders
│  ├─ features_target_columns_classification.pkl
│  ├─ features_target_columns_regression.pkl
│  ├─ label_encoder_target.pkl
│  ├─ pipeline_best_regression.pkl
│  └─ pipeline_sgboost_classification.pkl
├─ data/                     # Raw and processed data
│  ├─ climate_zones.csv
│  ├─ communes-france-2025.csv
│  └─ datasets/              # Specific datasets
│     └─ data_69.csv
├─ doc/                      # Documentation
│  ├─ DOC_FONCTIONNELLE.md
│  ├─ DOC_TECHNIQUE.md
│  └─ RAPPORT.md
├─ backend/                  # FastAPI backend
│  ├─ main.py                # API launcher
│  ├─ models/                # Pydantic input validation models
│  │  └─ input_model.py
│  └─ services/              # Backend services for data prep and predictions
│     ├─ data_preparation.py
│     └─ prediction.py
└─ src/                      # Supporting Python modules
   ├─ data_requesters/      # Data fetching modules
   │  ├─ ademe.py
   │  ├─ base_api.py        # ABC class for API requests
   │  ├─ elevation.py
   │  ├─ enedis.py
   │  ├─ geo_features.py
   │  └─ helper.py
   ├─ processing/           # Data processing modules
   │  └─ data_cleaner.py
   └─ utils/                # Utilities for loading and selecting files
      └─ dataloader.py

📈 Datasources

  • ADEME API opendata:
  • datagouv opendata:
    • French cities dabase: Coordinates and information concerning all cities in France.
    • Elevation API: Used to provide the altitude of specific coordinates.
    • Climate Zones: Provide climate zone for each department in France.
    • Cities Geolocalisation: Geocoding cities by their INSEE code.
    • Cities informations: Open API providing complementary information about cities, mainly used to search by name and get the INSEE code for geolocalisation and altitude.

🪪 License

This project is distributed under the MIT License.
See the LICENSE file for more information.


image image

© 2025

About

Application permettant l'évaluation du DPE et de la consommation énergétique d'un logement.

Resources

License

Stars

Watchers

Forks

Contributors