Skip to content

aj1no/datacenter-efficiency-eda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Center Energy Efficiency Analysis (EDA)

Ler em Português

Python Pandas Seaborn License MIT CI

This project is an Exploratory Data Analysis (EDA) focused on Data Center energy efficiency, addressing the critical correlation between workload, temperature, and electrical power consumption.

Originally structured at an academic level and converted into a clean Python script, the goal of this repository is to demonstrate skills in data cleaning, statistical inference, and graphical visualization applied to infrastructure metrics.


Key Features of the Script

  1. Automatic Extraction: Autonomously consumes the programmer3/data-center-cold-source-control-dataset dataset via Kaggle API.
  2. Treatment and Normalization: Renames extracted columns to clear English standards (workload, temperature, power_consumption, cooling_parameters), converts data to numeric types, and handles null values using robust medians.
  3. Outlier Analysis: Identifies critical peaks in temperature and processing using the Interquartile Range (IQR) technique, removing anomalies and sensor noise.
  4. Relevant Feature Engineering:
    • esforco_energia: Energy cost relative to workload.
    • risco_termico: Thermal evaluation versus cooling parameters.
  5. Visual Insights (Charts):
    • Population histograms.
    • Comparative boxplots showing data before and after outlier removal.
    • Variable correlation heatmap.
    • Scatterplot between workload and power.

Project Structure

├── main.py        # Main Python script containing data routine, engineering, and plotting
└── .github/       # CI workflows and issue templates

How to Run Locally

Install the main dependencies and run the script:

pip install pandas numpy matplotlib seaborn scikit-learn kagglehub
python main.py

Analysis developed for exploratory purposes in green IT management and operational stability.

About

Exploratory Data Analysis (EDA) in Python exploring the correlation between workload, temperature, and energy efficiency in Data Centers.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages