This tool performs exploratory data analysis on diamond prices to understand which features (carat, cut, color, clarity) influence the price.
The script processes the diamonds dataset and:
- Loads and inspects the data - Checks shape, data types, missing values, and summary statistics
- Cleans and prepares the data - Converts categorical columns to efficient types
- Generates visualizations - Creates six key plots saved in the
/result/folder - Performs grouping analysis - Calculates average price and carat by cut and color, and identifies most expensive diamonds
- Prints key insights - Summarizes findings in the console
After running the analysis, you will find these charts in the /result/ folder:
- Ensure you have the required dependencies (see Requirements below)
- Run the analysis script using
uv:uv run python diamond_analysis.py
- Check the console for summary statistics and insights
- Find all generated plots in the
/result/folder
This script uses uv for Python package management. Make sure you have uv installed:
curl -LsSf https://astral.sh/uv/install.sh | shuv venv
source .venv/bin/activate
uv run main.pyGenerated by the seaborn team.
Input: Built‑in diamonds dataset from seaborn (no external file needed)
Output:
- Console output: summary statistics, grouped results, and key insights
/result/folder containing six PNG images (as shown above)





