I'm a Sports Data Analyst and Scientist based in Milan, Italy. I specialize in turning complex spatial, event, and tracking data into actionable tactical insights and predictive models. With a strong academic background in statistical modeling, my primary focus is pioneering advanced football analytics.
- Contextual Football Scouting (In Progress): A data-driven scouting platform designed to mitigate "Team Bias" by isolating individual talent from team tactical structures. Utilizing UEFA Euro 2024 StatsBomb 360° and Transfermarkt data, the project implements spatial metrics via Convex Hull, Expected Possession Value (EPV) to spot line-breakers, decision-quality evaluation under pressure, off-ball movement tracking, and a within-role similarity model ("Spatial DNA").
- Serie A CB Scouting Engine: An interactive Streamlit dashboard built on PCA and clustering algorithms to categorize and profile Italian top-flight center-backs.
- Technical Scouting & Match Analysis (Girona FC Recruitment Pipeline): Developed professional scouting deliverables and tactical assessments tailored to the specific recruitment data standards of Girona FC's analytical department.
- Expected Goals (xG) Pipeline: End-to-end development and calibration of xG models using Logistic Regression, Random Forests, and XGBoost.
- Postgraduate in Sports Analytics | Barça Innovation Hub & Universitat Central de Catalunya (2026 - Present)
- Focusing on processing open, event, and tracking data to solve specific tactical and performance problems using Python and SQL.
- M.Sc. in Statistics, Business Analytics | University of Bologna
- Graduated 110/110 cum laude.
- Master's Thesis focused on multivariate time-series forecasting benchmarks (VAR, Random Forest, XGBoost, LightGBM, CatBoost, SVM, LSTM), which later led to a research collaboration and a co-authored scientific paper.
- B.Sc. in Statistical and Economic Sciences | University of Milano-Bicocca
- Thesis project: Predictive modeling for NBA player salaries.
- University of Bologna (Research Fellow): Managed the full data pipeline—from curation and machine learning modeling to final software implementation and visualization—for a scientific publication.
- Nomisma (Market & Data Analyst): Designed and deployed a predictive model in R that was officially presented at Vinitaly 2025.
- Languages: Python (Pandas, NumPy, Scikit-learn, XGBoost), R, SQL (MySQL), SAS, MongoDB
- Analytics: Time Series Forecasting, Predictive Modeling, Machine Learning (Regression & Classification), Clustering, Web Scraping
- Tools: Git/GitHub, Power BI, Excel, Streamlit
- LinkedIn: linkedin.com/in/matteo-vezzoli83
- Email: matt.vezzoli@gmail.com