This course makes introduction to Machine Learning (ML) field. ML is a branch of artificial intelligence (AI) that enables computers to learn from data and make decisions or predictions without being explicitly programmed. ML is used in applications like speech recognition, internet search engines, recommendation systems, and so on. The course presents different methods for solving supervised regression and classification problems like linear and logistic regression, support vector machines and decision trees. Then the course discusses composition of ML algorithms, the problem of bias-variance decomposition and the methods like random forest and gradient boosting. The course also covers topics of relevant feature selection, unsupervised ML problems and ranking. Together with ML-related material the course includes necessary discussion of basic mathematical tools actively used in ML like probability and statistics, optimization algorithms and vector-matrix calculus.
Instructor: Dmitry Kropotov
Teaching assistants: Maksim Nakhodnov, Ivan Shchekotov
Timetable: the classes are scheduled on Fridays 8:15 - 11:00 in SAC, Hall 3. The first lecture is scheduled on the 7th of February.
Telegram chat for questions and discussion: link
Assignments: All assignments are given and checked in the corresponding Teams space
Written examination, Duration: 120 min, Weight: 100 %
Completion: To pass this module, the exam must be passed with at least 45%.
In the course there will be given several home assignments in the form of Jupyter notebooks and theoretical assignments on vector-matrix calculus. Completing these assignments is fully optional. However, there will be a small bonus for making these assignments (5% to the final course grade in case of total assignments grade between 30% and 65%, and 10% to the final course grade in case of total assignments grade higher than 65%).
In the middle of the course a mid-term written exam is planned. This exam helps students better understand the types of problems that are expected in the final course exam. For successfull passing of this mid-term exam a small bonus is supposed (5% to the final course grade in case of mid-term grade between 45% and 70%, and 10% to the final course grade in case of mid-term grade higher than 70%).
The described bonuses can't exceed together 10% and are given only in case of basic exam grade is higher than 45%.
The final written exam is scheduled on the 23rd of May (Friday) at 9:00. Test exams for the previous year: test mid-term, test mid-term reference solution, test final exam, test final exam reference solution
| Date | Number | Topic | Materials |
|---|---|---|---|
| 07.02.25 | 01 | Introduction to the course. Basic terminology in ML, feature types, standard ML problem types. Overfitting and cross-validation. Pandas library and exploratory data analysis. | Presentation ipynb Videos: 1, 2, 3, 4 |
| 14.02.25 | 02 | Linear regression. Loss functions for regression. Regularization. Matrix-vector manipulation and differentiation. | Presentation Videos: 1, 2, 3, 4, 5, 6, 7 |
| 21.02.25 | 03 | Linear regression: numerical optimization methods and data normalization. Stochastic optimization. | Presentation Videos: 1, 2, 3, 4, 5, 6 |
| 28.02.25 | 04 | Linear regression: probabilistic view. Linear classification. | Presentation Videos: 1, 2, 3, 4 |
| 07.03.25 | 05 | Constrained optimization. Support Vector Machine (SVM). | Presentation Videos: 1, 2, 3, 4, 5 |
| 14.03.25 | 06 | Linear regression: Bayesian view. Logistic Regression, probability calibration. Multi-class classification. | Presentation Videos: 1, 2, 3, 4 |
| 21.03.25 | 07 | Decision Trees for regression and classification. Working with missing values and categorical features. | Presentation Videos: 1, 2, 3, 4 |
| 28.03.25 | 08 | Mid-term exam | |
| 04.04.25 | 09 | Bias-Variance Decomposition. Random Forest, Gradient Boosting. | Presentation Videos: 1, 2, 3, 4, 5, 6 |
| 11.04.25 | 10 | Efficient implementation of gradient boosting: XGBoost, LightGBM. Blending and Stacking. | Presentation Videos: 1, 2, 3 |
| 18.04.25 | -- | Semester break. No classes. | |
| 25.04.25 | 11 | Unsupervised learning: clustering, dimension reduction, data visualization | Presentation Videos: 1, 2, 3, 4 |
| 02.05.25 | 12 | Mid-term exam discussion | Presentation |
| 09.05.25 | 13 | Learning to rank | Presentation Videos: 1, 2, 3, 4 |
- T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edition, Springer, 2008.
- S. Shalev-Shwartz, Shai Ben-David: Understanding Machine Learning, Cambridge University Press, 2014.
- C. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
- T.M. Mitchell, Machine Learning, Mc Graw Hill India, 2017.