This repository contains materials for the Interpretability of Large Language Models course (0368.4264) at Tel Aviv University. It is a graduate-level, active-learning course in which students learn about interpretability of LLMs in the style of a collaborative research group. The course is structured around weekly paper readings, in-class discussions, role-playing, and hands-on exercises.1 Students are assumed to have prior background in natural language processing and machine learning.
In this repository, you will find:
- Schedule and reading lists
- Coding exercises and challenges
The course was developed by Dr. Mor Geva and Daniela Gottesman at Tel Aviv University. We also thank Amit Elhelo, Or Shafran, and Yoav Gur-Arieh for their contributions. We share these materials and hope they serve as a useful resource for anyone curious about or working on the interpretability of large language models.
The schedule is subject to minor changes.
If you have questions or suggestions, please open an issue in this repository.
Footnotes
-
The course format draws inspiration from the paper-reading seminar by Alec Jacobson and Colin Raffel and The Science of Large Language Models course by Robin Jia. ↩