diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000..09a507c Binary files /dev/null and b/.DS_Store differ diff --git a/RL_introduction.pdf b/1. RL_introduction.pdf similarity index 100% rename from RL_introduction.pdf rename to 1. RL_introduction.pdf diff --git a/2. Multi-Armed Bandits/.DS_Store b/2. Multi-Armed Bandits/.DS_Store new file mode 100644 index 0000000..37ef980 Binary files /dev/null and b/2. Multi-Armed Bandits/.DS_Store differ diff --git a/2. Multi-Armed Bandits/2. Multi-Armed Bandits.pdf b/2. Multi-Armed Bandits/2. Multi-Armed Bandits.pdf new file mode 100755 index 0000000..240e582 Binary files /dev/null and b/2. Multi-Armed Bandits/2. Multi-Armed Bandits.pdf differ diff --git "a/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/17_A Tutorial on Thompson Sampling.pdf" "b/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/17_A Tutorial on Thompson Sampling.pdf" new file mode 100644 index 0000000..41b5e93 Binary files /dev/null and "b/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/17_A Tutorial on Thompson Sampling.pdf" differ diff --git "a/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/Finite-time Analysis of the Multiarmed Bandit Problem.pdf" "b/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/Finite-time Analysis of the Multiarmed Bandit Problem.pdf" new file mode 100644 index 0000000..6ea87ca Binary files /dev/null and "b/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/Finite-time Analysis of the Multiarmed Bandit Problem.pdf" differ diff --git "a/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/Introduction to Multi-Armed Bandits .pdf" "b/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/Introduction to Multi-Armed Bandits .pdf" new file mode 100644 index 0000000..0b69175 Binary files /dev/null and "b/2. Multi-Armed Bandits/\345\217\202\350\200\203\346\226\207\347\214\256/Introduction to Multi-Armed Bandits .pdf" differ diff --git a/README.md b/README.md index 22788f9..947f5ff 100644 --- a/README.md +++ b/README.md @@ -4,4 +4,6 @@ slides and other materials | Title | Detail | Author | link | | ------------------------------------------------------------ | ------------------------------------------------ | ------ | ------------------------------------------------------------ | | Introduction about RL | 强化学习简介,包含基本的要素、强化学习分类和一些例子 | 李娜 | [slide](https://github.com/ECNUdase/Reinforcement-Learning-2020/blob/master/RL_introduction.pdf) | +| 第二章:多臂赌博机 | 多臂赌博机问题的各种经典解法,主要解决强化学习中的EE问题。 | 韩程程 | [slide](https://github.com/ECNUdase/Reinforcement-Learning-2020/blob/master/2. Multi-Armed Bandits/2. Multi-Armed Bandits.pdf) | | 第五章:蒙特卡洛方法 | 利用蒙特卡洛方法进行策略评估和策略改进,包含on-policy和off-policy下的不同做法 | 刘婷婷 | [slide](https://github.com/ECNUdase/Reinforcement-Learning-2020/blob/master/5.%20MC%20Learning.pdf) | +