===================
This project covers two main topics: the multi-armed bandits theory with extensive treatment of the Thompson Sampling approach, as well as the application of multi-armed bandits in active learning. The comprehensive introduction to the theory underlying multi-armed bandits is gradually developed in the project to cover the concepts necessary for understanding the basic bandits strategies. The latter part of the paper presents the first of its kind (to the best of our knowledge) application of the Thompson Sampling-inspired multi-armed bandits algorithm to solve computer science task of learning in the environment of insufficient information.
Reader does not require any prior knowledge in this field, for only the basics of statistics and probability theory are necessary to smoothly follow the text.
Keywords: multi-armed bandits, active learning, semi-supervised learning, exploration, exploitation, Thompson Sampling