1. Introduction to Artificial Intelligence and Probabilistic Reasoning （AIPR） Lab.
AIPR Lab at KAIST focuses on probabilistic models and algorithms for sequential decision making, which humans and animals face in daily life. In more detail, we perform theoretical research for reinforcement learning （RL） which is modelled as Markov decision process （MDP）, and inverse RL （IRL） for discovering the fundamental behavior mechanism of agents. We also perform applied research, such as spoken dialogue system, autonomous virtual military modelling and simulation technology, etc.
2. Research interests
2.1 Theoretical Research:
＊ Reinforcement learning
Reinforcement learning is an area of machine learning concerned with how agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The objective of reinforcement learning is to calculate the optimal behavioral policy through balancing between taking the best action for achieving the goal （exploitation） and learning the surrounding environment （exploration）. In our lab., we aim to develop advanced reinforcement learning algorithms.
＊ Inverse reinforcement learning
Inverse reinforcement learning is a research for establishing behavioral principle of agents from the behavioral history data. This can be applied to analyzing the collective behaviors of humans and animals, and has a significant influence on neuroscience and social science.
2.2 Applied research:
＊ Spoken dialogue system
Dialogues between a user and a system can be modelled as a reinforcement learning problem, where the system reacts to the given context and gets the credit（reward） from the user. Possible situations and behaviors space is huge, so intelligent approximation techniques are required for a fast and accurate response.
＊ Autonomous virtual military modelling and simulation technology research
In virtual combat training, we introduce reinforcement learning and optimization techniques to model virtual military system, out of previous rule-based approach that can only perform simple behaviors. In particular, we investigate the virtual military technology that is resistant to the uncertain situation by making use of partially observable MDP （POMDP）.