چكيده به لاتين
Abstract:
The Reinforcement Learning (RL) agent based on its observation in each step selects an action and receives a reward signal then, according to this reward, its behavior improves in long time. But with environment’s dimension enlargement, decision making parameters also increase and therefore learning time increases. One solution to this problem is agent automatic learning skills. The skill is a set of primitive action. The main advantage of ability is to reuse skills therefore after learning these skills, the agent can transfer or use it somewhere else.
Hierarchal frameworks help to learning agent to learn skills more efficiently. In traditional methods, these hierarchal frameworks are assumed as prior knowledge by designer. But this is impossible for large and unknown environments. So the reinforcement learning agent should be able to learn skills automatically. One method of skill learning is subgoal discovery and is to create skills based on its.
In this thesis we use recent achievements in deep reinforcement learning for identification and extract subgoals. We have developed a deep reinforcement learning algorithm for learning agent’s policy in environment and then based on this we create a policy graph. Finally, by using bridge centrality subgoals are extracted.
The results of proposed algorithm in taxi and room to room environment (these are standard environments for learning skills) show that this algorithm correctly identifies and extracts subgoals. Also results show that without skill acquisition proposed algorithm is able to accelerate learning.
Keywords:
Reinforcement Learning, Deep Reinforcement LearningHeirarichal Framework, Subgoal, Deep Network, Policy Graph, Convolotoinal Network, Bridge Centrality