چكيده به لاتين
Abstract
Nowadays, being successful in multi agent environments is one of the main challenges in artificial intelligence. These multi agent environments are much more difficult and more complex than single agent environments due to their dynamism, caused by interactions of Autonomous agents together.
2D soccer simulation platform as a complex multi agent environment has played an important role as a tool for researchers in the field of machine learning through time. This environment have some aspects that made it a very realistic and complex tool for the learning processes to succeed in. These aspects are: dynamicity, continuity, uncertainty, noisy and partial observability.
Reinforcement learning (RL) is a machine learning method which is based on continuous interactions between the agent and the environment. It uses a guided try and error method with some feedbacks from environment (rewards or punishments). In this method the agent chooses to perform a specific action in a specific state of the environment and gets a positive or negative feedback (reward) based on the result of that action. The agent learns the optimum policy regarding to the received rewards for every chosen action in any specific state of the environment.
In spite of all the successes of RL in different problems and environments, this method has some serious difficulties in environments with many dimensions. (curse of dimensionality). This problem makes it challenging to use RL in big and complex environments such as 2D soccer simulation.
Furthermore, in many situations/problems it's needed for some robots to cooperate with each other in order to achieve a shared goal. Unfortunately with increasing number of the manufacturers and labs with different robots it gets more and more difficult for them to cooperate with another; hence it would be really helpful to find a way for helping a robot to cooperate with other unknown robot(s).
In this paper a two layer reinforcement learning method is proposed for a 2D soccer simulation agent to learn how to cooperate with an unknown and strange teammate in order to succeed in the "half field offence" problem. In one layer the agent tries to learn how to cooperate with each and every different well known teammate, then in an upper layer it tries to quickly adapt itself with an unknown strange agent based on the first layer learnings.
The main challenge in this work is to adapt the reinforcement learning method with the big and complex environment of 2D soccer simulation. The proposed mechanism in this paper was able to take over the very multi-dimensional environment of the 2D soccer simulation by defining appropriate features for the environment's states and data abstraction. The results show a really good improvement comparing the other proposed methods so far.
Keywords: Reinforcement Learning, RoboCup 2D Soccer Simulation, Multi-Agent Environment, Cooperation with unknown agent, Abstraction, Q-Learning, Half Field Offense