In this paper, multi-player sequential game with an unknown non-stationary irrational player is investigated for cooperative autonomous robots decision-making applications. In practice, the irrationality of agents can seriously degrade the effectiveness of decision making especially for distributed cooperative tasks with applications to multi-robot systems. Specifically, The irrationality can be caused by the cooperation agent's mechanical failure or sensor flaw. To handle this issue, a novel dynamic evaluation system, which includes two important parameters, i.e. cooperation index and competitive flag, is designed to efficiently quantify the player's level of cooperation or competition firstly. Then, the continuous deep Q network space is proposed to predict the action value with respect to a continuous cooperation index.
In this paper, a novel decentralized intelligent adaptive optimal strategy has been developed to solve the pursuit-evasion game for massive Multi-Agent Systems (MAS) under uncertain environment. Existing strategies for pursuit-evasion games are neither efficient nor practical for large population multi-agent system due to the notorious ``Curse of dimensionality" and communication limit while the agent population is large. To overcome these challenges, the emerging mean field game theory is adopted and further integrated with reinforcement learning to develop a novel decentralized intelligent adaptive strategy with a new type of adaptive dynamic programing architecture named the Actor-Critic-Mass (ACM). Through online approximating the solution of the coupled mean field equations, the developed strategy can obtain the optimal pursuit-evasion policy even for massive MAS under uncertain environment.
In this paper, finite horizon intelligent decision-making problem has been investigated for self-organized autonomous systems especially under unstructured environment. According to the latest studies, the uncertainty of environment will seriously affect the effectiveness of decision making especially for autonomous systems. To handle these issues, transfer learning, and deep reinforcement learning has been presented recently. However, those existing Learning algorithms commonly needs a large set of state-space which cause the algorithm to be time-consuming and not suitable for real-time application. Therefore, in this paper, a library of polices trained using Deep Q-Learning under similar environments is built and implemented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.