Paper
21 April 2020 Automatic construction of Markov decision process models for multi-agent reinforcement learning
Author Affiliations +
Abstract
This paper describes our current multi-agent reinforcement learning concepts to complement or replace classic operational planning techniques. A neural planner is used to generate many possible paths. Training of the neural planner is a onetime task using a physics-based model to create the training data. The outputs of the neural planner are achievable paths. The path intersections are represented as decision waypoint nodes in a graph. The graph is interpreted as a Markov Decision Process (MDP). The resulting MDP is much faster than non-discretized spaces to train multi-agent reinforcement algorithms because only high-level decision waypoints are considered. The technique is applicable to multiple domains including air, space, land, sea, and cyber-physical domains.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Darrell L. Young and Chris Eccles "Automatic construction of Markov decision process models for multi-agent reinforcement learning", Proc. SPIE 11413, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications II, 114130Y (21 April 2020); https://doi.org/10.1117/12.2557823
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Process modeling

Satellites

Data modeling

Computer programming

Robotics

Surveillance

Space operations

Back to Top