Proceedings Article | 6 June 2022
KEYWORDS: Artificial intelligence, Sensors, Evolutionary algorithms, Ranging, Fiber optic gyroscopes, Data modeling, Analytical research, Video, Signal processing, Probability theory
Artificial intelligence (AI), or more specifically deep learning approaches to AI, have led to astonishing results in recent times, which makes them a prime candidate for guiding agent actions in military domains. However, it is often difficult to train multiple agents with deep learning approaches when a task is sufficiently complex, or the state space is huge, as is often the case in military domains. One possible way to alleviate the difficulties associated with military tasks is to leverage military doctrine to assist in the guidance of multi-agent systems. Military doctrine is a guide to action rather than hard rules for the execution of military campaigns, operations, exercises, and engagements. Doctrine, written by experts in their respective domains, is used to make sure that each task associated with an engagement, for example, is executed according to military standards. Such standards ensure coordination between different tasks, resulting in a greater likelihood of Mission success. In addition, the efficacy of combining doctrine with deep learning must be tested to determine any realized benefit for AI driven military engagements under adversarial conditions. Further, the inherent complexities associated with military engagements demand coordination between heterogeneous resources and teams of Soldiers which are often geospatially separated. In this work, we establish a baseline of doctrine-based maneuvers for a military engagement scenario with a multi-agent system (MAS) embedded in the StarCraft Multi-Agent Challenge (SMAC) simulation environment, now a standard test environment for Multi-Agent Reinforcement Learning (MARL). We introduce a hybrid training approach that combines MARL with doctrine (MARDOC) to test whether doctrine-informed MARL policies produce more realistic behaviors and/or improved performance in a simple military engagement task. We compare this hybrid approach to both doctrine-only (i.e., supervised learning to mimic doctrine) and MARL-only approaches to evaluate the efficacy of the proposed MARDOC approach. Our experiments show that MARDOC approaches produce desired behavior and improved performance over supervised approaches or MARL alone. In summary, the experimental results suggest that MARDOC approaches provide a sufficient advantage over MARL alone due to agent doctrinal guidance of MARL exploration to overcome the complexities in military domains.