The performance of decision-making algorithmic approaches depends on a vast number of factors, including hyperparameters, which make some solutions difficult to find. In our previous work (MARDOC paradigm),1 we have found that even in a simple environment2 (i.e., a small map with few obstacles), merely changing the initial conditions or doctrinal guided policy (MARDOC) showed a significant impact on the converged behavior. Further, we found that not all policies were useful or desirable for military applications. In this paper, we focus on a complex environment (i.e., a larger map with a greater number of heterogeneous assets and a stronger adversarial force) to analyze the impact of different doctrinal control parameters on the performance and behavior of fixed doctrinal policies. Especially we prioritize the Red force assets for targeted maneuvers and attacks. We hypothesize that asset type and their corresponding coordination will have a significant impact on the performance of the Blue force. Our preliminary experiments in this complex environment showed that the performance varies tremendously depending on asset capability and coordination between teams.
|