From Event: SPIE Defense + Commercial Sensing, 2019
Deep reinforcement learning has been successful in training an agent to play at human-level in Atari games. Here, outputs of the game images were fed into a deep neural network to compute optimal actions. Conceptually, reinforcement learning can be viewed as the intersection of planning, uncertainty, and learning. In this paper, deep reinforcement learning method is applied to solve a problem formulated as a partially observable Markov decision process (POMDP). Specifically, the input images are perturbed to introduce imperfect knowledge. POMDP formulations assume uncertainties in the true state space and thus a more accurate representation of the real-world scenarios. The deep Q-network is adopted to see if an optimal sequence of actions can be learned when the inputs are not fully observable. Experimental results indicated that optimal strategies were discovered by deep reinforcement learning in majority of test cases, albeit slower to converge to the optimal solution.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Song Jun Park and Dale R. Shires, "Learning optimal actions with imperfect images," Proc. SPIE 10996, Real-Time Image Processing and Deep Learning 2019, 109960F (Presented at SPIE Defense + Commercial Sensing: April 15, 2019; Published: 4 June 2019); https://doi.org/10.1117/12.2518921.