Translator Disclaimer
3 April 2000 Pattern-level temporal difference learning, data fusion, and chess
Author Affiliations +
Abstract
Our research group is using chess as a vehicle for studying the fusion of adaptation, multiple representation, and search technologies for real-time decision making. Chess systems like Deep Blue have achieved Grandmaster chess play with a brute-force search of the game tree and human- supplied information, like piece-values and opening books. However, subtle aspects of chess, including positional features and advanced concepts, are not capable of being represented or processed efficiently with the standard method. Since 1989, Morph I-III have exhibited more autonomy and learning ability than traditional chess programs in `adaptive pattern-oriented chess'. Like its predecessors, Morph IV is a reinforcement learner, but it also uses a new technique we call pattern-level TD and Q-learning to mathematically map the state space and effectively learn to classify situations. Its three knowledge sources include two traditional ones: material and a piece-square table, and a new method called Distance. These are combined using a simple genetic algorithm and a decision tree. This paper shows the effectiveness of fusing knowledge to replace search in real-time situations, since an agent which combines all sources is capable of consistently beating an agent which employs any of the individual knowledge sources. Surprisingly, the pattern-level TD agent is slightly superior to the pattern-level Q-learning agent, despite the fact that the Q-learning agent updates more Q-values on each temporal step.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Robert Levinson and Ryan J. Weber "Pattern-level temporal difference learning, data fusion, and chess", Proc. SPIE 4051, Sensor Fusion: Architectures, Algorithms, and Applications IV, (3 April 2000); https://doi.org/10.1117/12.381657
PROCEEDINGS
12 PAGES


SHARE
Advertisement
Advertisement
RELATED CONTENT

Bias estimation using targets of opportunity
Proceedings of SPIE (September 21 2007)
A spatial object index algorithm based on self-adaptive grids
Proceedings of SPIE (December 02 2005)
The implementation of spatial operator based on label
Proceedings of SPIE (October 28 2006)
Learned fusion operators based on matrix completion
Proceedings of SPIE (June 06 2011)

Back to Top