Factored temporal difference learning in the new ties environment

Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a l...

Teljes leírás

Elmentve itt :
Bibliográfiai részletek
Szerzők: Gyenes Viktor
Bontovics Ákos
Lőrincz András
Testületi szerző: Symposium of Young Scientists on Intelligent Systems (2.) (2007) (Budapest)
Dokumentumtípus: Cikk
Megjelent: 2008
Sorozat:Acta cybernetica 18 No. 4
Kulcsszavak:Számítástechnika, Kibernetika
Tárgyszavak:
Online Access:http://acta.bibl.u-szeged.hu/12840
LEADER 01674nab a2200253 i 4500
001 acta12840
005 20220616145103.0
008 161015s2008 hu o 0|| eng d
022 |a 0324-721X 
040 |a SZTE Egyetemi Kiadványok Repozitórium  |b hun 
041 |a eng 
100 1 |a Gyenes Viktor 
245 1 0 |a Factored temporal difference learning in the new ties environment  |h [elektronikus dokumentum] /  |c  Gyenes Viktor 
260 |c 2008 
300 |a 651-668 
490 0 |a Acta cybernetica  |v 18 No. 4 
520 3 |a Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a linear function approximation scheme utilising natural features coming from the structure of the task. We conducted experiments in the New Ties environment, which is a novel platform for multi-agent simulations. We show that learning utilising a factored representation is effective even in large state spaces, furthermore it outperforms tabular methods even in smaller problems both in learning speed and stability, because of its generalisation capabilities. 
650 4 |a Természettudományok 
650 4 |a Számítás- és információtudomány 
695 |a Számítástechnika, Kibernetika 
700 0 1 |a Bontovics Ákos  |e aut 
700 0 1 |a Lőrincz András  |e aut 
710 |a Symposium of Young Scientists on Intelligent Systems (2.) (2007) (Budapest) 
856 4 0 |u http://acta.bibl.u-szeged.hu/12840/1/Gyenes_2008_ActaCybernetica.pdf  |z Dokumentum-elérés