Factored temporal difference learning in the new ties environment
Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a l...
Elmentve itt :
Szerzők: | |
---|---|
Testületi szerző: | |
Dokumentumtípus: | Cikk |
Megjelent: |
2008
|
Sorozat: | Acta cybernetica
18 No. 4 |
Kulcsszavak: | Számítástechnika, Kibernetika |
Tárgyszavak: | |
Online Access: | http://acta.bibl.u-szeged.hu/12840 |
LEADER | 01674nab a2200253 i 4500 | ||
---|---|---|---|
001 | acta12840 | ||
005 | 20220616145103.0 | ||
008 | 161015s2008 hu o 0|| eng d | ||
022 | |a 0324-721X | ||
040 | |a SZTE Egyetemi Kiadványok Repozitórium |b hun | ||
041 | |a eng | ||
100 | 1 | |a Gyenes Viktor | |
245 | 1 | 0 | |a Factored temporal difference learning in the new ties environment |h [elektronikus dokumentum] / |c Gyenes Viktor |
260 | |c 2008 | ||
300 | |a 651-668 | ||
490 | 0 | |a Acta cybernetica |v 18 No. 4 | |
520 | 3 | |a Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a linear function approximation scheme utilising natural features coming from the structure of the task. We conducted experiments in the New Ties environment, which is a novel platform for multi-agent simulations. We show that learning utilising a factored representation is effective even in large state spaces, furthermore it outperforms tabular methods even in smaller problems both in learning speed and stability, because of its generalisation capabilities. | |
650 | 4 | |a Természettudományok | |
650 | 4 | |a Számítás- és információtudomány | |
695 | |a Számítástechnika, Kibernetika | ||
700 | 0 | 1 | |a Bontovics Ákos |e aut |
700 | 0 | 1 | |a Lőrincz András |e aut |
710 | |a Symposium of Young Scientists on Intelligent Systems (2.) (2007) (Budapest) | ||
856 | 4 | 0 | |u http://acta.bibl.u-szeged.hu/12840/1/Gyenes_2008_ActaCybernetica.pdf |z Dokumentum-elérés |