Show simple item record

dc.contributor.authorJung, T.
dc.contributor.authorPolani, D.
dc.date.accessioned2011-10-20T11:01:11Z
dc.date.available2011-10-20T11:01:11Z
dc.date.issued2007
dc.identifier.citationJung , T & Polani , D 2007 , Kernelizing LSPE λ . in Procs of the 2007 Symposium on Approximate Dynamic Programming & Reinforcement Learning (ADPRL 2007) . vol. 2007 , Institute of Electrical and Electronics Engineers (IEEE) , pp. 338-345 .
dc.identifier.isbn1-4244-0706-0
dc.identifier.otherPURE: 425284
dc.identifier.otherPURE UUID: 3c26bf2c-6982-44b3-95fa-72d79185bbf2
dc.identifier.otherdspace: 2299/1920
dc.identifier.otherScopus: 34548765672
dc.identifier.otherORCID: /0000-0002-3233-5847/work/86098100
dc.identifier.urihttp://hdl.handle.net/2299/6735
dc.description.abstractWe propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(λ) and LSTD(λ). In particular we present the ‘kernelization’ of model-free LSPE(λ). The ‘kernelization’ is computationally made possible by using the subset of regressors approximation, which approximates the kernel using a vastly reduced number of basis functions. The core of our proposed solution is an efficient recursive implementation with automatic supervised selection of the relevant basis functions. The LSPE method is well-suited for optimistic policy iteration and can thus be used in the context of online reinforcement learning. We use the high-dimensional Octopus benchmark to demonstrate this.en
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartofProcs of the 2007 Symposium on Approximate Dynamic Programming & Reinforcement Learning (ADPRL 2007)
dc.titleKernelizing LSPE λen
dc.contributor.institutionCentre for Computer Science and Informatics Research
dc.contributor.institutionAdaptive Systems
dc.contributor.institutionDepartment of Computer Science
dc.contributor.institutionSchool of Physics, Engineering & Computer Science
dc.contributor.institutionCentre for Future Societies Research
rioxxterms.versionVoR
rioxxterms.typeOther
herts.preservation.rarelyaccessedtrue


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record