Decoupling Exploration and Exploitation for Unsupervised Pre-training with Successor Features

Kim, J; Xuan, J; Liang, C; Hussain, F

Decoupling Exploration and Exploitation for Unsupervised Pre-training with Successor Features

Kim, J Xuan, J

Liang, C Hussain, F

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: 2024 International Joint Conference on Neural Networks (IJCNN), 2024, 00, pp. 1-8
Issue Date:: 2024-09-09

Embargoed

	Filename	Description	Size
	Decoupling Exploration and Exploitation for Unsupervised Pre-training with Successor Features.pdf	Accepted version	530.46 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Embargoed
Open Access

This item is currently unavailable due to the publisher's embargo.

The embargo period expires on 9 Sep 2026

Full metadata record

Field	Value	Language
dc.contributor.author	Kim, J
dc.contributor.author	Xuan, J https://orcid.org/0000-0002-8367-6908
dc.contributor.author	Liang, C
dc.contributor.author	Hussain, F https://orcid.org/0000-0003-1513-8072
dc.date	2024-06-30
dc.date.accessioned	2025-01-15T22:15:29Z
dc.date.available	2025-01-15T22:15:29Z
dc.date.issued	2024-09-09
dc.identifier.citation	2024 International Joint Conference on Neural Networks (IJCNN), 2024, 00, pp. 1-8
dc.identifier.isbn	979-8-3503-5932-9
dc.identifier.issn	2161-4393
dc.identifier.uri	http://hdl.handle.net/10453/183680
dc.description.abstract	Unsupervised pre training has been on the lookout for the virtue of a value function representation referred to as successor features SFs which decouples the dynamics of the environment from the rewards It has a significant impact on the process of task specific fine tuning due to the decomposition However existing approaches struggle with local optima due to the unified intrinsic reward of exploration and exploitation without considering the linear regression problem and the discriminator supporting a small skill sapce We propose a novel unsupervised pre training model with SFs based on a non monolithic exploration methodology Our approach pursues the decomposition of exploitation and exploration of an agent built on SFs which requires separate agents for the respective purpose The idea will leverage not only the inherent characteristics of SFs such as a quick adaptation to new tasks but also the exploratory and task agnostic capabilities Our suggested model is termed Non Monolithic unsupervised Pretraining with Successor features NMPS which improves the performance of the original monolithic exploration method of pre training with SFs NMPS outperforms Active Pre training with Successor Features APS in a comparative experiment
dc.language	en
dc.publisher	IEEE
dc.relation	http://purl.org/au-research/grants/arc/DE200100245
dc.relation	http://purl.org/au-research/grants/arc/LP210301046
dc.relation.ispartof	2024 International Joint Conference on Neural Networks (IJCNN)
dc.relation.ispartof	2024 International Joint Conference on Neural Networks
dc.relation.isbasedon	10.1109/ijcnn60899.2024.10651424
dc.rights	info:eu-repo/semantics/embargoedAccess
dc.title	Decoupling Exploration and Exploitation for Unsupervised Pre-training with Successor Features
dc.type	Conference Proceeding
utslib.citation.volume	00
utslib.location.activity	Yokohama, Japan
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	University of Technology Sydney/UTS Groups
pubs.organisational-group	University of Technology Sydney/UTS Groups/Australian Artificial Intelligence Institute (AAII)
utslib.copyright.status	embargoed	*
utslib.copyright.embargo	2026-09-09T00:00:00+1000Z
dc.date.updated	2025-01-15T22:15:28Z
pubs.finish-date	2024-07-05
pubs.place-of-publication	Piscataway, USA
pubs.publication-status	Published
pubs.start-date	2024-06-30
pubs.volume	00
dc.location	Piscataway, USA

Abstract:

Unsupervised pre training has been on the lookout for the virtue of a value function representation referred to as successor features SFs which decouples the dynamics of the environment from the rewards It has a significant impact on the process of task specific fine tuning due to the decomposition However existing approaches struggle with local optima due to the unified intrinsic reward of exploration and exploitation without considering the linear regression problem and the discriminator supporting a small skill sapce We propose a novel unsupervised pre training model with SFs based on a non monolithic exploration methodology Our approach pursues the decomposition of exploitation and exploration of an agent built on SFs which requires separate agents for the respective purpose The idea will leverage not only the inherent characteristics of SFs such as a quick adaptation to new tasks but also the exploratory and task agnostic capabilities Our suggested model is termed Non Monolithic unsupervised Pretraining with Successor features NMPS which improves the performance of the original monolithic exploration method of pre training with SFs NMPS outperforms Active Pre training with Successor Features APS in a comparative experiment

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/183680