Optimization-driven Hierarchical Deep Reinforcement Learning for Hybrid Relaying Communications

Zou, Y; Xie, Y; Zhang, C; Gong, S; Hoang, DT; Niyato, D

Optimization-driven Hierarchical Deep Reinforcement Learning for Hybrid Relaying Communications

Zou, Y Xie, Y Zhang, C Gong, S Hoang, DT Niyato, D

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: 2020 IEEE Wireless Communications and Networking Conference (WCNC), 2020, 2020-May, pp. 1-6
Issue Date:: 2020-06-19

Closed Access

	Filename	Description	Size
	09120470.pdf	Published version	219.56 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zou, Y
dc.contributor.author	Xie, Y
dc.contributor.author	Zhang, C
dc.contributor.author	Gong, S
dc.contributor.author	Hoang, DT
dc.contributor.author	Niyato, D
dc.date	2020-05-25
dc.date.accessioned	2021-04-12T21:14:57Z
dc.date.available	2021-04-12T21:14:57Z
dc.date.issued	2020-06-19
dc.identifier.citation	2020 IEEE Wireless Communications and Networking Conference (WCNC), 2020, 2020-May, pp. 1-6
dc.identifier.isbn	978-1-7281-3107-8
dc.identifier.issn	1525-3511
dc.identifier.uri	http://hdl.handle.net/10453/148046
dc.description.abstract	In this paper, we employ multiple wireless-powered user devices as wireless relays to assist information transmission from a multi-antenna access point to a single-antenna receiver. To improve energy efficiency, we design a hybrid relaying communication strategy in which wireless relays are allowed to operate in either the passive mode via backscatter communications or the active mode via RF communications, depending on their channel conditions and energy states. We aim to maximize the overall SNR by jointly optimizing the access point's beamforming strategy as well as individual relays' radio modes and operating parameters. Due to the non-convex and combinatorial structure of the SNR maximization problem, we develop a deep reinforcement learning approach that adapts the beamforming and relaying strategies dynamically. In particular, we propose a novel optimization-driven hierarchical deep deterministic policy gradient (H-DDPG) approach that integrates the model-based optimization into the framework of conventional DDPG approach. It decomposes the discrete relay mode selection into the outer-loop by using deep Q-network (DQN) algorithm and then optimizes the continuous beamforming and relays' operating parameters by using the inner-loop DDPG algorithm. Simulation results reveal that the H-DDPG is robust to the hyper parameters and can speed up the learning process compared to the conventional DDPG approach.
dc.language	en
dc.publisher	IEEE
dc.relation.ispartof	2020 IEEE Wireless Communications and Networking Conference (WCNC)
dc.relation.ispartof	2020 IEEE Wireless Communications and Networking Conference
dc.relation.isbasedon	10.1109/wcnc45663.2020.9120470
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Optimization-driven Hierarchical Deep Reinforcement Learning for Hybrid Relaying Communications
dc.type	Conference Proceeding
utslib.citation.volume	2020-May
utslib.location.activity	Seoul, Korea (South)
utslib.for	1005 Communications Technologies
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2021-04-12T21:14:57Z
pubs.finish-date	2020-05-28
pubs.publication-status	Published
pubs.start-date	2020-05-25
pubs.volume	2020-May

Abstract:

In this paper, we employ multiple wireless-powered user devices as wireless relays to assist information transmission from a multi-antenna access point to a single-antenna receiver. To improve energy efficiency, we design a hybrid relaying communication strategy in which wireless relays are allowed to operate in either the passive mode via backscatter communications or the active mode via RF communications, depending on their channel conditions and energy states. We aim to maximize the overall SNR by jointly optimizing the access point's beamforming strategy as well as individual relays' radio modes and operating parameters. Due to the non-convex and combinatorial structure of the SNR maximization problem, we develop a deep reinforcement learning approach that adapts the beamforming and relaying strategies dynamically. In particular, we propose a novel optimization-driven hierarchical deep deterministic policy gradient (H-DDPG) approach that integrates the model-based optimization into the framework of conventional DDPG approach. It decomposes the discrete relay mode selection into the outer-loop by using deep Q-network (DQN) algorithm and then optimizes the continuous beamforming and relays' operating parameters by using the inner-loop DDPG algorithm. Simulation results reveal that the H-DDPG is robust to the hyper parameters and can speed up the learning process compared to the conventional DDPG approach.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/148046