DynaComm: Accelerating Distributed CNN Training between Edges and Clouds Through Dynamic Communication Scheduling

Cai, S; Wang, D; Wang, H; Lyu, Y; Xu, G; Zheng, X; Vasilakos, AV

DynaComm: Accelerating Distributed CNN Training between Edges and Clouds Through Dynamic Communication Scheduling

Cai, S Wang, D Wang, H Lyu, Y Xu, G Zheng, X Vasilakos, AV

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Journal on Selected Areas in Communications, 2022, 40, (2), pp. 611-625
Issue Date:: 2022-02-01

Closed Access

	Filename	Description	Size
	DynaComm_Accelerating_Distributed_CNN_Training_Between_Edges_and_Clouds_Through_Dynamic_Communication_Scheduling.pdf	Published version	2.47 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Cai, S
dc.contributor.author	Wang, D
dc.contributor.author	Wang, H
dc.contributor.author	Lyu, Y
dc.contributor.author	Xu, G
dc.contributor.author	Zheng, X
dc.contributor.author	Vasilakos, AV
dc.date.accessioned	2022-04-18T19:26:15Z
dc.date.available	2022-04-18T19:26:15Z
dc.date.issued	2022-02-01
dc.identifier.citation	IEEE Journal on Selected Areas in Communications, 2022, 40, (2), pp. 611-625
dc.identifier.issn	0733-8716
dc.identifier.issn	1558-0008
dc.identifier.uri	http://hdl.handle.net/10453/156337
dc.description.abstract	To reduce uploading bandwidth and address privacy concerns, deep learning at the network edge has been an emerging topic. Typically, edge devices collaboratively train a shared model using real-time generated data through the Parameter Server framework. Although all the edge devices can share the computing workloads, the distributed training processes over edge networks are still time-consuming due to the parameters and gradients transmission procedures between parameter servers and edge devices. Focusing on accelerating distributed Convolutional Neural Networks (CNNs) training at the network edge, we present DynaComm, a novel scheduler that dynamically decomposes each transmission procedure into several segments to achieve optimal layer-wise communications and computations overlapping during run-time. Through experiments, we verify that DynaComm manages to achieve optimal layer-wise scheduling for all cases compared to competing strategies while the model accuracy remains untouched.
dc.language	English
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Journal on Selected Areas in Communications
dc.relation.isbasedon	10.1109/JSAC.2021.3118419
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0805 Distributed Computing, 0906 Electrical and Electronic Engineering, 1005 Communications Technologies
dc.subject.classification	Networking & Telecommunications
dc.title	DynaComm: Accelerating Distributed CNN Training between Edges and Clouds Through Dynamic Communication Scheduling
dc.type	Journal Article
utslib.citation.volume	40
utslib.for	0805 Distributed Computing
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1005 Communications Technologies
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2022-04-18T19:26:13Z
pubs.issue	2
pubs.publication-status	Published
pubs.volume	40
utslib.citation.issue	2

Abstract:

To reduce uploading bandwidth and address privacy concerns, deep learning at the network edge has been an emerging topic. Typically, edge devices collaboratively train a shared model using real-time generated data through the Parameter Server framework. Although all the edge devices can share the computing workloads, the distributed training processes over edge networks are still time-consuming due to the parameters and gradients transmission procedures between parameter servers and edge devices. Focusing on accelerating distributed Convolutional Neural Networks (CNNs) training at the network edge, we present DynaComm, a novel scheduler that dynamically decomposes each transmission procedure into several segments to achieve optimal layer-wise communications and computations overlapping during run-time. Through experiments, we verify that DynaComm manages to achieve optimal layer-wise scheduling for all cases compared to competing strategies while the model accuracy remains untouched.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/156337