Crowd counting via hierarchical scale recalibration network

Zou, Z; Liu, Y; Xu, S; Wei, W; Wen, S; Zhou, P

Crowd counting via hierarchical scale recalibration network

Zou, Z Liu, Y Xu, S Wei, W Wen, S

Zhou, P

Permalink

Publication Type:: Conference Proceeding
Citation:: Frontiers in Artificial Intelligence and Applications, 2020, 325, pp. 2864-2871
Issue Date:: 2020-08-24

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (2.08 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Zou, Z
dc.contributor.author	Liu, Y
dc.contributor.author	Xu, S
dc.contributor.author	Wei, W
dc.contributor.author	Wen, S https://orcid.org/0000-0001-8077-7001
dc.contributor.author	Zhou, P
dc.date.accessioned	2021-05-19T22:26:53Z
dc.date.available	2021-05-19T22:26:53Z
dc.date.issued	2020-08-24
dc.identifier.citation	Frontiers in Artificial Intelligence and Applications, 2020, 325, pp. 2864-2871
dc.identifier.isbn	9781643681009
dc.identifier.issn	0922-6389
dc.identifier.uri	http://hdl.handle.net/10453/148976
dc.description.abstract	The task of crowd counting is extremely challenging due to complicated difficulties, especially the huge variation in vision scale. Previous works tend to adopt a naive concatenation of multiscale information to tackle it, while the scale shifts between the feature maps are ignored. In this paper, we propose a novel Hierarchical Scale Recalibration Network (HSRNet), which addresses the above issues by modeling rich contextual dependencies and recalibrating multiple scale-associated information. Specifically, a Scale Focus Module (SFM) first integrates global context into local features by modeling the semantic inter-dependencies along channel and spatial dimensions sequentially. In order to reallocate channel-wise feature responses, a Scale Recalibration Module (SRM) adopts a step-by-step fusion to generate final density maps. Furthermore, we propose a novel Scale Consistency loss to constrain that the scale-associated outputs are coherent with groundtruth of different scales. With the proposed modules, our approach can ignore various noises selectively and focus on appropriate crowd scales automatically. Extensive experiments on crowd counting datasets (ShanghaiTech, MALL, WorldEXPO'10, and UCSD) show that our HSRNet can deliver superior results over all state-of-the-art approaches. More remarkably, we extend experiments on an extra vehicle dataset, whose results indicate that the proposed model is generalized to other applications.
dc.language	en
dc.relation.ispartof	Frontiers in Artificial Intelligence and Applications
dc.relation.isbasedon	10.3233/FAIA200429
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Crowd counting via hierarchical scale recalibration network
dc.type	Conference Proceeding
utslib.citation.volume	325
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	open_access	*
dc.date.updated	2021-05-19T22:26:49Z
pubs.publication-status	Published
pubs.volume	325

Abstract:

The task of crowd counting is extremely challenging due to complicated difficulties, especially the huge variation in vision scale. Previous works tend to adopt a naive concatenation of multiscale information to tackle it, while the scale shifts between the feature maps are ignored. In this paper, we propose a novel Hierarchical Scale Recalibration Network (HSRNet), which addresses the above issues by modeling rich contextual dependencies and recalibrating multiple scale-associated information. Specifically, a Scale Focus Module (SFM) first integrates global context into local features by modeling the semantic inter-dependencies along channel and spatial dimensions sequentially. In order to reallocate channel-wise feature responses, a Scale Recalibration Module (SRM) adopts a step-by-step fusion to generate final density maps. Furthermore, we propose a novel Scale Consistency loss to constrain that the scale-associated outputs are coherent with groundtruth of different scales. With the proposed modules, our approach can ignore various noises selectively and focus on appropriate crowd scales automatically. Extensive experiments on crowd counting datasets (ShanghaiTech, MALL, WorldEXPO'10, and UCSD) show that our HSRNet can deliver superior results over all state-of-the-art approaches. More remarkably, we extend experiments on an extra vehicle dataset, whose results indicate that the proposed model is generalized to other applications.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/148976