Learn to focus on objects for visual detection

Chen, Z; Li, J; You, X

Learn to focus on objects for visual detection

Chen, Z Li, J

You, X

Permalink

Publication Type:: Journal Article
Citation:: Neurocomputing, 2019, 348 pp. 27 - 39
Issue Date:: 2019-07-05

Closed Access

	Filename	Description	Size
	1-s2.0-S0925231218312785-main.pdf	Published Version	4.19 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Chen, Z	en_US
dc.contributor.author	Li, J https://orcid.org/0000-0002-1336-2241	en_US
dc.contributor.author	You, X	en_US
dc.date.accessioned	2020-04-14T06:39:34Z
dc.date.available	2020-04-14T06:39:34Z
dc.date.issued	2019-07-05	en_US
dc.identifier.citation	Neurocomputing, 2019, 348 pp. 27 - 39	en_US
dc.identifier.issn	0925-2312	en_US
dc.identifier.uri	http://hdl.handle.net/10453/139987
dc.description.abstract	© 2018 State-of-art visual detectors utilize object proposals as the reference of objects to achieve higher efficiency. However, the number of the proposal to ensure full coverage of potential objects is still large because the proposals are generated with thread and thrum, exposing proposal computation as a bottleneck. This paper presents a complementary technique that aims to work with any existing proposal generating system, amending the work-flow from “propose-assess” to “propose-adjust-assess”. Inspired by the biological processing, we propose to improve the quality of object proposals by analyzing visual contexts and gradually focusing proposals on targets. In particular, the proposed method can be employed with existing proposals generation algorithms based on both hand-crafted features and Convolutional Neural Network (CNN) features. For the former, we realize the focusing function by two learning-based transformation models, which are trained for identifying generic objects using image cues. For the latter, a Focus Proposal Net (FoPN) with cascaded layers, which can be directly injected into CNN models in an end-to-end manner, is developed as the implementation of focusing operation. Experiments on real-life image data sets demonstrate that the quality of the proposal is improved by the proposed technique. Besides, it can reduce the number of proposals to achieve high recall rate of the objects based on both hand-crafted features and CNN-features, and can boost the performance of state-of-art detectors.	en_US
dc.relation.ispartof	Neurocomputing	en_US
dc.relation.isbasedon	10.1016/j.neucom.2018.06.082	en_US
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Learn to focus on objects for visual detection	en_US
dc.type	Journal Article
utslib.citation.volume	348	en_US
utslib.for	08 Information and Computing Sciences	en_US
utslib.for	09 Engineering	en_US
utslib.for	17 Psychology and Cognitive Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access	*
pubs.publication-status	Published	en_US
pubs.volume	348	en_US

Abstract:

© 2018 State-of-art visual detectors utilize object proposals as the reference of objects to achieve higher efficiency. However, the number of the proposal to ensure full coverage of potential objects is still large because the proposals are generated with thread and thrum, exposing proposal computation as a bottleneck. This paper presents a complementary technique that aims to work with any existing proposal generating system, amending the work-flow from “propose-assess” to “propose-adjust-assess”. Inspired by the biological processing, we propose to improve the quality of object proposals by analyzing visual contexts and gradually focusing proposals on targets. In particular, the proposed method can be employed with existing proposals generation algorithms based on both hand-crafted features and Convolutional Neural Network (CNN) features. For the former, we realize the focusing function by two learning-based transformation models, which are trained for identifying generic objects using image cues. For the latter, a Focus Proposal Net (FoPN) with cascaded layers, which can be directly injected into CNN models in an end-to-end manner, is developed as the implementation of focusing operation. Experiments on real-life image data sets demonstrate that the quality of the proposal is improved by the proposed technique. Besides, it can reduce the number of proposals to achieve high recall rate of the objects based on both hand-crafted features and CNN-features, and can boost the performance of state-of-art detectors.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/139987