GigaDepth: Learning Depth from Structured Light with Branching Neural Networks

Schreiberhuber, S; Weibel, JB; Patten, T; Vincze, M

GigaDepth: Learning Depth from Structured Light with Branching Neural Networks

Schreiberhuber, S Weibel, JB Patten, T

Vincze, M

Permalink

Publisher:: Springer Nature
Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13693 LNCS, pp. 214-229
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	GigaDepth Learning Depth from Structured Light with Branching Neural Networks.pdf		3.31 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Schreiberhuber, S
dc.contributor.author	Weibel, JB
dc.contributor.author	Patten, T https://orcid.org/0000-0003-1139-9451
dc.contributor.author	Vincze, M
dc.date.accessioned	2023-03-15T05:43:14Z
dc.date.available	2023-03-15T05:43:14Z
dc.date.issued	2022-01-01
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13693 LNCS, pp. 214-229
dc.identifier.isbn	9783031198267
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.uri	http://hdl.handle.net/10453/167360
dc.description.abstract	Structured light-based depth sensors provide accurate depth information independently of the scene appearance by extracting pattern positions from the captured pixel intensities. Spatial neighborhood encoding, in particular, is a popular structured light approach for off-the-shelf hardware. However, it suffers from the distortion and fragmentation of the projected pattern by the scene’s geometry in the vicinity of a pixel. This forces algorithms to find a delicate balance between depth prediction accuracy and robustness to pattern fragmentation or appearance change. While stereo matching provides more robustness at the expense of accuracy, we show that learning to regress a pixel’s position within the projected pattern is not only more accurate when combined with classification but can be made equally robust. We propose to split the regression problem into smaller classification sub-problems in a coarse-to-fine manner with the use of a weight-adaptive layer that efficiently implements branching per-pixel Multilayer Perceptrons applied to features extracted by a Convolutional Neural Network. As our approach requires full supervision, we train our algorithm on a rendered dataset sufficiently close to the real-world domain. On a separately captured real-world dataset, we show that our network outperforms state-of-the-art and is significantly more robust than other regression-based approaches.
dc.language	en
dc.publisher	Springer Nature
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.ispartofseries	Lecture Notes in Computer Science
dc.relation.isbasedon	10.1007/978-3-031-19827-4_13
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	GigaDepth: Learning Depth from Structured Light with Branching Neural Networks
dc.type	Conference Proceeding
utslib.citation.volume	13693 LNCS
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Mechanical and Mechatronic Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-15T05:42:43Z
pubs.publication-status	Published
pubs.volume	13693 LNCS

Abstract:

Structured light-based depth sensors provide accurate depth information independently of the scene appearance by extracting pattern positions from the captured pixel intensities. Spatial neighborhood encoding, in particular, is a popular structured light approach for off-the-shelf hardware. However, it suffers from the distortion and fragmentation of the projected pattern by the scene’s geometry in the vicinity of a pixel. This forces algorithms to find a delicate balance between depth prediction accuracy and robustness to pattern fragmentation or appearance change. While stereo matching provides more robustness at the expense of accuracy, we show that learning to regress a pixel’s position within the projected pattern is not only more accurate when combined with classification but can be made equally robust. We propose to split the regression problem into smaller classification sub-problems in a coarse-to-fine manner with the use of a weight-adaptive layer that efficiently implements branching per-pixel Multilayer Perceptrons applied to features extracted by a Convolutional Neural Network. As our approach requires full supervision, we train our algorithm on a rendered dataset sufficiently close to the real-world domain. On a separately captured real-world dataset, we show that our network outperforms state-of-the-art and is significantly more robust than other regression-based approaches.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/167360