Multimodal Compatibility Modeling via Exploring the Consistent and Complementary Correlations

Guan, W; Wen, H; Song, X; Yeh, CH; Chang, X; Nie, L

Multimodal Compatibility Modeling via Exploring the Consistent and Complementary Correlations

Guan, W Wen, H Song, X Yeh, CH Chang, X

Nie, L

Permalink

Publisher:: ACM
Publication Type:: Conference Proceeding
Citation:: MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 2299-2307
Issue Date:: 2021-10-17

Closed Access

	Filename	Description	Size
	3474085.3475392.pdf	Published version	1.8 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Guan, W
dc.contributor.author	Wen, H
dc.contributor.author	Song, X
dc.contributor.author	Yeh, CH
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807
dc.contributor.author	Nie, L
dc.date	2021-10-20
dc.date.accessioned	2022-06-26T04:14:03Z
dc.date.available	2022-06-26T04:14:03Z
dc.date.issued	2021-10-17
dc.identifier.citation	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 2299-2307
dc.identifier.isbn	9781450386517
dc.identifier.uri	http://hdl.handle.net/10453/158372
dc.description.abstract	Existing methods towards outfit compatibility modeling seldom explicitly consider multimodal correlations. In this work, we explore the consistent and complementary correlations for better compatibility modeling. This is, however, non-trivial due to the following challenges: 1) how to separate and model these two kinds of correlations; 2) how to leverage the derived complementary cues to strengthen the text and vision-oriented representations of the given item; and 3) how to reinforce the compatibility modeling with text and vision-oriented representations. To address these challenges, we present a comprehensive multimodal outfit compatibility modeling scheme. It first nonlinearly projects each modality into separable consistent and complementary spaces via multi-layer perceptron, and then models the consistent and complementary correlations between two modalities by parallel and orthogonal regularization. Thereafter, we strengthen the visual and textual representation of items with complementary information, and further induct both the text-oriented and vision- oriented outfit compatibility modeling. We ultimately employ the mutual learning strategy to reinforce the final performance of compatibility modeling. Extensive experiments demonstrate the superiority of our scheme.
dc.language	en
dc.publisher	ACM
dc.relation.ispartof	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
dc.relation.ispartof	ACM International Conference on Multimedia
dc.relation.isbasedon	10.1145/3474085.3475392
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Multimodal Compatibility Modeling via Exploring the Consistent and Complementary Correlations
dc.type	Conference Proceeding
utslib.location.activity	Virtual Event China
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2022-06-26T04:14:01Z
pubs.finish-date	2021-10-24
pubs.place-of-publication	USA
pubs.publication-status	Published
pubs.start-date	2021-10-20
dc.location	USA

Abstract:

Existing methods towards outfit compatibility modeling seldom explicitly consider multimodal correlations. In this work, we explore the consistent and complementary correlations for better compatibility modeling. This is, however, non-trivial due to the following challenges: 1) how to separate and model these two kinds of correlations; 2) how to leverage the derived complementary cues to strengthen the text and vision-oriented representations of the given item; and 3) how to reinforce the compatibility modeling with text and vision-oriented representations. To address these challenges, we present a comprehensive multimodal outfit compatibility modeling scheme. It first nonlinearly projects each modality into separable consistent and complementary spaces via multi-layer perceptron, and then models the consistent and complementary correlations between two modalities by parallel and orthogonal regularization. Thereafter, we strengthen the visual and textual representation of items with complementary information, and further induct both the text-oriented and vision- oriented outfit compatibility modeling. We ultimately employ the mutual learning strategy to reinforce the final performance of compatibility modeling. Extensive experiments demonstrate the superiority of our scheme.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/158372