Multimodal Compatibility Modeling via Exploring the Consistent and Complementary Correlations
- Publisher:
- ACM
- Publication Type:
- Conference Proceeding
- Citation:
- MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 2299-2307
- Issue Date:
- 2021-10-17
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
3474085.3475392.pdf | Published version | 1.8 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Existing methods towards outfit compatibility modeling seldom explicitly consider multimodal correlations. In this work, we explore the consistent and complementary correlations for better compatibility modeling. This is, however, non-trivial due to the following challenges: 1) how to separate and model these two kinds of correlations; 2) how to leverage the derived complementary cues to strengthen the text and vision-oriented representations of the given item; and 3) how to reinforce the compatibility modeling with text and vision-oriented representations. To address these challenges, we present a comprehensive multimodal outfit compatibility modeling scheme. It first nonlinearly projects each modality into separable consistent and complementary spaces via multi-layer perceptron, and then models the consistent and complementary correlations between two modalities by parallel and orthogonal regularization. Thereafter, we strengthen the visual and textual representation of items with complementary information, and further induct both the text-oriented and vision- oriented outfit compatibility modeling. We ultimately employ the mutual learning strategy to reinforce the final performance of compatibility modeling. Extensive experiments demonstrate the superiority of our scheme.
Please use this identifier to cite or link to this item: