Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation

Li, L; Xiao, J; Shi, H; Wang, W; Shao, J; Liu, AA; Yang, Y; Chen, L

Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation

Li, L Xiao, J Shi, H Wang, W Shao, J Liu, AA Yang, Y

Chen, L

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34, (1), pp. 195-206
Issue Date:: 2024-01-01

Closed Access

	Filename	Description	Size
	Label_Semantic_Knowledge_Distillation_for_Unbiased_Scene_Graph_Generation.pdf	Published version	2.69 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, L
dc.contributor.author	Xiao, J
dc.contributor.author	Shi, H
dc.contributor.author	Wang, W
dc.contributor.author	Shao, J
dc.contributor.author	Liu, AA
dc.contributor.author	Yang, Y https://orcid.org/0000-0002-0512-880X
dc.contributor.author	Chen, L
dc.date.accessioned	2024-03-18T00:58:01Z
dc.date.available	2024-03-18T00:58:01Z
dc.date.issued	2024-01-01
dc.identifier.citation	IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34, (1), pp. 195-206
dc.identifier.issn	1051-8215
dc.identifier.issn	1558-2205
dc.identifier.uri	http://hdl.handle.net/10453/176831
dc.description.abstract	The Scene Graph Generation (SGG) task aims to detect all the objects and their pairwise visual relationships in a given image. Although SGG has achieved remarkable progress over the last few years, almost all existing SGG models follow the same training paradigm: they treat both object and predicate classification in SGG as a single-label classification problem, and the ground-truths are one-hot target labels. However, this prevalent training paradigm has overlooked two characteristics of current SGG datasets: 1) For positive samples, some specific subject-object instances may have multiple reasonable predicates. 2) For negative samples, there are numerous missing annotations. Regardless of the two characteristics, SGG models are easy to be confused and make wrong predictions. To this end, we propose a novel model-agnostic Label Semantic Knowledge Distillation (LS-KD) for unbiased SGG. Specifically, LS-KD dynamically generates a 'soft' label for each subject-object instance by fusing a predicted Label Semantic Distribution (LSD) with its original one-hot target label. LSD reflects the correlations between this instance and multiple predicate categories. Meanwhile, we propose two different strategies to predict LSD: iterative self-KD and synchronous self-KD. Extensive ablations and results on three SGG tasks have attested to the superiority and generality of our proposed LS-KD, which can consistently achieve decent trade-off performance between different predicate categories.
dc.language	English
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Transactions on Circuits and Systems for Video Technology
dc.relation.isbasedon	10.1109/TCSVT.2023.3282349
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.subject.classification	4006 Communications engineering
dc.subject.classification	4009 Electronics, sensors and digital hardware
dc.subject.classification	4603 Computer vision and multimedia computation
dc.title	Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation
dc.type	Journal Article
utslib.citation.volume	34
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access	*
dc.date.updated	2024-03-18T00:57:59Z
pubs.issue	1
pubs.publication-status	Published
pubs.volume	34
utslib.citation.issue	1

Abstract:

The Scene Graph Generation (SGG) task aims to detect all the objects and their pairwise visual relationships in a given image. Although SGG has achieved remarkable progress over the last few years, almost all existing SGG models follow the same training paradigm: they treat both object and predicate classification in SGG as a single-label classification problem, and the ground-truths are one-hot target labels. However, this prevalent training paradigm has overlooked two characteristics of current SGG datasets: 1) For positive samples, some specific subject-object instances may have multiple reasonable predicates. 2) For negative samples, there are numerous missing annotations. Regardless of the two characteristics, SGG models are easy to be confused and make wrong predictions. To this end, we propose a novel model-agnostic Label Semantic Knowledge Distillation (LS-KD) for unbiased SGG. Specifically, LS-KD dynamically generates a 'soft' label for each subject-object instance by fusing a predicted Label Semantic Distribution (LSD) with its original one-hot target label. LSD reflects the correlations between this instance and multiple predicate categories. Meanwhile, we propose two different strategies to predict LSD: iterative self-KD and synchronous self-KD. Extensive ablations and results on three SGG tasks have attested to the superiority and generality of our proposed LS-KD, which can consistently achieve decent trade-off performance between different predicate categories.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/176831