Effective few-shot learning approaches for image semantic segmentation
- Publication Type:
- Thesis
- Issue Date:
- 2024
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Semantic image segmentation has gained significant attention in computer vision due to its
wide range of applications, including visual understanding, medical image analysis, self-driving
vehicles, augmented reality, and video surveillance. While modern deep learning models have
achieved surprising performance on segmentation tasks, it relies heavily on a massive amount
of dense-labelled training data. However, abundant high-quality labeled data are not always
available in real-world scenarios due to privacy or ethical concerns and safety issues. This
research aims to reduce the reliance on data volume of segmentation tasks by introducing few-shot learning (FSL) technology. This empowers deep learning models to accurately segment
unseen classes from only a few labeled images, thereby relieving researchers and engineers from
intensive data labeling works.
This research initially addresses the problem of few-shot semantic segmentation (FSS), which
requires segmenting the novel class objects in a test image on the condition of a few labeled data.
For the challenges of prototype bias and sub-optimal feature representation, this research proposes the
Masked Cross-image Encoding technique. This method captures shared information and mutual
dependencies between training data and testing data, enhancing the visual properties of novel
classes for improved prototype-feature matching. Then, we re-evaluate the standard binary
matching paradigm employed in FSS and identify its association with potential false-matching
and under-matching issues, which can significantly degrade segmentation performance. To
alleviate this issue, a Multi-Prototype Discrimination scheme is introduced to explicitly assign
each pixel-wise query features to a specific class, reducing class matching ambiguity present
in conventional FSS methods. Building upon the FSS task, we tackle a more practical and
challenging task known as Incremental Few-Shot Semantic Segmentation (iFSS). It requires a
deep learning model to continuously learn new classes with scarce annotated examples, while
retaining the knowledge learned from previously encountered classes. We consider a meta-learning-based approach that simulates the incremental learning evaluation protocol during the
base training stage. This training task alignment strategy encourages the model to learn how
to incrementally adapt to novel classes without forgetting previous ones.
The overall research contributes valuable insights and methodologies to enhance the effectiveness of few-shot learning approaches for semantic image segmentation.
Please use this identifier to cite or link to this item: