Distributed open-domain answer sentence selection by federated learning
- Publication Type:
- Thesis
- Issue Date:
- 2023
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Natural Language Processing (NLP) has achieved huge success, largely attributed to the use of large pre-trained language models. Open-Domain Question Answering (OD-QA), a task of significant importance within the industry, has also experienced substantial advancements through the application of these large-scale pre-training models. A specialized subset of Open-Domain Question Answering, Open-Domain Answer Sentence Selection (OD-AS2), seeks to provide an answer to a query from a sentence within a document collection. An excellent application of this technology is the deployment of OD-AS2 models on edge devices such as computers and smartphones, thereby creating a personalized, intelligent question-answering assistant derived from a user’s personal documents. Recently, Dense Retrieval has garnered interest from both academic and industrial society as a novel approach to OD-QA/OD-AS2. The Dense Retrieval models play an indispensable role by striking a balance between efficiency and performance across various solution paradigms. However, their effectiveness largely depends on the availability of sample labeled positive QA pairs and a diverse range of hard negative samples in training. Fulfilling these requirements is challenging in a privacy preserving distributed scenario, where each client possesses fewer in-domain pairs and a relatively small collection, unsuitable for effective Dense Retrieval training. To address this issue, we introduce a new deep-learning framework for Privacy-preserving Distributed OD-AS2, dubbed as PDD-AS2. Drawing upon the principles of Federated Learning, this framework incorporates a client-customized query encoding method for personalization and a cross-client negative sampling method to enhance learning effectiveness called Fed-Negative. To assess our learning framework, we initially construct a novel OD-AS2 dataset, termed FedNewsQA, utilizing NewsQA as the base to simulate distributed clients with varying genre/domain data. Experimental results indicate that our learning framework outperforms baseline models and demonstrates impressive personalization capabilities.
Please use this identifier to cite or link to this item: