[논문 간단 정리] Dense Retrieval Adaptation using Target Domain Description

Paper Review

[논문 간단 정리] Dense Retrieval Adaptation using Target Domain Description

oneonlee 2024. 10. 25. 20:17

Dense Retrieval Adaptation using Target Domain Description

Cited by 3 ('2024-10-22)
Publication Info: ACM ICTIR 2023
URL: https://arxiv.org/abs/2307.02740

Dense Retrieval Adaptation using Target Domain Description

In information retrieval (IR), domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain. Existing methods in this area focus on unsupervised domain adaptation where they have

arxiv.org

Summary

(NAACL 2022) GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval의 후속 연구
- 관련 글: [논문 간단 리뷰] GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
Target Domain의 Description만을 사용하여 Unsupervised Domain Adpatation을 수행하는 방법론 제시

Problem

Related Work
- IR 모델을 domain adaptation하는 기존의 방법론들은 target domain 데이터에 접근 가능한 방법들이 대부분임
그러나 현실에서는 실제 target domain 데이터에 접근 불가능 할 수도 있음
- e.g., 의료 기록이나 법적 제한이 있는 데이터는 공유 불가
본 논문은 zero-shot setting과 유사하게, target data를 사용하지 않고, target domain의 description만으로 Dense Retrieval 모델의 성능을 개선함
- description: 데이터의 작업과 특성을 개괄적으로 설명하는 높은 수준의 text desc.

Methods

Exp - Dataset

Target Retrieval Task 1: Bio-Medical IR
- TREC Covid Track in 2020 (TREC-COVID)
Target Retrieval Task 2: Financial Question Answering
- FiQA-2018 Task 2 (FiQA)
Target Retrieval Task 3: Argument Retrieval
- ArguAna
Target Retrieval Task 4: Duplicate Question Retrieval
- Quora
  - The aim of duplicate question retrieval is to detect repeated questions asked on community question-answering (CQA) forums
Target Retrieval Task 5: Fact Checking
- SciFact

저작자표시 비영리 동일조건 (새창열림)