<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>One Only</title>
    <link>https://oneonlee.tistory.com/</link>
    <description>싱싱한 자연어를 탐구합니다.</description>
    <language>ko</language>
    <pubDate>Sun, 31 May 2026 14:39:09 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>oneonlee</managingEditor>
    <image>
      <title>One Only</title>
      <url>https://tistory1.daumcdn.net/tistory/4431349/attach/46fa0b04d680454d90218dd7c6abf367</url>
      <link>https://oneonlee.tistory.com</link>
    </image>
    <item>
      <title>[논문 리뷰] Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models</title>
      <link>https://oneonlee.tistory.com/176</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;arXiv: &lt;a href=&quot;https://arxiv.org/abs/2410.07176&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://arxiv.org/abs/2410.07176&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;OpenReview: &lt;a href=&quot;https://openreview.net/forum?id=xy6B5Fh2v7&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://openreview.net/forum?id=xy6B5Fh2v7&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Code: x&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Keywords:&amp;nbsp;Retrieval&amp;nbsp;Augmented&amp;nbsp;Generation,&amp;nbsp;Knowledge&amp;nbsp;Conflicts&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. Motivation&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;검색 결과에 의존하는 RAG는 관련성이 없거나 오해의 소지가 있는 불완전한 검색 결과로 인해, 부정확한 LLM 응답을 초래할 수 있음&lt;/li&gt;
&lt;li&gt;검색된 결과가 LLM이 알고 있던 지식과 다를 때는 Knowledge Conflict가 발생할 수 있지만, 대부분의 기존 연구들은 이를 고려하지 않음 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;problem_3.png&quot; data-origin-width=&quot;1016&quot; data-origin-height=&quot;334&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/XriIi/btsKOWkyj7C/7THzvfkCK85mkftddeZsC0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/XriIi/btsKOWkyj7C/7THzvfkCK85mkftddeZsC0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/XriIi/btsKOWkyj7C/7THzvfkCK85mkftddeZsC0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FXriIi%2FbtsKOWkyj7C%2F7THzvfkCK85mkftddeZsC0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1016&quot; height=&quot;334&quot; data-filename=&quot;problem_3.png&quot; data-origin-width=&quot;1016&quot; data-origin-height=&quot;334&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. Preliminary&amp;nbsp;Experiment:&amp;nbsp;Imperfect&amp;nbsp;retrieval&amp;nbsp;is&amp;nbsp;common&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;retrieval.png&quot; data-origin-width=&quot;1810&quot; data-origin-height=&quot;465&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dz012u/btsKPHtpBcL/M21wx2QEkGGxjj06I14eMk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dz012u/btsKPHtpBcL/M21wx2QEkGGxjj06I14eMk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dz012u/btsKPHtpBcL/M21wx2QEkGGxjj06I14eMk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fdz012u%2FbtsKPHtpBcL%2FM21wx2QEkGGxjj06I14eMk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1810&quot; height=&quot;465&quot; data-filename=&quot;retrieval.png&quot; data-origin-width=&quot;1810&quot; data-origin-height=&quot;465&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;$$Retrieval\ Precision=\frac{\#\ retrieved\ passages\ containing\ correct\ answer}{\#\ total\ retrieved\ passages}$$&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. Problems&amp;nbsp;&amp;amp;&amp;nbsp;Previous&amp;nbsp;Works&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Problems&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;불완전한 검색 결과 및 Knowledge Conflicts는 광범위하게 발생하고, 이것들은 RAG의 오류를 초래함&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;기존 연구들에 따르면, LLM은 Knowledge Conflict 상황에서 내&amp;middot;외부 지식을 종합적으로 이해하기보다는 잘못된 정보에 기반하여 답변하는 경향이 있음 [1-3]&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Previous&amp;nbsp;Works&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;검색&amp;nbsp;결과에&amp;nbsp;초점을&amp;nbsp;둔&amp;nbsp;이전&amp;nbsp;연구들과[1,4]&amp;nbsp;달리,&amp;nbsp;본&amp;nbsp;논문은&amp;nbsp;검색된&amp;nbsp;passage가&amp;nbsp;제공된&amp;nbsp;검색&amp;nbsp;후&amp;nbsp;단계에서&amp;nbsp;LLM&amp;nbsp;내부&amp;nbsp;지식을&amp;nbsp;활용하여&amp;nbsp;RAG의&amp;nbsp;견고성을&amp;nbsp;강화하는&amp;nbsp;데&amp;nbsp;중점을&amp;nbsp;두고&amp;nbsp;있음 &lt;/li&gt;
&lt;li&gt;또한,&amp;nbsp;&lt;b&gt;Black-Box&amp;nbsp;환경&lt;/b&gt;에서&amp;nbsp;&lt;b&gt;training&amp;nbsp;없이&amp;nbsp;knowledge&amp;nbsp;conflicts를&amp;nbsp;직접&amp;nbsp;해결&lt;/b&gt;하여&amp;nbsp;양쪽의&amp;nbsp;유용한&amp;nbsp; 정보를&amp;nbsp;결합하고,&amp;nbsp;보다&amp;nbsp;신뢰할&amp;nbsp;수&amp;nbsp;있는&amp;nbsp;답변을&amp;nbsp;얻을&amp;nbsp;수&amp;nbsp;있는&amp;nbsp;방법을&amp;nbsp;제안함 &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li style=&quot;text-align: left;&quot;&gt;그렇다면, 신뢰할 수 있는 RAG를 위해, &lt;b&gt;LLM의 내&amp;middot;외부 지식 충돌을 해결하는 방법&lt;/b&gt;이 있는가?&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. Methods&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Overview&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;method_2.png&quot; data-origin-width=&quot;1027&quot; data-origin-height=&quot;458&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/mWsfI/btsKNuiuYne/KWMdk5ryEdON9NmD8ZIQ61/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/mWsfI/btsKNuiuYne/KWMdk5ryEdON9NmD8ZIQ61/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/mWsfI/btsKNuiuYne/KWMdk5ryEdON9NmD8ZIQ61/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FmWsfI%2FbtsKNuiuYne%2FKWMdk5ryEdON9NmD8ZIQ61%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1027&quot; height=&quot;458&quot; data-filename=&quot;method_2.png&quot; data-origin-width=&quot;1027&quot; data-origin-height=&quot;458&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Step 1/3: Passages&amp;nbsp;Generation&amp;nbsp;of&amp;nbsp;Internal&amp;nbsp;Knowledge &lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LLM의 내부 지식을 명시적으로 도출
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;question $q$를 기반으로 여러 개의 passages를 생성하도록 LLM prompting&lt;/li&gt;
&lt;li&gt;LLM&amp;nbsp;내부&amp;nbsp;지식과&amp;nbsp;외부&amp;nbsp;지식&amp;nbsp;간의&amp;nbsp;상호&amp;nbsp;확인을&amp;nbsp;위한&amp;nbsp;목적&lt;/li&gt;
&lt;/ul&gt;
&lt;img src=&quot;https://blog.kakaocdn.net/dn/b7s6z6/btsKNtxaaIC/WGHbcAA5D1wkYePPip1ep0/img.png&quot; width=&quot;694&quot; height=&quot;159&quot; data-filename=&quot;그림1.png&quot; data-origin-height=&quot;263&quot; data-origin-width=&quot;1148&quot; data-is-animation=&quot;false&quot; /&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;Step 2/3: Iterative Source-aware Knowledge Consolidation&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;내&amp;middot;외부 지식 정보들을 한번에 비교하여 context를 명시적으로 통합하도록 LLM prompting
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;일관된 정보 &amp;rarr; cluster &amp;amp; summarize&lt;/li&gt;
&lt;li&gt;정보 간 충돌 &amp;rarr; separate&lt;/li&gt;
&lt;li&gt;불필요한 정보 &amp;rarr; exclude&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;LLM이&amp;nbsp;지식을&amp;nbsp;통합할&amp;nbsp;때,&amp;nbsp;각&amp;nbsp;지식의&amp;nbsp;출처를&amp;nbsp;함께&amp;nbsp;제공 &lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Memory&amp;nbsp;or&amp;nbsp;Internal&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;위 과정을 $t$번 반복하여 더 유용한 contexts로 개선함&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2233&quot; data-origin-height=&quot;803&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/3aX4i/btsKONgTWxg/VvlbFocQ2stuTcjFEQrlKk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/3aX4i/btsKONgTWxg/VvlbFocQ2stuTcjFEQrlKk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/3aX4i/btsKONgTWxg/VvlbFocQ2stuTcjFEQrlKk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F3aX4i%2FbtsKONgTWxg%2FVvlbFocQ2stuTcjFEQrlKk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2233&quot; height=&quot;803&quot; data-origin-width=&quot;2233&quot; data-origin-height=&quot;803&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;Step 3/3: Step Answer Finalization&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;각각의 그룹으로부터 답변을 하나씩 생성한 후, 신뢰성을 고려하여 하나의 최종 답변을 선택하도록 LLM prompting
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;신뢰성&amp;nbsp;평가에는&amp;nbsp;지식&amp;nbsp;출처,&amp;nbsp;출처간&amp;nbsp;정보&amp;nbsp;일치&amp;nbsp;여부,&amp;nbsp;정보&amp;nbsp;세밀성&amp;nbsp;등을&amp;nbsp;고려 &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2215&quot; data-origin-height=&quot;988&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/18fAx/btsKN3kqgZF/G2uYhBXD0XiKp7rNTrlDCK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/18fAx/btsKN3kqgZF/G2uYhBXD0XiKp7rNTrlDCK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/18fAx/btsKN3kqgZF/G2uYhBXD0XiKp7rNTrlDCK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F18fAx%2FbtsKN3kqgZF%2FG2uYhBXD0XiKp7rNTrlDCK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2215&quot; height=&quot;988&quot; data-origin-width=&quot;2215&quot; data-origin-height=&quot;988&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;5. Experimental Settings&lt;/h2&gt;
&lt;h3 style=&quot;color: #000000;&quot; data-ke-size=&quot;size23&quot;&gt;Dataset&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;NQ,&amp;nbsp;TriviaQA,&amp;nbsp;BioASQ,&amp;nbsp;PopQA&amp;nbsp;&amp;rarr;&amp;nbsp;짧은&amp;nbsp;형식의&amp;nbsp;QA&amp;nbsp;데이터셋 &lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Passage&amp;nbsp;Collection &lt;/h3&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;각 질문에 대해 Google Search API를 통해 상위 30개의 결과를 검색&lt;/li&gt;
&lt;li&gt;접근 가능한 상위 10개의 웹사이트를 선택&lt;/li&gt;
&lt;li&gt;검색&amp;nbsp;결과의&amp;nbsp;snippet에&amp;nbsp;해당하는&amp;nbsp;단락을&amp;nbsp;각&amp;nbsp;웹사이트에서&amp;nbsp;추출하여&amp;nbsp;passage로&amp;nbsp;사용&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Metric&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Accuracy;&amp;nbsp;모델의&amp;nbsp;응답이&amp;nbsp;실제&amp;nbsp;정답을&amp;nbsp;포함하고&amp;nbsp;있으면&amp;nbsp;정확한&amp;nbsp;것으로&amp;nbsp;간주 &lt;/p&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;LLM Parameters&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LLM: gemini-1.5-pro-002, claude-3-5-sonnet@20240620&lt;/li&gt;
&lt;li&gt;temperature: 0&lt;/li&gt;
&lt;li&gt;max_token: 1024&lt;/li&gt;
&lt;li&gt;#&amp;nbsp;shot:&amp;nbsp;zero-shot&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;Baselines&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;USC (universal self-consistency) [5]
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모든 LLM 응답을 여러 번 샘플링하여 평균을 냄 (기본적인 API 호출을 통한 단순 개선 방법)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;GenRead [6]
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LLM 내부 지식으로 생성한 문서로 답변함 (외부 지식을 사용하지 않음)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;RobustRAG [7]
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;각각의 독립적인 문서에서 답변을 생성하고, 키워드로 최종 답변을 집계함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;InstructRAG [8]
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;답변 생성 시, Rationale을 생성하는 RAG 방법&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Self-Route [9]
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;답변&amp;nbsp;시,&amp;nbsp;RAG/LLM을&amp;nbsp;adaptive하게&amp;nbsp;선택하여&amp;nbsp;전환함&amp;nbsp;(내부&amp;nbsp;및&amp;nbsp;외부&amp;nbsp;지식&amp;nbsp;간의&amp;nbsp;전환)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000;&quot; data-ke-size=&quot;size26&quot;&gt;6. Main Results&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;그림2.png&quot; data-origin-width=&quot;2048&quot; data-origin-height=&quot;972&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/tmh9B/btsKNdujVRn/2MGLkrOxgek7dfPPtnJyX0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/tmh9B/btsKNdujVRn/2MGLkrOxgek7dfPPtnJyX0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/tmh9B/btsKNdujVRn/2MGLkrOxgek7dfPPtnJyX0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Ftmh9B%2FbtsKNdujVRn%2F2MGLkrOxgek7dfPPtnJyX0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2048&quot; height=&quot;972&quot; data-filename=&quot;그림2.png&quot; data-origin-width=&quot;2048&quot; data-origin-height=&quot;972&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;No RAG vs. RAG : NQ나 TriviaQA 같은 데이터셋에서는 RAG를 쓰지 않는게 성능이 더 좋을 때도 있음
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이는 검색 결과와 LLM 간의 지식 충돌 때문으로 보임&lt;/li&gt;
&lt;li&gt;반면,&amp;nbsp;domain-specific&amp;nbsp;QA&amp;nbsp;및&amp;nbsp;long-tail&amp;nbsp;QA인&amp;nbsp;BioASQ와&amp;nbsp;PopQA에선&amp;nbsp;RAG가&amp;nbsp;LLM의&amp;nbsp;성능을&amp;nbsp;향상시킴 &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;베이스라인들 중에선 일관되게 성능이 높은 모델이 없음
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이는 baseline 모델이 특정 setting에 fitting되어 있고, 보편적으로 적용되기에는 어렵다는 것을 시사함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;반면, Astute RAG는 모든 데이터셋에서 일관되게 baseline을 능가함&lt;/li&gt;
&lt;li&gt;knowledge&amp;nbsp;consolidation의&amp;nbsp;반복&amp;nbsp;횟수인&amp;nbsp;t를&amp;nbsp;늘리면&amp;nbsp;성능&amp;nbsp;개선&amp;nbsp;폭이&amp;nbsp;줄어드는데,&amp;nbsp;이것은&amp;nbsp;반복할수록&amp;nbsp;통합할&amp;nbsp;정보들이&amp;nbsp;줄어들기&amp;nbsp;때문 &lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;그림3.png&quot; data-origin-width=&quot;2026&quot; data-origin-height=&quot;956&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bg4ozA/btsKN2Z6BNo/q7E1mdAghOimObzVk1Ghkk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bg4ozA/btsKN2Z6BNo/q7E1mdAghOimObzVk1Ghkk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bg4ozA/btsKN2Z6BNo/q7E1mdAghOimObzVk1Ghkk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbg4ozA%2FbtsKN2Z6BNo%2Fq7E1mdAghOimObzVk1Ghkk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2026&quot; height=&quot;956&quot; data-filename=&quot;그림3.png&quot; data-origin-width=&quot;2026&quot; data-origin-height=&quot;956&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Gemini에서 t를 늘리면 BioASQ와 PopQA의 성능이 증가함
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;두 데이터셋은 외부 지식에 더 많이 의존하는데, knowledge consolidation 과정을 반복하면 외부 정보 내의 노이즈를 완화하는 데 도움이 되기 때문&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;t가 3에 도달하면 NQ와 TriviaQA의 성능은 더 이상 향상되지 않음
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이&amp;nbsp;두&amp;nbsp;데이터셋에서는&amp;nbsp;외부&amp;nbsp;지식의&amp;nbsp;역할이&amp;nbsp;덜&amp;nbsp;중요하기&amp;nbsp;때문&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000;&quot; data-ke-size=&quot;size26&quot;&gt;7. Analyses&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;Performance&amp;nbsp;by&amp;nbsp;Retrieval&amp;nbsp;Precision &lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;그림4.png&quot; data-origin-width=&quot;525&quot; data-origin-height=&quot;525&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bZEQlJ/btsKNXLnqlB/EZveXP9v9jKGp3mKRuUJW1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bZEQlJ/btsKNXLnqlB/EZveXP9v9jKGp3mKRuUJW1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bZEQlJ/btsKNXLnqlB/EZveXP9v9jKGp3mKRuUJW1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbZEQlJ%2FbtsKNXLnqlB%2FEZveXP9v9jKGp3mKRuUJW1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;525&quot; height=&quot;525&quot; data-filename=&quot;그림4.png&quot; data-origin-width=&quot;525&quot; data-origin-height=&quot;525&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;검색 품질이 매우 낮은 경우(Retrieval Precision이 거의 0에 가까울 때) 다른 베이스라인 모델들은 No RAG에 비해 성능이 저하된 반면, Astute RAG만이 이 기준을 넘는 성능을 보임 &lt;/p&gt;
&lt;h3 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size23&quot;&gt;Performance&amp;nbsp;by&amp;nbsp;Knowledge&amp;nbsp;Conflicts &lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;그림4.png&quot; data-origin-width=&quot;464&quot; data-origin-height=&quot;466&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/NlqKb/btsKPx5GCkQ/mWdqIGsYyhbQJiMWL3LyEK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/NlqKb/btsKPx5GCkQ/mWdqIGsYyhbQJiMWL3LyEK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/NlqKb/btsKPx5GCkQ/mWdqIGsYyhbQJiMWL3LyEK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FNlqKb%2FbtsKPx5GCkQ%2FmWdqIGsYyhbQJiMWL3LyEK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;464&quot; height=&quot;466&quot; data-filename=&quot;그림4.png&quot; data-origin-width=&quot;464&quot; data-origin-height=&quot;466&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000;&quot; data-ke-size=&quot;size26&quot;&gt;8. Conclusions&amp;nbsp;&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;별도의 훈련 없이 Knowledge Confliction을 완화하기 위한 Astute RAG 제안
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LLM의 내부 지식을 활용하여 생성된 응답을 반복적으로 개선&lt;/li&gt;
&lt;li&gt;내&amp;middot;외부 지식을 출처 기반으로 통합하여 답변을 최종화&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Limitations
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LLM의 Instruction-following 능력이나 Reasoning 능력에 의존함&lt;/li&gt;
&lt;li&gt;Knowledge Consolidation 시, LLM의 내재적 편견과 환각이 발생할 수 있음&lt;/li&gt;
&lt;li&gt;Main Results에서 API Call을 비교하는 것은 의미가 없어 보임&lt;/li&gt;
&lt;li&gt;API&amp;nbsp;Call에&amp;nbsp;사용한&amp;nbsp;token&amp;nbsp;수를&amp;nbsp;비교해야&amp;nbsp;함 &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;h2 style=&quot;color: #000000;&quot; data-ke-size=&quot;size26&quot;&gt;References&lt;/h2&gt;
&lt;p style=&quot;text-align: left;&quot; data-ke-size=&quot;size16&quot;&gt;[1] (LREC-COLING 2024) Tug-of-War between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models&lt;/p&gt;
&lt;p style=&quot;text-align: left;&quot; data-ke-size=&quot;size16&quot;&gt;[2] (ACL 2024) Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts for Open-Domain QA?&lt;/p&gt;
&lt;p style=&quot;text-align: left;&quot; data-ke-size=&quot;size16&quot;&gt;[3] (ICLR 2024) Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts&lt;/p&gt;
&lt;p style=&quot;text-align: left;&quot; data-ke-size=&quot;size16&quot;&gt;[4] (ACL 2024) When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories&lt;/p&gt;
&lt;p style=&quot;text-align: left;&quot; data-ke-size=&quot;size16&quot;&gt;[5] (ICML Workshop 2024) Universal Self-Consistency for Large Language Models &lt;br /&gt;[6] (ICLR 2023) Generate rather than Retrieve: Large Language Models are Strong Context Generators &lt;br /&gt;[7] (arXiv 2024) Certifiably Robust RAG against Retrieval Corruption &lt;br /&gt;[8] (arXiv 2024) InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales &lt;br /&gt;[9] (EMNLP Industry 2024) Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach&lt;/p&gt;</description>
      <category>Paper Review</category>
      <category>knowledge</category>
      <category>knowledge conflicts</category>
      <category>LLM</category>
      <category>Rag</category>
      <category>오블완</category>
      <category>티스토리챌린지</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/176</guid>
      <comments>https://oneonlee.tistory.com/176#entry176comment</comments>
      <pubDate>Tue, 19 Nov 2024 15:16:46 +0900</pubDate>
    </item>
    <item>
      <title>Continual Learning의 목표와 Forward Transfer 및 Backward Transfer</title>
      <link>https://oneonlee.tistory.com/175</link>
      <description>&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;napkin-selection.png&quot; data-origin-width=&quot;1329&quot; data-origin-height=&quot;1329&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pxIrp/btsKlbJMxv7/3HpgjlQ9TxrssfQaWPXFy0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pxIrp/btsKlbJMxv7/3HpgjlQ9TxrssfQaWPXFy0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pxIrp/btsKlbJMxv7/3HpgjlQ9TxrssfQaWPXFy0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FpxIrp%2FbtsKlbJMxv7%2F3HpgjlQ9TxrssfQaWPXFy0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;234&quot; height=&quot;234&quot; data-filename=&quot;napkin-selection.png&quot; data-origin-width=&quot;1329&quot; data-origin-height=&quot;1329&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Continual Learning의 목표&lt;/h2&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;Avoid Catastrophic Forgetting
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이전 task의 기억을 보전해야 함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Positive Forward Transfer
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이전 task에서 학습했던 지식이 다음 task에 도움이 되어야 함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Positive Backward Transfer
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;다음 task에서 학습을 한 지식이 이전 task의 성능 향상에도 도움이 되어야 함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Task-Order Free Learning
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Task의 학습 순서와 무관하게 모든 task를 잘 수행해야 함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;b&gt;Forward Transfer&lt;/b&gt;&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Forward transfer는 모델이 이전에 학습한 task의 지식을 활용하여 새로운 task에 대한 학습 효율과 성능을 향상시키는 능력을 말한다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Continual Learning에서 forward transfer는 모델이 과거 task에서 학습한 representation을 사용하여 새로운 task를 얼마나 쉽게 학습할 수 있는지에 따라 측정된다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이상적으로 forward transfer가 positive인 모델은 이전 지식을 효과적으로 활용할 수 있으므로 더 적은 리소스로 더 빠르게 또는 더 적은 자원으로 새로운 task를 학습할 수 있다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;b&gt;Backward Transfer&lt;/b&gt;&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;반면에 Backward transfer는 새로운 task를 학습하는 것이 이전에 학습한 task 수행에 미치는 영향을 포함한다. 즉, 새로운 task를 학습하면서 이전에 학습했던 task들의 성능도 함께 개선되는 현상을 말한다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Positive backward transfer는 새로운 task를 학습함으로써 이전 task의 수행 능력이 향상될 때 발생한다. 이전 task의 데이터를 다시 보지 않고도 성능이 개선될 수 있다는 점이 특징이다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;하지만 대부분의 신경망에서는 catastrophic forgetting 때문에 backward transfer 달성이 어렵다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Continual Learning은 catastrophic forgetting을 방지하는 데 초점을 맞추기 때문에, 의도치 않게 backward transfer를 제한할 수 있고, positive backward transfer의 발생은 흔치 않다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;그러나 일부 방법은 새 작업과 이전 작업 모두에 도움이 되는 방식으로 모델을 선택적으로 업데이트하여 positive backward transfer를 달성하는 것을 목표로 한다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;References&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;ContinualAI Wiki - Introduction to Continual Learning, &lt;a href=&quot;https://wiki.continualai.org/the-continualai-wiki/introduction-to-continual-learning&quot;&gt;https://wiki.continualai.org/the-continualai-wiki/introduction-to-continual-learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;KAIST 산업및시스템공학과 박찬영 교수님 - 연속학습을 통한 사용자의 일반적인 표현 학습 (Universal User Representation Learning based on Continual Learning), &lt;a href=&quot;https://dsail.kaist.ac.kr/files/NAVER_Techtalk2023.pdf&quot;&gt;https://dsail.kaist.ac.kr/files/NAVER_Techtalk2023.pdf&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Lin et al., Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer, NeurIPS 2022, &lt;a href=&quot;https://arxiv.org/abs/2211.00789&quot;&gt;https://arxiv.org/abs/2211.00789&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Artificial Intelligence/ML &amp;amp; DL</category>
      <category>backward transfer</category>
      <category>catastrophic forgetting</category>
      <category>Continual Learning</category>
      <category>forward transfer</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/175</guid>
      <comments>https://oneonlee.tistory.com/175#entry175comment</comments>
      <pubDate>Sun, 27 Oct 2024 21:59:08 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation</title>
      <link>https://oneonlee.tistory.com/174</link>
      <description>&lt;p&gt;Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Publication Info: Information Sciences 2025&lt;/li&gt;
&lt;li&gt;URL: &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0020025524012829&quot;&gt;https://www.sciencedirect.com/science/article/pii/S0020025524012829&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;Contribution&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;IR task의 맥락에서 Continual Learning 패러다임에 대해 명확히 정의함&lt;/li&gt;
&lt;li&gt;Continual IR을 평가하기 위해, Topic-MS-MARCO 데이터셋을 제안함&lt;ul&gt;
&lt;li&gt;주제별 IR task와 predefined task similarity가 포함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;CLNIR (Continual Learning Framework for Neural Information Retrieval) 프레임워크 제안&lt;ul&gt;
&lt;li&gt;regularization &amp;amp; replay mechanism에 기반한 7가지 전략&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Limitations&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;optimization-based &amp;amp; architecture-based strategies는 탐구되지 않음&lt;/li&gt;
&lt;li&gt;단일 데이터셋만을 사용했음&lt;ul&gt;
&lt;li&gt;cross-domain setting이라고 보기 힘듦&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>Continual Learning</category>
      <category>information retrieval</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/174</guid>
      <comments>https://oneonlee.tistory.com/174#entry174comment</comments>
      <pubDate>Fri, 25 Oct 2024 20:19:12 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] Dense Retrieval Adaptation using Target Domain Description</title>
      <link>https://oneonlee.tistory.com/173</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;Dense Retrieval Adaptation using Target Domain Description&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cited by 3 ('2024-10-22)&lt;/li&gt;
&lt;li&gt;Publication Info: ACM ICTIR 2023&lt;/li&gt;
&lt;li&gt;URL: &lt;a href=&quot;https://arxiv.org/abs/2307.02740&quot;&gt;https://arxiv.org/abs/2307.02740&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1729854874773&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Dense Retrieval Adaptation using Target Domain Description&quot; data-og-description=&quot;In information retrieval (IR), domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain. Existing methods in this area focus on unsupervised domain adaptation where they have&quot; data-og-host=&quot;arxiv.org&quot; data-og-source-url=&quot;https://arxiv.org/abs/2307.02740&quot; data-og-url=&quot;https://arxiv.org/abs/2307.02740v1&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/EFjV2/hyXlU9HFz2/LKrf8s0U9ovwY1ljVgmOs0/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/BnwRs/hyXpuO1DM0/EvAm9PMtmluZ9vao8Ja0H1/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000&quot;&gt;&lt;a href=&quot;https://arxiv.org/abs/2307.02740&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://arxiv.org/abs/2307.02740&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/EFjV2/hyXlU9HFz2/LKrf8s0U9ovwY1ljVgmOs0/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/BnwRs/hyXpuO1DM0/EvAm9PMtmluZ9vao8Ja0H1/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Dense Retrieval Adaptation using Target Domain Description&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;In information retrieval (IR), domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain. Existing methods in this area focus on unsupervised domain adaptation where they have&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;arxiv.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Summary&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;(NAACL 2022) GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval의 후속 연구
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;관련 글: &lt;a href=&quot;https://oneonlee.tistory.com/172&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;[논문 간단 리뷰] GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Target Domain의 Description만을 사용하여 Unsupervised Domain Adpatation을 수행하는 방법론 제시&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Related Work
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IR 모델을 domain adaptation하는 기존의 방법론들은 target domain 데이터에 접근 가능한 방법들이 대부분임&lt;/li&gt;
&lt;li&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;image.png&quot; data-origin-width=&quot;2318&quot; data-origin-height=&quot;437&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b6lgcB/btsKlxLVTLY/ZsfAZTgwVyFkbe1GjttIo1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b6lgcB/btsKlxLVTLY/ZsfAZTgwVyFkbe1GjttIo1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b6lgcB/btsKlxLVTLY/ZsfAZTgwVyFkbe1GjttIo1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb6lgcB%2FbtsKlxLVTLY%2FZsfAZTgwVyFkbe1GjttIo1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2318&quot; height=&quot;437&quot; data-filename=&quot;image.png&quot; data-origin-width=&quot;2318&quot; data-origin-height=&quot;437&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;/ul&gt;
&amp;nbsp;&lt;/li&gt;
&lt;li&gt;그러나 현실에서는 실제 target domain 데이터에 접근 불가능 할 수도 있음
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;e.g., 의료 기록이나 법적 제한이 있는 데이터는 공유 불가&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;본 논문은 zero-shot setting과 유사하게, target data를 사용하지 않고, target domain의 description만으로 Dense Retrieval 모델의 성능을 개선함
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;description: 데이터의 작업과 특성을 개괄적으로 설명하는 높은 수준의 text desc.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Methods&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;image.png&quot; data-origin-width=&quot;1874&quot; data-origin-height=&quot;642&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bAdd50/btsKjvCnJzY/jdqrwhyBFpTzig9iCFaFFK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bAdd50/btsKjvCnJzY/jdqrwhyBFpTzig9iCFaFFK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bAdd50/btsKjvCnJzY/jdqrwhyBFpTzig9iCFaFFK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbAdd50%2FbtsKjvCnJzY%2FjdqrwhyBFpTzig9iCFaFFK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1874&quot; height=&quot;642&quot; data-filename=&quot;image.png&quot; data-origin-width=&quot;1874&quot; data-origin-height=&quot;642&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Exp - Dataset&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Target Retrieval Task 1: Bio-Medical IR
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;TREC Covid Track in 2020 (TREC-COVID)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Target Retrieval Task 2: Financial Question Answering
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;FiQA-2018 Task 2 (FiQA)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Target Retrieval Task 3: Argument Retrieval
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;ArguAna&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Target Retrieval Task 4: Duplicate Question Retrieval
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Quora
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;The aim of duplicate question retrieval is to detect repeated questions asked on community question-answering (CQA) forums&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Target Retrieval Task 5: Fact Checking
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;SciFact&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>dense retrieval adaptation</category>
      <category>domain adaptation</category>
      <category>information retrieval</category>
      <category>target domain description</category>
      <category>unsupervised domain adaptation</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/173</guid>
      <comments>https://oneonlee.tistory.com/173#entry173comment</comments>
      <pubDate>Fri, 25 Oct 2024 20:17:08 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval</title>
      <link>https://oneonlee.tistory.com/172</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cited by 142 (&amp;rsquo;2024-10-22)&lt;/li&gt;
&lt;li&gt;Publication Info: NAACL 2022&lt;/li&gt;
&lt;li&gt;URL: &lt;a href=&quot;https://aclanthology.org/2022.naacl-main.168&quot;&gt;https://aclanthology.org/2022.naacl-main.168&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1729854611467&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval&quot; data-og-description=&quot;Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022.&quot; data-og-host=&quot;aclanthology.org&quot; data-og-source-url=&quot;https://aclanthology.org/2022.naacl-main.168&quot; data-og-url=&quot;https://aclanthology.org/2022.naacl-main.168&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/gRLie/hyXpCfcEOe/ekg9wpDOGd0XnVktKPG7J1/img.jpg?width=600&amp;amp;height=600&amp;amp;face=0_0_600_600&quot;&gt;&lt;a href=&quot;https://aclanthology.org/2022.naacl-main.168&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://aclanthology.org/2022.naacl-main.168&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/gRLie/hyXpCfcEOe/ekg9wpDOGd0XnVktKPG7J1/img.jpg?width=600&amp;amp;height=600&amp;amp;face=0_0_600_600');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022.&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;aclanthology.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Summary&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;GPL이라는 Unsupervised Domain Adaptation 방법을 제안하고, 기존 domain adaptation method와 광범위하게 성능 비교 (nDCG@10)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Dense retrieval는 대부분의 도메인에서 사용할 수 없는 대량의 학습 데이터가 필요로 함&lt;/li&gt;
&lt;li&gt;Dense retireval은 domain shifts에 매우 민감함
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;MS MARCO에서 훈련된 모델은 코로나19 과학 문헌에 대한 질문에 대해 다소 저조한 성능을 보임&lt;/li&gt;
&lt;li&gt;MS MARCO는 코로나19 이전에 생성되었기 때문에 코로나19 관련 주제가 포함되어 있지 않으며, 모델은 이 주제를 vector space에서 잘 표현하는 방법을 학습하지 못했기 때문&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Related Work&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Dense retrieval의 성능은 domain shift를 거치면 하락함이 이전 연구를 통해 밝혀짐&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Contribution&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;image.png&quot; data-origin-width=&quot;2343&quot; data-origin-height=&quot;855&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bdih8V/btsKkIU0HqV/aKMEKvG0PPngKbdxU3szJk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bdih8V/btsKkIU0HqV/aKMEKvG0PPngKbdxU3szJk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bdih8V/btsKkIU0HqV/aKMEKvG0PPngKbdxU3szJk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbdih8V%2FbtsKkIU0HqV%2FaKMEKvG0PPngKbdxU3szJk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2343&quot; height=&quot;855&quot; data-filename=&quot;image.png&quot; data-origin-width=&quot;2343&quot; data-origin-height=&quot;855&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;unsupervised domain adaptation 방법론인 query generator와 cross-encoder의 pseudo labeling을 결합하는 방식의 Generative Pseudo Labeling (GPL)을 제안
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;원하는 도메인의 passages에 대해 pre-trained T5로 합성 쿼리를 생성&lt;/li&gt;
&lt;li&gt;합성 쿼리 생성에 사용된 passages는 생성된 쿼리에 대해 positive passages로 간주&lt;/li&gt;
&lt;li&gt;negative mining; 기존의 dense retrieval model로 생성된 쿼리와 가장 유사한 passages를 찾아 그것을 negative passages (hard negative)로 간주&lt;/li&gt;
&lt;li&gt;cross-encoder를 사용하여 각 (query, passage) 쌍에 점수를 매기고, MarginMSE-Loss를 사용하여 생성된 pseudo-labeled queries에 대해 dense retrieval model을 훈련&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;GPL을 Previous Domain Adaptaion Models와 비교
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Previous Domain Adaptation Methods
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;UDALM, MoDIR&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Pre-Training based Domain Adaptation
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;CD, SimCSE, CT, MLM, ICT, TSDAE&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Generation-based Domain Adaptation
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;QGen, QGen (w/ Hard Negatives)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Exp - Dataset&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;six representative domain-specific datasets from the BeIR benchmark
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;FiQA (financial domain)&lt;/li&gt;
&lt;li&gt;SciFact (scientific papers)&lt;/li&gt;
&lt;li&gt;BioASQ (biomedical Q&amp;amp;A)&lt;/li&gt;
&lt;li&gt;TREC-COVID (scientific papers on COVID-19)&lt;/li&gt;
&lt;li&gt;CQADupStack (12 StackExchange subforums)&lt;/li&gt;
&lt;li&gt;Robust04 (news articles)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Limitation&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;catastrophic forgetting 분석 x&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>dense retrieval</category>
      <category>dense retrieval adaptation</category>
      <category>domain adaptation</category>
      <category>generative pseudo labeling</category>
      <category>GPL</category>
      <category>information retrieval</category>
      <category>unsupervised domain adaptation</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/172</guid>
      <comments>https://oneonlee.tistory.com/172#entry172comment</comments>
      <pubDate>Fri, 25 Oct 2024 20:10:58 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] Continual Learning of Long Topic Sequences in Neural Information Retrieval</title>
      <link>https://oneonlee.tistory.com/171</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;Continual Learning of Long Topic Sequences in Neural Information Retrieval&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cited by 6 ('2024-10-22)&lt;/li&gt;
&lt;li&gt;Publication Info: ECIR 2022&lt;/li&gt;
&lt;li&gt;URL: &lt;a href=&quot;https://arxiv.org/abs/2201.03356&quot;&gt;https://arxiv.org/abs/2201.03356&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1729854253430&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Continual Learning of Long Topic Sequences in Neural Information Retrieval&quot; data-og-description=&quot;In information retrieval (IR) systems, trends and users' interests may change over time, altering either the distribution of requests or contents to be recommended. Since neural ranking approaches heavily depend on the training data, it is crucial to under&quot; data-og-host=&quot;arxiv.org&quot; data-og-source-url=&quot;https://arxiv.org/abs/2201.03356&quot; data-og-url=&quot;https://arxiv.org/abs/2201.03356v1&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/cqjbTy/hyXlRE62cq/1kkfKTFEMkem9IJqLWNsuK/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/dyH0Cd/hyXlKsrl69/ZprFuqcjzjPC8g6A4aRbG0/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000&quot;&gt;&lt;a href=&quot;https://arxiv.org/abs/2201.03356&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://arxiv.org/abs/2201.03356&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/cqjbTy/hyXlRE62cq/1kkfKTFEMkem9IJqLWNsuK/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/dyH0Cd/hyXlKsrl69/ZprFuqcjzjPC8g6A4aRbG0/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Continual Learning of Long Topic Sequences in Neural Information Retrieval&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;In information retrieval (IR) systems, trends and users' interests may change over time, altering either the distribution of requests or contents to be recommended. Since neural ranking approaches heavily depend on the training data, it is crucial to under&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;arxiv.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Summary&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Deep Neural Ranking Models에서의 Catastrophic Forgetting 정도를 확인하고 해결방안을 제시한 (ECIR 2021) Studying Catastrophic Forgetting in Neural Ranking Models의 후속논문
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;관련 글: &lt;a href=&quot;https://oneonlee.tistory.com/170&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;[논문&amp;nbsp;간단&amp;nbsp;정리]&amp;nbsp;Studying&amp;nbsp;Catastrophic&amp;nbsp;Forgetting&amp;nbsp;in&amp;nbsp;Neural&amp;nbsp;Ranking&amp;nbsp;Models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1729854304618&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;article&quot; data-og-title=&quot;[논문 간단 정리] Studying Catastrophic Forgetting in Neural Ranking Models&quot; data-og-description=&quot;Cited by 23 ('2024-10-22)Publication Info: ECIR 2021URL: https://arxiv.org/abs/2101.06984&amp;nbsp;Studying Catastrophic Forgetting in Neural Ranking ModelsSeveral deep neural ranking models have been proposed in the recent IR literature. While their transferabili&quot; data-og-host=&quot;oneonlee.tistory.com&quot; data-og-source-url=&quot;https://oneonlee.tistory.com/170&quot; data-og-url=&quot;https://oneonlee.tistory.com/170&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/G8kxG/hyXlKeTJ1J/pWKmS4y1t78UwjOJmwufok/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800,https://scrap.kakaocdn.net/dn/EYUaU/hyXlI2seVB/c6c1MtDp6xXtq5kAXIzwE1/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800&quot;&gt;&lt;a href=&quot;https://oneonlee.tistory.com/170&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://oneonlee.tistory.com/170&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/G8kxG/hyXlKeTJ1J/pWKmS4y1t78UwjOJmwufok/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800,https://scrap.kakaocdn.net/dn/EYUaU/hyXlI2seVB/c6c1MtDp6xXtq5kAXIzwE1/img.png?width=800&amp;amp;height=800&amp;amp;face=0_0_800_800');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;[논문 간단 정리] Studying Catastrophic Forgetting in Neural Ranking Models&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Cited by 23 ('2024-10-22)Publication Info: ECIR 2021URL: https://arxiv.org/abs/2101.06984&amp;nbsp;Studying Catastrophic Forgetting in Neural Ranking ModelsSeveral deep neural ranking models have been proposed in the recent IR literature. While their transferabili&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;oneonlee.tistory.com&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;콘텐츠와 사용자의 요구 사항이 시간이 지남에 따라 변화할 수 있음&lt;/li&gt;
&lt;li&gt;IR 모델이 새로운 토픽/트렌드에 대한 랭킹 능력을 변경할 수 있는지, 또한 이러한 모델이 최신 상태로 유지되는 경우 이전 토픽/트렌드에 대해서도 여전히 성능을 발휘할 수 있는지 파악하는 것&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Contribution&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Continual Learning을 위한 긴 토픽 시퀀스 및 IR 기반 제어 토픽 시퀀스를 처리하기 위해 MSMarco에서 파생된 corpus를 설계&lt;/li&gt;
&lt;li&gt;Long-term Continual Learning IR setting과 controlled setting에서 서로 다른 neural ranking model의 성능 비교
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RQ1: Modeling the long topic sequence; IR에서 지속적인 학습을 위한 일련의 작업을 설계하는 방법은 무엇인가?
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;평생 학습 전략을 설계할 때 과제 유사성, 학습 과정에서의 과제 위치, 전달해야 하는 배포 유형(짧은 텍스트 대 긴 텍스트)을 고려하는 것이 중요하다는 점을 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;RQ2: Performances on the MSMarco long topic sequence; 긴 주제 시퀀스를 학습하는 동안 신경 순위 모델의 성능은 어떤가? 치명적인 망각의 신호를 감지할 수 있나?
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;IR에서 치명적인 망각은 존재하지만 다른 영역에 비해 낮다는 것을 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Continual Learning에서 task similarity level이 neural ranking model의 학습 행동(learning behavior)에 미치는 영향을 조사
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;RQ3: Behavior on IR-driven controlled settings 시퀀스 내 작업의 유사성 수준이 모델 효과와 치명적 망각에 대한 견고성에 영향을 미치는가?&lt;/li&gt;
&lt;li&gt;RQ4: 신경 순위 모델은 쿼리 또는 문서 분포 변화에 어떻게 적응하는가?Exp - Dataset&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;자체 변형한 MS MARCO 데이터셋&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Limitation&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;본 연구에서 &lt;b&gt;서로 다른 도메인&lt;/b&gt;은 서로 다른 데이터 분포를 특징으로 하는 &lt;b&gt;서로 다른 데이터 세트를 의미&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>Continual Learning</category>
      <category>information retrieval</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/171</guid>
      <comments>https://oneonlee.tistory.com/171#entry171comment</comments>
      <pubDate>Fri, 25 Oct 2024 20:05:56 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] Studying Catastrophic Forgetting in Neural Ranking Models</title>
      <link>https://oneonlee.tistory.com/170</link>
      <description>&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cited by 23 ('2024-10-22)&lt;/li&gt;
&lt;li&gt;Publication Info: ECIR 2021&lt;/li&gt;
&lt;li&gt;URL: &lt;a href=&quot;https://arxiv.org/abs/2101.06984&quot;&gt;https://arxiv.org/abs/2101.06984&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;figure id=&quot;og_1729854086135&quot; contenteditable=&quot;false&quot; data-ke-type=&quot;opengraph&quot; data-ke-align=&quot;alignCenter&quot; data-og-type=&quot;website&quot; data-og-title=&quot;Studying Catastrophic Forgetting in Neural Ranking Models&quot; data-og-description=&quot;Several deep neural ranking models have been proposed in the recent IR literature. While their transferability to one target domain held by a dataset has been widely addressed using traditional domain adaptation strategies, the question of their cross-doma&quot; data-og-host=&quot;arxiv.org&quot; data-og-source-url=&quot;https://arxiv.org/abs/2101.06984&quot; data-og-url=&quot;https://arxiv.org/abs/2101.06984v1&quot; data-og-image=&quot;https://scrap.kakaocdn.net/dn/bqGOEq/hyXlHibzlR/m4NvxNwaCacFbGss2RVnNK/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/ygoyk/hyXlLZeEX5/PB5CBXcy60Hkabj9UILQUk/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000&quot;&gt;&lt;a href=&quot;https://arxiv.org/abs/2101.06984&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot; data-source-url=&quot;https://arxiv.org/abs/2101.06984&quot;&gt;
&lt;div class=&quot;og-image&quot; style=&quot;background-image: url('https://scrap.kakaocdn.net/dn/bqGOEq/hyXlHibzlR/m4NvxNwaCacFbGss2RVnNK/img.png?width=1200&amp;amp;height=700&amp;amp;face=0_0_1200_700,https://scrap.kakaocdn.net/dn/ygoyk/hyXlLZeEX5/PB5CBXcy60Hkabj9UILQUk/img.png?width=1000&amp;amp;height=1000&amp;amp;face=0_0_1000_1000');&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class=&quot;og-text&quot;&gt;
&lt;p class=&quot;og-title&quot; data-ke-size=&quot;size16&quot;&gt;Studying Catastrophic Forgetting in Neural Ranking Models&lt;/p&gt;
&lt;p class=&quot;og-desc&quot; data-ke-size=&quot;size16&quot;&gt;Several deep neural ranking models have been proposed in the recent IR literature. While their transferability to one target domain held by a dataset has been widely addressed using traditional domain adaptation strategies, the question of their cross-doma&lt;/p&gt;
&lt;p class=&quot;og-host&quot; data-ke-size=&quot;size16&quot;&gt;arxiv.org&lt;/p&gt;
&lt;/div&gt;
&lt;/a&gt;&lt;/figure&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Summary&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Deep Neural Ranking Models에서의 Catastrophic Forgetting 정도를 확인하고 해결방안을 제시
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;시간이 지남에 따라 지식을 조금 잊어버리는 IR 모델의 작은 약점을 강조&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;IR에서 Continual Learning을 다룬 최초의 논문(으로 보임)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이전까지의 Ranking Model 연구에서는 하나의 target domain에 대해서만 Domain Adaptation 전략을 사용해서 다루었음&lt;/li&gt;
&lt;li&gt;그러나 cross-domain transferability에서 catastrophic forgetting이 발생하는지 확인되지 않음&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Related Work&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이전 연구에서는 Catastrophic Forgetting의 수준이 데이터셋과 아키텍쳐에 크게 영향을 받는다는 사실을 밝힘&lt;/li&gt;
&lt;li&gt;그러나 domain 간 전이성 관점에서 Catastrophic Forgetting을 확인하거나, 있다면 어떻게 극복할 수 있는지를 보여주는 기존 연구는 아직까지 없음&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Contribution&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Cross-domain setting에서 Ranking Model이 새로운 지식을 습득한 후, 오래된 지식에 대해 성능이 저하되는 Catastrophic Forgetting 정도를 확인&lt;/li&gt;
&lt;li&gt;치명적인 망각을 예측하는 데이터 세트의 특성을 확인&lt;/li&gt;
&lt;li&gt;cross-domain regularizer를 통해 Catastrophic Forgetting을 완화할 수 있음을 실험을 통해 검증&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Exp - Dataset&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;MS MARCO&lt;/li&gt;
&lt;li&gt;TREC CORD19&lt;/li&gt;
&lt;li&gt;TREC Microblog&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Limitation&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;본 연구에서 &lt;b&gt;서로 다른 도메인&lt;/b&gt;은 서로 다른 데이터 분포를 특징으로 하는 &lt;b&gt;서로 다른 데이터셋을 의미&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;2개 또는 3개의 연속적인 데이터셋으로 stream을 구성
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;현실적인 long-term topic sequences에 대한 시나리오는 고려되지 않음&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;language shift나 information update는 고려되지 않음&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>catastrophic forgetting</category>
      <category>Continual Learning</category>
      <category>information retrieval</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/170</guid>
      <comments>https://oneonlee.tistory.com/170#entry170comment</comments>
      <pubDate>Fri, 25 Oct 2024 20:01:33 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models</title>
      <link>https://oneonlee.tistory.com/169</link>
      <description>&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;(EMNLP 2023) SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;arXiv: &lt;a href=&quot;https://arxiv.org/abs/2303.08896&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://arxiv.org/abs/2303.08896&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;code: &lt;a href=&quot;https://github.com/potsawee/selfcheckgpt&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/potsawee/selfcheckgpt&lt;/a&gt;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-style=&quot;style6&quot; data-ke-type=&quot;horizontalRule&quot; /&gt;
&lt;p style=&quot;color: #333333; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;1. Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Hallucination&amp;nbsp;Detection&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;기존의 fact verification 방법은 ChatGPT와 같은 블랙박스 모델에서는 작동하지 않을 수 있으므로 외부 리소스 없이도 Hallucination을 Detection 할 수 있는 새로운 접근 방식이 필요함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;2. Related Works&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;intrinsic uncertainty metrics (e.g., &amp;nbsp;token probability or entropy)&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;information may not be available to users when systems are accessed through limited external APIs&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;fact-verification&amp;nbsp;approaches
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;facts can only be assessed relative to the knowledge present in the database&lt;/li&gt;
&lt;li&gt;hallucinations are observed over a wide range of&amp;nbsp;tasks&amp;nbsp;beyond&amp;nbsp;pure&amp;nbsp;fact&amp;nbsp;verification&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;3. Proposed Key Ideas: SelfCheckGPT (sampling-based approach)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;571&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b2T9uI/btsJQlmHPHW/EjbtEUgIwUSiKd2olROyTK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b2T9uI/btsJQlmHPHW/EjbtEUgIwUSiKd2olROyTK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b2T9uI/btsJQlmHPHW/EjbtEUgIwUSiKd2olROyTK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb2T9uI%2FbtsJQlmHPHW%2FEjbtEUgIwUSiKd2olROyTK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;571&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;571&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;외부&amp;nbsp;리소스에&amp;nbsp;의존하지&amp;nbsp;않고&amp;nbsp;블랙박스&amp;nbsp;LLM에서&amp;nbsp;환각을&amp;nbsp;감지하기&amp;nbsp;위한&amp;nbsp;샘플링&amp;nbsp;기반&amp;nbsp;접근&amp;nbsp;방식인&amp;nbsp;'SelfCheckGPT'를&amp;nbsp;소개
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;BERTScore, question-answering, n-gram 분석, NLI, LLM 프롬프트 등 다양한 변형을 사용하여 샘플링된 여러 응답에 걸쳐 일관성을 측정&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The&amp;nbsp;motivating&amp;nbsp;idea
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;When an LLM has been trained on a given concept, the sampled responses are likely to be similar and contain consistent facts.&lt;/li&gt;
&lt;li&gt;However, for hallucinated facts, stochastically sampled responses are likely to diverge and &lt;br /&gt;may contradict one another.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;zero-resource hallucination detection solution that can be applied to black-box systems&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>Paper Review</category>
      <category>Hallucination</category>
      <category>hallucination detection</category>
      <category>LLM</category>
      <category>selfcheckgpt</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/169</guid>
      <comments>https://oneonlee.tistory.com/169#entry169comment</comments>
      <pubDate>Mon, 30 Sep 2024 17:34:15 +0900</pubDate>
    </item>
    <item>
      <title>[논문 간단 정리] Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation</title>
      <link>https://oneonlee.tistory.com/168</link>
      <description>&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;(ICLR 2023 notable-top-25%) Semantic&amp;nbsp;Uncertainty:&amp;nbsp;Linguistic&amp;nbsp;Invariances&amp;nbsp;for&amp;nbsp;Uncertainty&amp;nbsp;Estimation&amp;nbsp;in&amp;nbsp;Natural&amp;nbsp;Language&amp;nbsp;Generation&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;arXiv: &lt;a href=&quot;https://arxiv.org/abs/2302.09664&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://arxiv.org/abs/2302.09664&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;code: &lt;a href=&quot;https://github.com/lorenzkuhn/semantic_uncertainty&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/lorenzkuhn/semantic_uncertainty&lt;/a&gt;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-style=&quot;style6&quot; data-ke-type=&quot;horizontalRule&quot; /&gt;
&lt;p style=&quot;color: #333333; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;1. Motivation&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LLM이 생성한 답변의 uncertainty를 추정하는 것은 Trustworthy LLM과 관련하여 중요한 문제임&lt;/li&gt;
&lt;li&gt;그러나 답변의&amp;nbsp;uncertainty를&amp;nbsp;추정하는&amp;nbsp;기존의&amp;nbsp;token-likelihood&amp;nbsp;기반&amp;nbsp;방법들은&amp;nbsp;semantic&amp;nbsp;equivalence&amp;nbsp;문제를&amp;nbsp;고려하지&amp;nbsp;않음&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;semantic equivalence: 어휘적(lexical)으로는 다른 문장이 의미적(semantic)으로는 같은 의미를 가지는 것&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;본 논문은 LLM에서 불확실성을 측정하는 문제, 특히 semantic equivalence 문제로 인해 기존 방법이 어려움을 겪는 QA task의 문제를 해결함&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;2. Related Work on Uncertainty Estimation&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;predictive entropy of the output distribution&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;$PE(x) = H(Y \vert x) = -\int p(y \vert x) \log{p(y \vert x)} dy$&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;3. Proposed Key Ideas&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;semantic likelihood
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의미적으로 유사한 샘플들에서 발생하는 불확실성을 줄이기 위해, 해당 샘플들에 대해 marginalization을 수행하여, 그들의 정보를 하나로 통합함으로써 비지도 방식으로 uncertainty를 측정하는 지표&lt;/li&gt;
&lt;li&gt;이 방은 bidirectional entailment clustering을 사용하여 의미적으로 동등한 결과물을 그룹화하고, 이러한 클러스터의 분포를 기반으로 불확실성을 계산함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style=&quot;color: #000000; text-align: start;&quot; data-ke-size=&quot;size26&quot;&gt;4. Summary of Experimental Results&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;더 큰 모델과 더 까다로운 데이터 세트의 경우, semantic entropy는 기존의 불확실성 측정값보다 AUROC 성능이 뛰어남&lt;/li&gt;
&lt;li&gt;비슷한 baseline보다 QA task에서 모델의 정확도를 더 잘 예측하며, 모델 크기가 커질수록 성능이 향상됨&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>Hallucination</category>
      <category>hallucination detection</category>
      <category>LLM</category>
      <category>semantic uncertainty</category>
      <category>uncertainty estimation</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/168</guid>
      <comments>https://oneonlee.tistory.com/168#entry168comment</comments>
      <pubDate>Mon, 30 Sep 2024 15:47:40 +0900</pubDate>
    </item>
    <item>
      <title>논문 간단 정리 포스팅 템플릿</title>
      <link>https://oneonlee.tistory.com/166</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;(venue year) Title&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;arXiv:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;code:&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. Importance of the Problem&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. Related Works&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. Proposed Key Ideas&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. Summary of Experimental Results&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>Paper Review</category>
      <category>논문 간단 정리</category>
      <author>oneonlee</author>
      <guid isPermaLink="true">https://oneonlee.tistory.com/166</guid>
      <comments>https://oneonlee.tistory.com/166#entry166comment</comments>
      <pubDate>Mon, 30 Sep 2024 14:06:03 +0900</pubDate>
    </item>
  </channel>
</rss>