'분류 전체보기' 카테고리의 글 목록 (26 Page)

Internal Consistency and Self-Feedback inLarge Language Models: A Survey 논문리뷰

self-correcthttps://arxiv.org/abs/2211.00053https://arxiv.org/abs/2406.01297 When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMsSelf-correction is an approach to improving responses from large language models (LLMs) by refining the responses using LLMs during inference. Prior work has proposed various self-correction frameworks using different sources ..

카테고리 없음 2024.07.27

LEGO: A Multi-agent Collaborative Framework with Role-playing andIterative Feedback for Causality Explanation Generation 논문리뷰

https://openreview.net/pdf?id=RAtrnAtAsM2. methodology(1) Fine-grained World Knowledge Integration Module(2) Iterative Feedback and Refinement Moduleone LLM serve as Explainer - initial 결과물 생성Critic LLM - Obeservation, Iterative feed back 받음 -> refine its explanation 2.1 Fine-grained World Knowledge Integrationinception prompt -> 에 의해Cause Analyst role Effect Analyst role가 two LLMs에게 assign됨 The..

agent/multi - agent 2024.07.26

Debating with More Persuasive LLMs Leads to More Truthful Answers 논문리뷰

2024년 5월 30일자 논문LLM을 align 할때 주로 human-labelled 데이터가 주로 사용되었다. 하지만 LLM이 점점 정교해짐에 따라 human expertise를 능가하게 되고 사람이 평가하는 역할은 이러한 LLM, expertise를 감독하는 non-expert의 영역이 될 것이다모델의 잘못된 대답을 align하기 위해 각 분야의 전문가를 다 모으기는 빡셈 이러한 anticipation 기대에 앞서약한 모델(judge)이 강한모델을 평가하는 것(supervise)이 가능한가에 대한 질문 ?-> debate 방식 평가non-expert 모델(weak)이 정답을 선택하고 강한 모델이 debate를 통해 이 정확도를 증가시킴 답변 정확도non-expert 심사원(실제 사람) : 60% ..

agent/multi - agent 2024.07.24

Knowledge Mechanisms in Large Language Models:A Survey and Perspective 논문 리뷰

https://arxiv.org/pdf/2407.15017

카테고리 없음 2024.07.23

GovSim(Cooperate or Collapse: Emergence of SustainableCooperation in a Society of LLM Agents) 논문리뷰

LLM의 협력에 관해어류( 공유 자원 )를 어획하는 소규모 어부 집단에서부터 기후 변화의 부정적 영향을 줄이기 위해 오염을 제한하는 국제조약에 이르기까지 협력해서 문제를 해결하는 것은다양하다. 그러나 이기적인 개인이나 조직이 greater good을 sustain 하기 위해 개인적인 비용을 지불해야 하는 상황에 있을 때, 협력을 유지하는 것은 어려울 수 있다mechanism designers 들이 이러한 개인들의 협력을 이끌어내기 위해 incentive-compatible systems 개발 , 이런 시스템은 주로 top-down process, 하지만 실제사람들은 from the bottom up 개발을 하기도함 LLM 에이전트를 위한 최초의 공유 자원 공유 시뮬레이션 플랫폼( first common ..

agent/multi - agent 2024.07.23

mem0 (개인화된 LLM) 기억력 구현 코드 리뷰

How is Mem0 different from RAG?Mem0의 대규모 언어 모델(LLM)을 위한 메모리 구현은 검색 증강 생성(RAG)에 비해 몇 가지 장점을 제공합니다:Entity Relationships: Mem0는 정적 문서에서 정보를 검색하는 RAG와 달리 다양한 상호작용 간의 엔티티를 이해하고 연관시킬 수 있습니다. 이는 맥락과 관계에 대한 더 깊은 이해로 이어집니다.Recency, Relevancy, and Decay: Mem0는 최근 상호작용을 우선시하고 오래된 정보를 점진적으로 잊어 메모리가 관련성을 유지하고 최신 상태를 유지하여 더 정확한 응답을 보장합니다.Contextual Continuity: Mem0는 세션 전반에 걸쳐 정보를 유지하여 대화와 상호작용의 연속성을 유지합니다. 이는..

카테고리 없음 2024.07.22

my paper

hallucination, toxic content, faulty code A problem-solving process is a sequence of iterative stages within a human group (Bransford & Stein, 1993). this research aims to solve multi-step reasoning taskusing multi-agent framework Despite the rapid advancement of LLM, its shortcomings of LLMs in areas such as mathematical computations, reasoning, and factuality interrupt application lacks in esp..

카테고리 없음 2024.07.22

proagent : Building Proactive Cooperative Agents with Large Language Models논문리뷰

Planner, Verificator, Controller ,Memory 4개의 모듈을 포함한다Belief Correction 메커니즘 multi agent에 참여하는 agent 의 intention을 예측해adaptive cooperative reasoning and planning을 실행함 (미세조정, 사전훈련 없이)

agent/multi - agent 2024.07.22

On scalable oversight with weak LLMs judgingstrong LLMs 논문 리뷰

https://arxiv.org/pdf/2407.04622출발점두 AI 사이의 토론을 통해 judge model에 올바른 대답을 선택하게 한다는 아이디어( AI safety via debate arxiv)에서 출발토론에서의 nash equilibria 와 같이 두 AI 모두 가장 convincing(설득력 잇는) 방식으로 judge(심판) AI에게 진실을 말할 것이라는 hope1. Introduction유형 1. Extractive질문과 그에 따른 답변 선택지 2개, 그리고 원본 source article하지만 judge model can't see the article -> information-asymmetry2. closed질문과 그에 따른 답변 선택지 2개만 존재3. multimodal 이미지 포..

inference-time, RLHF/scalable oversight 2024.07.22

Weak-to-Strong Reasoning 논문리뷰 (Llama3-8b-instruct 로 Llama3-70b 훈련할때 supervise)

Full Weak FT” refers to the results of the baseline where the strong model is naively fine-tuned on the full dataset generated by the weak model강한 모델이 약한 모델에 의해 파인튜닝됨현재 약한 모델에서 강한 모델로의 추론 프레임워크( weak-to-strong reasoning framework )내에서, 단순한 미세 조정을 넘어서 약한 오류의 과적합을 방지하고 강한 모델의 본질적인 추론 능력을 더욱 끌어내기 위한 효과적인 방법이 부족한 상황입니다 첫 번째 단계에서, 더 정확할 가능성이 높은 적은 양의 데이터를 활용하는 것이 더 유리하다고 가정약한 모델이 생성한 데이터 강한 모델이 ICL을 ..

카테고리 없음 2024.07.22

이진욱님의 블로그

분류 전체보기 287

티스토리툴바

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30