tool augmented reward modeling 논문 리뷰

카테고리 없음

tool augmented reward modeling 논문 리뷰

jinuklee 2024. 9. 14. 19:47

Our approach enhances RMs with the capability to make informed and dynamic decisions concerning which APIs to employ, when to invoke them, what arguments to pass, and how to effectively integrate the obtained results into the broader reasoning process

Thought: At this initial stage, the model evaluates whether it should engage external APIs (referred to as tool reasoning).

• Action: Subsequently, the model generates the necessary API calls along with the corresponding arguments required for the interactions.

• Observation: The results produced by the external APIs are collected and stored.

• Rationale: This stage involves the aggregation and synthesis of previously acquired information, fostering both induction and reasoning processes, specifically tailored for reward modeling.

현재글tool augmented reward modeling 논문 리뷰

이진욱님의 블로그

ai research memo for reference

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

이진욱님의 블로그

tool augmented reward modeling 논문 리뷰

'카테고리 없음'의 다른글

티스토리툴바