프롬프팅
self-refine Iterative Refinement with Self-Feedback 논문리뷰
jinuklee
2024. 7. 14. 01:39
25 May 2023
주안점 self-provided feedback
generate an initial output using an LLM; then, the same LLM provides feedback for its output and uses it to refine itself, iteratively
until a stopping condition is met. The stopping condition stop(f bt, t) either stops at a specified timestep t, or extracts a stopping indicator (e.g. a scalar stop score) from the feedback.
To inform the model about the previous iterations, we retain the history of previous feedback and outputs by appending them to the prompt