논문 reference 참고 list
LLM-deliberation: Evaluating LLMs with interactive multi-agent negotiation game (2024)
Gpt-4 technical report (2023)
Qwen-vl: A frontier large vision-language model with versatile abilities (2023)
Language models are few-shot learners (2020)
Chateval: Towards better LLM-based evaluators through multi-agent debate (2024)
Minigpt-v2: Large language model as a unified interface for vision-language multi-task learning (2023)
Octopus v2: On-device language model for super agent (2024)
Octopus v3: Technical report for on-device sub-billion multimodal AI agent (2024)
Octopus v4: Graph of language models (2024)
Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors (2024)
Seeclick: Harnessing gui grounding for advanced visual gui agents (2024)
Holistic analysis of hallucination in gpt-4v (ision): Bias and interference challenges (2023)
Instructblip: Towards general-purpose vision-language models with instruction tuning (2023)
Mind2web: Towards a generalist agent for the web (2023)
Detecting and preventing hallucinations in large vision language models (2024)
A real-world webagent with planning, long context understanding, and program synthesis (2024)
MetaGPT: Meta programming for a multi-agent collaborative framework (2024)
Cogagent: A visual language model for gui agents (2023)
More agents is all you need (2024)
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration (2024)
Automated Unit Test Improvement using Large Language Models (2024)
Human-level play in the game of Diplomacy by combining language models with strategic reasoning (2022)
AgentScope: A Flexible yet Robust Multi-Agent Platform (2024)
Experiential Co-Learning of Software-Developing Agents (2024)
Communicative Agents for Software Development (2023)
Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy (2024)
Generative Agents: Interactive Simulacra of Human Behavior (2023)
Learning to Decode Collaboratively with Multiple Language Models (2024)
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents (2024)
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework (2024)
LLM Agent Operating System (2024)
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution (2024)
AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models (2024)
Scaling Instructable Agents Across Many Simulated Worlds (2024)
Evolutionary Optimization of Model Merging Recipes (2024)
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models (2024)
Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents (2024)
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation (2023)
Chain of Agents: Large Language Models Collaborating on Long-Context Tasks (2024)
CulturePark: Boosting Cross-cultural Understanding in Large Language Models (2024)
Constitutional AI: Harmlessness from AI Feedback (2022)
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration (2024)
Mixture-of-Agents Enhances Large Language Model Capabilities (2024)
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments (2024)
CODER: ISSUE RESOLVING WITH MULTI-AGENT AND TASK GRAPHS (2024)
EVOAGENT: Towards Automatic Multi-Agent: Generation via Evolutionary Algorithms (2024)
Scaling Synthetic Data Creation with 1,000,000,000 Personas (2024)
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts (2024)
RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing (2024)
On scalable oversight with weak LLMs judging strong LLMs (2024)
AgentInstruct: Toward Generative Teaching with Agentic Flows (2024)
INTERNET OF AGENTS (2024)
Mobileagent: Autonomous multi-modal mobile device agent with visual perception (2024)
Autodroid: Llm-powered task automation in android (2023)
Language agents with reinforcement learning for strategic play in the werewolf game (2024)
Gpt-4v in wonderland: Large multimodal models for zero-shot smartphone gui navigation (2023)
Webshop: Towards scalable real-world web interaction with grounded language agents (2022)
mplug-owl: Modularization empowers large language models with multimodality (2023)
mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration (2023)
Ufo: A ui-focused agent for windows os interaction (2024)
Appagent: Multimodal agents as smartphone users (2023)
Exploring collaboration mechanisms for llm agents: A social psychology view (2023)
Tinychart: Efficient chart understanding with visual token merging and program-of-thoughts learning (2024)
Expel: Llm agents are experiential learners (2024)
Gpt-4v(ision) is a generalist web agent, if grounded (2024)
mplug-paperowl: Scientific diagram analysis with the multimodal large language model (2023)
mplug-docowl 1.5: Unified structure learning for ocr-free document understanding (2024)
Modelscope-agent: Building your customizable agent system with open-source large language models (2023)
CAMEL: Communicative agents for "mind" exploration of large language model society (2023)
Evaluating object hallucination in large vision-language models (2023)
Aligning large multi-modal model with robust instruction tuning (2023)
Improved baselines with visual instruction tuning (2023)
Visual instruction tuning (2023)
Llava-plus: Learning to use tools for creating multimodal agents (2023)
Grounding dino: Marrying dino with grounded pre-training for open-set object detection (2023)
Dynamic llm-agent network: An llm-agent collaboration framework with agent team optimization (2023)
Welfare diplomacy: Benchmarking language model cooperation (2024)
Generative agents: Interactive simulacra of human behavior (2023)
Small llms are weak tool learners: A multi-llm agent (2024)
DebateGPT: Fine-tuning large language models with multi-agent debate supervision (2024)
Multi-agent collaboration: Harnessing the power of intelligent llm agents (2023)
Chain-of-discussion: A multi-model framework for complex evidence-based question answering (2024)