Agents

Papers in Agents

44 papers

MAXS: Meta-Adaptive Exploration with LLM Agents
Develop a reasoning framework using LLM agents for stable, efficient multi-tool integration with proven performance gains.
AgentsViability: 8.0
TCAndon-Router: Adaptive Reasoning Router for Multi-Agent Collaboration
Develop an adaptive reasoning router for seamless and robust multi-agent collaboration in enterprise applications.
AgentsViability: 8.0
Fuzzy Categorical Planning: Autonomous Goal Satisfaction with Graded Semantic Constraints
Develop a fuzzy categorical planning tool to optimize vague predicates in autonomous planning using semantic constraints.
AgentsViability: 5.0
JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG
Develop a joint optimization framework for dynamic agentic RAG workflows to improve performance and efficiency.
AgentsViability: 6.0
The Hierarchy of Agentic Capabilities: Evaluating Frontier Models on Realistic RL Environments
Develop AI agents for workplace tasks by mastering tool use, planning, and adaptability in e-commerce RL environments.
AgentsViability: 6.0
Internal Representations as Indicators of Hallucinations in Agent Tool Selection
Develop a real-time hallucination detection tool for LLM-based agents to ensure reliable tool usage and security.
AgentsViability: 6.0
When Agents Fail to Act: A Diagnostic Framework for Tool Invocation Reliability in Multi-Agent LLM Systems
Evaluate and enhance tool-use reliability in multi-agent LLM systems with our diagnostic framework.
AgentsViability: 6.0
Beyond Static Summarization: Proactive Memory Extraction for LLM Agents
Develop an iterative cognitive process for proactive memory extraction to improve QA accuracy in LLM agents.
AgentsViability: 3.0
Autonomous Agents on Blockchains: Standards, Execution Models, and Trust Boundaries
Develop secure interfaces for autonomous agents executing on blockchains to ensure safe and robust transactions.
AgentsViability: 3.0
AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement
AutoRefine enhances LLM agents by transforming execution histories into procedural and static knowledge, significantly improving efficiency and coordination.
AgentsViability: 7.0
CUA-Skill: Develop Skills for Computer Using Agent
CUA-Skill provides a structured skills library for autonomous computer-using agents to enhance their efficiency and reliability.
AgentsViability: 8.0
Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments
Proactive Task-oriented Agents for dynamic environments that adapt to user intent shifts.
AgentsViability: 7.0
Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models
Create smarter AI agents by using adaptive lookahead world models for complex task planning.
AgentsViability: 5.0
TopoDIM: One-shot Topology Generation of Diverse Interaction Modes for Multi-Agent Systems
TopoDIM optimizes multi-agent communication topology to enhance efficiency and performance with decentralized, adaptive interaction modes.
AgentsViability: 7.0
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration
AOrchestra automates sub-agent creation for efficient multi-agent orchestration, improving adaptability and performance in task automation.
AgentsViability: 7.0
Role-Playing Agents Driven by Large Language Models: Current Status, Challenges, and Future Trends
Develop advanced role-playing language agents leveraging large language models for immersive human-computer interaction experiences.
AgentsViability: 5.0
MATA: Multi-Agent Framework for Reliable and Flexible Table Question Answering
MATA provides a multi-agent framework for efficient and scalable Table Question Answering using small open-source models.
AgentsViability: 6.0
ShopSimulator: Evaluating and Exploring RL-Driven LLM Agent for Shopping Assistants
An RL-driven LLM shopping assistant that improves product search and personalization in e-commerce.
AgentsViability: 7.0
Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering
Build autonomous AI agents with ultra-long-horizon capabilities for complex machine learning engineering tasks.
AgentsViability: 7.0
GAIA: A Data Flywheel System for Training GUI Test-Time Scaling Critic Models
GAIA offers a self-improving training framework for GUI agents, enhancing action accuracy with iterative data refinement.
AgentsViability: 7.0
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search
Boundary-Aware Policy Optimization enhances reliability for LLM-driven agentic search by teaching AI to recognize its knowledge limits.
AgentsViability: 8.0
Deep Researcher with Sequential Plan Reflection and Candidates Crossover (Deep Researcher Reflect Evolve)
Create dynamic research tools for generating detailed PhD-level reports using innovative sequential plan refinement and candidate crossover techniques.
AgentsViability: 7.0
ReCreate: Reasoning and Creating Domain Agents Driven by Experience
ReCreate automates the creation of domain-specific agents by leveraging agent interaction histories for improved performance.
AgentsViability: 3.0
A Lightweight Modular Framework for Constructing Autonomous Agents Driven by Large Language Models: Design, Implementation, and Applications in AgentForge
AgentForge is an open-source Python framework simplifying the creation and deployment of LLM-driven autonomous agents.
AgentsViability: 8.0
Temp-R1: A Unified Autonomous Agent for Complex Temporal KGQA via Reverse Curriculum Reinforcement Learning
Temp-R1 is a state-of-the-art autonomous agent for Temporal Knowledge Graph Question Answering trained through reverse curriculum reinforcement learning.
AgentsViability: 5.0
FormalJudge: A Neuro-Symbolic Paradigm for Agentic Oversight
Develop a neuro-symbolic framework for ensuring behavioral safety of LLM-based agents through formal verification.
AgentsViability: 7.0
Emerging from Ground: Addressing Intent Deviation in Tool-Using Agents via Deriving Real Calls into Virtual Trajectories
RISE improves intent alignment in LLM tool-using agents by synthesizing virtual trajectories for efficient fine-tuning.
AgentsViability: 7.0
MAS-Orchestra: Understanding and Improving Multi-Agent Reasoning Through Holistic Orchestration and Controlled Benchmarks
MAS-Orchestra revolutionizes multi-agent system design and evaluation, promising superior coordination and intelligence through holistic orchestration and controlled benchmarking.
AgentsViability: 6.0
Insight Agents: An LLM-Based Multi-Agent System for Data Insights
Insight Agents delivers quick, personalized business insights to e-commerce sellers, leveraging a structured multi-agent LLM system.
AgentsViability: 7.0
MemCtrl: Using MLLMs as Active Memory Controllers on Embodied Agents
MemCtrl is a framework for enhancing embodied agents with efficient memory control using Multimodal Large Language Models.
AgentsViability: 6.0
LLM-in-Sandbox Elicits General Agentic Intelligence
LLM-in-Sandbox enables large language models to perform tasks beyond code through a virtual exploration environment.
AgentsViability: 7.0
SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents
Develop SWE-AGI as a benchmark for evaluating LLMs in specification-driven software construction using MoonBit.
AgentsViability: 5.0
Agentic Uncertainty Quantification
Develop a dual-process AI framework to enhance agent reliability by transforming uncertainty into active control signals.
AgentsViability: 6.0
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience
EvoCUA: A scalable evolutionary learning agent for automating complex computer-use tasks with high success rate.
AgentsViability: 6.0
OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution
OmegaUse is a GUI agent model designed for streamlined autonomous task execution across mobile and desktop platforms.
AgentsViability: 7.0
Belief in Authority: Impact of Authority in Multi-Agent Evaluation Framework
Develop AI systems that optimize multi-agent interactions by understanding authority bias using role-based frameworks.
AgentsViability: 5.0
Learning Latent Action World Models In The Wild
Develop latent action world models from diverse in-the-wild videos for improved real-world planning tasks.
AgentsViability: 2.0
Sentipolis: Emotion-Aware Agents for Social Simulations
Sentipolis enhances social simulation agents by integrating emotion dynamics, improving emotional continuity and communication.
AgentsViability: 3.0
NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents
NEMO transforms natural language decision problem descriptions into executable mathematical optimization implementations using autonomous coding agents.
AgentsViability: 7.0
DynaWeb: Model-Based Reinforcement Learning of Web Agents
DynaWeb leverages model-based reinforcement learning to train web agents in simulated environments, enhancing the efficiency and scalability of autonomous internet navigation.
AgentsViability: 5.0
GLOVE: Global Verifier for LLM Memory-Environment Realignment
Develop a system that enhances LLM memory reliability in dynamic environments by detecting and realigning outdated or conflicting data.
AgentsViability: 7.0
Multi-Agent Procedural Graph Extraction with Structural and Logical Refinement
A multi-agent framework for extracting procedural graphs from natural language with improved structural and logical accuracy.
AgentsViability: 4.0
Agentic Design Patterns: A System-Theoretic Framework
Develop a system-theoretic framework for robust AI agent design to improve modularity and reliability.
AgentsViability: 3.0
Toward Learning POMDPs Beyond Full-Rank Actions and State Observability
Develop a method for learning observation and transition matrices in POMDPs beyond full-rank assumptions.
AgentsViability: 2.0