State of the Field
Recent advancements in agent-based systems are focusing on enhancing reliability, efficiency, and adaptability in complex tasks across various domains. New frameworks, such as Avenir-Web, are improving the execution of long-horizon tasks on dynamic web interfaces by integrating advanced grounding techniques and adaptive memory systems. Concurrently, innovations like Learning to Share are optimizing parallel agentic systems by implementing selective memory mechanisms that reduce computational overhead while maintaining performance. Boundary-Aware Policy Optimization is addressing reliability issues in reinforcement learning by promoting accurate self-assessment in agents, encouraging them to acknowledge limitations. Additionally, the introduction of modular frameworks like AgentForge is democratizing the development of autonomous agents, enabling rapid prototyping and deployment. These developments collectively aim to solve commercial challenges in automation, data mining, and user interaction, paving the way for more robust and efficient agent systems that can operate effectively in real-world applications.
Papers
1–10 of 50Learning to Share: Selective Memory for Efficient Parallel Agentic Systems
Agentic systems solve complex tasks by coordinating multiple agents that iteratively reason, invoke tools, and exchange intermediate results. To improve robustness and solution quality, recent approac...
Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent Evolution
Multimodal Large Language Model (MLLM) agents facilitate Graphical User Interface (GUI) automation but struggle with long-horizon, cross-application tasks due to limited context windows. While memory ...
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
Agent Skills are structured packages of procedural knowledge that augment LLM agents at inference time. Despite rapid adoption, there is no standard way to measure whether they actually help. We prese...
TCAndon-Router: Adaptive Reasoning Router for Multi-Agent Collaboration
Multi-Agent Systems(MAS) have become a powerful paradigm for building high performance intelligent applications. Within these systems, the router responsible for determining which expert agents should...
CUA-Skill: Develop Skills for Computer Using Agent
Computer-Using Agents (CUAs) aim to autonomously operate computer systems to complete real-world tasks. However, existing agentic systems remain difficult to scale and lag behind human performance. A ...
M$^2$-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
Graphical User Interface (GUI) agent is pivotal to advancing intelligent human-computer interaction paradigms. Constructing powerful GUI agents necessitates the large-scale annotation of high-quality ...
A Lightweight Modular Framework for Constructing Autonomous Agents Driven by Large Language Models: Design, Implementation, and Applications in AgentForge
The emergence of LLMs has catalyzed a paradigm shift in autonomous agent development, enabling systems capable of reasoning, planning, and executing complex multi-step tasks. However, existing agent f...
ProAct: Agentic Lookahead in Interactive Environments
Existing Large Language Model (LLM) agents struggle in interactive environments requiring long-horizon planning, primarily due to compounding errors when simulating future states. To address this, we ...
Avenir-Web: Human-Experience-Imitating Multimodal Web Agents with Mixture of Grounding Experts
Despite advances in multimodal large language models, autonomous web agents still struggle to reliably execute long-horizon tasks on complex and dynamic web interfaces. Existing agents often suffer fr...
Understanding Agent Scaling in LLM-Based Multi-Agent Systems via Diversity
LLM-based multi-agent systems (MAS) have emerged as a promising approach to tackle complex tasks that are difficult for individual LLMs. A natural strategy is to scale performance by increasing the nu...