LLM Agents

Trending

16papers

6.3viability

+100%30d

Papers

1–10 of 15

Research Paper·Mar 22, 2026

LLM-Powered Workflow Optimization for Multidisciplinary Software Development: An Automotive Industry Case Study

Multidisciplinary Software Development (MSD) requires domain experts and developers to collaborate across incompatible formalisms and separate artifact sets. Today, even with AI coding assistants like...

8.0 viability

Research Paper·Mar 19, 2026

MoRI: Learning Motivation-Grounded Reasoning for Scientific Ideation in Large Language Models

Scientific ideation aims to propose novel solutions within a given scientific context. Existing LLM-based agentic approaches emulate human research workflows, yet inadequately model scientific reasoni...

7.0 viabilityHas code

Research Paper·Mar 19, 2026

From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents

Anonymization is widely treated as a practical safeguard because re-identifying anonymous records was historically costly, requiring domain expertise, tailored algorithms, and manual corroboration. We...

7.0 viability

Research Paper·Mar 18, 2026

MemArchitect: A Policy Driven Memory Governance Layer

Persistent Large Language Model (LLM) agents expose a critical governance gap in memory management. Standard Retrieval-Augmented Generation (RAG) frameworks treat memory as passive storage, lacking me...

7.0 viability

Research Paper·Mar 18, 2026

Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety

Large Language Model (LLM) agents increasingly act through external tools, making their safety contingent on tool-call workflows rather than text generation alone. While recent benchmarks evaluate age...

7.0 viability

Research Paper·Mar 13, 2026

Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents

Test-time scaling has become a dominant paradigm for improving LLM agent reliability, yet current approaches treat compute as an abundant resource, allowing agents to exhaust token and tool budgets on...

7.0 viability

Research Paper·Mar 24, 2026

Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments

Large language models (LLMs) have enabled agentic systems that can reason, plan, and act across complex tasks, but it remains unclear whether they can allocate resources effectively under uncertainty....

7.0 viability

Research Paper·Mar 9, 2026

One Model Is Enough: Native Retrieval Embeddings from LLM Agent Hidden States

LLM agents that retrieve external knowledge typically generate a search query as text, then run a separate embedding model to encode it into a vector. This two-model pipeline adds infrastructure compl...

7.0 viability

Research Paper·Mar 19, 2026

ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs

Tool-augmented large language models (LLMs) must tightly couple multi-step reasoning with external actions, yet existing benchmarks often confound this interplay with complex environment dynamics, mem...

7.0 viability

Research Paper·Mar 23, 2026

EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning

Scientific idea generation is a cornerstone of autonomous knowledge discovery, yet the iterative evolution required to transform initial concepts into high-quality research proposals remains a formida...

7.0 viability

Page 1 of 2

LLM Agents

Papers

LLM-Powered Workflow Optimization for Multidisciplinary Software Development: An Automotive Industry Case Study

MoRI: Learning Motivation-Grounded Reasoning for Scientific Ideation in Large Language Models

From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents

MemArchitect: A Policy Driven Memory Governance Layer

Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety

Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents

Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments

One Model Is Enough: Native Retrieval Embeddings from LLM Agent Hidden States

ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs

EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning

Filters