State of the Field
Current research in AI theory is increasingly focused on understanding the mechanisms behind reasoning and learning in neural networks, particularly in large language models (LLMs) and deep networks. Recent work has revealed insights into emergent analogical reasoning within Transformers, highlighting how geometric alignment and categorical functors facilitate the transfer of relational structures across domains. Meanwhile, investigations into chain-of-thought reasoning have identified fundamental limits on inference-time compute, emphasizing the need for efficient reasoning token management as input sizes grow. In parallel, studies on the dynamics of deep Jacobians are clarifying the implicit biases that arise during training, while advancements in Bayesian network learning are refining our understanding of complexity in structure learning. Collectively, these efforts are addressing critical commercial challenges, such as optimizing model performance and interpretability, which are essential for deploying AI systems in real-world applications across various industries.
Papers
1–8 of 8A Mathematical Theory of Agency and Intelligence
To operate reliably under changing conditions, complex systems require feedback on how effectively they use resources, not just whether objectives are met. Current AI systems process vast information ...
Can machines be uncertain?
The paper investigates whether and how AI systems can realize states of uncertainty. By adopting a functionalist and behavioral perspective, it examines how symbolic, connectionist and hybrid architec...
On the Expressive Power of Transformers for Maxout Networks and Continuous Piecewise Linear Functions
Transformer networks have achieved remarkable empirical success across a wide range of applications, yet their theoretical expressive power remains insufficiently understood. In this paper, we study t...
The Complexity of Bayesian Network Learning: Revisiting the Superstructure
We investigate the parameterized complexity of Bayesian Network Structure Learning (BNSL), a classical problem that has received significant attention in empirical but also purely theoretical studies....
Reasoning about Reasoning: BAPO Bounds on Chain-of-Thought Token Complexity in LLMs
Inference-time scaling via chain-of-thought (CoT) reasoning is a major driver of state-of-the-art LLM performance, but it comes with substantial latency and compute costs. We address a fundamental the...
Power and Limitations of Aggregation in Compound AI Systems
When designing compound AI systems, a common approach is to query multiple copies of the same model and aggregate the responses to produce a synthesized output. Given the homogeneity of these models, ...
Emergent Analogical Reasoning in Transformers
Analogy is a central faculty of human intelligence, enabling abstract patterns discovered in one domain to be applied to another. Despite its central role in cognition, the mechanisms by which Transfo...
Why Deep Jacobian Spectra Separate: Depth-Induced Scaling and Singular-Vector Alignment
Understanding why gradient-based training in deep networks exhibits strong implicit bias remains challenging, in part because tractable singular-value dynamics are typically available only for balance...