AI Models Comparison Hub
3 papers - avg viability 4.0
Top Papers
- Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts(5.0)
Develop HypeNet, a hybrid RNN-attention architecture, to enhance long-context performance and efficiency using minimal data.
- The Information Geometry of Softmax: Probing and Steering(5.0)
Develop enhanced concept manipulation in AI models using Dual Steering with information geometry.
- Poly-attention: a general scheme for higher-order self-attention(2.0)
Develop a new quadratic-time attention mechanism for higher-order token interactions in AI models.