Papers
1–3 of 3Research Paper·Mar 9, 2026
High-Fidelity Pruning for Large Language Models
Large Language Models (LLMs) have demonstrated exceptional performance across a wide range of tasks, yet their significant computational and memory requirements present major challenges for deployment...
8.0 viability
Research Paper·Mar 9, 2026
Deterministic Differentiable Structured Pruning for Large Language Models
Structured pruning reduces LLM inference cost by removing low-importance architectural components. This can be viewed as learning a multiplicative gate for each component under an l0 sparsity constrai...
7.0 viability
Research Paper·Mar 6, 2026
ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning
Pruning is widely recognized as an effective method for reducing the parameters of large language models (LLMs), potentially leading to more efficient deployment and inference. One classic and promine...
7.0 viability