LLM Compression Comparison Hub

Compress chain-of-thought reasoning traces with difficulty-aware reinforcement learning to reduce token cost without sacrificing answer quality.

Leech Lattice Vector Quantization offers a novel approach to efficiently compress large language models using high-dimensional lattice structures.

A novel approach to compress large language models by focusing on the relative rank of weights.

Reference Surfaces