LLM Inference Comparison Hub

ArcLight is a CPU-optimized LLM inference architecture that maximizes throughput on many-core CPUs by minimizing cross-NUMA memory access overhead.

DRIFT offers an efficient dual-model framework to enhance LLMs' long-context reasoning by decoupling knowledge and inference processes.

Develop a differentially private and communication efficient LLM inference framework for resource-constrained devices.

Improve LLM inference accuracy and cost tradeoff by using particle filtering algorithms.

Reference Surfaces