Multi-head Latent Attention

Multi-head Latent Attention is a research_field technology tracked in AI research papers.