Language Models

Papers in Language Models

10 papers

Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching
Develop a reward-guided diffusion language model tool for enhanced reasoning across math and coding tasks.
Language ModelsViability: 8.0
Value-Aware Numerical Representations for Transformer Language Models
Enhance language models to handle numerical data robustly with value-aware token embeddings.
Language ModelsViability: 6.0
A.X K1 Technical Report
A.X K1 is a scalable MoE language model excelling in Korean-language benchmarks with user-controlled reasoning capabilities.
Language ModelsViability: 6.0
Kakugo: Distillation of Low-Resource Languages into Small Language Models
Kakugo: Cost-effective pipeline for developing AI models in low-resource languages using distillation under $50 per language.
Language ModelsViability: 8.0
Towards robust long-context understanding of large language model via active recap learning
A novel active recap learning framework improves long-context understanding in language models through enhanced summarization mechanisms.
Language ModelsViability: 3.0
Foundations of Global Consistency Checking with Noisy LLM Oracles
Develop a scalable framework for verifying linguistic consistency using LLM-based evaluators.
Language ModelsViability: 3.0
Parallelism and Generation Order in Masked Diffusion Language Models: Limits Today, Potential Tomorrow
Explore parallel token generation in language models with our innovative diffusion approach.
Language ModelsViability: 5.0
Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling
Develop a Chinese language model using low-resolution character images for improved efficiency and prediction.
Language ModelsViability: 6.0
Emergence of Phonemic, Syntactic, and Semantic Representations in Artificial Neural Networks
Investigates stages of linguistic representation emergence in neural networks, analogous to child language acquisition.
Language ModelsViability: 2.0
Understanding the Reversal Curse Mitigation in Masked Diffusion Models through Attention and Training Dynamics
Understanding mechanisms in masked diffusion models to mitigate reversal curse in language processing.
Language ModelsViability: 2.0