Papers
1–2 of 2Research Paper·Feb 18, 2026
References Improve LLM Alignment in Non-Verifiable Domains
While Reinforcement Learning with Verifiable Rewards (RLVR) has shown strong effectiveness in reasoning tasks, it cannot be directly applied to non-verifiable domains lacking ground-truth verifiers, s...
6.0 viability
Research Paper·Feb 19, 2026
ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment
Activation steering, or representation engineering, offers a lightweight approach to align large language models (LLMs) by manipulating their internal activations at inference time. However, current m...
5.0 viability