LLMsSHAP analysismechanistic interpretabilityLlamaGemmaMistral
Top papers
- Model Medicine: A Clinical Framework for Understanding, Diagnosing, and Treating AI Models(7.0)
- Diagnosing Generalization Failures in Fine-Tuned LLMs: A Cross-Architectural Study on Phishing Detection(6.0)
- "I May Not Have Articulated Myself Clearly": Diagnosing Dynamic Instability in LLM Reasoning at Inference Time(3.0)