Papers
1–3 of 3Research Paper·Mar 11, 2026
PEEM: Prompt Engineering Evaluation Metrics for Interpretable Joint Evaluation of Prompts and Responses
Prompt design is a primary control interface for large language models (LLMs), yet standard evaluations largely reduce performance to answer correctness, obscuring why a prompt succeeds or fails and p...
7.0 viability
Research Paper·Feb 25, 2026
Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem
Large language models consistently fail the "car wash problem," a viral reasoning benchmark requiring implicit physical constraint inference. We present a variable isolation study (n=20 per condition,...
4.0 viability
Research Paper·Mar 16, 2026
Prompt Readiness Levels (PRL): a maturity scale and scoring framework for production grade prompt assets
Prompt engineering has become a production critical component of generative AI systems. However, organizations still lack a shared, auditable method to qualify prompt assets against operational object...
2.0 viability