GPT-5.2
GPT-5.2 is a model in our research taxonomy.
Related papers
- AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises
- Leveraging LLMs to support co-evolution between definitions and instances of textual DSLs: A Systematic Evaluation
- Even GPT-5.2 Can't Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs
- Evaluative Fingerprints: Stable and Systematic Differences in LLM Evaluator Behavior
- DSAEval: Evaluating Data Science Agents on a Wide Range of Real-World Data Science Problems
- Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis
- A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
- Pencil Puzzle Bench: A Benchmark for Multi-Step Verifiable Reasoning
- Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning
- An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions