Security AI

4papers

5.5viability

-67%30d

State of the Field

Recent advancements in security AI are focusing on enhancing the detection and mitigation of vulnerabilities through innovative applications of large language models (LLMs) and adversarial techniques. Research is exploring the predictive capabilities of LLMs for identifying security bug reports, revealing a trade-off between sensitivity and precision that suggests a need for further refinement. Concurrently, new methods like WebSentinel are being developed to effectively detect and localize prompt injection attacks, addressing a significant gap in existing defenses. The field is also scrutinizing adversarial transferability, with efforts to establish standardized frameworks that evaluate the effectiveness of various attack strategies. Moreover, frameworks that integrate LLMs into security planning are being designed to minimize hallucination risks, thereby improving incident response times. Collectively, these developments aim to bolster security measures across commercial applications, emphasizing the importance of robust, reliable AI systems in an increasingly complex threat landscape.

Last updated Feb 27, 2026

Papers

1–4 of 4

Research Paper·Jan 30, 2026

Evaluating Large Language Models for Security Bug Report Prediction

Early detection of security bug reports (SBRs) is critical for timely vulnerability mitigation. We present an evaluation of prompt-based engineering and fine-tuning approaches for predicting SBRs usin...

7.0 viability

Research Paper·Feb 3, 2026·B2BConsumer

WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

Prompt injection attacks manipulate webpage content to cause web agents to execute attacker-specified tasks instead of the user's intended ones. Existing methods for detecting and localizing such atta...

7.0 viability

Research Paper·Feb 5, 2026·B2BGovernment

Hallucination-Resistant Security Planning with a Large Language Model

Large language models (LLMs) are promising tools for supporting security management tasks, such as incident response planning. However, their unreliability and tendency to hallucinate remain significa...

5.0 viability

Research Paper·Jan 30, 2026

Make Anything Match Your Target: Universal Adversarial Perturbations against Closed-Source MLLMs via Multi-Crop Routed Meta Optimization

Targeted adversarial attacks on closed-source multimodal large language models (MLLMs) have been increasingly explored under black-box transfer, yet prior methods are predominantly sample-specific and...

3.0 viability