An Empirical Study of the Imbalance Issue in Software Vulnerability Detection

PDF Viewer

BUILDER'S SANDBOX

Build This Paper

Use an AI coding agent to implement this research.

OpenAI Codex
OpenAI CodexAI Agent

Lightweight coding agent in your terminal.

Claude Code
Claude CodeAI Agent

Agentic coding tool for terminal workflows.

AntiGravity IDE
AntiGravity IDEScaffolding

AI agent mindset installer and workflow scaffolder.

Cursor
CursorIDE

AI-first code editor built on VS Code.

VS Code
VS CodeIDE

Free, open-source editor by Microsoft.

Estimated $10K - $14K over 6-10 weeks.

See exactly what it costs to build this -- with 3 comparable funded startups.

7-day free trial. Cancel anytime.

Discover the researchers behind this paper and find similar experts.

7-day free trial. Cancel anytime.

References (43)

[1]
On the effectiveness of data balancing techniques in the context of ML-based test case prioritization
2022Jediael Mendoza, Jason Mycroft et al.
[2]
Improving Test-Time Adaptation via Shift-agnostic Weight Regularization and Nearest Source Prototypes
2022Sungha Choi, Seunghan Yang et al.
[3]
Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue
2022Rui Shu, Tianpei Xia et al.
[4]
Natural Attack for Pre-trained Models of Code
2022Zhou Yang, Jieke Shi et al.
[5]
A Survey on Machine Learning Techniques for Source Code Analysis
2021Tushar Sharma, M. Kechagia et al.
[6]
Data Preparation for Software Vulnerability Prediction: A Systematic Literature Review
2021Roland Croft, Yongzhen Xie et al.
[7]
On the Opportunities and Risks of Foundation Models
2021Rishi Bommasani, Drew A. Hudson et al.
[8]
Pre-Trained Models: Past, Present and Future
2021Xu Han, Zhengyan Zhang et al.
[9]
Shallow or Deep? An Empirical Study on Detecting Vulnerabilities using Deep Learning
2021Alejandro Mazuera-Rozo, Anamaria Mojica-Hanke et al.
[10]
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
2021Shuai Lu, Daya Guo et al.
[11]
Learning from class-imbalanced data: review of data driven methods and algorithm driven methods
2021C. Huang, H. Dai
[12]
Learning from what we know: How to perform vulnerability prediction using noisy historical data
2020Aayush Garg, Renzo Degiovanni et al.
[13]
WILDS: A Benchmark of in-the-Wild Distribution Shifts
2020Pang Wei Koh, Shiori Sagawa et al.
[14]
A Survey of Automatic Software Vulnerability Detection, Program Repair, and Defect Prediction Techniques
2020Zhidong Shen, Si Chen
[15]
GraphCodeBERT: Pre-training Code Representations with Data Flow
2020Daya Guo, Shuo Ren et al.
[16]
Deep Learning Based Vulnerability Detection: Are We There Yet?
2020Saikat Chakraborty, R. Krishna et al.
[17]
Language Models are Few-Shot Learners
2020Tom B. Brown, Benjamin Mann et al.
[18]
Generating Adversarial Examples for Holding Robustness of Source Code Processing Models
2020Huangzhao Zhang, Zhuo Li et al.
[19]
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
2020Zhangyin Feng, Daya Guo et al.
[20]
Deep Learning-Based Vulnerable Function Detection: A Benchmark
2019Guanjun Lin, Wei Xiao et al.

Showing 20 of 43 references

Founder's Pitch

"Optimize deep learning-based software vulnerability detection by addressing data imbalance issues using empirical findings."

Software SecurityScore: 5View PDF ↗

Commercial Viability Breakdown

0-10 scale

High Potential

1/4 signals

2.5

Quick Build

2/4 signals

5

Series A Potential

1/4 signals

2.5

Sources used for this analysis

arXiv Paper

Full-text PDF analysis of the research paper

GitHub Repository

Code availability, stars, and contributor activity

Citation Network

Semantic Scholar citations and co-citation patterns

Community Predictions

Crowd-sourced unicorn probability assessments

Analysis model: GPT-4o · Last scored: 2/12/2026

Explore the full citation network and related research.

7-day free trial. Cancel anytime.

Understand the commercial significance and market impact.

7-day free trial. Cancel anytime.

Get detailed profiles of the research team.

7-day free trial. Cancel anytime.

Related Papers

Loading…