OpenAI launches safety bug bounty

2026-04-09 · openai

OpenAI announced a new public Safety Bug Bounty program aimed at identifying abuse and safety risks across its products. The scope includes agentic vulnerabilities, prompt injection, account integrity issues, and data exfiltration paths that may not fit standard security bounties. That makes this a meaningful operational update for developers and security researchers alike. OpenAI is formalizing AI safety as something that can be found, reported, and fixed like a real software surface.

Key Features or Updates

The program accepts reports on AI abuse and safety problems, including agentic behavior that can cause material harm. OpenAI says the safety program sits alongside its security bug bounty but is specifically scoped to AI misuse and behavioral risk. That separation shows the company sees safety as a distinct engineering domain.

Impact on Developers

If you build with OpenAI tools, prompt injection, permission boundaries, and account integrity now deserve the same attention as code security. This update also gives researchers a clearer path for reporting risky behavior in agentic systems. It is a strong signal that AI product teams need to harden their workflows before incidents happen in production.

How to use it

Teams should review where their agents read external text, invoke tools, or act on behalf of users. The best next step is to add tighter permission scopes, better validation, and clearer human approval steps where the risk is highest. Use the bounty scope as a practical checklist for your own threat model.

Read Original Post →