Questions or feedback? Reach me at [email protected]
AI Security Agent

Your code has secrets.
Helios finds them.

The first security agent that thinks like a hacker. Autonomously finds, exploits, and fixes vulnerabilities — with two AI agents battling over your code in real-time.

helios — arena-demo
$

Adversarial Agents Arena

Two Claude AI instances with opposing objectives battle over your codebase in real-time. One attacks. One defends. You watch.

Red Agent // Attacker
[RED] Scanning target for attack vectors...
[RED] Found: SQL injection in /api/login
[RED] Severity: CRITICAL
[RED] Crafting auth bypass payload...
[RED] Executing: ' OR 1=1 --
[RED] Exploit successful. Admin access gained.
[RED] Scanning for XSS in /search...
[RED] Found: Reflected XSS via q= param
[RED] SQLi blocked. Adapting strategy...
[RED] Chaining XSS + SSRF for lateral move
[RED] Generating custom exploit script...
[RED] Swarm agent 2 targeting /api/upload
[RED] Path traversal: ../../etc/passwd
[RED] File read successful. 3 chains complete.
VS
Blue Agent // Defender
[BLUE] Monitoring container logs...
[BLUE] Anomalous query on /api/login
[BLUE] Threat: SQL injection attempt
[BLUE] Deploying parameterized query patch
[BLUE] Patch applied. SQLi vector closed.
[BLUE] Detected script injection in /search
[BLUE] Applying output encoding...
[BLUE] WAF rule deployed: block XSS payloads
[BLUE] XSS vector neutralized.
[BLUE] Suspicious file read on /api/upload
[BLUE] Patching path traversal...
[BLUE] Input sanitization deployed
[BLUE] Hardening CORS and CSP headers
[BLUE] Container hardened. Monitoring...
Live Battle Score -- Inkscape Web Test
Red Agent: 14 Blue Agent: 8

What Helios does

Not a wrapper around an API call. Deep AI-powered security analysis with real exploitation and automated remediation.

Autonomous Scanner

SQL injection, XSS, command injection, SSRF, path traversal, hardcoded secrets, dependency vulns, and config misconfigurations.

Sandbox Exploitation

Docker-based sandbox runs your app in isolation. Real exploits execute against live containers with captured proof-of-concept output.

Supply Chain Analysis

Maps dependency relationships across multiple repos. Identifies transitive attack vectors -- vulns in package A that create exploits in package B.

AI-Generated Exploits

Not predefined payloads. The Red Agent analyzes your specific codebase and crafts bespoke attack scripts that adapt when defenses change.

Auto-Fix and Patch

Generates code patches for every finding. Creates GitLab issues with severity labels and opens merge requests with verified fixes.

GitLab Native

Clone repos via GitLab API. Create issues, open fix branches, submit merge requests. Drop-in CI/CD pipeline template for any project.

How it works

Claude Agent SDK with custom MCP tools. Docker sandbox for safe exploitation. Concurrent adversarial agents via anyio task groups.

01

Scan

AI analyzes your codebase with static analysis, secret detection, dependency checking, and configuration auditing.

02

Exploit

Docker sandbox spins up your app. AI generates custom exploit scripts and executes them against the live container.

03

Battle

Red Agent attacks while Blue Agent defends in real-time. Concurrent agents compete with live scoring.

04

Fix

Auto-generates patches, creates GitLab issues with severity labels, and opens merge requests with verified fixes.

Under the hood

Built on Claude Agent SDK with Model Context Protocol tools. Each agent gets a tailored security toolset.

Claude Agent SDK
  |
  +-- Red Agent (attacker)
  |     +-- Scanner (static analysis, secrets, deps, config audit)
  |     +-- Exploiter (SQLi, XSS, SSRF, auth bypass, path traversal)
  |     +-- Custom Exploit Generator (AI writes bespoke attack scripts)
  |     +-- Supply Chain Analyzer (cross-repo transitive vectors)
  |
  +-- Blue Agent (defender)
  |     +-- Log Monitor (real-time container log analysis)
  |     +-- Patch Deployer (live patching of running containers)
  |     +-- WAF Rule Engine (runtime input validation)
  |     +-- Patch Verifier (confirms fixes block attack vectors)
  |
  +-- Arena Orchestrator
        +-- Docker Sandbox (isolated exploitation environment)
        +-- Score Tracker (Red exploits vs Blue patches)
        +-- Rich Terminal Display (split-panel live output)

Why Helios

Most security tools scan for patterns. Helios thinks like a hacker.

Capability Traditional Scanners Helios
Detection Pattern matching AI-powered contextual analysis
Verification "You might be vulnerable" "I just exploited this. Here's proof."
Exploitation None Sandbox with proof-of-concept
Defense Testing None AI vs AI adversarial battle
Supply Chain Single repo Cross-repository analysis
Custom Exploits Predefined payloads AI generates bespoke exploits
Cost $15-50K / pentest Your existing Claude subscription

Real-world testing

Tested against production GitLab-hosted projects. Every finding verified with proof-of-concept in Docker sandbox.

0
Vulnerabilities
Found in Inkscape Web
0
Critical
Exploited in sandbox
0
Attack Vectors
Transitive supply chain
0
Verified
% with PoC output
Test Target Inkscape Web (GitLab-hosted) -- Supply chain analysis across Inkscape Web + django-cms dependency trees. Arena swarm mode: Red Agent 14 vs Blue Agent 8.

Built with

Built for the GitLab AI Hackathon with Claude Agent SDK and the Model Context Protocol.

AI
Claude Agent SDK
AI reasoning and MCP tool orchestration
DK
Docker
Sandboxed exploitation environment
GL
python-gitlab
GitLab API integration
UI
Rich
Terminal UI and arena display
CL
Click
CLI framework
IO
anyio
Async concurrency for parallel agents

Ready to hack your own code?

Install Helios in seconds. No per-scan fees. No enterprise pricing.

$ pip install helios-security