THE RED COUNCIL

AI Red Team & Security

Attack. Assess. Patch.

Systematically test your LLMs and AI agents against adversarial attacks before attackers find them. Closed-loop security testing: Attack → Detect → Defend → Verify.

Get Started View on GitHub

Four Heads. One Loop.

Each head of the Hydra represents an agent in the closed-loop security testing pipeline.

Attack → Judge → Defend → Verify ↺

Attacker

Generates adversarial prompts from 165+ curated attack patterns including jailbreaks, prompt injections, and extraction attacks.

Judge

Scores responses 0-10 for safety, detects secret leakage, and determines if a breach occurred with detailed analysis.

Defender

Auto-generates hardened system prompts when breaches occur, implementing specific countermeasures against detected attack vectors.

Verifier

Re-tests defenses to prove they work, closing the loop. Only marks as secure when the same attack no longer succeeds.

Quick Start

Get up and running in under 5 minutes.

terminal

# Clone and install
git clone https://github.com/sherifkozman/the-red-council
cd the-red-council && pip install -e ".[dev]"

# Seed the attack knowledge base
python -m scripts.seed_kb

# Start the backend
uvicorn src.api.main:app --port 8000

# In another terminal, start the frontend
cd frontend && pnpm install && pnpm dev

# Visit http://localhost:3000