OWASP AI Testing Guide: Security Testing for AI Systems

Dec 9, 2025

owaspai-securitytestingadversarial-mlllm

The OWASP AI Testing Guide is an open-source initiative providing structured methodologies for testing AI systems. Because AI models learn, adapt, and fail in non-deterministic ways, they introduce risks that conventional security testing can’t address.

Why AI-Specific Testing?

Traditional software testing assumes deterministic behavior. AI systems don’t work that way:

Models can be fooled by adversarial inputs
Training data can be poisoned
Model weights can leak sensitive information
Outputs can be manipulated through prompt injection

Without specialized testing, these vulnerabilities remain invisible.

The Four Pillars

The guide uses a threat-driven methodology aligned with Google’s SAIF framework, decomposing AI systems into four layers:

1. Model Testing

Testing the “brain” of the system:

Evasion attacks: Crafted inputs that fool the model
Model poisoning: Corrupted training leading to malicious behavior
Membership inference: Detecting if specific data was used in training
Goal alignment: Ensuring the model behaves as intended

2. Infrastructure Testing

Securing the compute and storage pipeline:

Supply chain tampering: Poisoned models from HuggingFace, etc.
Resource exhaustion: DoS attacks on inference endpoints
Plugin boundary violations: Escaping sandbox restrictions

3. Data Testing

Assuring integrity and privacy of training data:

Training data exposure: Extracting memorized data from models
Runtime exfiltration: Leaking data during inference
Dataset bias: Discriminatory outcomes from skewed training data

4. Application Testing

Application-layer vulnerabilities:

Prompt injection: Hijacking LLM behavior through malicious inputs
Context manipulation: Exploiting conversation history
Jailbreaks: Bypassing safety guardrails

Tools

For adversarial testing, the guide recommends:

Adversarial Robustness Toolbox (ART): IBM’s Python library for evasion, poisoning, extraction, and inference attacks
Foolbox: Fast adversarial attacks
TextAttack: NLP-specific adversarial testing
Armory: End-to-end adversarial ML evaluation

The AI Testing Guide complements:

OWASP Top 10 for LLM: Risk taxonomy for generative AI
OWASP AI Exchange: 200+ pages on AI protection
OWASP AI VSS: AI-specific vulnerability scoring

Current Status

The project is in active development (Phase 1 as of June 2025) with a public draft on GitHub. Led by Matteo Meucci and Marco Morana, with 23 contributors and growing.