OWASP AI Testing Guide: Security Testing for AI Systems
Dec 9, 2025
owaspai-securitytestingadversarial-mlllm
The OWASP AI Testing Guide is an open-source initiative providing structured methodologies for testing AI systems. Because AI models learn, adapt, and fail in non-deterministic ways, they introduce risks that conventional security testing can’t address.
Why AI-Specific Testing?
Traditional software testing assumes deterministic behavior. AI systems don’t work that way:
- Models can be fooled by adversarial inputs
- Training data can be poisoned
- Model weights can leak sensitive information
- Outputs can be manipulated through prompt injection
Without specialized testing, these vulnerabilities remain invisible.
The Four Pillars
The guide uses a threat-driven methodology aligned with Google’s SAIF framework, decomposing AI systems into four layers:
1. Model Testing
Testing the “brain” of the system:
- Evasion attacks: Crafted inputs that fool the model
- Model poisoning: Corrupted training leading to malicious behavior
- Membership inference: Detecting if specific data was used in training
- Goal alignment: Ensuring the model behaves as intended
2. Infrastructure Testing
Securing the compute and storage pipeline:
- Supply chain tampering: Poisoned models from HuggingFace, etc.
- Resource exhaustion: DoS attacks on inference endpoints
- Plugin boundary violations: Escaping sandbox restrictions
3. Data Testing
Assuring integrity and privacy of training data:
- Training data exposure: Extracting memorized data from models
- Runtime exfiltration: Leaking data during inference
- Dataset bias: Discriminatory outcomes from skewed training data
4. Application Testing
Application-layer vulnerabilities:
- Prompt injection: Hijacking LLM behavior through malicious inputs
- Context manipulation: Exploiting conversation history
- Jailbreaks: Bypassing safety guardrails
Tools
For adversarial testing, the guide recommends:
- Adversarial Robustness Toolbox (ART): IBM’s Python library for evasion, poisoning, extraction, and inference attacks
- Foolbox: Fast adversarial attacks
- TextAttack: NLP-specific adversarial testing
- Armory: End-to-end adversarial ML evaluation
Related OWASP Projects
The AI Testing Guide complements:
- OWASP Top 10 for LLM: Risk taxonomy for generative AI
- OWASP AI Exchange: 200+ pages on AI protection
- OWASP AI VSS: AI-specific vulnerability scoring
Current Status
The project is in active development (Phase 1 as of June 2025) with a public draft on GitHub. Led by Matteo Meucci and Marco Morana, with 23 contributors and growing.