Methodology
How we document, verify, and score AI agent failures
Severity Scoring System
StupidLLM uses a CVSS-inspired 0-10 severity scale to rate AI agent incidents. The score reflects the real-world impact of the failure, not just its technical complexity.
| Score | Label | Criteria | Examples |
|---|---|---|---|
| 9.0–10.0 | Critical | Irreversible damage, security breach, production outage, data loss | Deleted database, exposed API keys, wiped git history |
| 7.0–8.9 | High | Significant damage requiring hours to fix, broken builds, corrupted files | Deleted migration files, introduced XSS, broke CI/CD pipeline |
| 4.0–6.9 | Medium | Wrong output requiring rework, wasted compute, incorrect refactoring | Wrong API used, infinite retry loop, scope explosion |
| 0.0–3.9 | Low | Minor issues easily caught in review, cosmetic errors | Unused imports, wrong variable name, style inconsistency |
Failure Mode Taxonomy
Every incident is classified into one of these failure modes:
Hallucination
Agent references APIs, functions, packages, or files that don't exist
Destructive Action
Agent deletes files, drops tables, overwrites data, or corrupts state
Infinite Loop
Agent gets stuck in a cycle, retrying the same failed approach
Wrong File
Agent edits incorrect files or creates files in wrong locations
Scope Explosion
Agent rewrites far more code than requested, touching unrelated files
Ignored Instructions
Agent disregards explicit user instructions or constraints
Logic Error
Agent produces code with incorrect logic that compiles but doesn't work
Security Vulnerability
Agent introduces XSS, SQL injection, hardcoded secrets, or auth bypasses
Data Loss
Agent causes irreversible loss of user data, database records, or files
Verification Process
Source verification
We check the source URL (GitHub PR, tweet, blog post) to confirm the incident happened as described
Classification
Incident is classified by failure mode, root cause, agent, severity, and affected domain
Severity scoring
Severity assigned based on impact criteria: reversibility, scope of damage, time to fix, security implications
STUPID-ID assignment
Verified incidents receive a unique STUPID-YYYY-NNNN identifier for permanent tracking
Root Cause Categories
Beyond failure modes, we track why the agent failed: