Claude-Code

AI Agent Reliability Report

6.8
Avg Severity /10
4
Total Incidents
1
Critical
2
High

Failure Modes

Security Vulnerability 2
Destructive Action 1
Hallucination 1

Root Causes

Instruction Misunderstanding 1
Training Data Gap 1
Other 1
Confidence Miscalibration 1

Frequently Asked Questions

Is Claude-Code reliable?

Based on 4 documented incidents, Claude-Code has an average failure severity of 6.8/10. 1 incidents were rated critical and 2 were rated high severity. Common failure modes include security vulnerability.

What are the most common Claude-Code failures?

The most frequently documented Claude-Code failure modes are: security vulnerability (2 incidents), destructive action (1 incidents), hallucination (1 incidents). These failures range from critical to high severity.

How many Claude-Code AI failures have been documented?

StupidLLM has documented 4 Claude-Code AI agent failures as of 2026. Each incident is severity-scored on a 0-10 scale, verified against source evidence, and categorized by failure mode and root cause.

All Claude-Code Incidents