Claude Code
4
Incidents
6.8
Avg Severity
1
Critical
2
High
Top Failure Modes
Security Vulnerability
2
Destructive Action
1
Hallucination
1
Devin
11
Incidents
5.0
Avg Severity
3
Critical
0
High
Top Failure Modes
Destructive Action
2
Infinite Loop
2
Scope Explosion
2
Comparison Summary
| Metric | Claude Code | Devin |
|---|---|---|
| Total Incidents | 4 | 11 |
| Avg Severity | 6.8/10 | 5.0/10 |
| Critical Incidents | 1 | 3 |
| Top Failure Mode | Security Vulnerability | Destructive Action |
Frequently Asked Questions
Is Claude Code or Devin more reliable?
Based on StupidLLM data, Claude Code has 4 documented failures (avg severity 6.8/10) while Devin has 11 (avg severity 5.0/10). Devin shows better reliability based on average severity scores.
What are the main differences between Claude Code and Devin failures?
Claude Code's most common failure mode is security vulnerability, while Devin most commonly fails via destructive action. Claude Code has 1 critical incidents vs Devin's 3.