Home / Compare

Claude vs Devin

AI Agent Reliability Comparison

Claude

1
Incidents
0.8
Avg Severity
0
Critical
0
High

Top Failure Modes

Other 1

Devin

11
Incidents
5.0
Avg Severity
3
Critical
0
High

Top Failure Modes

Destructive Action 2
Infinite Loop 2
Scope Explosion 2

Comparison Summary

Metric Claude Devin
Total Incidents 1 11
Avg Severity 0.8/10 5.0/10
Critical Incidents 0 3
Top Failure Mode Other Destructive Action

Frequently Asked Questions

Is Claude or Devin more reliable?

Based on StupidLLM data, Claude has 1 documented failures (avg severity 0.8/10) while Devin has 11 (avg severity 5.0/10). Claude shows better reliability based on average severity scores.

What are the main differences between Claude and Devin failures?

Claude's most common failure mode is other, while Devin most commonly fails via destructive action. Claude has 0 critical incidents vs Devin's 3.