Devin
11
Incidents
5.0
Avg Severity
3
Critical
0
High
Top Failure Modes
Destructive Action
2
Infinite Loop
2
Scope Explosion
2
Github Copilot
1
Incidents
10.0
Avg Severity
1
Critical
0
High
Top Failure Modes
Security Vulnerability
1
Comparison Summary
| Metric | Devin | Github Copilot |
|---|---|---|
| Total Incidents | 11 | 1 |
| Avg Severity | 5.0/10 | 10.0/10 |
| Critical Incidents | 3 | 1 |
| Top Failure Mode | Destructive Action | Security Vulnerability |
Frequently Asked Questions
Is Devin or Github Copilot more reliable?
Based on StupidLLM data, Devin has 11 documented failures (avg severity 5.0/10) while Github Copilot has 1 (avg severity 10.0/10). Devin shows better reliability based on average severity scores.
What are the main differences between Devin and Github Copilot failures?
Devin's most common failure mode is destructive action, while Github Copilot most commonly fails via security vulnerability. Devin has 3 critical incidents vs Github Copilot's 1.