LLVM APR Benchmark Leaderboard
Leaderboard for the LLVM APR Benchmark.
Total issues: 262
{
- "headers": [
- "Method",
- "Base Model",
- "Score",
- "Repaired",
- "Repaired (Fast)",
- "Hint",
- "Number of Attempts",
- "Repaired (Crash)",
- "Repaired (Miscompilation)",
- "Repaired (Hang)",
- "Build Success Rate (%)",
- "MTTR (min)",
- "Average Sample Count"
- "data": [
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://qwenlm.github.io/blog/qwq-32b" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">QwQ-Plus-2025-03-05</a>",
- 18.3,
- 48,
- 90,
- "w/ hint",
- 197,
- 36,
- 10,
- 2,
- 69.6,
- 4.9,
- 2.7
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://api-docs.deepseek.com/news/news250120" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">DeepSeek-R1</a>",
- 16.4,
- 43,
- 67,
- "w/ hint",
- 194,
- 33,
- 9,
- 1,
- 54.3,
- 3.8,
- 2.8
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://api-docs.deepseek.com/news/news1226" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">DeepSeek-V3</a>",
- 15.6,
- 41,
- 65,
- "w/ hint",
- 195,
- 29,
- 11,
- 1,
- 73.2,
- 1.5,
- 2.4
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://qwenlm.github.io/blog/qwen2.5-max" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Qwen2.5-Max-2025-01-25</a>",
- 15.3,
- 40,
- 64,
- "w/ hint",
- 187,
- 33,
- 7,
- 0,
- 84.8,
- 0.7,
- 2.4
- [
- "metadata": null
Textbox
Category | Total | Repaired | Repair Rate (%) | Repaired (Fast) | Repair Rate (Fast) (%) |
---|---|---|---|---|---|
Miscompilation | 262 | 79 | 30.2 | 132 | 50.4 |
Component | Total | Repaired | Repair Rate (%) | Repaired (Fast) | Repair Rate (Fast) (%) |
---|---|---|---|---|---|
InductiveRangeCheckElimination | 66 | 12 | 18.2 | 20 | 30.3 |
Component | Total | Repaired | Repair Rate (%) | Repaired (Fast) | Repair Rate (Fast) (%) |
---|---|---|---|---|---|
LoopVectorize | 66 | 12 | 18.2 | 20 | 30.3 |
SLPVectorizer | 63 | 18 | 28.6 | 33 | 52.4 |
InstCombine | 50 | 11 | 22 | 31 | 62 |
ScalarEvolution | 15 | 6 | 40 | 7 | 46.7 |
VectorCombine | 10 | 8 | 80 | 10 | 100 |
ValueTracking | 7 | 1 | 14.3 | 2 | 28.6 |
InstructionSimplify | 5 | 1 | 20 | 1 | 20 |
IR | 5 | 1 | 20 | 1 | 20 |
ConstraintElimination | 4 | 1 | 25 | 3 | 75 |
MemCpyOptimizer | 3 | 1 | 33.3 | 2 | 66.7 |
DeadStoreElimination | 3 | 2 | 66.7 | 3 | 100 |
Local | 3 | 1 | 33.3 | 1 | 33.3 |
MemorySSAUpdater | 2 | 0 | 0 | 0 | 0 |
LICM | 2 | 0 | 0 | 1 | 50 |
LoopStrengthReduce | 2 | 1 | 50 | 2 | 100 |
ConstantFold | 2 | 1 | 50 | 1 | 50 |
SimplifyIndVar | 2 | 0 | 0 | 0 | 0 |
Scalarizer | 1 | 0 | 0 | 0 | 0 |
SimpleLoopUnswitch | 1 | 1 | 100 | 1 | 100 |
Coroutines | 1 | 0 | 0 | 0 | 0 |
ValueMapper | 1 | 0 | 0 | 0 | 0 |
DFAJumpThreading | 1 | 0 | 0 | 0 | 0 |
LowerSwitch | 1 | 1 | 100 | 1 | 100 |
JumpThreading | 1 | 0 | 0 | 0 | 0 |
AliasAnalysis | 1 | 0 | 0 | 0 | 0 |
MoveAutoInit | 1 | 0 | 0 | 0 | 0 |
SCCPSolver | 1 | 0 | 0 | 0 | 0 |
LoopSimplifyCFG | 1 | 1 | 100 | 1 | 100 |
AggressiveInstCombine | 1 | 1 | 100 | 1 | 100 |
GVNSink | 1 | 1 | 100 | 1 | 100 |
IndVarSimplify | 1 | 1 | 100 | 1 | 100 |
LoopAccessAnalysis | 1 | 1 | 100 | 1 | 100 |
Attributor | 1 | 1 | 100 | 1 | 100 |
GVN | 1 | 0 | 0 | 0 | 0 |
NewGVN | 1 | 0 | 0 | 0 | 0 |
BDCE | 1 | 0 | 0 | 0 | 0 |
LoopPeel | 1 | 0 | 0 | 0 | 0 |
LoopCacheAnalysis | 1 | 0 | 0 | 0 | 0 |
SimplifyLibCalls | 1 | 1 | 100 | 1 | 100 |
SimplifyCFG | 1 | 1 | 100 | 1 | 100 |
LazyValueInfo | 1 | 1 | 100 | 1 | 100 |
EarlyCSE | 1 | 0 | 0 | 0 | 0 |
DemoteRegToStack | 1 | 1 | 100 | 1 | 100 |
Reassociate | 1 | 0 | 0 | 0 | 0 |
InductiveRangeCheckElimination | 1 | 1 | 100 | 1 | 100 |
InlineCost | 1 | 1 | 100 | 1 | 100 |
With the provided evaluation environment, you can get a certificate by calling env.dump()
.
Please submit your evaluation results (generated by scripts/submit.py) to dtcxzyw/llvm-apr-benchmark-submissions.