LLVM APR Benchmark Leaderboard
Leaderboard for the LLVM APR Benchmark.
Total issues: 305
{
- "headers": [
- "Method",
- "Base Model",
- "Score",
- "Repaired",
- "Repaired (Fast)",
- "Hint",
- "Number of Attempts",
- "Repaired (Crash)",
- "Repaired (Miscompilation)",
- "Repaired (Hang)",
- "Build Success Rate (%)",
- "MTTR (min)",
- "Average Sample Count"
- "data": [
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://qwenlm.github.io/blog/qwq-32b" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">QwQ-Plus-2025-03-05</a>",
- 15.7,
- 48,
- 90,
- "w/ hint",
- 197,
- 36,
- 10,
- 2,
- 69.6,
- 4.9,
- 2.7
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://api-docs.deepseek.com/news/news250325" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">DeepSeek-V3-0324</a>",
- 15.4,
- 47,
- 74,
- "w/ hint",
- 197,
- 33,
- 14,
- 0,
- 85.6,
- 0.9,
- 2.5
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://api-docs.deepseek.com/news/news250120" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">DeepSeek-R1</a>",
- 14.1,
- 43,
- 67,
- "w/ hint",
- 194,
- 33,
- 9,
- 1,
- 54.3,
- 3.8,
- 2.8
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://api-docs.deepseek.com/news/news1226" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">DeepSeek-V3</a>",
- 13.4,
- 41,
- 65,
- "w/ hint",
- 195,
- 29,
- 11,
- 1,
- 73.2,
- 1.5,
- 2.4
- [
- "<a target="_blank" href="https://github.com/dtcxzyw/llvm-apr-benchmark/blob/main/examples/baseline.py" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Baseline</a>",
- "<a target="_blank" href="https://qwenlm.github.io/blog/qwen2.5-max" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">Qwen2.5-Max-2025-01-25</a>",
- 13.1,
- 40,
- 64,
- "w/ hint",
- 187,
- 33,
- 7,
- 0,
- 84.8,
- 0.7,
- 2.4
- [
- "metadata": null
Textbox
Category | Total | Repaired | Repair Rate (%) | Repaired (Fast) | Repair Rate (Fast) (%) |
---|---|---|---|---|---|
Miscompilation | 305 | 83 | 27.2 | 137 | 44.9 |
Component | Total | Repaired | Repair Rate (%) | Repaired (Fast) | Repair Rate (Fast) (%) |
---|---|---|---|---|---|
InductiveRangeCheckElimination | 74 | 18 | 24.3 | 34 | 45.9 |
Component | Total | Repaired | Repair Rate (%) | Repaired (Fast) | Repair Rate (Fast) (%) |
---|---|---|---|---|---|
SLPVectorizer | 74 | 18 | 24.3 | 34 | 45.9 |
LoopVectorize | 72 | 14 | 19.4 | 22 | 30.6 |
InstCombine | 54 | 12 | 22.2 | 31 | 57.4 |
ScalarEvolution | 15 | 6 | 40 | 7 | 46.7 |
VectorCombine | 12 | 8 | 66.7 | 10 | 83.3 |
ValueTracking | 8 | 1 | 12.5 | 2 | 25 |
IR | 6 | 1 | 16.7 | 1 | 16.7 |
ConstraintElimination | 6 | 1 | 16.7 | 3 | 50 |
InstructionSimplify | 5 | 2 | 40 | 2 | 40 |
Local | 4 | 1 | 25 | 1 | 25 |
SimplifyIndVar | 4 | 0 | 0 | 0 | 0 |
MemorySSAUpdater | 3 | 0 | 0 | 0 | 0 |
LoopAccessAnalysis | 3 | 1 | 33.3 | 1 | 33.3 |
LoopPeel | 3 | 0 | 0 | 0 | 0 |
DeadStoreElimination | 3 | 2 | 66.7 | 3 | 100 |
MemCpyOptimizer | 3 | 1 | 33.3 | 2 | 66.7 |
FunctionAttrs | 2 | 0 | 0 | 0 | 0 |
LazyValueInfo | 2 | 1 | 50 | 1 | 50 |
ConstantFold | 2 | 1 | 50 | 1 | 50 |
LoopStrengthReduce | 2 | 1 | 50 | 2 | 100 |
LICM | 2 | 0 | 0 | 1 | 50 |
GVNSink | 1 | 1 | 100 | 1 | 100 |
ValueMapper | 1 | 0 | 0 | 0 | 0 |
Scalarizer | 1 | 0 | 0 | 0 | 0 |
Instrumentation | 1 | 0 | 0 | 0 | 0 |
LoopUnrollAndJamPass | 1 | 0 | 0 | 0 | 0 |
SimpleLoopUnswitch | 1 | 1 | 100 | 1 | 100 |
DemoteRegToStack | 1 | 1 | 100 | 1 | 100 |
InlineCost | 1 | 1 | 100 | 1 | 100 |
Attributor | 1 | 1 | 100 | 1 | 100 |
DFAJumpThreading | 1 | 0 | 0 | 0 | 0 |
SimplifyLibCalls | 1 | 1 | 100 | 1 | 100 |
IndVarSimplify | 1 | 1 | 100 | 1 | 100 |
Reassociate | 1 | 0 | 0 | 0 | 0 |
NewGVN | 1 | 0 | 0 | 0 | 0 |
GVN | 1 | 0 | 0 | 0 | 0 |
VectorUtils | 1 | 0 | 0 | 0 | 0 |
LoopDeletion | 1 | 0 | 0 | 0 | 0 |
AggressiveInstCombine | 1 | 1 | 100 | 1 | 100 |
GlobalOpt | 1 | 0 | 0 | 0 | 0 |
InductiveRangeCheckElimination | 1 | 1 | 100 | 1 | 100 |
SCCPSolver | 1 | 0 | 0 | 0 | 0 |
Coroutines | 1 | 0 | 0 | 0 | 0 |
SimplifyCFG | 1 | 1 | 100 | 1 | 100 |
DeadArgumentElimination | 1 | 0 | 0 | 0 | 0 |
JumpThreading | 1 | 0 | 0 | 0 | 0 |
AliasAnalysis | 1 | 0 | 0 | 0 | 0 |
MoveAutoInit | 1 | 0 | 0 | 1 | 100 |
LoopUnrollRuntime | 1 | 0 | 0 | 0 | 0 |
LoopCacheAnalysis | 1 | 0 | 0 | 0 | 0 |
EarlyCSE | 1 | 0 | 0 | 0 | 0 |
LowerSwitch | 1 | 1 | 100 | 1 | 100 |
Evaluator | 1 | 0 | 0 | 0 | 0 |
LoopSimplifyCFG | 1 | 1 | 100 | 1 | 100 |
BDCE | 1 | 0 | 0 | 0 | 0 |
With the provided evaluation environment, you can get a certificate by calling env.dump()
.
Please submit your evaluation results (generated by scripts/submit.py) to dtcxzyw/llvm-apr-benchmark-submissions.