Skip to content

Commit ad937db

Browse files
Merge branch 'main' of ssh://github.com/code-refactor/code-refactor.github.io
2 parents 8c4c755 + 5375791 commit ad937db

File tree

1 file changed

+7
-28
lines changed

1 file changed

+7
-28
lines changed

index.html

Lines changed: 7 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -301,35 +301,14 @@ <h2 id="results">Results</h2>
301301
<td>82.0</td>
302302
<td>86.8</td>
303303
</tr>
304-
</tbody>
305-
</table>
306-
<figcaption style="text-align: center;">Table 2: Results on the MiniCode CodeContests split</figcaption>
307-
</figure>
308-
309-
<figure class="table-figure">
310-
<table class="table-styled">
311-
<thead>
312304
<tr>
313-
<th><strong>Metric</strong></th>
314-
<th><strong>Value</strong></th>
315-
</tr>
316-
</thead>
317-
<tbody>
318-
<tr>
319-
<td>Pass Rate</td>
320-
<td>90.67% ±1.88</td>
321-
</tr>
322-
<tr>
323-
<td>Pass Rate Improvement (over non-refactored)</td>
324-
<td>6.33% ±1.41</td>
325-
</tr>
326-
<tr>
327-
<td>MDL Ratio</td>
328-
<td>0.53 ±0.03</td>
305+
<td>LIBRARIAN</td>
306+
<td>90.67</td>
307+
<td>53.0</td>
329308
</tr>
330309
</tbody>
331310
</table>
332-
<figcaption style="text-align: center;">Table 3: Refactoring results for LIBRARIAN (w/ K = 8) averaged over 10 Code Contests collections</figcaption>
311+
<figcaption style="text-align: center;">Table 2: Results on the MiniCode CodeContests split</figcaption>
333312
</figure>
334313

335314
We also present the results on the small repo split, which consists of repositories generated by o4-mini.
@@ -361,10 +340,10 @@ <h2 id="results">Results</h2>
361340
</tr>
362341
</tbody>
363342
</table>
364-
<figcaption style="text-align: center;">Table 4: Average results on MiniCode-repositories small, using Codex with o4-mini and Claude Code with Claude Sonnet 3.7</figcaption>
343+
<figcaption style="text-align: center;">Table 3: Average results on MiniCode-repositories small, using Codex with o4-mini and Claude Code with Claude Sonnet 3.7</figcaption>
365344
</figure>
366345

367-
Finally, we present resulst on the large repo split. Due to the stronger performance of Sonnet models, we evaluate only Sonnet models to minimize cost.
346+
Finally, we present results on the large repo split. Due to the stronger performance of Sonnet models, we evaluate only Sonnet models to minimize cost.
368347
<figure class="table-figure">
369348
<table class="table-styled">
370349
<thead>
@@ -392,7 +371,7 @@ <h2 id="results">Results</h2>
392371
</tr>
393372
</tbody>
394373
</table>
395-
<figcaption style="text-align: center;">Table 5: Average results on MiniCode-repositories large, comparing the original code sources with Claude Sonnet 3.7 and Sonnet 4</figcaption>
374+
<figcaption style="text-align: center;">Table 4: Average results on MiniCode-repositories large, comparing the original code sources with Claude Sonnet 3.7 and Sonnet 4</figcaption>
396375
</figure>
397376

398377
Check out the paper for the full CodeContests results, as well as repository results!

0 commit comments

Comments
 (0)