You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<figcaptionstyle="text-align: center;">Table 4: Average results on MiniCode-repositories small, using Codex with o4-mini and Claude Code with Claude Sonnet 3.7</figcaption>
343
+
<figcaptionstyle="text-align: center;">Table 3: Average results on MiniCode-repositories small, using Codex with o4-mini and Claude Code with Claude Sonnet 3.7</figcaption>
365
344
</figure>
366
345
367
-
Finally, we present resulst on the large repo split. Due to the stronger performance of Sonnet models, we evaluate only Sonnet models to minimize cost.
346
+
Finally, we present results on the large repo split. Due to the stronger performance of Sonnet models, we evaluate only Sonnet models to minimize cost.
368
347
<figureclass="table-figure">
369
348
<tableclass="table-styled">
370
349
<thead>
@@ -392,7 +371,7 @@ <h2 id="results">Results</h2>
392
371
</tr>
393
372
</tbody>
394
373
</table>
395
-
<figcaptionstyle="text-align: center;">Table 5: Average results on MiniCode-repositories large, comparing the original code sources with Claude Sonnet 3.7 and Sonnet 4</figcaption>
374
+
<figcaptionstyle="text-align: center;">Table 4: Average results on MiniCode-repositories large, comparing the original code sources with Claude Sonnet 3.7 and Sonnet 4</figcaption>
396
375
</figure>
397
376
398
377
Check out the paper for the full CodeContests results, as well as repository results!
0 commit comments