Merge branch 'main' of ssh://github.com/code-refactor/code-refactor.github.io

justinchiu-cohere · justinchiu-cohere · commit ad937db01847 · 2025-06-23T11:41:04.000-04:00
diff --git a/index.html b/index.html
@@ -301,35 +301,14 @@ <h2 id="results">Results</h2>
 						<td>82.0</td>
 						<td>86.8</td>
 					</tr>
-				</tbody>
-			</table>
-			<figcaption style="text-align: center;">Table 2: Results on the MiniCode CodeContests split</figcaption>
-		</figure>
-
-		<figure class="table-figure">
-			<table class="table-styled">
-				<thead>
 					<tr>
-						<th><strong>Metric</strong></th>
-						<th><strong>Value</strong></th>
-					</tr>
-				</thead>
-				<tbody>
-					<tr>
-						<td>Pass Rate</td>
-						<td>90.67% ±1.88</td>
-					</tr>
-					<tr>
-						<td>Pass Rate Improvement (over non-refactored)</td>
-						<td>6.33% ±1.41</td>
-					</tr>
-					<tr>
-						<td>MDL Ratio</td>
-						<td>0.53 ±0.03</td>
+						<td>LIBRARIAN</td>
+						<td>90.67</td>
+						<td>53.0</td>
 					</tr>
 				</tbody>
 			</table>
-			<figcaption style="text-align: center;">Table 3: Refactoring results for LIBRARIAN (w/ K = 8) averaged over 10 Code Contests collections</figcaption>
+			<figcaption style="text-align: center;">Table 2: Results on the MiniCode CodeContests split</figcaption>
 		</figure>
 
         We also present the results on the small repo split, which consists of repositories generated by o4-mini.
@@ -361,10 +340,10 @@ <h2 id="results">Results</h2>
 					</tr>
 				</tbody>
 			</table>
-			<figcaption style="text-align: center;">Table 4: Average results on MiniCode-repositories small, using Codex with o4-mini and Claude Code with Claude Sonnet 3.7</figcaption>
+			<figcaption style="text-align: center;">Table 3: Average results on MiniCode-repositories small, using Codex with o4-mini and Claude Code with Claude Sonnet 3.7</figcaption>
 		</figure>
 
-        Finally, we present resulst on the large repo split. Due to the stronger performance of Sonnet models, we evaluate only Sonnet models to minimize cost.
+        Finally, we present results on the large repo split. Due to the stronger performance of Sonnet models, we evaluate only Sonnet models to minimize cost.
 		<figure class="table-figure">
 			<table class="table-styled">
 				<thead>
@@ -392,7 +371,7 @@ <h2 id="results">Results</h2>
 					</tr>
 				</tbody>
 			</table>
-			<figcaption style="text-align: center;">Table 5: Average results on MiniCode-repositories large, comparing the original code sources with Claude Sonnet 3.7 and Sonnet 4</figcaption>
+			<figcaption style="text-align: center;">Table 4: Average results on MiniCode-repositories large, comparing the original code sources with Claude Sonnet 3.7 and Sonnet 4</figcaption>
 		</figure>
 
 		Check out the paper for the full CodeContests results, as well as repository results!