Skip to content

Conversation

@rustyrussell
Copy link
Contributor

This is the algorithm recommended by @Lagrang3 . Much is neatening and tweaking existing code, though.

  1. Removal of unnecessary caching.
  2. Simplify sorting code (sort array of flows directly, don't use index arrays)
  3. make increase_flows() more general, so it can be used for > 2% increase.
  4. Use increase_flows() to top up the largest-capacity flows if we hit maxparts.

I added amount_msat_accumulate for the "a+=b" case, but I was struggling
with a name for the subtractive equivalent.  After some prompting, ChatGPT
suggested deduct.

Signed-off-by: Rusty Russell <[email protected]>
This is not worth optimizing that I can see.  Using a non-debug build I get
the following times for tests/test_askrene.py::test_real_data

Before:
	143 seconds

After:
	141 seconds.

Signed-off-by: Rusty Russell <[email protected]>
We've already freed the working_ctx, and the fail path does that again.

Signed-off-by: Rusty Russell <[email protected]>
It can only fail on overflow, but if it did, the fail path frees working_ctx
and returns "error_message".

Signed-off-by: Rusty Russell <[email protected]>
Make it calculate on demand.  This will be useful when we call it from elsewhere.

Signed-off-by: Rusty Russell <[email protected]>
We don't need the indexes array, we can use this directly.

We still set up the indexes array (for now) after we call this.

Signed-off-by: Rusty Russell <[email protected]>
This removes the index array from code after increase_flows()m, so we use the flows
array directly.

The next step will be to make increase_flows() use the flows array, and remove the
index array indirection entirely.

Signed-off-by: Rusty Russell <[email protected]>
We don't need to convert to strings, we can compare directly.  This removes the final
use of the index arrays.

This of course changes the order of returned routes, which alters test_real_biases, since
that biases against the final channel in the *first* route.

Took me far too long to diagnose that!

Signed-off-by: Rusty Russell <[email protected]>
We added _noidx versions of the sort functions, but now they're the only ones, we can
rename them to the old names.

Signed-off-by: Rusty Russell <[email protected]>
Rewrite it, so it properly takes into account interactions between flows
by using reservations.

Signed-off-by: Rusty Russell <[email protected]>
For 1, we use single-path.  For 0, reject.

Signed-off-by: Rusty Russell <[email protected]>
@rustyrussell rustyrussell added this to the v25.12 milestone Nov 13, 2025
Copy link
Collaborator

@Lagrang3 Lagrang3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Removal of unnecessary caching.

Yes. Caching at this phase was overzealous, when in fact we spend most of the time in doing MCF/

2. Simplify sorting code (sort array of flows directly, don't use index arrays)

Indexing was a necessity after using cached channel information.

3. make increase_flows() more general, so it can be used for > 2% increase.

I like it. It is preferred to the previous approach in my opinion.

Looping and remove 1/n of the shortage at every iteration worries me a little bit because
the shortage amount decays exponentially but the factor decreases the more flows there are
and we won't stop until 10sats.
For example, it takes 7 iterations to reduce some shortage to 1% of its initial value if
there are 2 paths. 44 iterations for 10 paths and 459 iterations for 100 paths.
Also the complexity is the number of iterations times the number of paths, hence
2 * 7 = 14
10 * 44 = 440
100 * 459 = 45900
respectively.
In practice it will be ok since we don't expect too many paths
or the shortage to be too large.

4. Use increase_flows() to top up the largest-capacity flows if we hit maxparts.

I strongly believe that reduce_num_flows shoudl sort flows based on delivery amount
and keep the largest delivery amount first.

@rustyrussell
Copy link
Contributor Author

rustyrussell commented Nov 16, 2025

Here's the diff for the reduce_flows:

diff --git a/plugins/askrene/refine.c b/plugins/askrene/refine.c
index 2ee3855e2f..118460eb9d 100644
--- a/plugins/askrene/refine.c
+++ b/plugins/askrene/refine.c
@@ -588,36 +598,37 @@ double flows_probability(const tal_t *ctx, struct route_query *rq,
 	return probability;
 }
 
-/* Compare flows by remaining capacity (note: they are not reserved, so this
- * capacity includes current capacity. */
-static int reverse_cmp_capacity(struct flow *const *fa, struct flow *const *fb,
-				struct route_query *rq)
-{
-	struct amount_msat amount_a, amount_b;
-
-	amount_a = flow_max_deliverable(rq, *fa);
-	amount_b = flow_max_deliverable(rq, *fb);
-	if (!amount_msat_deduct(&amount_a, (*fa)->delivers))
-		abort();
-	if (!amount_msat_deduct(&amount_b, (*fb)->delivers))
-		abort();
-
-	if (amount_msat_eq(amount_a, amount_b))
-		return 0;
-	if (amount_msat_greater(amount_a, amount_b))
-		return -1;
-	return 1;
-}
-
 const char *reduce_num_flows(const tal_t *ctx,
 			     const struct route_query *rq,
 			     struct flow ***flows,
 			     struct amount_msat deliver,
 			     size_t num_parts)
 {
-	/* Keep the largest-capacity flows */
+	/* Keep the largest flows (not as I originally implemented, the largest
+	 * capacity flows).  Here's Lagrang3's analysis:
+	 *
+	 * I think it is better to keep the largest-deliver flows.  If we only
+	 * go for the highest capacity we may throw away the low cost benefits
+	 * of the MCF.
+
+	 * Hypothetical scenario: MCF finds 3 flows but maxparts=2,
+	 * flow 1: deliver=10, cost=0, capacity=0
+	 * flow 2: deliver=7, cost=1, capacity=5
+	 * flow 3: deliver=1, cost=10, capacity=100
+	 *
+	 * It is better to keep flows 1 and 2 by accomodating 1 more unit of
+	 * flow in flow2 at 1 value expense (per flow), than to keep flows 2 and
+	 * 3 by accomodating 5 more units of flow in flow2 at cost 1 and 5 in
+	 * flow3 at cost 100.
+	 *
+	 * The trade-off is: if we prioritize the delivery value already
+	 * computed by MCF then we find better solutions, but we might fail to
+	 * find feasible solutions sometimes. If we prioritize capacity then we
+	 * generally find bad solutions though we find feasibility more often
+	 * than the alternative.
+	 */
 	size_t orig_num_flows = tal_count(*flows);
-	asort(*flows, orig_num_flows, reverse_cmp_capacity, cast_const(struct route_query *, rq));
+	asort(*flows, orig_num_flows, revcmp_flows, NULL);
 	tal_resize(flows, num_parts);
 
 	if (!increase_flows(rq, *flows, deliver, -1.0))

Now we simply call it at the end.  We need to check it hasn't violated fee maxima, but
otherwise it's simple.

Signed-off-by: Rusty Russell <[email protected]>
Changelog-Fixed: Plugins: `askrene` now handles limits on number of htlcs much more gracefully.
@rustyrussell rustyrussell force-pushed the guilt/askrene-infinite-loop branch from b9c65ba to 60dc918 Compare November 16, 2025 05:10
@Lagrang3 Lagrang3 self-requested a review November 16, 2025 11:02
Pointed out by @Lagrang3; he's right, while it's a temporary leak the
way we use flows, it's still a trap.

Signed-off-by: Rusty Russell <[email protected]>
@rustyrussell rustyrussell force-pushed the guilt/askrene-infinite-loop branch from 60dc918 to bcdb398 Compare November 16, 2025 22:48
@rustyrussell rustyrussell merged commit 6e4bc1d into ElementsProject:master Nov 17, 2025
37 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants