multi: update ChanUpdatesInHorizon and NodeUpdatesInHorizon to return iterators (iter.Seq[T]) #10128

Roasbeef · 2025-08-05T02:03:57Z

Haven't had an excuse to make an iterator yet, so I nerd-sniped myself into the creation of this PR.

This PR does a few things:

Updates nodes+chan updates in horizon to return an iterator of the contents.
- Even though most of the actual values we read out will already be in the channel graph cache, as is we can have a potentially very long running database transactions.
- The new iterator versions accept a batch size and read in chunks, yielding out the responses. Callers are mostly unchanged.
Optimizes public node filtering for nodes in horizon:
- Before in a new loop after we'd read all the nodes, we'd then check again to see which ones are public. This previously meant a new DB transaction for each node. Now we fold that into the underlying query.
Update the cache incrementally during reading:
- Before we'd hold the cache mutex the entire time while reading out the entire response.
- Now we'll only hold the cache to check if we can serve from it, then we update the cache items at the very end of serving a batch.

The cache changes now mean that an invocation doesn't have a consistent view of the cache, but for cases like this (serving gossip data to peers, can be lossy), we don't really need a consistent snapshot. This change should reduce over all mutex contention as well.

In this commit, we introduce a new utility function `Collect` to the fn package. This function drains all elements from an iterator and returns them as a slice. This is particularly useful when transitioning from iterator-based APIs to code that expects slices, allowing for gradual migration to the new iterator patterns. The fn module's go.mod is also updated to require Go 1.23, which is necessary for the built-in iter.Seq type support.

In this commit, we add a replace directive to use the local fn package that now includes the new Collect function for iterators. This ensures that the main module can access the iterator utilities we've added. The replace directive will be removed once the fn package changes are merged and a new version is tagged.

In this commit, we introduce a new options pattern for configuring iterator behavior in the graph database. This includes configuration for batch sizes when iterating over channel and node updates, as well as an option to filter for public nodes only. The new functional options pattern allows callers to customize iterator behavior without breaking existing APIs. Default batch sizes are set to 1000 entries for both channel and node updates, which provides a good balance between memory usage and performance.

In this commit, we refactor the NodeUpdatesInHorizon method to return an iterator instead of a slice. This change significantly reduces memory usage when dealing with large result sets by allowing callers to process items incrementally rather than loading everything into memory at once. The new implementation uses Go 1.23's iter.Seq type to provide a standard iterator interface. The method now supports configurable batch sizes through functional options, allowing fine-tuned control over memory usage and performance characteristics. Rather than reading all the entries from disk into memory (before this commit, we did consult the cache for most entries, skipping the disk hits), we now expose a chunked iterator instead. We also make the process of filtering out public nodes first class. This saves many newly created db transactions later.

In this commit, we refactor the ChanUpdatesInHorizon method to return an iterator instead of a slice. This change significantly reduces memory usage when dealing with large result sets by allowing callers to process items incrementally rather than loading everything into memory at once.

In this commit, we update the SQL store implementation to support the new iterator-based API for NodeUpdatesInHorizon. This includes adding a new SQL query that supports efficient pagination through result sets. The SQL implementation uses cursor-based pagination with configurable batch sizes, allowing efficient iteration over large result sets without loading everything into memory. The query is optimized to use indexes effectively and minimize database round trips. New SQL query GetNodesByLastUpdateRange is updated to support: * Cursor-based pagination using (last_update, pub_key) compound cursor * Optional filtering for public nodes only * Configurable batch sizes via MaxResults parameter

In this commit, we update the SQL store implementation to support the new iterator-based API for ChanUpdatesInHorizon. This includes adding SQL query pagination support and helper functions for efficient batch processing. The SQL implementation uses cursor-based pagination with configurable batch sizes, allowing efficient iteration over large result sets without loading everything into memory. The query is optimized to use indexes effectively and minimize database round trips. New SQL query GetChannelsByPolicyLastUpdateRange is updated to support: - Cursor-based pagination using (max_update_time, id) compound cursor - Configurable batch sizes via MaxResults parameter - Efficient batch caching with updateChanCacheBatch helper

In this commit, we update all callers of NodeUpdatesInHorizon and ChanUpdatesInHorizon to use the new iterator-based APIs. The changes use fn.Collect to maintain existing behavior while benefiting from the memory efficiency of iterators when possible.

gemini-code-assist

Summary of Changes

Hello @Roasbeef, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant architectural shift in how large sets of channel and node updates are retrieved from the graph database. By migrating to Go's new iter.Seq[T] interface, the system now supports streaming and batched data retrieval, leading to reduced memory consumption and improved performance. Furthermore, optimizations have been applied to reduce mutex contention during cache updates and to integrate public node filtering directly into the data fetching process, enhancing overall efficiency and concurrency.

Highlights

API Refactor: Iterator-based Data Retrieval: The ChanUpdatesInHorizon and NodeUpdatesInHorizon functions have been updated to return iter.Seq[T] instead of slices. This enables streaming and batched retrieval of graph data, allowing for more efficient memory usage and early termination of data processing.
Performance Optimization: Batched Database Reads: Data is now fetched from both KV and SQL databases in configurable batches. This reduces the memory footprint for large queries and improves overall performance by avoiding loading the entire dataset into memory at once.
Concurrency Improvement: Reduced Mutex Contention: The cache update mechanism has been refined. Instead of holding a global cache mutex for the entire duration of a data retrieval operation, updates are now applied incrementally per batch. This significantly reduces the time the mutex is held, thereby decreasing contention and improving concurrency.
Query Optimization: Integrated Public Node Filtering: Public node filtering, previously a post-retrieval step, has been integrated directly into the underlying database queries (for SQL) or the batch processing logic (for KV store). This pushes filtering closer to the data source, improving efficiency.
Go Version Update: The project's Go module version has been updated from 1.19 to 1.23 to support the new iter package, which is central to the API changes in this PR.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This is a great refactoring to use iterators for fetching channel and node updates, which should significantly improve performance and reduce database transaction times for large graphs. The optimizations to the SQL queries are also a welcome improvement.

I've found a few minor issues, mostly related to style guide adherence (line length and typos in comments), and one more significant issue regarding an inconsistency in how time ranges are handled between the updated functions. Please take a look at the comments.

gemini-code-assist · 2025-08-05T02:06:58Z

graph/db/interfaces.go

+	ChanUpdatesInHorizon(startTime, endTime time.Time,
+		opts ...Option) (iter.Seq[ChannelEdge], error)


There seems to be an inconsistency in how the endTime parameter is handled between ChanUpdatesInHorizon and NodeUpdatesInHorizon.

NodeUpdatesInHorizon is now inclusive of endTime in both kv_store and sql_store implementations. This is a change from the previous behavior in the SQL store.

ChanUpdatesInHorizon is now exclusive of endTime in both implementations. This is a change from the previous behavior in the kv_store.

This inconsistency in the API can be surprising and lead to bugs. For API consistency, it would be better if both functions handled the time range inclusively or exclusively in the same way. I'd recommend making both inclusive to match the new behavior of NodeUpdatesInHorizon.

gemini-code-assist · 2025-08-05T02:06:58Z

graph/db/graph_test.go

+	// We'll start by creating 100 nodes, each with an update time // spaced
+	// one hour apart.


There's a typo in this comment.

Suggested change

// We'll start by creating 100 nodes, each with an update time // spaced

// one hour apart.

// We'll start by creating 100 nodes, each with an update time spaced

// one hour apart.

gemini-code-assist · 2025-08-05T02:06:59Z

graph/db/graph_test.go

+			)
+			require.NoError(t, err)
+
+			// Collect only up to stopAt nodes, breakign afterwards.


Typo in comment: breakign should be breaking.

Suggested change

// Collect only up to stopAt nodes, breakign afterwards.

// Collect only up to stopAt nodes, breaking afterwards.

gemini-code-assist · 2025-08-05T02:06:59Z

graph/db/graph_test.go

+			// Create a fresh graph for each test, then add two new
+			// nodes to teh graph.


Typo in comment: teh should be the.

Suggested change

// Create a fresh graph for each test, then add two new

// nodes to teh graph.

// Create a fresh graph for each test, then add two new

// nodes to the graph.

gemini-code-assist · 2025-08-05T02:06:59Z

graph/db/graph_test.go

+			// Next, we'll create 25 channels between the two nodes,
+			// each with increasign timestamps.


Typo in comment: increasign should be increasing.

Suggested change

// Next, we'll create 25 channels between the two nodes,

// each with increasign timestamps.

// Next, we'll create 25 channels between the two nodes,

// each with increasing timestamps.

gemini-code-assist · 2025-08-05T02:06:59Z

graph/db/graph_test.go

+			// Now we'll run teh main query, and verify that we get
+			// back the expected number of channels.


Typo in comment: teh should be the.

Suggested change

// Now we'll run teh main query, and verify that we get

// back the expected number of channels.

// Now we'll run the main query, and verify that we get

// back the expected number of channels.

gemini-code-assist · 2025-08-05T02:06:59Z

graph/db/kv_store.go

+		// With all the relevent buckets read, we'll now create a fresh
+		// read cusor.


There are a couple of typos in these comments.

Suggested change

// With all the relevent buckets read, we'll now create a fresh

// read cusor.

// With all the relevant buckets read, we'll now create a fresh

// read cursor.

gemini-code-assist · 2025-08-05T02:06:59Z

graph/db/kv_store.go

+		// We'll now obtain a cursor to perform a range query within the
+		// index to find all node announcements within the horizon.
+		// The nodeUpdateIndex key format is: [8 bytes timestamp][33 bytes node pubkey]
+		// This allows efficient range queries by time while maintaining a stable
+		// sort order for nodes with the same timestamp.


These comments exceed the 80-character line limit specified in the style guide.¹

Please consider re-wrapping them for better readability and to adhere to the project's coding standards.

Style Guide References

Suggested change

// We'll now obtain a cursor to perform a range query within the

// index to find all node announcements within the horizon.

// The nodeUpdateIndex key format is: [8 bytes timestamp][33 bytes node pubkey]

// This allows efficient range queries by time while maintaining a stable

// sort order for nodes with the same timestamp.

// We'll now obtain a cursor to perform a range query within the

// index to find all node announcements within the horizon. The

// nodeUpdateIndex key format is: [8b timestamp][33b pubkey].

// This allows efficient range queries by time while maintaining a

// stable sort order for nodes with the same timestamp.

Footnotes

The style guide requires wrapping columns at 80 characters, with tabs counted as 8 spaces. (link) ↩

morehouse · 2025-08-06T22:33:26Z

Cool idea. Will we be able to thread this through to ApplyGossipFilter to reduce the memory utilization?

There would still be an outgoing bandwidth issue, but this would help with excessive memory use.

lightninglabs-deploy · 2025-08-13T22:51:06Z

@Roasbeef, remember to re-request review from reviewers when ready

Roasbeef · 2025-08-14T22:55:22Z

Cool idea. Will we be able to thread this through to ApplyGossipFilter to reduce the memory utilization?

Yep, I realize now that I stopped just short of threading it up to that level.

The missing change here would be updating UpdatesInHorizon to return either a multi-iterator, or a single one that unifies the two iterator streams. So rather than return []lnwire.Message, it would be iter.Seq[lnwire.Message]. Then that main goroutine launched just continues to iterate over the response as normal (need to change that length check w/ something like a one time call to next() to see if anything is returned). The callers stay the same for the most part, but now everything is lazy loaded in the background, as needed.

There would still be an outgoing bandwidth issue, but this would help with excessive memory use.

Isn't this effectively addressed via the rate.Limiter usage based on outbound bandwidth sent? It is the case that we don't apply that to broadcasts like we do for the gossip filter application, but gossip filter application is where most of the bandwidth usage comes from, as some implementations effectively request the entire graph on connection.

Roasbeef added 9 commits August 4, 2025 18:51

graph/db: add tests for iterator implementations

528582f

Roasbeef added graph optimization iter labels Aug 5, 2025

gemini-code-assist bot reviewed Aug 5, 2025

View reviewed changes

saubyk assigned Roasbeef Aug 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

multi: update ChanUpdatesInHorizon and NodeUpdatesInHorizon to return iterators (iter.Seq[T]) #10128

multi: update ChanUpdatesInHorizon and NodeUpdatesInHorizon to return iterators (iter.Seq[T]) #10128

Uh oh!

Roasbeef commented Aug 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

gemini-code-assist bot Aug 5, 2025

Uh oh!

morehouse commented Aug 6, 2025

Uh oh!

lightninglabs-deploy commented Aug 13, 2025

Uh oh!

Roasbeef commented Aug 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

		ChanUpdatesInHorizon(startTime, endTime time.Time,
		opts ...Option) (iter.Seq[ChannelEdge], error)

		// We'll start by creating 100 nodes, each with an update time // spaced
		// one hour apart.

	// Collect only up to stopAt nodes, breakign afterwards.
	// Collect only up to stopAt nodes, breaking afterwards.

		// Create a fresh graph for each test, then add two new
		// nodes to teh graph.

		// Next, we'll create 25 channels between the two nodes,
		// each with increasign timestamps.

		// Now we'll run teh main query, and verify that we get
		// back the expected number of channels.

		// With all the relevent buckets read, we'll now create a fresh
		// read cusor.

multi: update ChanUpdatesInHorizon and NodeUpdatesInHorizon to return iterators (iter.Seq[T]) #10128

Are you sure you want to change the base?

multi: update ChanUpdatesInHorizon and NodeUpdatesInHorizon to return iterators (iter.Seq[T]) #10128

Uh oh!

Conversation

Roasbeef commented Aug 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 5, 2025

Choose a reason for hiding this comment

Style Guide References

Footnotes

Uh oh!

morehouse commented Aug 6, 2025

Uh oh!

lightninglabs-deploy commented Aug 13, 2025

Uh oh!

Roasbeef commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Roasbeef commented Aug 14, 2025 •

edited

Loading