Skip to content

Conversation

Beilinson
Copy link
Contributor

@Beilinson Beilinson commented Sep 24, 2025

Description

The goal of this PR is to improve performance with scene.drillPick, without changing the algorithm significantly.

The first commit changing the internal data structure of _pickObjects to Map was inspired by the performance boosts seen in #12896, and led to a surprising 75% decrease in time.

I tried further reducing unneeded code by removing some funny byte-float-byte transformations and reusing a scratch array for pixels in Context.readPixels, which also led to around ~10% improvements over the previous iteration.

Further, I realized that in sandcastles such as the one described in #9660, where may non-fully overlapping entities may be drawn, and the drill pick rectangle may be particularly large, instead of PickFramebuffer.end exiting early after one entity was found, it could continue counting entities and iterate over the entire pickPolygon. This led to the most significant performance boost, the attached sandcastle now takes ~30ms down from ~3s on my machine, since the slowest part of picking is the need to rerender for every single entity in the pick rectangle currently (i.e, 220 entities = 220 renders + 220 buffer reads + 220 iterations over bigger and bigger parts of the image)

Issue number and link

#9660

Testing plan

Sandcastle example modified from #9660:
local

image

main

image

Author checklist

  • I have submitted a Contributor License Agreement
  • I have added my name to CONTRIBUTORS.md
  • I have updated CHANGES.md with a short summary of my change
  • I have added or updated unit tests to ensure consistent code coverage
  • I have updated the inline documentation, and included code examples where relevant
  • I have performed a self-review of my code

Copy link

Thank you for the pull request, @Beilinson!

✅ We can confirm we have a CLA on file for you.

@Beilinson Beilinson changed the title Depth picking performance drill picking performance Sep 25, 2025
Copy link
Contributor

@mzschwartz5 mzschwartz5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Beilinson

Thanks for the contribution! I left a number of comments, but they're mostly minor. The performance improvement seems huge. I was particularly surprised the Map change gave a big difference, since the plain object {} is also backed by a hash table. (In contrast to #12896, where we were iterating over an array. Map had a huge impact there).

I just have a few questions, but they're pretty important:

  1. Paraphrasing a comment I left on PickFramebuffer.js, can you elaborate on why not returning early from PickFramebuffer.end speeds up picking? Seems counterintuitive to me.
  2. Following from that, does this performance gain come at the cost of some other use case of picking? (like, does it speed up drill picking at the cost of slowing down regular picking?)
  3. In the example sandcastle you linked under the testing section, it only picks 40 pins (compared to the 220 in your measurements). What do I change to increase it to 220?

@Beilinson
Copy link
Contributor Author

Beilinson commented Sep 26, 2025

Hey @mzschwartz5 , thanks for the review!

I was particularly surprised the Map change gave a big difference, since the plain object {} is also backed by a hash table. (In contrast to #12896, where we were iterating over an array. Map had a huge impact there).

delete obj[key] is very slow on objects compared to map.delete(key) (about 75% difference apparently) source. Because the picking is a depth peeling algorithm, the entire _pickObjects map is reconstructed every iteration, which means deleting and reinserting all keys each time, which is where much of the slowdown came from.

1. Paraphrasing a comment I left on `PickFramebuffer.js`, can you elaborate on why _not_ returning early from `PickFramebuffer.end` speeds _up_ picking? Seems counterintuitive to me.

The drill picking implemented by Cesium is similar to Depth Peeling. Each iteration, entities are rendered to some offscreen buffer, that buffer is then read into memory and then we iterate over the pixels (spirally from the center, this is primarily used for picking the entities "closest" to the pointer, not really useful when picking an actual rectangle). The old algorithm would return the moment one primitive was found.

The sandcastle demonstrates a scenario where picking isn't directly of a 3x3 pixel rectangle at the pointer, rather a significant portion of the screen. This is also a use case we have in our enterprise system, where we support "drag selecting" multiple entities.

The important thing is depth peeling is critical when entities are rendered 100% directly on top of another and you are picking a very small (3x3) area. In this use case however, the entities dont have a significant amount of overlap, meaning each iteration we could find more entities rendered, but instead we exit early to re-render everything and re-read everything.

By letting the PickFramebuffer continue trying to search for entities, we solve this so over a large rectangle at the given zoom in the sandcastle all entities are found in one pass, and only one re-render is needed (to make sure that there are no entities under those ones). Thats the significant performance boost.

2. Following from that, does this performance gain come at the cost of some other use case of picking? (like, does it speed up drill picking at the cost of slowing down regular picking?)

No, regular picking still picks a (3x3 by default) rectangle, and stops after finding the first entity (limit=1). Also I didn't remove the spiral code (https://github.com/CesiumGS/cesium/blob/ce6c3e28fc7e550b32a5c67335aa4490d0e84c34/packages/engine/Source/Scene/PickFramebuffer.js#L82C3-L86C83), so that also still works the same.

3. In the example sandcastle you linked under the testing section, it only picks 40 pins (compared to the 220 in your measurements). What do I change to increase it to 220?

Probably has to do with screen size, you could try resizing the viewer size or alter the sandcastle so the pickRectangle is the entire canvas, but that would be even more unfair

@Beilinson
Copy link
Contributor Author

Beilinson commented Sep 26, 2025

Essentially this PR has two separate (not necessarily exclusive) improvements:

  1. The ~75% speedup in drillPicking a small area (3x3) with many overlapping entities (due to the Map switch)
  2. A more significant speedup (at best, N=# of entities, then from O(N) to O(1)), when picking a large area with many spread out non overlapping entities

To be precise, 2 speeds up from O(N) to O(N2), where N2 is the maximum # of fully overlapping entities at the given zoom and render resolution.

@mzschwartz5
Copy link
Contributor

mzschwartz5 commented Sep 28, 2025

@Beilinson

No, regular picking still picks a (3x3 by default) rectangle, and stops after finding the first entity (limit=1).

Got it. So the behavior of regular picking is exactly the same as before this PR, because it uses limit=1. Perfect.

I guess the natural follow up is then, does this slow down the performance of regular 3x3 drill picking, where we most likely do want 1 result at a time? (I need to go over the PR again, and maybe I'll figure out the answer myself, but good to document it anyway).


Took another pass, still interested in your response to the above question, but otherwise I think this is looking pretty good barring those few style-related discussions.

@Beilinson
Copy link
Contributor Author

Beilinson commented Sep 29, 2025

@mzschwartz5

I guess the natural follow up is then, does this slow down the performance of regular 3x3 drill picking, where we most likely do want 1 result at a time? (I need to go over the PR again, and maybe I'll figure out the answer myself, but good to document it anyway).

It's the exact same code flow as previously because it returns early after finding one result (limit <=0) break in PickFramebuffer.prototype.end, so should be the same performance as before

@mzschwartz5
Copy link
Contributor

@Beilinson

It's the exact same code flow as previously because it returns early after finding one result (limit <=0) break in PickFramebuffer.prototype.end, so should be the same performance as before

No, I don't think this is true. Take this example: I'm doing a 10x10 drill pick, on a region of the screen where 5 entities are overlapped, and I've set the limit to 5. On each pick pass, now, even once we've found 1 entity, PickFramebuffer.prototype.end will continue to search through the whole 10x10 region because it hasn't hit the limit on this pass - even though there's nothing left to find on this layer. Does that sound right to you? (I'm no expert on picking but that's my understanding of how this works)

Essentially, this PR is a tradeoff - better performance for large pick rectangles when entities are non-overlapping, at the same depth, but worse performance when they are overlapping at various depths. Is the tradeoff worth it? We might need to discuss. The primary use-case of drill picking is to pick objects that are overlapping each other at various depth layers.

So iff this has a significant performance impact on large-rectangle drill picks containing overlapping objects, we may want to consider other options. (like maybe a new picking API specifically for this type of picking?).

@mzschwartz5
Copy link
Contributor

mzschwartz5 commented Sep 29, 2025

Also, won't this change which entities get picked? The old method prioritizes entities closest to the mouse position, depth-first if you will. The new method prioritizes breadth-first.

This is probably more important of a question than the performance one above, as I assume that, overall, performance is probably better all-around given the other changes (map, rgb conversion)

@Beilinson
Copy link
Contributor Author

Ah sorry I misread your question. Yes you're right, this PR causes breadth-first instead of depth-first search.

Regarding the fact that you iterate over empty pixels after already finding an entity rather than early return, I took the liberty to benchmark the iteration speed before and after my map+byte conversion improvements:

New sandbox In the new example, I click directly on the entities, which are all 100% overlapping

Main:
Picking between 469.0,210.0 with a 11x11 rectangle
Picked 400 pins in 471ms.

Color + Map improvements:
Picking between 451.2,186.6 with a 11x11 rectangle
Picked 400 pins in 358ms.

Color+Map+Iterate over all pixels:
Picking between 452.0,203.4 with a 11x11 rectangle
Picked 400 pins in 367ms.

I benchmarked all these several times, and the numbers weren't 100% consistent obviously but this was the average first click time.

As a user of both scenarios, I feel a 10ms slowdown (which isn't felt because this pr also gives 100ms speedup) for a pick of 400 entities, while getting a O(N) improvement over non-overlapping entities is worth it. Iterating over 100 pixels really isn't all that slow when all the iteration does is check a map

@Beilinson
Copy link
Contributor Author

Also, won't this change which entities get picked? The old method prioritizes entities closest to the mouse position, depth-first if you will. The new method prioritizes breadth-first.

This is probably more important of a question than the performance one above, as I assume that, overall, performance is probably better all-around given the other changes (map, rgb conversion)

Agreed this is 100% a more important question. On any sized rectangle, given no limit, the result will be the same (different order, but for large picks thats an improvement). On small picks (3x3) it is most likely guranteed to be the same as before, because I still iterate centrally outwards, meaning if any outer entities pixels exist they are are probably of entities hidden by the central entity.

Given a large rectangle, and a small limit, and a mix of overlapping and non-overlapping entities, then the results will be different - drillPick will now prioritize visible entities over completely hidden ones.

From my point of view this is a matter of intention: as a user, we use drillPick with the default 3x3 to pick things directly under the mouse. The use of drillPick with a large rectangle is to drag select, in which case if we have a limit (which we do) we prefer to highlight the visible entities over the hidden ones.

If there is a use-case for drill picking a large rectangle and still preferring a depth-first search, I am open to adding a flag which will early return after 1 entity is found. I do think breadth-first should be the default, because of its improved performance. If needed, I could add a warning in the changelog

@mzschwartz5
Copy link
Contributor

That's a pretty fair assessment on both fronts. I agree about the performance changes being a non-issue.

I'm going to seek input from my teammates on the selection ordering; I think it could be OK, and may just warrant a note in the change log as you said.

@Beilinson
Copy link
Contributor Author

Found this #3018 randomly looking through issues related to performance/memory leaks, realized that this PR implements the transition to Map as mentioned there, so this could close that I think

@mzschwartz5
Copy link
Contributor

Hey @Beilinson,

Team didn't have strong opinions - I think let's just add a small Breaking Change note to the CHANGES.md about the picking order change.

Otherwise this is good to go. I'm out tomorrow and Wednesday, but will merge on Thursday (with the CHANGES.md note). Thanks for the work!

@Beilinson
Copy link
Contributor Author

Great @mzschwartz5!

I've updated CHANGES.md, will be available thursday if needed.

@Beilinson Beilinson requested a review from mzschwartz5 October 7, 2025 09:40
@Beilinson Beilinson requested a review from mzschwartz5 October 9, 2025 16:32
@mzschwartz5
Copy link
Contributor

@Beilinson Looks like CI is failing with some JS doc-related errors... :/

 ERROR: Unable to parse a tag's type expression for source file /home/runner/work/cesium/cesium/packages/engine/Source/Scene/Picking.js in line 791 with tag title "param" and text "{(limit: number) => object[]} pickCallback Pick callback to execute each iteration": Invalid type expression "(limit: number) => object[]": Expected "$", "'", ".", "0", "@", "\"", "\\", "_", "break", "case", "catch", "class", "const", "continue", "debugger", "default", "delete", "do", "else", "enum", "export", "extends", "finally", "for", "function", "if", "implements", "import", "in", "instanceof", "interface", "let", "new", "package", "private", "protected", "public", "return", "static", "super", "switch", "this", "throw", "try", "typeof", "var", "void", "while", "with", "yield", Unicode letter number, Unicode lowercase letter, Unicode modifier letter, Unicode other letter, Unicode titlecase letter, Unicode uppercase letter, or [1-9] but " " found.
ERROR: Unable to parse a tag's type expression for source file /home/runner/work/cesium/cesium/packages/engine/Source/Scene/Picking.js in line 797 with tag title "param" and text "{(limit: number) => object[]} pickCallback Pick callback to execute each iteration": Invalid type expression "(limit: number) => object[]": Expected "$", "'", ".", "0", "@", "\"", "\\", "_", "break", "case", "catch", "class", "const", "continue", "debugger", "default", "delete", "do", "else", "enum", "export", "extends", "finally", "for", "function", "if", "implements", "import", "in", "instanceof", "interface", "let", "new", "package", "private", "protected", "public", "return", "static", "super", "switch", "this", "throw", "try", "typeof", "var", "void", "while", "with", "yield", Unicode letter number, Unicode lowercase letter, Unicode modifier letter, Unicode other letter, Unicode titlecase letter, Unicode uppercase letter, or [1-9] but " " found.
[16:34:25] 'buildTs' errored after 45 s
[16:34:25] Error: Command failed: npx jsdoc --configure packages/engine/tsd-conf.json

Can you look into this? You should be able to run the npm run build-docs command locally to verify if you've solved the issue, I believe.

@Beilinson
Copy link
Contributor Author

Beilinson commented Oct 9, 2025

Fixing the first tsdoc error now (optional argument before required)

The second issue seems to be due to the TSDOC target library not being es2015, which I assume is a mistake since the gulpfile build step targets es2020.

@mzschwartz5 Correct me if I'm wrong. I could just fix this by removing the jsdoc error complaining about the use of Map, but I don't think this is a good idea as it will be confusing to future developers using this (even though the error is on a completely internal part of cesium)

@Beilinson
Copy link
Contributor Author

Beilinson commented Oct 9, 2025

This seems to be failing to a flaky spec, I tested just now and it passes for me:

image image

@mzschwartz5
Copy link
Contributor

Yeah that one fails pretty frequently... I'm just going to rerun the that test.

@mzschwartz5 mzschwartz5 added this pull request to the merge queue Oct 10, 2025
Merged via the queue into CesiumGS:main with commit e3b388d Oct 10, 2025
13 of 16 checks passed
@mzschwartz5
Copy link
Contributor

All good to go, thanks for the contribution! This is a huge performance upgrade for drill picking.

@Beilinson
Copy link
Contributor Author

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants