Skip to content

Conversation

@maliozer
Copy link

Ensure correct order of FPS downsampling and embedding for multi-resolution inputs; fix import location of fps

Summary

This PR fixes the following issues in the PerceiverCrossAttentionEncoder._forward() method:

Correct import location for fps:
The function fps from torch_cluster is now imported directly before it is used in the multi-resolution downsampling block, preventing NameError.

Correct order of downsampling and embedding:
Multi-resolution FPS downsampling of point clouds and associated features (pc, feats, sharp_pc, sharp_feat) is now performed before any embedding or projection. This ensures that only the downsampled tensors are passed through the embedder and input projections, so shapes are always aligned. No redundant recomputation is performed.

Changes

  • Moved from torch_cluster import fps to immediately before usage in the if self.use_multi_reso: block.
  • Downsample input tensors first (if use_multi_reso is enabled), and then perform embedding/projection.
  • Prevents mismatches in shape/dimension and avoids unnecessary recomputation.
  • All logic paths now consistently process only the current (possibly downsampled) batch.

Related Issue: #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant