Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

@Andy-Jost Andy-Jost commented Jan 12, 2026

Summary

  • Replace exhaustive O(n²) pairwise seed testing with prime-stride sampling
  • Reduces iterations from ~32,640 to ~100-150 while maintaining meaningful coverage
  • Removes the Windows skip marker added in Add skipif IS_WINDOWS for test_patterngen_seeds #1456, re-enabling the test on Windows

Closes #1455

@Andy-Jost Andy-Jost added bug Something isn't working cuda.core Everything related to the cuda.core module labels Jan 12, 2026
@Andy-Jost Andy-Jost self-assigned this Jan 12, 2026
@Andy-Jost Andy-Jost requested a review from rwgk January 12, 2026 18:36
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Jan 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost
Copy link
Contributor Author

/ok to test c48274e

log("done")


@pytest.mark.skipif(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could leave this in if we want to continue skipping this test on Windows. It is not important to test this on every platform IMO.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 100% for removing this.

@Andy-Jost
Copy link
Contributor Author

The test runs in 0.19s on Linux.

@github-actions
Copy link

log("done")


@pytest.mark.skipif(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 100% for removing this.

# especially on Windows. See https://github.com/NVIDIA/cuda-python/issues/1455
pgen = PatternGen(device, NBYTES)
for i in range(256):
for i in range(0, 256, 17):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't covering the usual trouble-maker corner case 1 anymore.

This would sample around the usual suspects (0, 1 corner case, then a couple around powers of 2):

(0, 1, 2, 3, 4, 5, 31, 32, 33, 127, 128, 129)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I included values <5 since those are the most common. I don't think there's anything special about powers of two in this case.

Replace exhaustive O(n²) pairwise seed testing with prime-stride
sampling, reducing iterations from ~32k to ~100 while maintaining
meaningful coverage.

Closes NVIDIA#1455
@Andy-Jost Andy-Jost force-pushed the fix-slow-patterngen-test branch from c48274e to d39f754 Compare January 12, 2026 19:09
@Andy-Jost
Copy link
Contributor Author

/ok to test d39f754

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks nice!

>>> n = 0
>>> for i in (ii for ii in range(0, 256) if ii < 5 or ii % 17 == 0):
...     js = tuple(jj for jj in range(i + 1, 256) if jj < 5 or jj % 19 == 0)
...     print(i, js)
...     n += len(js)
...
0 (1, 2, 3, 4, 19, 38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
1 (2, 3, 4, 19, 38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
2 (3, 4, 19, 38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
3 (4, 19, 38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
4 (19, 38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
17 (19, 38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
34 (38, 57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
51 (57, 76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
68 (76, 95, 114, 133, 152, 171, 190, 209, 228, 247)
85 (95, 114, 133, 152, 171, 190, 209, 228, 247)
102 (114, 133, 152, 171, 190, 209, 228, 247)
119 (133, 152, 171, 190, 209, 228, 247)
136 (152, 171, 190, 209, 228, 247)
153 (171, 190, 209, 228, 247)
170 (171, 190, 209, 228, 247)
187 (190, 209, 228, 247)
204 (209, 228, 247)
221 (228, 247)
238 (247,)
255 ()
>>> print(n)
171

@Andy-Jost
Copy link
Contributor Author

/ok to test a4dcee1

@Andy-Jost
Copy link
Contributor Author

/ok to test d0454f2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cuda.core Everything related to the cuda.core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Extremely slow cuda_core/tests/test_helpers.py::test_patterngen_seeds on Windows

2 participants