Skip to content

Conversation

TrevorBurnham
Copy link
Contributor

This blog post about the performance impact of semver on npm installations has been doing the rounds lately: https://marvinh.dev/blog/speeding-up-javascript-ecosystem-part-12/ Per discussion at npm/node-semver#800, there don't seem to be any low-hanging fruit optimizations in semver itself.

One part of the post that stood out to me is that npm frequently calls semver in a way that produces duplicative work. For instance, calling if (semver.valid(version)) followed by semver.parse(version) actually parses the version twice, because semver.valid uses semver.parse under the hood. (There's a built-in cache that prevents some of semver's work from being duplicated, but not all of it.) This could in principle be addressed in semver this by memoizing semver.parse, but that would be a compatibility-breaking change because the object returned by semver.parse is mutable: If a consumer modified the object, that would affect anyone else who calls semver.parse with the same version.

With that in mind, this PR adds a cached-semver utility to npm. It's a drop-in replacement for semver that uses caching to ensure that versions and ranges are only parsed once. This should significantly speed up installs, updates, and audits.

@TrevorBurnham TrevorBurnham requested a review from a team as a code owner September 1, 2025 17:19
@TrevorBurnham TrevorBurnham force-pushed the cache-semver branch 3 times, most recently from 8ef564f to 50d2403 Compare September 1, 2025 20:03
@TrevorBurnham TrevorBurnham changed the title perf: cache semver calls during dependency resolution feat: cache semver calls during dependency resolution Sep 1, 2025
@TrevorBurnham TrevorBurnham force-pushed the cache-semver branch 2 times, most recently from 833cf7d to ff00c7d Compare September 1, 2025 20:47
@wraithgar
Copy link
Member

Can we do some benchmarks on this? It's adding quite a bit of complexity so would need to represent a measurable increase in performance to be worth it.

@TrevorBurnham
Copy link
Contributor Author

Sure. I noticed that the built-in arborist benchmark script isn't operative: #8546 I tried restoring it and found a ~1% speedup for loadActual.

I agree that there's a complexity tradeoff here. One thing you could do is extract this logic to a separate package (call it @npmcli/cached-semver, say) and use that as a drop-in replacement for the semver dependency. That way the complexity would be the same from this package's perspective, and the performance benefits of the cache would be applied consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants