⚡️ Speed up method MoeWNA16Config.from_config by 28%
#329
+6
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 28% (0.28x) speedup for
MoeWNA16Config.from_configinpython/sglang/srt/layers/quantization/moe_wna16.py⏱️ Runtime :
385 microseconds→302 microseconds(best of39runs)📝 Explanation and details
Explanation of Optimizations
base_config.py
packed_modules_mappinginitialization fromdict()to{}. This is a micro-optimization as{}is faster thandict()for literal empty dict creation.get_from_keysby using the generator version of the pattern: return next((config[k] for k in keys if k in config), ...) with a default sentinel to raise ValueError. This reduces average-case lookup time when the key is found early.moe_wna16.py
modules_to_not_convertoutside the conditional, as the logic for setting it to[]or a default value is repeated.modules_to_not_convert, reducing a small overhead and slightly improving code clarity.modules_to_not_convertassignment: instead of if/else None-check, just useorto fallback to empty list, saving a branch at runtime.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-MoeWNA16Config.from_config-mhoy795dand push.