-
Notifications
You must be signed in to change notification settings - Fork 164
[OpenVINO] Update InferRequestWrapper to collect samples taking into account state of stateful models #1505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
…ful OV models (#3714) ### Changes Added `nncf.definitions.NNCF_DATASET_RESET_STATE_KEY` constant to specify when to reset model state. This constant is used by OpenVINO backend to control resetting of internal model state between model inferences. This key can be added to a dataset sample input dictionary with either `True` or `False` value. With `True` value, the model state will be reset before inference on the corresponding sample, and with `False` the state will not be reset. For an example of usage please see huggingface/optimum-intel#1505. ### Reason for changes Without this logic static quantization quality of stateful Whisper models is poor because a state of a stateful model must be cleared with the same schedule as it is done during calibration input data collection. ### Related tickets 172705 ### Tests Added `tests/openvino/native/test_engine.py::test_stateful_model_inference_with_controlled_resetting`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances the InferRequestWrapper class to properly handle state management for stateful OpenVINO models during quantization calibration. The changes ensure that state reset operations are tracked and communicated to NNCF's calibration process, which is critical for accurate quantization of stateful models like Whisper.
Key changes:
- Added state tracking mechanism to detect and record when model state is reset
- Modified input collection to include state reset information for NNCF calibration (version 2.19+)
- Implemented
reset_state()method wrapper to track state reset calls
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
76a11e5 to
03ce84c
Compare
echarlaix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
echarlaix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating, feel free to merge!
What does this PR do?
Changes
Add an extra value to input dict when wrapping infer request of a stateful model. This feature is not part of the latest NNCF v.2.19 release and depends on NNCF develop version.
Reason for changes
Quantization quality of stateful Whisper models is poor because a state of a stateful model must be cleared with the same schedule as it is done during calibration input data collection.
Ticket CVS-172705
Before submitting