-
Notifications
You must be signed in to change notification settings - Fork 61
Protect thread setting call #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protect thread setting call #159
Conversation
Updated to take #158 (review) into account |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me.
Requested @tanmayv25 to provide additional review as he's more familiar with the PyTorch backend than I am.
@whoisj thanks for starting the review. I have applied the code and markdown formatting in the latest commits. |
@kpedro88 please read the Triton Contributors Contribution License Agreement. We'll need this completed prior to accepting any changes from you unless you're acting on behalf of your employer and your employer has a CCLA on file with us. Thank you. |
The function
at::set_num_interop_threads()
in PyTorch can only be called once. Subsequent calls will result in an exception.(The function
at::set_num_threads()
can be called multiple times, but after the first call, subsequent calls will issue a warning message and will not have an effect.)This can cause the server to crash if two models are loaded that both specify values for the corresponding configuration parameters:
In this PR,
std::call_once()
is used to prevent this (along with a try/catch block just in case).Testing the above case (two models that both specify values) with this PR shows that the exception no longer occurs and inference can proceed.
The documentation is updated accordingly.