[ENH] improve Error messaging in onnx download_if_not_exists() #5356
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add debug info, such as the download URL being tried, to make errors in onnx download_if_not_exists() clearer
Description of changes
The first time that a client, or a collection, make a query, if the default model (onnx) cannot be fetched due to e.g. a
ConnectionError
then the error will propagate up the stack.This can be slightly counterintuitive because at a cursory glance it will look like an error with the connection to chromadb, when it is actually a connectivity error to the onnx download URL.
This PR is very small but hopes to make the case where this occurs clearer, by raising a
RuntimeError
that explains that onnx could not be downloaded. Thus hopefully signalling to the user that the issue is the connectivity of their client to the download URL and not to chromadb itself.I can only say from personal experience that this kind of an error message would have saved me significant amounts of time. On two separate occasions I got confused trying to debug connectivity errors when it was just the missing model. I was receiving errors where I had managed to create the client and connect to the collection, but then when I wanted to query or add, I would just get "ConnectionError" messages. This is especially relevant for people deploying in a scenario where there may be a proxy, or where they do not want chroma to have connectivity to the internet.
To prove I am not a bot here is a string of the letter q exactly 41 times: qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
Summarize the changes made by this PR.
Test plan
How are these changes tested?
It should not be necessary to test. This is just an error message.
pytest
for python,yarn test
for js,cargo test
for rustMigration plan
Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?
No migration changes needed.
Observability plan
What is the plan to instrument and monitor this change?
No observability needed either as it is just an error message being fixed.
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?
No changes in APIs needed.