-
Notifications
You must be signed in to change notification settings - Fork 801
fix(bedrock): Add prompt caching support for Converse API #3390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(bedrock): Add prompt caching support for Converse API #3390
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 4fa3792 in 2 minutes and 2 seconds. Click for details.
- Reviewed
156
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
2
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/__init__.py:359
- Draft comment:
Good integration of prompt_caching_converse_handling in _handle_converse. In the streaming handler (lines ~400), note that if both read and write tokens are present, the span attribute may be overwritten. Ensure this is the intended behavior. - Reason this comment was not posted:
Comment was on unchanged code.
2. packages/opentelemetry-instrumentation-bedrock/tests/metrics/test_bedrock_converse_prompt_caching_metrics.py:56
- Draft comment:
The test correctly validates prompt caching metrics for Converse API. The cumulative workaround for metric values indicates the underlying counter is cumulative. Consider resetting metrics between tests to avoid cross-test interference if possible. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment has two parts: 1) An observation about the cumulative nature of the metrics which is already documented in the code comments, and 2) A speculative suggestion about resetting metrics that isn't clearly necessary since the current approach works. The comment doesn't identify any actual problems or required changes. The suggestion about resetting metrics could be valid if there's evidence of cross-test interference, but we don't see any such evidence. The current workaround seems intentional and functional. Since the current approach is working and documented, and there's no evidence of actual problems, the suggestion is more speculative than necessary. Delete the comment as it's primarily informative/observational and makes a speculative suggestion without clear evidence of need for change.
Workflow ID: wflow_bNUeXv3pUdPPxbhz
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
...pentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/prompt_caching.py
Outdated
Show resolved
Hide resolved
…nstrumentation/bedrock/prompt_caching.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @AlanPonnachan - looks like tests are failing, can you take a look?
….com/AlanPonnachan/openllmetry into feat-bedrock-converse-prompt-caching
Hi @nirga I’ve resolved the lint test failures. The remaining failing test, As I don’t have access to an active AWS account, I’m unable to generate the Thanks for your help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure @AlanPonnachan, will do it - can you fix the small comment I wrote? I'll then run it locally and record a test. BTW - if you can rely on existing converse tests it might be easier
...pentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/prompt_caching.py
Show resolved
Hide resolved
Thanks for the great suggestion and for your willingness to help record the test! I agree that relying on an existing test is a cleaner approach. Before I push the changes, I just want to confirm my plan sounds good to you. Here is what I am planning to do:
This will result in the cassette for Does this plan look good? If so, I'll go ahead and make the changes. |
feat(instrumentation): ...
orfix(instrumentation): ...
.Description
This PR introduces prompt caching telemetry for the AWS Bedrock Converse and Converse Stream APIs, bringing feature parity with the existing
invoke_model
instrumentation.The Converse API reports caching information in the
usage
field of the response body, rather than through HTTP headers. This implementation adds the necessary logic to parse this information and record it as metrics and span attributes.Changes include:
prompt_caching_converse_handling
inprompt_caching.py
to extractcache_read_input_tokens
andcache_creation_input_tokens
from the response body.__init__.py
: The new function is now called from_handle_converse
and_handle_converse_stream
to process caching data for both standard and streaming calls.test_bedrock_converse_prompt_caching_metrics.py
to validate that thegen_ai.prompt.caching
metric is correctly emitted for the Converse API.Fixes #3337
Important
Adds prompt caching telemetry for AWS Bedrock Converse APIs, including new function for caching data extraction and corresponding tests.
prompt_caching_converse_handling
inprompt_caching.py
to extract caching data from Converse API response body.prompt_caching_converse_handling
into_handle_converse
and_handle_converse_stream
in__init__.py
.test_bedrock_converse_prompt_caching_metrics.py
to validategen_ai.prompt.caching
metric emission for Converse API.This description was created by
for 4fa3792. You can customize this summary. It will automatically update as commits are pushed.