Skip to content

Conversation

AlanPonnachan
Copy link

@AlanPonnachan AlanPonnachan commented Sep 19, 2025

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

Description

This PR introduces prompt caching telemetry for the AWS Bedrock Converse and Converse Stream APIs, bringing feature parity with the existing invoke_model instrumentation.

The Converse API reports caching information in the usage field of the response body, rather than through HTTP headers. This implementation adds the necessary logic to parse this information and record it as metrics and span attributes.

Changes include:

  1. New function prompt_caching_converse_handling in prompt_caching.py to extract cache_read_input_tokens and cache_creation_input_tokens from the response body.
  2. Integration into __init__.py: The new function is now called from _handle_converse and _handle_converse_stream to process caching data for both standard and streaming calls.
  3. New Test File: Added test_bedrock_converse_prompt_caching_metrics.py to validate that the gen_ai.prompt.caching metric is correctly emitted for the Converse API.

Fixes #3337


Important

Adds prompt caching telemetry for AWS Bedrock Converse APIs, including new function for caching data extraction and corresponding tests.

  • Behavior:
    • Adds prompt_caching_converse_handling in prompt_caching.py to extract caching data from Converse API response body.
    • Integrates prompt_caching_converse_handling into _handle_converse and _handle_converse_stream in __init__.py.
  • Testing:
    • Adds test_bedrock_converse_prompt_caching_metrics.py to validate gen_ai.prompt.caching metric emission for Converse API.
  • Misc:

This description was created by Ellipsis for 4fa3792. You can customize this summary. It will automatically update as commits are pushed.

@CLAassistant
Copy link

CLAassistant commented Sep 19, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to 4fa3792 in 2 minutes and 2 seconds. Click for details.
  • Reviewed 156 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/__init__.py:359
  • Draft comment:
    Good integration of prompt_caching_converse_handling in _handle_converse. In the streaming handler (lines ~400), note that if both read and write tokens are present, the span attribute may be overwritten. Ensure this is the intended behavior.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. packages/opentelemetry-instrumentation-bedrock/tests/metrics/test_bedrock_converse_prompt_caching_metrics.py:56
  • Draft comment:
    The test correctly validates prompt caching metrics for Converse API. The cumulative workaround for metric values indicates the underlying counter is cumulative. Consider resetting metrics between tests to avoid cross-test interference if possible.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment has two parts: 1) An observation about the cumulative nature of the metrics which is already documented in the code comments, and 2) A speculative suggestion about resetting metrics that isn't clearly necessary since the current approach works. The comment doesn't identify any actual problems or required changes. The suggestion about resetting metrics could be valid if there's evidence of cross-test interference, but we don't see any such evidence. The current workaround seems intentional and functional. Since the current approach is working and documented, and there's no evidence of actual problems, the suggestion is more speculative than necessary. Delete the comment as it's primarily informative/observational and makes a speculative suggestion without clear evidence of need for change.

Workflow ID: wflow_bNUeXv3pUdPPxbhz

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

…nstrumentation/bedrock/prompt_caching.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
@nirga nirga changed the title feat(bedrock): Add prompt caching support for Converse API fix(bedrock): Add prompt caching support for Converse API Sep 19, 2025
Copy link
Member

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @AlanPonnachan - looks like tests are failing, can you take a look?

@AlanPonnachan
Copy link
Author

Hi @nirga

I’ve resolved the lint test failures. The remaining failing test, test_prompt_cache_converse, is expected since it requires a VCR cassette to be recorded.

As I don’t have access to an active AWS account, I’m unable to generate the test_prompt_cache_converse.yaml cassette file myself. Would you be able to check out this branch, run the test and push the generated cassette file to this PR?

Thanks for your help!

Copy link
Member

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @AlanPonnachan, will do it - can you fix the small comment I wrote? I'll then run it locally and record a test. BTW - if you can rely on existing converse tests it might be easier

@AlanPonnachan
Copy link
Author

Thanks for the great suggestion and for your willingness to help record the test!

I agree that relying on an existing test is a cleaner approach. Before I push the changes, I just want to confirm my plan sounds good to you.

Here is what I am planning to do:

  1. Modify the Existing Test: I will update the test_titan_converse function in tests/traces/test_titan.py.
  2. Enable Caching: I'll add the additionalModelRequestFields with cacheControl to the existing brt.converse API call.
  3. Test Both Scenarios: I will add a second brt.converse call within that same test to ensure we cover both the initial "cache-write" and the subsequent "cache-read".
  4. Add Assertions: I will add the metric assertions I wrote to validate that the prompt caching counters are working correctly.
  5. Clean Up: Finally, I will delete the new test file I originally created (test_bedrock_converse_prompt_caching_metrics.py).

This will result in the cassette for test_titan_converse.yaml needing to be re-recorded, as you mentioned.

Does this plan look good? If so, I'll go ahead and make the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🚀 Feature: Add prompt caching for Bedrock Converse
3 participants