feat(openai): standardize completions to indexed attribute format #2242

codefromthecrypt · 2025-09-30T02:17:10Z

Standardizes /completions spans to use llm.* prefix with indexed nested attributes, aligning with /chat/completions format:

llm.prompts.{i}.prompt.text for completion prompts (nested indexed format)
llm.choices.{i}.completion.text for completion outputs (nested indexed format)
Adds LLM_CHOICES constant to semantic conventions
Reuses existing LLM_PROMPTS constant (un-deprecated for completions use)
Adds OPENINFERENCE_HIDE_CHOICES environment variable
Reuses existing OPENINFERENCE_HIDE_PROMPTS environment variable
Updates spec documentation with proper indexed attribute patterns

Key benefits:

Consistent llm.* prefix across all LLM span types (no completion.* top-level prefix)
Nested discriminated union structure enables future extensibility
Aligns with existing llm.input_messages and llm.output_messages patterns
Simplifies attribute parsing and querying with uniform structure

Breaking changes:

Attribute names change from completion.prompt.{i} → llm.prompts.{i}.prompt.text
Attribute names change from completion.text.{i} → llm.choices.{i}.completion.text

Note

Migrates OpenAI completions to indexed llm.prompts.N.prompt.text and llm.choices.N.completion.text, adds LLM_CHOICES and OPENINFERENCE_HIDE_CHOICES, and updates masking, tests, and docs accordingly.

OpenAI Instrumentation:
- Standardizes completions request attributes to llm.prompts.N.prompt.text (replaces list under llm.prompts).
- Emits completion outputs as llm.choices.N.completion.text from response choices.
Semantic Conventions:
- Adds SpanAttributes.LLM_CHOICES; documents indexed/nested format for llm.prompts and llm.choices.
Config:
- Introduces OPENINFERENCE_HIDE_CHOICES and TraceConfig.hide_choices; extends masking to redact llm.prompts.* and llm.choices.* based on hide settings.
Tests:
- Updates OpenAI instrumentation and config tests to assert new indexed attributes and hiding behavior.
Docs/Specs:
- Updates configuration and LLM span specs with indexed patterns and a completions example.

^{Written by Cursor Bugbot for commit 5a324c1. This will update automatically on new commits. Configure here.}

codefromthecrypt · 2025-09-30T02:55:20Z

google-adk haystack and bedrock latest failures are not related to this PR, rather drift on their corresponding latest versions.

codefromthecrypt · 2025-09-30T03:13:46Z

bedrock ci fix: #2243

codefromthecrypt · 2025-09-30T03:41:38Z

google-adk fix: #2244

codefromthecrypt · 2025-09-30T06:02:29Z

haystack fix: #2245

axiomofjoy · 2025-09-30T18:34:18Z

Thanks @codefromthecrypt. I think the llm.input_prompts change looks fine if it helps to avoid the attribute size limit. I'm concerned about the llm.output_choices attribute, since it seems like we would need to nest our existing LLM output message attributes under each choice in addition to text. I'm concerned about this resulting in a large amount of duplicate data in the span and quite a few more semantic conventions. Is choices a common pattern for different models and APIs? I have so far only seen it as an idiosyncrasy of the OpenAI API.

Looks like spec/semantic_conventions.md also needs to be updated.

codefromthecrypt · 2025-09-30T23:30:03Z

@axiomofjoy so what I was told in envoy AI gateway is that /completions is a special case and needed for LoRA use cases, so while it is specified semantically, we can clarify that this is only applies to that endpoint which is implemented several places outside openai including most notably vLLM (bloomberg was talking about this specifically)

So, where I'm getting at is I think we don't imply that /chat/completions and others map to choices, rather the existing semantics which index on messages (a /chat/completions nouns) isn't re-used for the choices of the raw/legacy completions (which has no such response field "messages" only "choices"). In other words, the choices is contained to the completions use case.

If pragmatically we want to map /completions attributes into a synthetic schema to merge with /chat/completions, I'm keen on that, just would like guidance on it. Either way, response arrays have the same size issue as the request ones.

Thoughts?

codefromthecrypt · 2025-09-30T23:50:47Z

so concretely in LLM spans (normal chat completions who have a concept of role) we do traverse the choices path and get to the message.role or message.content field and add them as indexed attributes.

    {
      "key": "output.value",
      "value": {
        "stringValue": "{\"id\":\"chatcmpl-C4Gm9xikLXbgE8He0BHWeoM03aa72\",\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"message\":{\"content\":\"Hi there! How can I help you today? I can explain concepts, answer questions, help with writing or editing, brainstorm ideas, assist with coding or math, plan tasks, and more. Tell me what you’d like to do.\",\"refusal\":null,\"role\":\"assistant\",\"annotations\":[]}}],\"created\":1755133833,\"model\":\"gpt-5-nano-2025-08-07\",\"object\":\"chat.completion\",\"service_tier\":\"default\",\"system_fingerprint\":null,\"usage\":{\"completion_tokens\":377,\"prompt_tokens\":8,\"total_tokens\":385,\"completion_tokens_details\":{\"accepted_prediction_tokens\":0,\"audio_tokens\":0,\"reasoning_tokens\":320,\"rejected_prediction_tokens\":0},\"prompt_tokens_details\":{\"audio_tokens\":0,\"cached_tokens\":0}}}"
      }
    },
    {
      "key": "llm.output_messages.0.message.role",
      "value": {
        "stringValue": "assistant"
      }
    },
    {
      "key": "llm.output_messages.0.message.content",
      "value": {
        "stringValue": "Hi there! How can I help you today? I can explain concepts, answer questions, help with writing or editing, brainstorm ideas, assist with coding or math, plan tasks, and more. Tell me what you’d like to do."
      }

In completions there is no structured field inside the choices field, so it doesn't make as much sense to map the same way as chat completions which has structured data, so needs to split out the content vs the role (there is no role)

        "output.value": "{\"id\": \"cmpl-CKz4klHa1MMqAa4hQn3yzIMlLMZHd\", \"object\": \"text_completion\", \"created\": 1759117370, \"model\": \"babbage:2023-07-21-v2\", \"choices\": [{\"text\": \" + fib(n-3) + fib(n-4)\\n\\ndef fib(n):\\n    if n <= 1:\\n        return\", \"index\": 0, \"finish_reason\": \"length\"}], \"usage\": {\"prompt_tokens\": 31, \"completion_tokens\": 25, \"total_tokens\": 56}}",
        "output.mime_type": "application/json",
        "llm.output_choices.0.choice.text": " + fib(n-3) + fib(n-4)\n\ndef fib(n):\n    if n <= 1:\n        return",

So, I think what you are saying is that because the /chat/completions has a choices field which the existing output attributes are sourced from, saying "choices" in the attribute name for the completions response, even if technically valid would be confusing.

What if instead of "llm.output_choices.0.choice.text" we made a pragmatic change to say "llm.output_completion.0.text" . While less technically accurate, it might reduce the confusion?

I'll go ahead and spike this and also update the docs as requested. if you think of something better meanwhile lemme know.

codefromthecrypt · 2025-10-01T00:21:10Z

eek.. just realized how bad embeddings looks in practice: "embedding.embeddings.0.embedding.text" "embedding.embeddings.0.embedding.vector"

what if we change both of the special apis embedding and completion to be less repetitive?
"embedding.input.0.text"
"embedding.output.0.vector"

"completion.input.0.prompt"
"completion.output.0.text"

I'll spike this for completions for feedback

codefromthecrypt · 2025-10-01T03:26:12Z

Current Design: "completion.prompt.N" and "completion.text.N"

Rationale

JSON Path Alignment

The convention directly mirrors the actual OpenAI Completions API structure:

Completions API (simple):

Request: { "prompt": "text" } or { "prompt": ["text1", "text2"] }
Response: { "choices": [{"text": "output", "index": 0}] }
→ Attributes: "completion.prompt.0", "completion.text.0"

Chat Completions API (structured objects):

Request: { "messages": [{"role": "user", "content": "text"}] }
Response: { "choices": [{"message": {"role": "assistant", "content": "text"}}] }
→ Attributes: "llm.input_messages.0.message.role", "llm.input_messages.0.message.content"

Key Difference: Completions deals with simple indexed strings, not structured objects with multiple fields. The attribute format reflects this fundamental difference.

Index-at-the-End Convention

For simple values (strings), the index comes last: "completion.prompt.0"

JSON path: request.prompt[0]
No nested object fields to navigate after the index

For structured objects, index comes in the middle: "llm.input_messages.0.message.content"

JSON path: request.messages[0].content
Must navigate through object fields after the index

Alternative Conventions (and why they're worse)

"completion.input.N.prompt" + "completion.output.N.text"

❌ Adds unnecessary input/output levels not in the JSON structure
JSON path is prompt[N], not input.prompt[N]
Creates false hierarchy where none exists

"llm.prompts.N" (old convention)

❌ Uses deprecated llm.prompts attribute
❌ Plural form misleading (each index is ONE prompt, not multiple)
❌ Doesn't distinguish input vs output

"completion.input_prompts.N.prompt.text"

❌ Mimics chat format when completions are fundamentally simpler
❌ Way too verbose for a single string value
❌ Creates fake nesting (prompt.text) that doesn't exist in API

"llm.input_choices.N.choice.prompt"

❌ "choices" is an output concept, not input
❌ Confusing mental model (choices don't have prompts)
❌ Doesn't match API terminology

codefromthecrypt · 2025-10-01T03:46:27Z

openai drift #2253

codefromthecrypt · 2025-10-01T06:28:15Z

beeai drift: #2255

axiomofjoy · 2025-10-02T21:09:54Z

@codefromthecrypt After discussing with @mikeldking, here's what we'd like to propose.

Prompts

We'd like to keep llm.prompts rather than llm.input_prompts. While llm.input_prompts does mirror the existing llm.input_messages prefix, prompts are implicitly inputs while messages can be inputs or outputs. We'd prefer to keep the conventions shorter without sacrificing descriptiveness, which we believe is possible in this case. In terms of backward compatibility, we think it's okay to deprecate the usage of llm.prompts with a list of strings attribute value while keeping the llm.prompts prefix. This should involve minimal changes to the implementations in the various OpenInference libraries since this convention is not currently widely used.

We'd also like to propose namespacing the "text" field under a "prompt" field. Concretely, keys would look like "llm.prompts.<prompt_index>.prompt.text". There are a few reasons we advocate for this approach:

The second part of the key after the index (i.e., prompt.text) is more descriptive than text alone.
It's convenient for downstream consumers of the telemetry data, which receive a payload such as the one below and can use "prompt" as the discriminator in a discriminated union to know what type to expect after accessing "llm.prompts.<prompt_index>.prompt".

{
  "llm": {
    "prompts": [
      "prompt": {
        "text": "Write a haiku"
      }
    ]
  }
}

Along those lines, having a mechanism for a discriminated union leaves us open to including other types in addition to "prompt" in the prompts array in the future if needed.
It mirrors the structure of our existing conventions for messages, which were chosen for similar reasons to those described above.

{
  "llm": {
    "input_messages": [
      {
        "message": {
          "role": "user",
          "content": "Write a haiku"
        }
      }
    ]
  }
}

Choices

llm.choices as a prefix makes sense to us. The nuance here is that both legacy completions and modern chat completions APIs support multiple choices. Similarly to above, we'd like to leave room for both via discriminated unions. Concretely, we'd like to propose keys of the form "llm.choices.<choice_index>.completion.text". This shares the benefits outlined for the proposed prompt key format. It also results in a format that is consistent between prompts, choices, and messages:

{
  "llm": {
    "prompts": [
      {
        "prompt": {
          "text": "Write a haiku"
        }
      }
    ],
    "input_messages": [
      {
        "message": {
          "role": "user",
          "content": "Write a haiku"
        }
      }
    ],
    "choices": [
      {
        "completion": {
          "text": "Cherry blossoms bloom\nabove New York’s restless streets\nskyline crowned in pink"
        }
      },
      {
        "chat_completion": {
          ... //  leave for another day
        }
      }
    ]
  }
}

Notes

We'd like to avoid adding a top-level "completion" prefix. Currently, the prefixes correspond to the span kind attribute values ("chain", "llm", "embedding", etc.). We still think of legacy completions as LLM spans and don't see a need to introduce a new completion span kind.

codefromthecrypt · 2025-10-02T23:19:04Z

looks beautiful @axiomofjoy @mikeldking thanks for collaborating on this!

Signed-off-by: Adrian Cole <[email protected]>

codefromthecrypt · 2025-10-03T02:38:20Z

beeai drift refactored here: #2255

Signed-off-by: Adrian Cole <[email protected]>

codefromthecrypt · 2025-10-03T02:55:51Z

also, it seems a trend that if you hide_inputs you also hide things derived from it. similar for outputs. Is that true? if so, maybe I'll do a follow-up to make it coherent in tests and docs.

codefromthecrypt requested a review from a team as a code owner September 30, 2025 02:17

github-project-automation bot added this to Instrumentation Sep 30, 2025

This comment was marked as outdated.

Sign in to view

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Sep 30, 2025

codefromthecrypt force-pushed the openinference-completions branch from 4bb94eb to 463fd1e Compare September 30, 2025 02:55

codefromthecrypt force-pushed the openinference-completions branch from 463fd1e to 62473a2 Compare October 1, 2025 02:54

This comment was marked as outdated.

Sign in to view

codefromthecrypt force-pushed the openinference-completions branch from 62473a2 to b3fe44e Compare October 1, 2025 03:11

feat(openai): standardize completions to indexed attribute format

0cf1098

Signed-off-by: Adrian Cole <[email protected]>

codefromthecrypt force-pushed the openinference-completions branch from b3fe44e to 0cf1098 Compare October 3, 2025 01:07

This comment was marked as outdated.

Sign in to view

fix

5a324c1

Signed-off-by: Adrian Cole <[email protected]>

feat(openai): standardize completions to indexed attribute format #2242

Are you sure you want to change the base?

feat(openai): standardize completions to indexed attribute format #2242

Uh oh!

Conversation

codefromthecrypt commented Sep 30, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

codefromthecrypt commented Sep 30, 2025

Uh oh!

codefromthecrypt commented Sep 30, 2025

Uh oh!

codefromthecrypt commented Sep 30, 2025

Uh oh!

codefromthecrypt commented Sep 30, 2025

Uh oh!

axiomofjoy commented Sep 30, 2025

Uh oh!

codefromthecrypt commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codefromthecrypt commented Sep 30, 2025

Uh oh!

codefromthecrypt commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

codefromthecrypt commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale

JSON Path Alignment

Index-at-the-End Convention

Alternative Conventions (and why they're worse)

"completion.input.N.prompt" + "completion.output.N.text"

"llm.prompts.N" (old convention)

"completion.input_prompts.N.prompt.text"

"llm.input_choices.N.choice.prompt"

Uh oh!

codefromthecrypt commented Oct 1, 2025

Uh oh!

codefromthecrypt commented Oct 1, 2025

Uh oh!

axiomofjoy commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Prompts

Choices

Notes

Uh oh!

codefromthecrypt commented Oct 2, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

codefromthecrypt commented Oct 3, 2025

Uh oh!

codefromthecrypt commented Oct 3, 2025

Uh oh!

Uh oh!

codefromthecrypt commented Sep 30, 2025 •

edited by cursor bot

Loading

codefromthecrypt commented Sep 30, 2025 •

edited

Loading

codefromthecrypt commented Oct 1, 2025 •

edited

Loading

codefromthecrypt commented Oct 1, 2025 •

edited

Loading

axiomofjoy commented Oct 2, 2025 •

edited

Loading