[Platform] Standardise token usage #311

junaidbinfarooq · 2025-08-13T20:26:23Z

Q	A
Bug fix?	no
New feature?	yes
Docs?	yes
Issues	#208
License	MIT

Changes proposed:

The PR aims to standardize the token usage extraction and its availability in the AI bundle. Essentially, the following is done:

Makes token usage explicit by adding a dedicated dto for token information
Populates the token usage dto inside different token output processors and then adds it to the metadata object

OskarStark · 2025-08-13T20:41:29Z

examples/openai/token-metadata.php

 $result = $agent->call($messages, [
    'max_tokens' => 500, // specific options just for this call
 ]);

-$metadata = $result->getMetadata();
+if (null === $tokenUsage = $result->getTokenUsage()) {


Can this happen?

Yes, if you mean null === $tokenUsage, when token_usage is marked false in the config, as far as I can see.
Edit: Here it is unnecessary.

OskarStark · 2025-08-13T20:43:54Z

src/platform/src/Result/TokenUsage/TokenUsage.php

+final class TokenUsage implements \JsonSerializable
+{
+    public function __construct(
+        public ?int $promptTokens = null,


Suggested change

public ?int $promptTokens = null,

public ?int $prompt = null,

Maybe remove all the Token suffix here? They feel superfluous

No problem.
I would not have kept the suffix if the object were simply named something like Tokens, but I thought TokenUsage is a better name. Or perhaps the DTO should be named as Tokens.

To be discussed with @chr-hertel

I did remove the suffixes BTW.

Thanks, not sure now that this was the right decision 😅

TokenUsage for the class name and tokens as the suffixes for the property names sounds more logical to me.
Or conversely, Tokens for the class name and no suffixes for the property names would also be a good option.

TokenUsage and tokens as suffix sound best to me

🤦‍♂️ 😄

chr-hertel

Sorry, but we need to slim this down a bit - in general smaller PRs, that build on top of each other are favorable to me. so let's break it down like you did:

Makes token usage explicit
I like having the value object, we should definitely have that 👍
But I'm not in favor of having the getTokenUsage() directly on the result since the result is a high lever abstraction and token usage is quite specific to remote inference of GenAI models - not all models have that. let's keep it in the metadata please.

Adds support to automatically register token usage extractors so that the token usage information is available in the result when a model is invoked
That's a bit misleading since token usage was already sitting on an extension point => OutputProcessor and your design decision to use that extension point to add another extension point doesn't really resonate with me. why is the new extension point needed? it feels like we're just adding them because we can.

Makes the token usage auto-registration configurable
For me this is too early, i don't think that a use-case, do you? how often do we see the case for someone to bring their own token usage extractor.

all in all, let's start here please with the TokenUsage value object in the metadata class - that's an easy step we see the same picture 👍

edit: another PR could be to fix that issue #208 by registering the corresponding output processor based on the config setting

junaidbinfarooq · 2025-08-15T11:56:49Z

@chr-hertel

Makes token usage explicit
I like having the value object, we should definitely have that 👍
But I'm not in favor of having the getTokenUsage() directly on the result since the result is a high lever abstraction and token usage is quite specific to remote inference of GenAI models - not all models have that. let's keep it in the metadata please.

Hmm, I initially thought about keeping it in metadata, but having it inside the Result object also made sense since an API should send us this information. I, so far, have seen such information in the responses of whatever APIs I tried, but perhaps you are right, and other models out there do not provide such information.

Adds support to automatically register token usage extractors so that the token usage information is available in the result when a model is invoked
That's a bit misleading since token usage was already sitting on an extension point => OutputProcessor and your design decision to use that extension point to add another extension point doesn't really resonate with me. why is the new extension point needed? it feels like we're just adding them because we can.

Nope, the idea, instead, was to introduce an extension point specific to token usage extraction and exposure. This would also simplify for bridges to their own implentations of the extractor interface if needed.

Makes the token usage auto-registration configurable
For me this is too early, i don't think that a use-case, do you? how often do we see the case for someone to bring their own token usage extractor.

So far, in this repository, I have found three such extractors (currently TokenOutputProcessor classes), including the new Google Vertex AI bridge, and all three of them perform the extraction differently, albeit not entirely. As you also rightly pointed out above, token usage is something specific to GenAI models, and not all of them have it. Not all of them will expose the data in a similar fashion, either.

junaidbinfarooq · 2025-08-15T14:26:22Z

@OskarStark @chr-hertel
I have reverted the previous changes and only kept the token usage information. Also, edited the PR description.
Please review the changes.

src/platform/src/Result/Metadata/TokenUsage.php

OskarStark · 2025-08-15T16:13:11Z

src/platform/tests/Bridge/OpenAi/TokenOutputProcessorTest.php

@@ -70,8 +72,11 @@ public function testItAddsRemainingTokensToMetadata()
        $processor->processOutput($output);

        $metadata = $output->result->getMetadata();
+        $tokenUsage = $metadata->get('token_usage');


What about having a dedicated setTokenUsage/getTokenUsage on the metadata object?

Not sure about this.
Metadata object already has a lot of methods, and each of them is generic in nature as far as I can see, and doesn't concern any specific model characteristic.

src/platform/src/Result/Metadata/TokenUsage.php

chr-hertel

almost done i'd say - thanks already!

- Makes token usage explicit - Adds support to automatically register token usage extractors so that the token usage information is available in the result when a model is invoked - Makes the token usage auto-registration configurable

This reverts commit 1f1e6be.

- Makes token usage explicit by adding a dedicated dto for token information - Populates the token usage dto inside different token output processors and then adds it to metadata object

- Code refactor

junaidbinfarooq · 2025-08-15T21:03:43Z

almost done i'd say - thanks already!

I am working on autoconfiguring token usage processors inside the AI bundle and based that branch on this one.
A PR will arrive after this one is finished.

chr-hertel

Good to merge from my end - @OskarStark?

junaidbinfarooq requested review from chr-hertel, Nyholm and OskarStark as code owners August 13, 2025 20:26

carsonbot added Feature New feature Platform Issues & PRs about the AI Platform component Status: Needs Review labels Aug 13, 2025

OskarStark reviewed Aug 13, 2025

View reviewed changes

junaidbinfarooq force-pushed the feat/standardize-token-usage branch 2 times, most recently from 7a34c26 to d771023 Compare August 13, 2025 21:33

junaidbinfarooq requested a review from OskarStark August 13, 2025 21:39

junaidbinfarooq force-pushed the feat/standardize-token-usage branch 2 times, most recently from 75e98c8 to 048ac63 Compare August 13, 2025 21:49

junaidbinfarooq mentioned this pull request Aug 14, 2025

[Bundle] Activate TokenUsage output processors in the bundle #208

Open

chr-hertel requested changes Aug 14, 2025

View reviewed changes

carsonbot added Status: Needs Work and removed Status: Needs Review labels Aug 14, 2025

junaidbinfarooq force-pushed the feat/standardize-token-usage branch from 048ac63 to 1f1e6be Compare August 15, 2025 11:24

carsonbot added Status: Needs Review and removed Status: Needs Work labels Aug 15, 2025

junaidbinfarooq requested a review from chr-hertel August 15, 2025 14:24

OskarStark reviewed Aug 15, 2025

View reviewed changes

junaidbinfarooq force-pushed the feat/standardize-token-usage branch from e33ba4a to 069b4c3 Compare August 15, 2025 18:24

junaidbinfarooq requested a review from OskarStark August 15, 2025 19:54

chr-hertel reviewed Aug 15, 2025

View reviewed changes

src/platform/src/Result/Metadata/TokenUsage.php Outdated Show resolved Hide resolved

chr-hertel reviewed Aug 15, 2025

View reviewed changes

src/platform/src/Result/Metadata/TokenUsage.php Outdated Show resolved Hide resolved

chr-hertel requested changes Aug 15, 2025

View reviewed changes

carsonbot added Status: Needs Work and removed Status: Needs Review labels Aug 15, 2025

junaidbinfarooq added 4 commits August 16, 2025 02:28

feat(platform): Standardize token usage

23eb071

- Makes token usage explicit - Adds support to automatically register token usage extractors so that the token usage information is available in the result when a model is invoked - Makes the token usage auto-registration configurable

Revert "feat(platform): Standardize token usage"

e8b5bfe

This reverts commit 1f1e6be.

feat(platform): Standardize token usage

e2e9a04

- Makes token usage explicit by adding a dedicated dto for token information - Populates the token usage dto inside different token output processors and then adds it to metadata object

refactor(platform): Standardize token usage

21976bb

- Code refactor

junaidbinfarooq force-pushed the feat/standardize-token-usage branch from 069b4c3 to 21976bb Compare August 15, 2025 21:00

carsonbot added Status: Needs Review and removed Status: Needs Work labels Aug 15, 2025

chr-hertel approved these changes Aug 15, 2025

View reviewed changes

carsonbot added Status: Reviewed and removed Status: Needs Review labels Aug 15, 2025

junaidbinfarooq requested a review from chr-hertel August 16, 2025 07:36

carsonbot added Status: Needs Review and removed Status: Reviewed labels Aug 16, 2025

	public ?int $promptTokens = null,
	public ?int $prompt = null,

Uh oh!

[Platform] Standardise token usage #311

Are you sure you want to change the base?

[Platform] Standardise token usage #311

Uh oh!

Conversation

junaidbinfarooq commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes proposed:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

junaidbinfarooq Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

junaidbinfarooq Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chr-hertel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

junaidbinfarooq commented Aug 15, 2025

Uh oh!

junaidbinfarooq commented Aug 15, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chr-hertel left a comment

Choose a reason for hiding this comment

Uh oh!

junaidbinfarooq commented Aug 15, 2025

Uh oh!

chr-hertel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

junaidbinfarooq commented Aug 13, 2025 •

edited

Loading

junaidbinfarooq Aug 13, 2025 •

edited

Loading

junaidbinfarooq Aug 13, 2025 •

edited

Loading

chr-hertel left a comment •

edited

Loading