Skip to content

Revise proposed architecture PR with fluent API class diagrams #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
197 changes: 171 additions & 26 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,16 +198,83 @@ This section shows comprehensive class diagrams for the proposed architecture. F

**Note:** The class diagrams are also not meant to be comprehensive in terms of any specific configuration keys or parameters which are or will be supported. For now, the relevant definitions don't include any specific parameter names or constants.

### Zoomed out view
### Overview: Fluent API for AI consumption

Below you find the zoomed out overview class diagram, looking at the two entrypoints for the largely decoupled APIs for:
This is a subset of the overall class diagram, purely focused on the fluent API for AI consumption.

- Consuming AI capabilities.
- This is what the vast majority of developers will use.
- Registering and implementing AI providers.
- This is what only developers that implement additional models or custom providers will use.
```mermaid
---
config:
class:
hideEmptyMembersBox: true
---
classDiagram
direction LR
namespace Ai {
class AiEntrypoint {
+prompt(?string $text) PromptBuilder$
+message(?string $text) MessageBuilder$
}

class PromptBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also have a bunch of getter methods so the object consumer can do it's thing, unless you're planning on having a separate DTO that goes the model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the model will receive this in a different way, see the detailed class diagram. For the most part, it's 1 or more Message objects, and an AiModelConfig object.

+withText(string $text) self
+withImageFile(File $file) self
+withAudioFile(File $file) self
+withVideoFile(File $file) self
+withFunctionResponse(FunctionResponse $functionResponse) self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It adds a function response as a message part, typically this needs to be used in a user message, in response to the LLM returning a function call message part.

Technically similar to the other methods here that add message parts.

+withMessageParts(...MessagePart $part) self
+withHistory(...Message $messages) self
+usingModel(AiModel $model) self
+usingSystemInstruction(string|MessagePart[]|Message $systemInstruction) self
+usingTemperature(float $temperature) self
+usingTopP(float $topP) self
+usingTopK(int $topK) self
+usingStopSequences(...string $stopSequences) self
+usingCandidateCount(int $candidateCount) self
+usingOutputMime(string $mimeType) self
+usingOutputSchema(array< string, mixed > $schema) self
+usingOutputModalities(...AiModality $modalities) self
+generateResult() GenerativeAiResult
+generateOperation() GenerativeAiOperation
+generateTextResult() GenerativeAiResult
+streamGenerateTextResult() Generator< GenerativeAiResult >
+generateImageResult() GenerativeAiResult
+convertTextToSpeechResult() GenerativeAiResult
+generateSpeechResult() GenerativeAiResult
+generateEmbeddingsResult() EmbeddingResult
+generateTextOperation() GenerativeAiOperation
+generateImageOperation() GenerativeAiOperation
+convertTextToSpeechOperation() GenerativeAiOperation
+generateSpeechOperation() GenerativeAiOperation
+generateEmbeddingsOperation() EmbeddingOperation
+generateText() string
+streamGenerateText() Generator< string >
+generateImage() File
+convertTextToSpeech() File
+generateSpeech() File
+generateEmbeddings() Embedding[]
}

class MessageBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a message as opposed to a prompt?

Copy link
Member Author

@felixarntz felixarntz Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Message is the main data transfer object to communicate between the user and the LLM. There are three types of Messages: user, model, and system messages.

A "prompt" in this SDK is not really an actual thing. It's a higher-level overarching term for what could be a single user message, or many user and model messages (in case of history), or a user message with a bunch of configuration arguments.

I used the name PromptBuilder to relate to that overarching term, but we could go with a different name. Maybe something like AiRequestBuilder could be used instead.

The reason there's a MessageBuilder in addition to the (what is called now) PromptBuilder is to easily be able to create Message objects, which is mostly relevant when using PromptBuilder::withHistory().

+usingRole(MessageRole $role) self
+withText(string $text) self
+withImageFile(File $file) self
+withAudioFile(File $file) self
+withVideoFile(File $file) self
+withFunctionCall(FunctionCall $functionCall) self
+withFunctionResponse(FunctionResponse $functionResponse) self
+withMessageParts(...MessagePart $part) self
+get() Message
}
}

AiEntrypoint .. PromptBuilder : creates
AiEntrypoint .. MessageBuilder : creates
```

Zoomed in views with detailed specifications for both of the APIs are found in the subsequent sections.
### Overview: Traditional method call API for AI consumption

This is a subset of the overall class diagram, purely focused on the traditional method call API for AI consumption.

```mermaid
---
Expand All @@ -219,23 +286,41 @@ classDiagram
direction LR
namespace Ai {
class AiEntrypoint {
+defaultRegistry() AiProviderRegistry
+isConfigured(AiProviderAvailability $availability) bool$
+generateResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+generateOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateTextResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+streamGenerateTextResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) Generator< GenerativeAiResult >$
+generateImageResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+textToSpeechResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+convertTextToSpeechResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+generateSpeechResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+generateEmbeddingsResult(string[]|Message[] $input, AiModel $model) EmbeddingResult$
+generateTextOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateImageOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+textToSpeechOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+convertTextToSpeechOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateSpeechOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateEmbeddingsOperation(string[]|Message[] $input, AiModel $model) EmbeddingOperation$
}
}
```

### Overview: API for provider registration and implementation

This is a subset of the overall class diagram, purely focused on the API for provider registration and implementation.

```mermaid
---
config:
class:
hideEmptyMembersBox: true
---
classDiagram
direction LR
namespace Ai {
class AiEntrypoint {
+defaultRegistry() AiProviderRegistry$
+isConfigured(AiProviderAvailability $availability) bool$
}
}
namespace Ai.Providers {
class AiProviderRegistry {
+registerProvider(string $className) void
Expand All @@ -251,7 +336,7 @@ direction LR
AiEntrypoint "1" o-- "1..*" AiProviderRegistry
```

### Class diagram zoomed in on AI consumption
### Details: Class diagram for AI consumption

```mermaid
---
Expand All @@ -263,22 +348,75 @@ classDiagram
direction LR
namespace Ai {
class AiEntrypoint {
+defaultRegistry() AiProviderRegistry
+prompt(?string $text) PromptBuilder$
+message(?string $text) MessageBuilder$
+defaultRegistry() AiProviderRegistry$
+isConfigured(AiProviderAvailability $availability) bool$
+generateResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+generateOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateTextResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+streamGenerateTextResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) Generator< GenerativeAiResult >$
+generateImageResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+textToSpeechResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+convertTextToSpeechResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+generateSpeechResult(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiResult$
+generateEmbeddingsResult(string[]|Message[] $input, AiModel $model) EmbeddingResult$
+generateTextOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateImageOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+textToSpeechOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+convertTextToSpeechOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateSpeechOperation(string|MessagePart|MessagePart[]|Message|Message[] $prompt, AiModel $model) GenerativeAiOperation$
+generateEmbeddingsOperation(string[]|Message[] $input, AiModel $model) EmbeddingOperation$
}

class PromptBuilder {
+withText(string $text) self
+withImageFile(File $file) self
+withAudioFile(File $file) self
+withVideoFile(File $file) self
+withFunctionResponse(FunctionResponse $functionResponse) self
+withMessageParts(...MessagePart $part) self
+withHistory(...Message $messages) self
+usingModel(AiModel $model) self
+usingSystemInstruction(string|MessagePart[]|Message $systemInstruction) self
+usingTemperature(float $temperature) self
+usingTopP(float $topP) self
+usingTopK(int $topK) self
+usingStopSequences(...string $stopSequences) self
+usingCandidateCount(int $candidateCount) self
+usingOutputMime(string $mimeType) self
+usingOutputSchema(array< string, mixed > $schema) self
+usingOutputModalities(...AiModality $modalities) self
+generateResult() GenerativeAiResult
+generateOperation() GenerativeAiOperation
+generateTextResult() GenerativeAiResult
+streamGenerateTextResult() Generator< GenerativeAiResult >
+generateImageResult() GenerativeAiResult
+convertTextToSpeechResult() GenerativeAiResult
+generateSpeechResult() GenerativeAiResult
+generateEmbeddingsResult() EmbeddingResult
+generateTextOperation() GenerativeAiOperation
+generateImageOperation() GenerativeAiOperation
+convertTextToSpeechOperation() GenerativeAiOperation
+generateSpeechOperation() GenerativeAiOperation
+generateEmbeddingsOperation() EmbeddingOperation
+generateText() string
+streamGenerateText() Generator< string >
+generateImage() File
+convertTextToSpeech() File
+generateSpeech() File
+generateEmbeddings() Embedding[]
}

class MessageBuilder {
+usingRole(MessageRole $role) self
+withText(string $text) self
+withImageFile(File $file) self
+withAudioFile(File $file) self
+withVideoFile(File $file) self
+withFunctionCall(FunctionCall $functionCall) self
+withFunctionResponse(FunctionResponse $functionResponse) self
+withMessageParts(...MessagePart $part) self
+get() Message
}
}
namespace Ai.Types {
class Message {
Expand Down Expand Up @@ -454,10 +592,17 @@ direction LR

AiEntrypoint .. Message : receives
AiEntrypoint .. MessagePart : receives
AiEntrypoint .. PromptBuilder : creates
AiEntrypoint .. MessageBuilder : creates
AiEntrypoint .. GenerativeAiResult : creates
AiEntrypoint .. EmbeddingResult : creates
AiEntrypoint .. GenerativeAiOperation : creates
AiEntrypoint .. EmbeddingOperation : creates
PromptBuilder .. GenerativeAiResult : creates
PromptBuilder .. EmbeddingResult : creates
PromptBuilder .. GenerativeAiOperation : creates
PromptBuilder .. EmbeddingOperation : creates
MessageBuilder .. Message : creates
Message "1" *-- "1..*" MessagePart
MessagePart "1" o-- "0..1" InlineFile
MessagePart "1" o-- "0..1" RemoteFile
Expand All @@ -484,7 +629,7 @@ direction LR
Result <|-- EmbeddingResult
```

### Class diagram zoomed in on AI provider registration and implementation
### Details: Class diagram for AI provider registration and implementation

```mermaid
---
Expand Down Expand Up @@ -539,8 +684,8 @@ direction LR
class AiImageGenerationModel {
+generateImageResult(Message[] $prompt) GenerativeAiResult
}
class AiTextToSpeechModel {
+textToSpeechResult(Message[] $prompt) GenerativeAiResult
class AiTextToSpeechConversionModel {
+convertTextToSpeechResult(Message[] $prompt) GenerativeAiResult
}
class AiSpeechGenerationModel {
+generateSpeechResult(Message[] $prompt) GenerativeAiResult
Expand All @@ -554,8 +699,8 @@ direction LR
class AiImageGenerationOperationModel {
+generateImageOperation(Message[] $prompt) GenerativeAiOperation
}
class AiTextToSpeechOperationModel {
+textToSpeechOperation(Message[] $prompt) GenerativeAiOperation
class AiTextToSpeechConversionOperationModel {
+convertTextToSpeechOperation(Message[] $prompt) GenerativeAiOperation
}
class AiSpeechGenerationOperationModel {
+generateSpeechOperation(Message[] $prompt) GenerativeAiOperation
Expand Down Expand Up @@ -619,7 +764,7 @@ direction LR
}
class ImageGenerationConfig {
}
class TextToSpeechConfig {
class TextToSpeechConversionConfig {
}
class SpeechGenerationConfig {
}
Expand Down Expand Up @@ -684,12 +829,12 @@ direction LR
<<interface>> WithEmbeddingOperations
<<interface>> AiTextGenerationModel
<<interface>> AiImageGenerationModel
<<interface>> AiTextToSpeechModel
<<interface>> AiTextToSpeechConversionModel
<<interface>> AiSpeechGenerationModel
<<interface>> AiEmbeddingGenerationModel
<<interface>> AiTextGenerationOperationModel
<<interface>> AiImageGenerationOperationModel
<<interface>> AiTextToSpeechOperationModel
<<interface>> AiTextToSpeechConversionOperationModel
<<interface>> AiSpeechGenerationOperationModel
<<interface>> AiEmbeddingGenerationOperationModel
<<interface>> WithHttpClient
Expand Down Expand Up @@ -721,17 +866,17 @@ direction LR
AiModelMetadata ..> AiFeature
AiModel <|-- AiTextGenerationModel
AiModel <|-- AiImageGenerationModel
AiModel <|-- AiTextToSpeechModel
AiModel <|-- AiTextToSpeechConversionModel
AiModel <|-- AiSpeechGenerationModel
AiModel <|-- AiEmbeddingGenerationModel
AiModel <|-- AiTextGenerationOperationModel
AiModel <|-- AiImageGenerationOperationModel
AiModel <|-- AiTextToSpeechOperationModel
AiModel <|-- AiTextToSpeechConversionOperationModel
AiModel <|-- AiSpeechGenerationOperationModel
AiModel <|-- AiEmbeddingGenerationOperationModel
GenerationConfig <|-- TextGenerationConfig
GenerationConfig <|-- ImageGenerationConfig
GenerationConfig <|-- TextToSpeechConfig
GenerationConfig <|-- TextToSpeechConversionConfig
GenerationConfig <|-- SpeechGenerationConfig
GenerationConfig <|-- EmbeddingGenerationConfig
```