Releases: cequence-io/openai-scala-client
Version 1.2.0
1. ResponsesAPI
- Unified function/endpoint combining chat simplicity with tool use and state management.
- Out-of-the-box tool support - It natively supports first-party tools like
web_search,file_search, andcomputer_use, enabling you to invoke these capabilities without additional orchestration. - Built-in multi-turn conversation chaining- Use the
previous_response_idparameter to link requests into a chain of turns, and theinstructionsparameter to inject or override system/developer messages on a per-call basis. - Multimodal input and output - Beyond text, the API accepts images and audio in the same request, letting you build fully multimodal, tool-augmented experiences in a single call.
2. Core and OpenAI Enhancements
- JSON mode handling improvements and fallback json-repair implementation (port of
json-repairby @mangiucugna) - New models:
o3,o4-mini,gpt-4.1, andgpt-4.5series - Web search support (
gpt-4o-search-preview) - Chat completion parameters expanded (
store,reasoning_effort,service_tier,parallel_tool_calls,metadata) - Streaming and non-streaming IO conversion adapters developed and enhanced
- Token counting updated (
jtokkit v1.1.0) - Usage analytics improved
3. Anthropic Platform Enhancements
- Thinking and streaming settings integration
- Claude 3.7 Sonnet (Vanilla and via Bedrock)
- Citations handling, text blocks encoding improvements
- Caching support
- Enhanced token-limit error handling and mapped Anthropic to OpenAI exceptions
- A ton of new examples (also for Vision and PDF processing)
4. Google Gemini Integration
- New Google Gemini module and models introduced (Gemini 2.5 / 2.0 Pro and Flash)
- Gemini JSON schema handling improved, including OpenAI wrapper integration
- System message caching, domain content management, and usage tracking adjustments
- Btw. Google Vertex now also supports JSON schema mode
5. Perplexity Sonar Integration
- New Perplexity Sonar module and models introduced (sonar-deep-research, reasoning-pro, sonar-pro, etc.)
- Sonar JSON and regex response support, and citations formatting/handling
- OpenAI chat completion wrappers
6. Other Providers: Deepseek, Groq, Grok, FireworksAI, and Novita
- Groq JSON handling unified and adjusted, with
deepseek-r1-distill-llama-70bintegration - JSON schema handling for Grok models
- FireworksAI improvements (document inlining), Deepseek model integrations
- Message conversions, filtering thinking tokens, reasoning effort examples
- Llama 4 family
- New Deepseek models (deepseek-r1, DeepSeek-R1 distill) across providers (FireworksAI, Groq, Together AI), plus other models such as Phi-3-vision-128k-instruct, Deepseek-v2-lite-chat, and Llama-3.3-70b
- New chat completion provider:
Novita— Welcome to the family!
7. General Project Setup and CI/CD
- Build setup adjustments (build.sbt registrations, env helpers)
- GitHub CI - upload-artifact version bump (to v4)
- Example datasets added (e.g., norway_wiki dump), imports optimized
- README extended with more examples
Version 1.1.2
-
Amazon Bedrock support: Anthropic models, payload encoding, and AWS stream decoders
-
O1 models support: system/developer messages, JSON mode, etc.
-
New non-OpenAI models: Deepseek v3, Gemini 2.0 Flash (w. thinking), and Llama3
-
Adapters: chat completion intercept (for logging and benchmarking)
-
Other: new examples, Google Vertex upgrade, custom
parseJson(forcreateChatCompletionWithJSON)
Version 1.1.1
Sure, here's the proofread version:
-
New Models:
- Claude 3.5 Sonnet / Haiku, Llama 3.2, grok-beta, gpt-4o-2024-11-20, etc.
-
New Providers:
- Grok and Deepseek (with examples)
-
Enhanced Anthropic Integration:
- Message/block caching
- System prompt generalized as a "message"
- PDF block processing
-
Better Support for JSON:
createChatCompletionWithJSON
-
Adapters:
- Failover models support for chat completion (
createChatCompletionWithFailover) - Core adapters/wrappers abstracted and relocated to
ws-client
- Failover models support for chat completion (
-
Fixes:
- Scala 2.13 JSON schema serialization
-
Refactoring:
ChatCompletionBodyMaker(removedWSClientdependency)- Removed
org.joda.time.DateTime
Version 1.1.0
- Support for O1 Models: Introduced special handling for settings and conversion of system messages.
- New Endpoints: Added endpoints for runs, run steps, vector stores, vector store files, and vector store file batches.
- Structured Output: Enhanced output with JSON schema and introduced experimental reflective inference from a case class.
- New Providers: Added support for Google Vertex AI, TogetherAI, Cerebras, and Mistral.
- Fixes: Resolved issues related to fine-tune jobs format, handling of log-probs with special characters, etc.
- Examples/Scenarios: A plethora of new examples!
Lastly, a huge thanks 🙏 to our contributors: @bburdiliak, @branislav-burdiliak, @vega113, and @SunPj!
Version 1.0.0
API Updates
- New functions/endpoints for assistants, threads, messages, and files (thanks @branislav-burdiliak) 🎉
- Fine-tuning API updated (checkpoints, explicit dimensions, etc.)
- Chat completion with images (GPT vision), support for logprobs
- Audio endpoint updates - transcription, translation, speech
- New message hierarchy
- Support for tool calling
- New token count subproject (thanks @branislav-burdiliak , @piotrkuczko)
New Models and Providers
- Improved support for Azure OpenAI services
- Support for Azure AI models
- Support for Groq, Fireworks, OctoAI, Ollama, and FastChat (with examples) 🎉
- New Anthropic client project with OpenAI-compatible chat completion adapter (thanks @branislav-burdiliak) 🎉
- NonOpenAIModelId const object holding the most popular non-openai models introduced (e.g. Llama3-70b, Mixtral-8x22B)
Factories and Adapters
- Full, core, and chat completion factories refactored to support easier and more flexible streaming extension
- Service wrappers refactored and generalized to adapters
- Route adapter allowing the use of several providers (e.g., Groq and Anthropic) alongside OpenAI models for chat-completions 🎉
- Other adapters that have been added include round robin/random order load balancing, logging, retry, chat-to-completion, and settings/messages adapter
Bug Fixes and Improvements
- Fixed null bytes handling in TopLogprobInfo JSON format
- WS request - handling slash at the end of URLs
- Made "data:" prefix handling in
WSStreamRequestHelpermore robust MultipartWritable- added extension-implied content type for Azure file upload- Fixed chat completion's response_format_type
Examples and Tests
- New example project demonstrating usage of most of the features, providers, and adapters (more than 50)
- Tests for token counts and JSON (de)serialization
Breaking Changes/Migrations:
- Instead of the deprecated
MessageSpec, use typed{System, User, Assistant, Tool}Message - Instead of
createChatFunCompletionwithFunctionSpec(s), migrate tocreateChatToolCompletionusingToolSpec(s) - Use a new factory
OpenAIChatCompletionServiceFactorywith a custom URL and auth headers when only chat-completion is supported by a provider - Migrate streaming service creation from
OpenAIServiceStreamedFactorytoOpenAIStreamedServiceFactoryorOpenAIServiceFactory.withStreaming()(required import:OpenAIStreamedServiceImplicits._) - Note that
OpenAIServiceStreamedExtra.listFineTuneEventsStreamedhas been removed - Migrate
OpenAIMultiServiceAdapter.ofRoundRobinandofRandomOrdertoadapters.roundRobinandadapters.randomOrder(where adapter is, e.g.,OpenAIServiceAdapters.forFullService) - Migrate
OpenAIRetryServiceAdaptertoadapters.retry CreateImageSettings->CreateImageEditSettings
Version 0.5.0
- Fine-tuning functions/endpoints updated to match the latest API:
- Root URL changed to
/fine_tuning/jobs. - Fine tune job holder (
FineTuneJob) adjusted - e.g.finished_atadded,training_filevstraining_files(⚠️ Breaking Changes). OpenAIService.createFineTune- settings (CreateFineTuneSettings) adapted - all attributes exceptn_epochs,model, andsuffixdropped: e.g.batch_size,learning_rate_multiplier,prompt_loss_weight(⚠️ Breaking Changes).OpenAIService.listFineTunes- new optional params:afterandlimitOpenAIService.listFineTuneEvents- new optional params:afterandlimit
- Root URL changed to
- Azure service factory functions fixed:
forAzureWithApiKeyandforAzureWithAccessToken. - New models added:
gpt-3.5-turbo-instruct(with 0914 snapshot),davinci-002, andbabbage-002. - Deprecations:
OpenAIService.createEditandOpenAIServiceStreamedExtra.listFineTuneEventsStreamed(not supported anymore). - Links to the official OpenAI documentation updated.
Version 0.4.1
- Retries:
- RetryHelpers trait and retry adapter implementing non-blocking retries introduced (thanks @phelps-sg).
- RetryHelpers - fixing exponential delay, logging, fun(underlying) call, removing unused implicits.
- New exceptions introduced (
OpenAIScalaUnauthorizedException,OpenAIScalaRateLimitException, etc) with error handling/catching plus "registering" those that should be automatically retried. ⚠️ Old/deprecatedOpenAIRetryServiceAdapterremoved in favor of a new adapter in the client module (migrate if necessary).
- Support for "OpenAI-API-compatible" providers such as FastChat(Vicuna, Alpaca, LLaMA, Koala, fastchat-t5-3b-v1.0, mpt-7b-chat, etc), Azure, or any other similar service with a custom URL. Explicit factory methods for Azure (with API Key or Access Token) provided.
- Scalafmt and Scalafix configured and applied to the entire source base.
- Fixing test dependencies (e.g.,
akka-actor-testkit-typed,scala-java8-compat) and replacingmockito-scala-scalatestwithscalatestplusmockito. - Configuring GitHub CI with crossScalaVersions, newer Java, etc. (thanks @phelps-sg)
Version 0.4.0
- Function call support for the chat completion with a json output:
- Function
createChatFunCompletionintroduced with functions defined by a new data holderFunctionSpec - To stay compatible with the standard
createChatCompletionfunction the existingMessageSpechas not been changed - Important
⚠️ : to harmonize (redundant) response classes,ChatMessagewas replaced by the identicalMessageSpecclass
- Function
- New
gpt-3.5-turboandgpt-4models for the snapshot0613added toModelId. Old models deprecated. OpenAIService.closemethod declared with parentheses (thanks @phelps-sg).- Scala
2.12bumped to2.12.18and Scala2.13to2.13.11. - sbt version bumped to
1.9.0(thanks @phelps-sg) Commandrenamed toEndPointandTagrenamed toParam.OpenAIMultiServiceAdapter-ofRotationTypedeprecated in favor of a new nameofRoundRobinTypeOpenAIMultiServiceAdapter-ofRandomAccessTypedeprecated in favor ofofRandomOrderType.
Version 0.3.3
- Scala 3 compatibility refactorings -
JsonFormats, Scala-Guice lib upgrade, Guice modules, Enums migrations OpenAIService.closefunction for the underlying ws client exposed- OpenAI service wrapper with two adapters introduced: 1. Multi-service (load distribution) - rotation or random order, 2. Retry service - custom number of attempts, wait on failure, and logging
Version 0.3.2
- Removing deprecated/unused command/endpoint (
engines) and settings (ListFineTuneEventsSettings) - New exception
OpenAIScalaTokenCountExceededExceptionintroduced - thrown for completion or chat completion when the number of tokens needed is higher than allowed - Migrating
CommandandTagenums to sealed case objects to simplify Scala 3 compilation - Resolving the
scala-java8-compateviction problem for Scala 2.13 and Scala 3