generated from mintlify/starter
-
Notifications
You must be signed in to change notification settings - Fork 13
Revised AI engineering docs (2) #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dominicchapman
wants to merge
32
commits into
main
Choose a base branch
from
dominic/evals-plus-wider-edits-v2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
2ae1a63
initial eval docs
c-ehrlich a082b90
add note about instrumentation fn
c-ehrlich 7df0bdb
Stylistic fixes
manototh 0254557
Quick fixes
manototh 686a53e
Merge branch 'main' into evals-1
manototh 7b8bd25
Fixes
manototh 2251591
Add keywords
manototh 2c662b2
Restructure Measure page
manototh 95d4c5c
Implement review
manototh 55e6bf4
Refactor
manototh 3e3050c
Update measure.mdx
manototh 89ce5ca
Update measure.mdx
manototh ad26f30
docs: concepts and definitions
dominicchapman d6a1130
docs: update overview
dominicchapman 55703d9
docs: new evaluate section
dominicchapman c6d33c1
docs: create, evaluate/overview, remove measure from docs.json
dominicchapman 9a26814
docs: revise iterate
dominicchapman 528cf1f
docs: refinement
dominicchapman aad93a6
Update ai-engineering/concepts.mdx
dominicchapman f62b30d
docs: explain benefits of pickFlags
dominicchapman 58becf6
docs: less focus on temperature
dominicchapman 687548f
docs: remove duplicated content
dominicchapman cd7856e
docs: remove 'reference' from concepts
dominicchapman 42890d6
docs: add model example to enum
dominicchapman 1b6c8e7
docs: remove watch mode
dominicchapman 3b8e48f
docs: remove marketing fluff
dominicchapman e6e5c6c
docs: evaluator > evaluation
dominicchapman 93bb44b
docs: default flags to production config
dominicchapman 4ff0249
docs: update concepts for completeness
dominicchapman aad74a6
docs: run-id feedback
dominicchapman 6875722
Merge branch 'main' into dominic/evals-plus-wider-edits-v2
dominicchapman fd2ac49
update `createAppScope` import
c-ehrlich File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,133 +1,159 @@ | ||
| --- | ||
| title: "Create" | ||
| description: "Learn how to create and define AI capabilities using structured prompts and typed arguments with Axiom." | ||
| keywords: ["ai engineering", "AI engineering", "create", "prompt", "template", "schema"] | ||
| description: "Build AI capabilities using any framework, with best support for TypeScript-based tools." | ||
| keywords: ["ai engineering", "create", "prompt", "capability", "vercel ai sdk"] | ||
| --- | ||
|
|
||
| import { Badge } from "/snippets/badge.jsx" | ||
| import { definitions } from '/snippets/definitions.mdx' | ||
|
|
||
| The **Create** stage is about defining a new AI <Tooltip tip={definitions.Capability}>capability</Tooltip> as a structured, version-able asset in your codebase. The goal is to move away from scattered, hard-coded string prompts and toward a more disciplined and organized approach to prompt engineering. | ||
| Building an AI <Tooltip tip={definitions.Capability}>capability</Tooltip> starts with prototyping. You can use whichever framework you prefer. Axiom is focused on helping you evaluate and observe your capabilities rather than prescribing how to build them. | ||
|
|
||
| TypeScript-based frameworks like Vercel’s [AI SDK](https://sdk.vercel.ai) do integrate most seamlessly with Axiom’s tooling today, but that’s likely to evolve over time. | ||
|
|
||
| ## Build your capability | ||
|
|
||
| Define your capability using your framework of choice. Here’s an example using Vercel's [AI SDK](https://ai-sdk.dev/), which includes [many examples](https://sdk.vercel.ai/examples) covering different capability design patterns. Popular alternatives like [Mastra](https://mastra.ai) also exist. | ||
|
|
||
| ```ts src/lib/capabilities/classify-ticket.ts expandable | ||
dominicchapman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import { generateObject } from 'ai'; | ||
| import { openai } from '@ai-sdk/openai'; | ||
| import { wrapAISDKModel } from 'axiom/ai'; | ||
| import { z } from 'zod'; | ||
|
|
||
| export async function classifyTicket(input: { | ||
| subject?: string; | ||
| content: string | ||
| }) { | ||
| const result = await generateObject({ | ||
| model: wrapAISDKModel(openai('gpt-4o-mini')), | ||
| messages: [ | ||
| { | ||
| role: 'system', | ||
| content: 'Classify support tickets as: question, bug_report, or feature_request.', | ||
| }, | ||
| { | ||
| role: 'user', | ||
| content: input.subject | ||
| ? `Subject: ${input.subject}\n\n${input.content}` | ||
| : input.content, | ||
| }, | ||
| ], | ||
| schema: z.object({ | ||
| category: z.enum(['question', 'bug_report', 'feature_request']), | ||
| confidence: z.number().min(0).max(1), | ||
| }), | ||
| }); | ||
|
|
||
| ### Defining a capability as a prompt object | ||
| return result.object; | ||
| } | ||
| ``` | ||
|
|
||
| In Axiom AI engineering, every capability is represented by a `Prompt` object. This object serves as the single source of truth for the capability’s logic, including its messages, metadata, and the schema for its arguments. | ||
| The `wrapAISDKModel` function instruments your model calls for Axiom’s observability features. Learn more in the [Observe](/ai-engineering/observe) section. | ||
|
|
||
| For now, these `Prompt` objects can be defined and managed as TypeScript files within your own project repository. | ||
| ## Gather reference examples | ||
|
|
||
| A typical `Prompt` object looks like this: | ||
| As you prototype, collect examples of inputs and their correct outputs. | ||
|
|
||
| ```ts | ||
| const referenceExamples = [ | ||
| { | ||
| input: { | ||
| subject: 'How do I reset my password?', | ||
| content: 'I forgot my password and need help.' | ||
| }, | ||
| expected: { category: 'question' }, | ||
| }, | ||
| { | ||
| input: { | ||
| subject: 'App crashes on startup', | ||
| content: 'The app immediately crashes when I open it.' | ||
| }, | ||
| expected: { category: 'bug_report' }, | ||
| }, | ||
| ]; | ||
| ``` | ||
|
|
||
| ```ts /src/prompts/email-summarizer.prompt.ts | ||
| These become your ground truth for evaluation. Learn more in the [Evaluate](/ai-engineering/evaluate/overview) section. | ||
|
|
||
| ## Structured prompt management | ||
|
|
||
| <Note> | ||
| The features below are experimental. Axiom’s current focus is on the evaluation and observability stages of the AI engineering workflow. | ||
| </Note> | ||
|
|
||
| For teams wanting more structure around prompt definitions, Axiom’s SDK includes experimental utilities for managing prompts as versioned objects. | ||
|
|
||
| ### Define prompts as objects | ||
|
|
||
| Represent capabilities as structured `Prompt` objects: | ||
|
|
||
| ```ts src/prompts/ticket-classifier.prompt.ts | ||
dominicchapman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import { | ||
| experimental_Type, | ||
| type experimental_Prompt | ||
| } from 'axiom/ai'; | ||
|
|
||
| export const emailSummarizerPrompt = { | ||
| name: "Email Summarizer", | ||
| slug: "email-summarizer", | ||
| export const ticketClassifierPrompt = { | ||
| name: "Ticket Classifier", | ||
| slug: "ticket-classifier", | ||
| version: "1.0.0", | ||
| model: "gpt-4o", | ||
| model: "gpt-4o-mini", | ||
| messages: [ | ||
| { | ||
| role: "system", | ||
| content: | ||
| `Summarize emails concisely, highlighting action items. | ||
| The user is named {{ username }}.`, | ||
| content: "Classify support tickets as: {{ categories }}", | ||
| }, | ||
| { | ||
| role: "user", | ||
| content: "Please summarize this email: {{ email_content }}", | ||
| content: "{{ ticket_content }}", | ||
| }, | ||
| ], | ||
| arguments: { | ||
| username: experimental_Type.String(), | ||
| email_content: experimental_Type.String(), | ||
| categories: experimental_Type.String(), | ||
| ticket_content: experimental_Type.String(), | ||
| }, | ||
| } satisfies experimental_Prompt; | ||
| ``` | ||
|
|
||
| ### Strongly-typed arguments with `Template` | ||
|
|
||
| To ensure that prompts are used correctly, the Axiom’s AI SDK includes a `Template` type system (exported as `Type`) for defining the schema of a prompt’s `arguments`. This provides type safety, autocompletion, and a clear, self-documenting definition of what data the prompt expects. | ||
|
|
||
| The `arguments` object uses `Template` helpers to define the shape of the context: | ||
|
|
||
| ```typescript /src/prompts/report-generator.prompt.ts | ||
| import { | ||
| experimental_Type, | ||
| type experimental_Prompt | ||
| } from 'axiom/ai'; | ||
|
|
||
| export const reportGeneratorPrompt = { | ||
| // ... other properties | ||
| arguments: { | ||
| company: experimental_Type.Object({ | ||
| name: experimental_Type.String(), | ||
| isActive: experimental_Type.Boolean(), | ||
| departments: experimental_Type.Array( | ||
| experimental_Type.Object({ | ||
| name: experimental_Type.String(), | ||
| budget: experimental_Type.Number(), | ||
| }) | ||
| ), | ||
| }), | ||
| priority: experimental_Type.Union([ | ||
| experimental_Type.Literal("high"), | ||
| experimental_Type.Literal("medium"), | ||
| experimental_Type.Literal("low"), | ||
| ]), | ||
| }, | ||
| } satisfies experimental_Prompt; | ||
| ### Type-safe arguments | ||
|
|
||
| The `experimental_Type` system provides type safety for prompt arguments: | ||
|
|
||
| ```ts | ||
| arguments: { | ||
| user: experimental_Type.Object({ | ||
| name: experimental_Type.String(), | ||
| preferences: experimental_Type.Array(experimental_Type.String()), | ||
| }), | ||
| priority: experimental_Type.Union([ | ||
| experimental_Type.Literal("high"), | ||
| experimental_Type.Literal("medium"), | ||
| experimental_Type.Literal("low"), | ||
| ]), | ||
| } | ||
| ``` | ||
|
|
||
| You can even infer the exact TypeScript type for a prompt’s context using the `InferContext` utility. | ||
|
|
||
| ### Prototyping and local testing | ||
| ### Local testing | ||
|
|
||
| Before using a prompt in your application, you can test it locally using the `parse` function. This function takes a `Prompt` object and a `context` object, rendering the templated messages to verify the output. This is a quick way to ensure your templating logic is correct. | ||
| Test prompts locally before using them: | ||
|
|
||
| ```typescript | ||
| ```ts | ||
| import { experimental_parse } from 'axiom/ai'; | ||
| import { | ||
| reportGeneratorPrompt | ||
| } from './prompts/report-generator.prompt'; | ||
|
|
||
| const context = { | ||
| company: { | ||
| name: 'Axiom', | ||
| isActive: true, | ||
| departments: [ | ||
| { name: 'Engineering', budget: 500000 }, | ||
| { name: 'Marketing', budget: 150000 }, | ||
| ], | ||
| }, | ||
| priority: 'high' as const, | ||
| }; | ||
|
|
||
| // Render the prompt with the given context | ||
| const parsedPrompt = await experimental_parse( | ||
| reportGeneratorPrompt, { context } | ||
| ); | ||
|
|
||
| console.log(parsedPrompt.messages); | ||
| // [ | ||
| // { | ||
| // role: 'system', | ||
| // content: 'Generate a report for Axiom.\nCompany Status: Active...' | ||
| // } | ||
| // ] | ||
| ``` | ||
|
|
||
| ### Managing prompts with Axiom | ||
| const parsed = await experimental_parse(ticketClassifierPrompt, { | ||
| context: { | ||
| categories: 'question, bug_report, feature_request', | ||
| ticket_content: 'How do I reset my password?', | ||
| }, | ||
| }); | ||
|
|
||
| To enable more advanced workflows and collaboration, Axiom is building tools to manage your prompt assets centrally. | ||
| console.log(parsed.messages); | ||
| ``` | ||
|
|
||
| * <Badge>Coming soon</Badge> The `axiom` CLI will allow you to `push`, `pull`, and `list` prompt versions directly from your terminal, synchronizing your local files with the Axiom platform. | ||
| * <Badge>Coming soon</Badge> The SDK will include methods like `axiom.prompts.create()` and `axiom.prompts.load()` for programmatic access to your managed prompts. This will be the foundation for A/B testing, version comparison, and deploying new prompts without changing your application code. | ||
| These utilities help organize prompts in your codebase. Centralized prompt management and versioning features may be added in future releases. | ||
|
|
||
| ### What’s next? | ||
| ## What's next? | ||
|
|
||
| Now that you’ve created and structured your capability, the next step is to measure its quality against a set of known good examples. | ||
| Once you have a working capability and reference examples, systematically evaluate its performance. | ||
|
|
||
| Learn more about this step of the AI engineering workflow in the [Measure](/ai-engineering/measure) docs. | ||
| To learn how to set up and run evaluations, see [Evaluate](/ai-engineering/evaluate/overview). | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.