-
Notifications
You must be signed in to change notification settings - Fork 13
Add evals to AI engineering #469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c-ehrlich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some big picture thoughts:
- I appreciate why we have the "Create/Measure/Observe/Iterate" workflow. But feels strange that there is no page called "Evals" IMO.
- Obviously not part of this PR, but I would LOVE a video at the top of the page
|
|
||
| The `Eval` function provides a simple, declarative way to define a test suite for your capability directly in your codebase. | ||
|
|
||
| The key parameters of the `Eval` function: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: need to better document configFlags
| // Define the evaluation | ||
| Eval('spam-classification', { | ||
| // Specify which flags this eval uses | ||
| configFlags: pickFlags('ticketClassification'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only defined / explained further down the page. I understand why, and don't really have a better solution, but still feels weird.
|
Closing in favor of #473, feel free to re-open if that's wrong |
Based on #465