Structured extraction in Python, powered by OpenAI's function calling api, designed for simplicity, transparency, and control.
This library is built to interact with openai's function call api from python code, with python structs / objects. It's designed to be intuitive, easy to use, but give great visibily in how we call openai.
This library depends on Pydantic and OpenAI that's all.
To get started with OpenAI Function Call, you need to install it using pip. Run the following command in your terminal:
$ pip install instructorTo simplify your work with OpenAI models and streamline the extraction of Pydantic objects from prompts, we offer a patching mechanism for the `ChatCompletion`` class. Here's a step-by-step guide:
First, import the required libraries and apply the patch function to the OpenAI module. This exposes new functionality with the response_model parameter.
import openai
from pydantic import BaseModel
from instructor import patch
patch()Create a Pydantic model to define the structure of the data you want to extract. This model will map directly to the information in the prompt.
class UserDetail(BaseModel):
name: str
age: intUse the openai.ChatCompletion.create method to send a prompt and extract the data into the Pydantic object. The response_model parameter specifies the Pydantic model to use for extraction.
user: UserDetail = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
response_model=UserDetail,
messages=[
{"role": "user", "content": "Extract Jason is 25 years old"},
]
)You can then validate the extracted data by asserting the expected values. By adding the type things you also get a bunch of nice benefits with your IDE like spell check and auto complete!
assert user.name == "Jason"
assert user.age == 25If you want more control than just passing a single class we can use the OpenAISchema which extends BaseModel.
This quick start guide contains the follow sections:
- Defining a schema
- Adding Additional Prompting
- Calling the ChatCompletion
- Deserializing back to the instance
OpenAI Function Call allows you to leverage OpenAI's powerful language models for function calls and schema extraction. This guide provides a quick start for using OpenAI Function Call.
To begin, let's define a schema using OpenAI Function Call. A schema describes the structure of the input and output data for a function. In this example, we'll define a simple schema for a User object:
from instructor import OpenAISchema
class UserDetails(OpenAISchema):
name: str
age: intIn this schema, we define a UserDetails class that extends OpenAISchema. We declare two fields, name and age, of type str and int respectively.
To enhance the performance of the OpenAI language model, you can add additional prompting in the form of docstrings and field descriptions. They can provide context and guide the model on how to process the data.
!!! note Using patch
these docstrings and fields descriptions are powered by pydantic.BaseModel so they'll work via the patching approach as well.
from instructor import OpenAISchema
from pydantic import Field
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: intIn this updated schema, we use the Field class from pydantic to add descriptions to the name field. The description provides information about the field, giving even more context to the language model.
!!! note "Code, schema, and prompt"
We can run openai_schema to see exactly what the API will see, notice how the docstrings, attributes, types, and field descriptions are now part of the schema. This describes on this library's core philosophies.
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int
UserDetails.openai_schema{
"name": "UserDetails",
"description": "Correctly extracted user information",
"parameters": {
"type": "object",
"properties": {
"name": {
"description": "User's full name",
"type": "string"
},
"age": {
"type": "integer"
}
},
"required": [
"age",
"name"
]
}
}With the schema defined, let's proceed with calling the ChatCompletion API using the defined schema and messages.
from instructor import OpenAISchema
from pydantic import Field
class UserDetails(OpenAISchema):
"Correctly extracted user information"
name: str = Field(..., description="User's full name")
age: int
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
functions=[UserDetails.openai_schema],
function_call={"name": UserDetails.openai_schema["name"]},
messages=[
{"role": "system", "content": "Extract user details from my requests"},
{"role": "user", "content": "My name is John Doe and I'm 30 years old."},
],
)In this example, we make a call to the ChatCompletion API by providing the model name (gpt-3.5-turbo-0613) and a list of messages. The messages consist of a system message and a user message. The system message sets the context by requesting user details, while the user message provides the input with the user's name and age.
Note that we have omitted the additional parameters that can be included in the API request, such as temperature, max_tokens, and n. These parameters can be customized according to your requirements.
To deserialize the response from the ChatCompletion API back into an instance of the UserDetails class, we can use the from_response method.
user = UserDetails.from_response(completion)
print(user.name) # Output: John Doe
print(user.age) # Output: 30By calling UserDetails.from_response, we create an instance of the UserDetails class using the response from the API call. Subsequently, we can access the extracted user details through the name and age attributes of the user object.
Everything is designed for you to get the best developer experience possible, with the best editor support.
Including autocompletion:
And even inline errors
This quick start guide provided you with a basic understanding of how to use OpenAI Function Call for schema extraction and function calls. You can now explore more advanced use cases and creative applications of this library.
Since UserDetails is a OpenAISchems and a pydantic.BaseModel you can use inheritance and nesting to create more complex emails while avoiding code duplication
class UserDetails(OpenAISchema):
name: str = Field(..., description="User's full name")
age: int
class UserWithAddress(UserDetails):
address: str
class UserWithFriends(UserDetails):
best_friend: UserDetails
friends: List[UserDetails]If you have any questions, feel free to leave an issue or reach out to the library's author on Twitter. For a more comprehensive solution with additional features, consider checking out MarvinAI.
To see more examples of how we can create interesting models check out some examples.
This project is licensed under the terms of the MIT License.
$ openai_function_call git:(ft-cli) ✗ instructor jobs create-from-file data.jsonl
OpenAI Fine Tuning Job Monitoring
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ ┃ ┃ ┃ Completion ┃ ┃ ┃ ┃ ┃
┃ Job ID ┃ Status ┃ Creation Time ┃ Time ┃ Model Name ┃ File ID ┃ Epochs ┃ Base Model ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ ftjob-PWo6uwk… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 23:10:54 │ │ │ │ │ │
│ ftjob-1whjva8… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 22:47:05 │ │ │ │ │ │
│ ftjob-wGoBDld… │ 🚫 cancelled │ 2023-08-23 │ N/A │ │ file-F7lJg6Z4… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 22:44:12 │ │ │ │ │ │
│ ftjob-yd5aRTc… │ ✅ succeeded │ 2023-08-23 │ 2023-08-23 │ ft:gpt-3.5-tur… │ file-IQxAUDqX… │ 3 │ gpt-3.5-turbo-… │
│ │ │ 14:26:03 │ 15:02:29 │ │ │ │ │
└────────────────┴──────────────┴────────────────┴────────────────┴─────────────────┴────────────────┴────────┴─────────────────┘
Automatically refreshes every 5 seconds, press Ctrl+C to exit

