Create an Agent with OpenAI Function Calling Capabilities (2024)

Business Case: Development of a Farm Trip Assistant Agent

We aim to develop a farm trip assistant agent designed to enhance the user experience in planning farm visits. This digital assistant will offer comprehensive support by:

Identifying top farm destinations tailored to user location.
Providing detailed information on available activities at each farm.
Facilitating the booking of selected activities.
Offering a straightforward process for filing complaints, if necessary.

Application Architecture:

This flowchart shows the architecture of the application:

Pre-requisites:

OpenAI API key: You can obtain this from the OpenAI platform.

Step 1: Prepare to call the model:

To initiate a conversation, begin with a system message and a user prompt for the task:

Create a messages array to keep track of the conversation history.
Include a system message in the messages array to to establish the assistant's role and context.
Welcome the users with a greeting message and prompt them to specify their task.
Add the user prompt to the messages array.

const messages: ChatCompletionMessageParam[] = [];console.log(StaticPrompts.welcome);
messages.push(SystemPrompts.context);
const userPrompt = await createUserMessage();
messages.push(userPrompt);

As my personal preference, all the prompts are stored in objects for easy access and modification. Please refer to the following code snippets for all the prompts used in the application. Feel free to adopt or modify this approach as suits you.

StaticPrompts: Static messages that are used throughout the conversation.

export const StaticPrompts = {
 welcome:
 "Welcome to the farm assistant! What can I help you with today? You can ask me what I can do.",
 fallback: "I'm sorry, I don't understand.",
 end: "I hope I was able to help you. Goodbye!",
} as const;

UserPrompts: User messages that are generated based on user input.

import OpenAI from "openai";
type ChatCompletionUserMessageParam = OpenAI.ChatCompletionUserMessageParam;type UserPromptKey = "task";
type UserPromptValue = (userInput?: string) => ChatCompletionUserMessageParam;
export const UserPrompts: Record<UserPromptKey, UserPromptValue> = {
 task: (userInput) => ({
 role: "user",
 content: userInput || "What can you do?",
 }),
};

SystemPrompts: System messages that are generated based on system context.

import OpenAI from "openai";
type ChatCompletionSystemMessageParam = OpenAI.ChatCompletionSystemMessageParam;type SystemPromptKey = "context";
export const SystemPrompts: Record<
 SystemPromptKey,
 ChatCompletionSystemMessageParam
> = {
 context: {
 role: "system",
 content:
 "You are an farm visit assistant. You are upbeat and friendly. You introduce yourself when first saying `Howdy!`. If you decide to call a function, you should retrieve the required fields for the function from the user. Your answer should be as precise as possible. If you have not yet retrieve the required fields of the function completely, you do not answer the question and inform the user you do not have enough information.",
 },
};
See Also
Function calling with an OpenAPI specification | OpenAI CookbookDemystifying Function Calling in OpenAI: An In-Depth GuidePractical Examples of OpenAI Function CallingHow to use function calling with Azure OpenAI Service - Azure OpenAI Service

FunctionPrompts: Function messages that are basically the return values of the functions.

import OpenAI from "openai";
type ChatCompletionToolMessageParam = OpenAI.ChatCompletionToolMessageParam;type FunctionPromptKey = "function_response";
type FunctionPromptValue = (
 args: Omit<ChatCompletionToolMessageParam, "role">
) => ChatCompletionToolMessageParam;
export const FunctionPrompts: Record<FunctionPromptKey, FunctionPromptValue> = {
 function_response: (options) => ({
 role: "tool",
 ...options,
 }),
};

Step 2: Define the tools

As mentioned earlier, tools are essentially the descriptions of functions that the model can call. In this case, we define four tools to meet the requirements of the farm trip assistant agent:

get_farms: Retrieves a list of farm destinations based on user's location.
get_activities_per_farm: Provides detailed information on activities available at a specific farm.
book_activity: Facilitates the booking of a selected activity.
file_complaint: Offers a straightforward process for filing complaints.

The following code snippet demonstrates how these tools are defined:

import OpenAI from "openai";
import {
 ConvertTypeNameStringLiteralToType,
 JsonAcceptable,
} from "../utils/type-utils.js";type ChatCompletionTool = OpenAI.ChatCompletionTool;
type FunctionDefinition = OpenAI.FunctionDefinition;
// An enum to define the names of the functions. This will be shared between the function descriptions and the actual functions
export enum DescribedFunctionName {
 FileComplaint = "file_complaint",
 getFarms = "get_farms",
 getActivitiesPerFarm = "get_activities_per_farm",
 bookActivity = "book_activity",
}
// This is a utility type to narrow down the `parameters` type in the `FunctionDefinition`.
// It pairs with the keyword `satisfies` to ensure that the properties of parameters are correctly defined.
// This is a workaround as the default type of `parameters` in `FunctionDefinition` is `type FunctionParameters = Record<string, unknown>` which is overly broad.
type FunctionParametersNarrowed<
 T extends Record<string, PropBase<JsonAcceptable>>
> = {
 type: JsonAcceptable; // basically all the types that JSON can accept
 properties: T;
 required: (keyof T)[];
};
// This is a base type for each property of the parameters
type PropBase<T extends JsonAcceptable = "string"> = {
 type: T;
 description: string;
};
// This utility type transforms parameter property string literals into usable types for function parameters.
// Example: { email: { type: "string" } } -> { email: string }
export type ConvertedFunctionParamProps<
 Props extends Record<string, PropBase<JsonAcceptable>>
> = {
 [K in keyof Props]: ConvertTypeNameStringLiteralToType<Props[K]["type"]>;
};
// Define the parameters for each function
export type FileComplaintProps = {
 name: PropBase;
 email: PropBase;
 text: PropBase;
};
export type GetFarmsProps = {
 location: PropBase;
};
export type GetActivitiesPerFarmProps = {
 farm_name: PropBase;
};
export type BookActivityProps = {
 farm_name: PropBase;
 activity_name: PropBase;
 datetime: PropBase;
 name: PropBase;
 email: PropBase;
 number_of_people: PropBase<"number">;
};
// Define the function descriptions
const FunctionDescriptions: Record<
 DescribedFunctionName,
 FunctionDefinition
> = {
 [DescribedFunctionName.FileComplaint]: {
 name: DescribedFunctionName.FileComplaint,
 description: "File a complaint as a customer",
 parameters: {
 type: "object",
 properties: {
 name: {
 type: "string",
 description: "The name of the user, e.g. John Doe",
 },
 email: {
 type: "string",
 description: "The email address of the user, e.g. john@doe.com",
 },
 text: {
 type: "string",
 description: "Description of issue",
 },
 },
 required: ["name", "email", "text"],
 } satisfies FunctionParametersNarrowed<FileComplaintProps>,
 },
 [DescribedFunctionName.getFarms]: {
 name: DescribedFunctionName.getFarms,
 description: "Get the information of farms based on the location",
 parameters: {
 type: "object",
 properties: {
 location: {
 type: "string",
 description: "The location of the farm, e.g. Melbourne VIC",
 },
 },
 required: ["location"],
 } satisfies FunctionParametersNarrowed<GetFarmsProps>,
 },
 [DescribedFunctionName.getActivitiesPerFarm]: {
 name: DescribedFunctionName.getActivitiesPerFarm,
 description: "Get the activities available on a farm",
 parameters: {
 type: "object",
 properties: {
 farm_name: {
 type: "string",
 description: "The name of the farm, e.g. Collingwood Children's Farm",
 },
 },
 required: ["farm_name"],
 } satisfies FunctionParametersNarrowed<GetActivitiesPerFarmProps>,
 },
 [DescribedFunctionName.bookActivity]: {
 name: DescribedFunctionName.bookActivity,
 description: "Book an activity on a farm",
 parameters: {
 type: "object",
 properties: {
 farm_name: {
 type: "string",
 description: "The name of the farm, e.g. Collingwood Children's Farm",
 },
 activity_name: {
 type: "string",
 description: "The name of the activity, e.g. Goat Feeding",
 },
 datetime: {
 type: "string",
 description: "The date and time of the activity",
 },
 name: {
 type: "string",
 description: "The name of the user",
 },
 email: {
 type: "string",
 description: "The email address of the user",
 },
 number_of_people: {
 type: "number",
 description: "The number of people attending the activity",
 },
 },
 required: [
 "farm_name",
 "activity_name",
 "datetime",
 "name",
 "email",
 "number_of_people",
 ],
 } satisfies FunctionParametersNarrowed<BookActivityProps>,
 },
};
// Format the function descriptions into tools and export them
export const tools = Object.values(
 FunctionDescriptions
).map<ChatCompletionTool>((description) => ({
 type: "function",
 function: description,
}));

Understanding Function Descriptions

Function descriptions require the following keys:

name: Identifies the function.
description: Provides a summary of what the function does.
parameters: Defines the function's parameters, including their type, description, and whether they are required.
type: Specifies the parameter type, typically an object.
properties: Details each parameter, including its type and description.
required: Lists the parameters that are essential for the function to operate.

Adding a New Function

To introduce a new function, proceed as follows:

Extend DescribedFunctionName with a new enum, such as DoNewThings.
Define a Props type for the parameters, e.g., DoNewThingsProps.
Insert a new entry in the FunctionDescriptions object.
Implement the new function in the function directory, naming it after the enum value.

Step 3: Call the model with the messages and the tools

With the messages and tools set up, we’re ready to call the model using them.

It’s important to note that as of March 2024, function calling is supported only by the gpt-3.5-turbo-0125 and gpt-4-turbo-preview models.

Code implementation:

Step 4: Handling Model Responses

The model’s responses fall into two primary categories, with a potential for errors that necessitate a fallback message:

Create an Agent with OpenAI Function Calling Capabilities (1)

Function Call Request: The model indicates a desire to call function(s). This is the true potential of function calling. The model intelligently selects which function(s) to execute based on context and user queries. For instance, if the user asks for farm recommendations, the model may suggest calling the get_farms function.

But it doesn’t just stop there, the model also analyzes the user input to determine if it contains the necessary information (arguments) for the function call. If not, the model would prompt the user for the missing details.

Once it has gathered all required information (arguments), the model returns a JSON object detailing the function name and arguments. This structured response can be effortlessly translated into a JavaScript object within our application, enabling us to invoke the specified function seamlessly, thereby ensuring a fluid user experience.

Additionally, the model can choose to call multiple functions, either simultaneously or in sequence, each requiring specific details. Managing this within the application is crucial for smooth operation.

Example of model’s response:

{
 "role": "assistant",
 "content": null,
 "tool_calls": [
 {
 "id": "call_JWoPQYmdxNXdNu1wQ1iDqz2z",
 "type": "function",
 "function": {
 "name": "get_farms", // The function name to be called
 "arguments": "{\"location\":\"Melbourne\"}" // The arguments required for the function
 }
 }
 ... // multiple function calls can be present
 ]
}

2. Plain Text Response: The model provides a direct text response. This is the standard output we’re accustomed to from AI models, offering straightforward answers to user queries. Simply returning the text content suffices for these responses.

Example of model’s response:

{
 "role": "assistant",
 "content": {
 "text": "I can help you with that. What is your location?"
 }
}

The key distinction is the presence of a tool_calls key for function calls. If tool_calls is present, the model is requesting to execute a function; otherwise, it delivers a straightforward text response.

To process these responses, consider the following approach based on the response type:

type ChatCompletionMessageWithToolCalls = RequiredAll<
 Omit<ChatCompletionMessage, "function_call">
>;// If the message contains tool_calls, it extracts the function arguments. Otherwise, it returns the content of the message.
export function processMessage(message: ChatCompletionMessage) {
 if (isMessageHasToolCalls(message)) {
 return extractFunctionArguments(message);
 } else {
 return message.content;
 }
}
// Check if the message has `tool calls`
function isMessageHasToolCalls(
 message: ChatCompletionMessage
): message is ChatCompletionMessageWithToolCalls {
 return isDefined(message.tool_calls) && message.tool_calls.length !== 0;
}
// Extract function name and arguments from the message
function extractFunctionArguments(message: ChatCompletionMessageWithToolCalls) {
 return message.tool_calls.map((toolCall) => {
 if (!isDefined(toolCall.function)) {
 throw new Error("No function found in the tool call");
 }
 try {
 return {
 tool_call_id: toolCall.id,
 function_name: toolCall.function.name,
 arguments: JSON.parse(toolCall.function.arguments),
 };
 } catch (error) {
 throw new Error("Invalid JSON in function arguments");
 }
 });
}

The arguments extracted from the function calls are then used to execute the actual functions in the application, while the text content helps to carry on the conversation.

Below is an if-else block illustrating how this process unfolds:

const result = await startChat(messages);if (!result) {
 // Fallback message if response is empty (e.g., network error)
 console.log(StaticPrompts.fallback);
} else if (isNonEmptyString(result)) {
 // If the response is a string, log it and prompt the user for the next message
 console.log(`Assistant: ${result}`);
 const userPrompt = await createUserMessage();
 messages.push(userPrompt);
} else {
 // If the response contains function calls, execute the functions and call the model again with the updated messages
 for (const item of result) {
 const { tool_call_id, function_name, arguments: function_arguments } = item;
 // Execute the function and get the function return
 const functionReturn = await AvailableFunctions[
 function_name as keyof typeof AvailableFunctions
 ](function_arguments);
 // Add the function output back to the messages with a role of "tool", the id of the tool call, and the function return as the content
 messages.push(
 FunctionPrompts.function_response({
 tool_call_id,
 content: functionReturn,
 })
 );
 }
}

Step 5: Execute the function and call the model again

When the model requests a function call, we execute that function in our application and then update the model with the new messages. This keeps the model informed about the function’s result, allowing it to give a pertinent reply to the user.

Maintaining the correct sequence of function executions is crucial, especially when the model chooses to execute multiple functions in a sequence to complete a task. Using a for loop instead of Promise.all preserves the execution order, essential for a successful workflow. However, if the functions are independent and can be executed in parallel, consider custom optimizations to enhance performance.

Here’s how to execute the function:

for (const item of result) {
 const { tool_call_id, function_name, arguments: function_arguments } = item; console.log(
 `Calling function "${function_name}" with ${JSON.stringify(
 function_arguments
 )}`
 );
 // Available functions are stored in an object for easy access
 const functionReturn = await AvailableFunctions[
 function_name as keyof typeof AvailableFunctions
 ](function_arguments);
}

And here’s how to update the messages array with the function response:

for (const item of result) {
 const { tool_call_id, function_name, arguments: function_arguments } = item; console.log(
 `Calling function "${function_name}" with ${JSON.stringify(
 function_arguments
 )}`
 );
 const functionReturn = await AvailableFunctions[
 function_name as keyof typeof AvailableFunctions
 ](function_arguments);
 // Add the function output back to the messages with a role of "tool", the id of the tool call, and the function return as the content
 messages.push(
 FunctionPrompts.function_response({
 tool_call_id,
 content: functionReturn,
 })
 );
}

Example of the functions that can be called:

// Mocking getting farms based on location from a database
export async function get_farms(
 args: ConvertedFunctionParamProps<GetFarmsProps>
): Promise<string> {
 const { location } = args;
 return JSON.stringify({
 location,
 farms: [
 {
 name: "Farm 1",
 location: "Location 1",
 rating: 4.5,
 products: ["product 1", "product 2"],
 activities: ["activity 1", "activity 2"],
 },
 ...
 ],
 });
}

Example of the tool message with function response:

{
 "role": "tool",
 "tool_call_id": "call_JWoPQYmdxNXdNu1wQ1iDqz2z",
 "content": {
 // Function return value
 "location": "Melbourne",
 "farms": [
 {
 "name": "Farm 1",
 "location": "Location 1",
 "rating": 4.5,
 "products": [
 "product 1",
 "product 2"
 ],
 "activities": [
 "activity 1",
 "activity 2"
 ]
 },
 ...
 ]
 }
}

Step 6: Summarize the results back to the user

After running the functions and updating the message array, we re-engage the model with these updated messages to brief the user on the outcomes. This involves repeatedly invoking the startChat function via a loop.

To avoid endless looping, it’s crucial to monitor for user inputs signaling the end of the conversation, like “Goodbye” or “End,” ensuring the loop terminates appropriately.