← 返回列表

Function Calling Technical Summary

Function Calling Technical Summary

1. Definition

Function Calling is a mechanism that allows developers to describe available external tools (such as APIs) to a large language model (LLM) via JSON schema. When the model determines that a tool is needed to answer the user's question, it outputs structured tool_calls JSON data, specifying the function name and parameters to invoke. The host program parses and executes this call, returns the result to the model, and the model then generates the final answer.

2. Core Principle and Problems Solved

  • Essence: A closed loop of "two rounds of dialogue + intermediate execution." In the first round, the model decides and outputs a tool call request; the intermediate code executes the tool; in the second round, the model generates the final answer based on the execution result.
  • Problems Solved: It solves the previous issue of relying on unstable and error-prone natural language parsing (if/else judgments) when making the model call tools, achieving standardization and accuracy improvement through structured output.

3. Division of Responsibilities (Analogy to Task Delegation)

  • Developer (HR): Defines tools, writes JSON Schema describing tool functionality, parameters, etc.
  • LLM Model (Manager): Understands tool descriptions, decides whether to call a tool, which tool to call, and the parameters, and outputs a structured call request (tool_calls). The model only makes decisions and generates text, does not execute code itself.
  • Executor/Host Code (Employee): Parses the model's tool_calls request, actually executes the corresponding function or API call, and returns the result.

4. Tool Definition (JSON Schema)

Schema is the "instruction manual" for the tool, with key information including:
- name: Unique identifier for the tool.
- description: Crucial; the model relies entirely on this description to decide whether to call the tool. The clearer and more accurate the description, the better the model's decision.
- parameters: Defines the parameters required by the tool, their types, descriptions, and constraints (e.g., enum values, required or not).

5. Complete Call Flow

The article uses a code example of querying weather to demonstrate the entire process: from user question, carrying tool definitions for the first model call, model returning tool_calls, code executing the function, inserting the result as a message with role: "tool" back into the conversation history, to the final model generating a natural language answer.

6. Advanced Feature - Parallel Tool Calls

When a user's question requires multiple tools to work together (e.g., querying weather for multiple cities simultaneously), the model can output a list containing multiple tool_calls in a single response. The host code can execute these calls in parallel, then return all results to the model for synthesis, thereby improving efficiency.

评论

暂无已展示的评论。

发表评论(匿名)