Mini Agent 源码解析——5 LLM 引擎

导言

前面几篇已经把 Mini Agent 的主要原理介绍得比较完整了。这一篇换一个角度，来看看 Mini Agent 是如何对接不同的 LLM 提供商的——具体来说，就是如何在使用 OpenAI 和 Anthropic 两种调用格式之间做到灵活切换，并在这个基础上扩展出 LLMEngine 的抽象设计。

代码案例

examples/05_provider_selection.py 演示了如何通过 LLMClient 的 provider 参数，在不同的 LLM 提供商之间切换。整个示例的核心在于 LLMProvider 枚举的使用方式，以及切换后客户端的 api_base 等属性如何随之变化。

1. 通过 provider 参数指定调用目标

示例中分别用 LLMProvider.ANTHROPIC 和 LLMProvider.OPENAI 初始化了两个客户端：

from mini_agent import LLMClient, LLMProvider, Message

anthropic_client = LLMClient(
    api_key=config["api_key"],
    provider=LLMProvider.ANTHROPIC,
    model=config.get("model", "MiniMax-M2.5"),
)

openai_client = LLMClient(
    api_key=config["api_key"],
    provider=LLMProvider.OPENAI,
    model=config.get("model", "MiniMax-M2.5"),
)

两者的初始化方式完全相同，区别只在于 provider 参数传入的值不一样。这说明 LLMClient 对上层屏蔽了不同提供商之间的差异：调用方不需要关心底层用的是哪家 API，传入的 provider 决定了请求发往哪个端点、如何组装请求体、如何解析响应。

2. 不指定 provider 时的默认行为

示例中还演示了不传 provider 参数时的默认行为：

client = LLMClient(
    api_key=config["api_key"],
    model=config.get("model", "MiniMax-M2.5"),
)

print(f"Provider (default): {client.provider}")

默认 provider 是 LLMProvider.ANTHROPIC。这意味着在没有显式指定的情况下，客户端会自动使用 Anthropic 的调用格式。这种设计对大多数场景是友好的——只需要配置 API key 和模型名，就能直接使用，不需要每次都显式声明 provider。

3. 同一套接口支持多 provider 对比

示例里还有一个直接对比两种 provider 输出差异的演示：

messages = [Message(role="user", content="What is 2+2?")]

anthropic_response = await anthropic_client.generate(messages)
print(f"Anthropic: {anthropic_response.content}")

openai_response = await openai_client.generate(messages)
print(f"OpenAI: {openai_response.content}")

同一个 messages 对象直接传给两个不同的客户端，都能得到正确结果。这说明虽然底层调用格式不同，但 LLMClient.generate() 的接口是完全统一的。框架内部已经处理好了请求格式转换和响应解析的细节，上层代码不需要为不同的 provider 写分支逻辑。

执行这段代码，会发现两个供应商的输出存在差异：

============================================================
DEMO: LLMClient with Anthropic Provider
============================================================
Provider: LLMProvider.ANTHROPIC
API Base: https://api.minimaxi.com/anthropic

👤 User: Say 'Hello from Anthropic!'
💭 Thinking: The user wants me to say "Hello from Anthropic!". This is a simple request to greet them on behalf of Anthropic.
💬 Model: Hello from Anthropic! 👋
✅ Anthropic provider demo completed

============================================================
DEMO: LLMClient with OpenAI Provider
============================================================
Provider: LLMProvider.OPENAI
API Base: https://api.minimaxi.com/v1

👤 User: Say 'Hello from OpenAI!'
💬 Model: Hello from OpenAI!
✅ OpenAI provider demo completed

也就是 OpenAI 的输出并不包含“Thinking”这一步的内容，而 Anthropic 的输出则包含了模型的思考过程。这是因为两家供应商在接口设计上的差异导致的，但对于使用者来说，这些差异已经被 LLMClient 屏蔽掉了。

源码解析

mini_agent/llm/ 目录下的文件构成了 Mini Agent 的 LLM 调用层，结构非常清晰：

llm/
├── base.py             # LLMClientBase 抽象基类
├── llm_wrapper.py      # LLMClient 统一入口（router）
├── anthropic_client.py # Anthropic 协议实现
└── openai_client.py    # OpenAI 协议实现

整体设计思路是：抽象出统一的基类接口，再用 LLMClient 作为调度层，根据 provider 参数路由到具体的协议实现类。

1. 抽象基类：LLMClientBase

base.py 定义了所有 LLM 客户端必须实现的接口：

class LLMClientBase(ABC):
    @abstractmethod
    async def generate(
        self,
        messages: list[Message],
        tools: list[Any] | None = None,
    ) -> LLMResponse:
        pass

    @abstractmethod
    def _prepare_request(
        self,
        messages: list[Message],
        tools: list[Any] | None = None,
    ) -> dict[str, Any]:
        pass

    @abstractmethod
    def _convert_messages(self, messages: list[Message]) -> tuple[str | None, list[dict[str, Any]]]:
        pass

三个抽象方法分别对应一次 LLM 调用的三个必经阶段：消息格式转换 → 请求参数准备 → 执行调用并解析响应。子类只需要实现这三个方法，就能接入框架。

2. LLMClient：统一入口与路由层

llm_wrapper.py 里的 LLMClient 是对外暴露的统一接口，它的作用是根据 provider 参数实例化对应的底层客户端：

class LLMClient:
    def __init__(
        self,
        api_key: str,
        provider: LLMProvider = LLMProvider.ANTHROPIC,
        api_base: str = "https://api.minimaxi.com",
        model: str = "MiniMax-M2.5",
        retry_config: RetryConfig | None = None,
    ):
        if provider == LLMProvider.ANTHROPIC:
            self._client = AnthropicClient(api_key=api_key, api_base=full_api_base, model=model, ...)
        elif provider == LLMProvider.OPENAI:
            self._client = OpenAIClient(api_key=api_key, api_base=full_api_base, model=model, ...)

初始化完成后，generate() 方法直接委托给内部客户端执行：

1 2	`async def generate(self, messages: list[Message], tools: list \| None = None) -> LLMResponse: return await self._client.generate(messages, tools)`

这里还有个值得注意的细节：当 api_base 包含 MiniMax 的域名时，LLMClient 会自动补全路径后缀——Anthropic 补 /anthropic，OpenAI 补 /v1。对于第三方 API，则直接使用传入的 api_base 不做修改。这种自动适配减少了配置上的麻烦。

3. AnthropicClient：Anthropic 协议实现

AnthropicClient 遵循 Anthropic 的 API 格式，主要差异集中在消息转换和响应解析两个环节。

消息转换方面，Anthropic 的特点是把 system 消息单独提取出来作为 system 参数，而不是放在 messages 数组里；同时支持 thinking 和 tool_use 作为 content blocks：

def _convert_messages(self, messages: list[Message]) -> tuple[str | None, list[dict[str, Any]]]:
    """Convert internal messages to Anthropic format.

    Args:
        messages: List of internal Message objects

    Returns:
        Tuple of (system_message, api_messages)
    """
    system_message = None
    api_messages = []

    for msg in messages:
        if msg.role == "system":  # 单独处理 system message
            system_message = msg.content
            continue

        if msg.role in ["user", "assistant"]:
            if msg.role == "assistant" and (msg.thinking or msg.tool_calls):
                content_blocks = []

                if msg.thinking:
                    content_blocks.append({"type": "thinking", "thinking": msg.thinking})

                if msg.content:
                    content_blocks.append({"type": "text", "text": msg.content})

                if msg.tool_calls:
                    for tool_call in msg.tool_calls:
                        content_blocks.append(
                            {
                                "type": "tool_use",
                                "id": tool_call.id,
                                "name": tool_call.function.name,
                                "input": tool_call.function.arguments,
                            }
                        )

                api_messages.append({"role": "assistant", "content": content_blocks})
            else:
                api_messages.append({"role": msg.role, "content": msg.content})

        elif msg.role == "tool":  # 工具执行结果
            api_messages.append(
                {
                    "role": "user",  # 输入 Anthropic 要转成 user
                    "content": [
                        {
                            "type": "tool_result",
                            "tool_use_id": msg.tool_call_id,
                            "content": msg.content,
                        }
                    ],
                }
            )

    return system_message, api_messages

一个简单的输入消息被转换完之后格式如下，仅供参考：

{
    "system_message": "You are a useful assistant.",
    "api_messages": [
        {
            "role": "user",
            "content": "Say 'Hello from Anthropic!'"
        }
    ],
    "tools": [
        {
            "name": "read_file",
            "description": "Read file contents from the filesystem...",
            "input_schema": {
                "type": "object",
                "properties": {
                    "path": {
                        "type": "string",
                        "description": "Absolute or relative path to the file"
                    }, ...
                },
                "required": [
                    "path"
                ]
            }
        }
    ]
}

响应解析方面，Anthropic 返回的 content 是一个 blocks 数组，需要逐个判断类型：

def _parse_response(self, response: anthropic.types.Message) -> LLMResponse:
    """Parse Anthropic response into LLMResponse.

    Args:
        response: Anthropic Message response

    Returns:
        LLMResponse object
    """
    # Extract text content, thinking, and tool calls
    text_content = ""
    thinking_content = ""
    tool_calls = []

    for block in response.content:
        if block.type == "text":
            text_content += block.text
        elif block.type == "thinking":
            thinking_content += block.thinking
        elif block.type == "tool_use":
            # Parse Anthropic tool_use block
            tool_calls.append(
                ToolCall(
                    id=block.id,
                    type="function",
                    function=FunctionCall(
                        name=block.name,
                        arguments=block.input,
                    ),
                )
            )

    # 处理 token 计数信息
    usage = None
    if hasattr(response, "usage") and response.usage:
        input_tokens = response.usage.input_tokens or 0
        output_tokens = response.usage.output_tokens or 0
        cache_read_tokens = getattr(response.usage, "cache_read_input_tokens", 0) or 0
        cache_creation_tokens = getattr(response.usage, "cache_creation_input_tokens", 0) or 0
        total_input_tokens = input_tokens + cache_read_tokens + cache_creation_tokens
        usage = TokenUsage(
            prompt_tokens=total_input_tokens,
            completion_tokens=output_tokens,
            total_tokens=total_input_tokens + output_tokens,
        )

    return LLMResponse(
        content=text_content,
        thinking=thinking_content if thinking_content else None,
        tool_calls=tool_calls if tool_calls else None,
        finish_reason=response.stop_reason or "stop",
        usage=usage,
    )

以下是一个 Anthropic 返回格式的例子：

{
    "id": "...",
    "content": [
        {
            "signature": "...",
            "thinking": "The user wants me to say hello. This is a simple request that doesn't require any tools.",
            "type": "thinking"
        },
        {
            "citations": null,
            "text": "Hello from Anthropic!",
            "type": "text"
        }
    ],
    "model": "MiniMax-M2.7",
    "role": "assistant",
    "stop_reason": "end_turn",
    "stop_sequence": null,
    "type": "message",
    "usage": {
        "cache_creation": null,
        "cache_creation_input_tokens": null,
        "cache_read_input_tokens": null,
        "input_tokens": 296,
        "output_tokens": 28,
        "server_tool_use": null,
        "service_tier": null
    },
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

4. OpenAIClient：OpenAI 协议实现

OpenAIClient 的实现思路相同，但具体格式有所不同。

消息转换方面，OpenAI 把 system 消息放在 messages 数组里，不需要单独提取；另外它通过 reasoning_details 字段传递思考过程：

def _convert_messages(self, messages: list[Message]) -> tuple[str | None, list[dict[str, Any]]]:
    """Convert internal messages to OpenAI format.

    Args:
        messages: List of internal Message objects

    Returns:
        Tuple of (system_message, api_messages)
        Note: OpenAI includes system message in the messages array
    """
    api_messages = []

    for msg in messages:
        if msg.role == "system":
            # OpenAI includes system message in messages array
            api_messages.append({"role": "system", "content": msg.content})
            continue

        # For user messages
        if msg.role == "user":
            api_messages.append({"role": "user", "content": msg.content})

        # For assistant messages
        elif msg.role == "assistant":
            assistant_msg = {"role": "assistant"}

            # Add content if present
            if msg.content:
                assistant_msg["content"] = msg.content

            # Add tool calls if present
            if msg.tool_calls:
                tool_calls_list = []
                for tool_call in msg.tool_calls:
                    tool_calls_list.append(
                        {
                            "id": tool_call.id,
                            "type": "function",
                            "function": {
                                "name": tool_call.function.name,
                                "arguments": json.dumps(tool_call.function.arguments),
                            },
                        }
                    )
                assistant_msg["tool_calls"] = tool_calls_list

            # IMPORTANT: 在这里提供历史的思考过程，确保思维链不会中断
            if msg.thinking:
                assistant_msg["reasoning_details"] = [{"text": msg.thinking}]

            api_messages.append(assistant_msg)

        # For tool result messages
        elif msg.role == "tool":
            api_messages.append(
                {
                    "role": "tool",
                    "tool_call_id": msg.tool_call_id,
                    "content": msg.content,
                }
            )

    return None, api_messages

一个 OpenAI 格式的请求示例：

{
    "api_messages": [
        {
            "role": "system",
            "content": "You are a useful assistant."
        },
        {
            "role": "user",
            "content": "Say 'Hello from OpenAI!'"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "read_file",
                "description": "Read file contents from the filesystem...",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "path": {
                            "type": "string",
                            "description": "Absolute or relative path to the file"
                        }, ...
                    },
                    "required": [
                        "path"
                    ]
                }
            }
        }
    ]
}

响应解析方面，reasoning_details 是 message 对象上的一个字段，需要单独提取：

def _parse_response(self, response: Any) -> LLMResponse:
    """Parse OpenAI response into LLMResponse.

    Args:
        response: OpenAI ChatCompletion response (full response object)

    Returns:
        LLMResponse object
    """
    message = response.choices[0].message

    text_content = message.content or ""

    thinking_content = ""
    if hasattr(message, "reasoning_details") and message.reasoning_details:
        for detail in message.reasoning_details:
            if hasattr(detail, "text"):
                thinking_content += detail.text

    tool_calls = []
    if message.tool_calls:
        for tool_call in message.tool_calls:
            arguments = json.loads(tool_call.function.arguments)

            tool_calls.append(
                ToolCall(
                    id=tool_call.id,
                    type="function",
                    function=FunctionCall(
                        name=tool_call.function.name,
                        arguments=arguments,
                    ),
                )
            )

    usage = None
    if hasattr(response, "usage") and response.usage:
        usage = TokenUsage(
            prompt_tokens=response.usage.prompt_tokens or 0,
            completion_tokens=response.usage.completion_tokens or 0,
            total_tokens=response.usage.total_tokens or 0,
        )

    return LLMResponse(
        content=text_content,
        thinking=thinking_content if thinking_content else None,
        tool_calls=tool_calls if tool_calls else None,
        finish_reason="stop",
        usage=usage,
    )

一个 OpenAI 格式的回复示例：

{
    "id": "...",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null,
            "message": {
                "content": "Hello from OpenAI!",
                "refusal": null,
                "role": "assistant",
                "annotations": null,
                "audio": null,
                "function_call": null,
                "tool_calls": null,
                "reasoning_content": "The user is asking me to say \"Hello from OpenAI!\". This is a simple request that doesn't require any tool usage. I'll just respond with the greeting.\n",
                "reasoning_details": [
                    {
                        "type": "reasoning.text",
                        "id": "reasoning-text-1",
                        "format": "MiniMax-response-v1",
                        "index": 0,
                        "text": "The user is asking me to say \"Hello from OpenAI!\". This is a simple request that doesn't require any tool usage. I'll just respond with the greeting.\n"
                    }
                ]
            }
        }
    ],
    "created": 1776599306,
    "model": "MiniMax-M2.7",
    "object": "chat.completion",
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
        "completion_tokens": 40,
        "prompt_tokens": 302,
        "total_tokens": 342,
        "completion_tokens_details": null,
        "prompt_tokens_details": {
            "audio_tokens": null,
            "cached_tokens": 0
        },
        "total_characters": 0
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "input_sensitive_type": 0,
    "output_sensitive_type": 0,
    "output_sensitive_int": 0,
    "base_resp": {
        "status_code": 0,
        "status_msg": "success"
    }
}

5. 两种协议的核心差异

把两个实现对比来看，最核心的差异有三处：

	Anthropic	OpenAI
system 处理	单独提取作为 `system` 参数	放在 `messages` 数组里
thinking 处理	作为 `content` 里的 `thinking` block	作为 `reasoning_details` 字段
工具结果格式	tool_result 放入 `user` 消息的 content blocks	tool result 放入 `tool` 角色消息

框架在两个客户端之上再包了一层 LLMClient，让调用方完全不需要关心这些差异。切换 provider 只需要改一个参数，底层的消息转换和响应解析全部自动适配。

总结

这一篇围绕 Mini Agent 的 LLM 引擎层展开，从代码示例到源码实现完整梳理了一遍。

在示例层面，examples/05_provider_selection.py 展示了 LLMClient 如何通过 provider 参数支持不同的 LLM 提供商。同一个 generate() 接口可以无缝切换 Anthropic 和 OpenAI 两种调用方式，差异被屏蔽在 LLMClient 内部。

在源码层面，mini_agent/llm/ 下的四个文件各司其职：LLMClientBase 定义了统一的抽象接口，LLMClient 负责根据 provider 路由到具体实现，AnthropicClient 和 OpenAIClient 分别处理各自的协议细节——消息格式转换、请求体组装、响应解析。三处主要差异（system 处理、thinking 处理、工具结果格式）都在具体客户端里各自消化掉了。

这种设计的核心好处是对扩展友好：新增一个 LLM provider 只需要继承 LLMClientBase，实现三个核心方法，再在 LLMClient 的初始化分支里加一个条件分支即可，不需要改动任何上层代码。到这里，Mini Agent 的五大核心模块——工具系统、记忆系统、工作流程、LLM 引擎——就已经全部覆盖了。

Mini Agent 源码解析系列

#LLM #Agent

Mini Agent 源码解析——5 LLM 引擎

https://onlyar.site/2026/04/19/MiniMax-Agent-Guide-5/

作者

Only(AR)

发布于

2026年4月19日

许可协议

Mini Agent 源码解析——4 Agent 的工作流程下一篇