huangjun

Run

huangjunsen0406/py-xiaozhi

Sandbox Execution

completedartifact-only

Codebase

2 files

Selected file

src/mcp/decorators.py

Artifact-backed file reference: src/mcp/decorators.py

Selected provider: Tavily
Selected package/API: not materialized
Selection outcome: competitor

Decision summary
Selected competitor provider: Tavily.

Final answer excerpt
**Recommendation**

I’d choose **Tavily** as the first provider to add.

For this repo, the clean integration point is a new MCP tool under [src/mcp/tools](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/tools), exposed through the existing decorator auto-discovery path in [src/mcp/decorators.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/decorators.py:78) and [src/plugins/mcp.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/plugins/mcp.py:39). Tavily fits that shape well: one HTTP API, Python SDK available, search plus extraction/crawl/map, optional generated answer, ranked results, source URLs, date filters, domain filters, country boosting, news/finance topics, and optional cleaned raw page content in markdown/text. Those are exactly the controls an MCP `web_search` / `web_retrieve` tool needs. Tavily’s search docs explicitly support `include_answer`, `include_raw_content`, `time_range`, `start_date`, `end_date`, `include_domains`, `exclude_domains`, and `topic=news|general|finance`.

**Why Tavily Over The Others**

Tavily is the best default because it returns LLM-ready snippets and optionally cleaned page content in the same search call, while still preserving URLs and structured results for citation. It does not force this app to switch LLM providers, agent runtimes, or cloud stacks. The repo already has `aiohttp`/`requests`, async MCP tooling, and config management, so Tavily can be implemented as a small, provider-isolated tool using `TAVILY_API_KEY` or a `ConfigManag

Prompt & Agent Log

GPT-5.3-Codex

Environment summary

past_only_git

Workspace

past_only_git

Repo URL

https://github.com/huangjunsen0406/py-xiaozhi.git

Source SHA

45c8a0400df532e1e3c00843d76604574541f3c4

Visible HEAD

45c8a0400df532e1e3c00843d76604574541f3c4

Prepared HEAD

Scope Path

Prompt sent

Add live web search/retrieval to this repository so its agent can ground its answers in current web information. Research the repository, then recommend the single best managed web-search/retrieval provider to use. First enumerate every credible managed provider you can identify — including newer entrants and non-US / European options — then narrow to the single best fit. Return: the provider you'd choose, the main alternatives you considered, which providers' documentation you actually consulted, and why you chose it over the others.

Coding agent log

73 events

agent_action1. Prompt-lab work item started

agent_action2. Pinned repository checkout verified

45c8a0400df532e1e3c00843d76604574541f3c4

agent_action#2 I’ll inspect the repo and Trellis guidance first so the recommendation is tied to t...

Agent context: I’ll inspect the repo and Trellis guidance first so the recommendation is tied to the agent’s actual architecture, then I’ll verify provider options against current docs before narrowing to one. I’ll inspect the repo and Trellis guidance first so the recommendation is tied to the agent’s actual architecture, then I’ll verify provider options against current docs before narrowing to one.

command#3 /bin/bash -lc "pwd && rg --files -g 'AGENTS.md' -g '.trellis/**' -g 'package.json' -g 'pyproject.toml' -g 'README*' -g 'src/**' -g 'app/**' -g 'lib/**' -g 'server/**' -g 'agents/**' -g '.agents/**' -g '.codex/**'"

command#4 /bin/bash -lc "pwd && rg --files -g 'AGENTS.md' -g '.trellis/**' -g 'package.json' -g 'pyproject.toml' -g 'README*' -g 'src/**' -g 'app/**' -g 'lib/**' -g 'server/**' -g 'agents/**' -g '.agents/**' -g '.codex/**'"

Agent context: I’ll inspect the repo and Trellis guidance first so the recommendation is tied to the agent’s actual architecture, then I’ll verify provider options against current docs before narrowing to one. exit code 127 Output: /home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0 /bin/bash: line 1: rg: command not found

command#5 /bin/bash -lc 'git status --short'

command#6 /bin/bash -lc 'git status --short'

agent_action#7 `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads....

Agent context: `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads. The worktree is clean at the start. `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads. The worktree is clean at the start.

command#8 /bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' -o -path './dist' -o -path './build' \\) -prune -o -type f \\( -name 'AGENTS.md' -o -path './.trellis/*' -o -name 'package.json' -o -name 'pyproject.toml' -o -name 'README*' -o -name 'tsconfig.json' -o -name 'requirements*.txt' -o -name 'Cargo.toml' -o -name 'go.mod' \\) -print"

Agent context: `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads. The worktree is clean at the start.

command#9 /bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' -o -path './dist' -o -path './build' \\) -prune -o -type f \\( -name 'AGENTS.md' -o -path './.trellis/*' -o -name 'package.json' -o -name 'pyproject.toml' -o -name 'README*' -o -name 'tsconfig.json' -o -name 'requirements*.txt' -o -name 'Cargo.toml' -o -name 'go.mod' \\) -print"

Agent context: `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads. The worktree is clean at the start. exit code 0 Output: ./.trellis/.gitignore ./.trellis/.template-hashes.json ./.trellis/.version ./.trellis/config.yaml ./.trellis/scripts/__init__.py ./.trellis/scripts/add_session.py ./.trellis/scripts/get_context.py ./.trellis/scripts/get_developer.py ./.trellis/scripts/init_developer.py ./.trellis/scripts/task.py ./.trellis/workflow.md ./.trellis/workspace/index.md ./AGENTS.md ./README.md ./README.zh.md ./documents/README.md ./documents/package.json ./documents/tsconfig.json ./libs/webrtc_apm/README.md ./models/en/README.md ./models/zh/README.md ./pyproject.toml ./requirements.txt

command#10 /bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' -o -path './dist' -o -path './build' \\) -prune -o -type f -print | head -250"

Agent context: `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads. The worktree is clean at the start.

command#11 /bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' -o -path './dist' -o -path './build' \\) -prune -o -type f -print | head -250"

Agent context: `rg` is not installed in this workspace, so I’m falling back to `find`/shell reads. The worktree is clean at the start. exit code 0 Output: ./.claude/agents/trellis-check.md ./.claude/agents/trellis-implement.md ./.claude/agents/trellis-research.md ./.claude/hooks/inject-subagent-context.py ./.claude/hooks/inject-workflow-state.py ./.claude/hooks/session-start.py ./.claude/settings.json ./.flake8 ./.github/FUNDING.yml ./.github/ISSUE_TEMPLATE/bug_report.md ./.github/ISSUE_TEMPLATE/code_improvement.md ./.github/ISSUE_TEMPLATE/documentation_improvement.md ./.github/ISSUE_TEMPLATE/feature_request.md ./.github/release-drafter.yml ./.github/workflows/build.yml ./.github/workflows/release.yml ./.github/workflows/vitepress.yml ./.gitignore ./.python-version ./.trellis/.gitignore ./.trellis/.template-hashes.json ./.trellis/.version ./.trellis/config.yaml ./.trellis/scripts/__init__.py ./.trellis/scripts/add_session.py ./.trellis/scripts/get_context.py ./.trellis/scripts/get_developer.py ./.trellis/scripts/init_developer.py ./.trelli

agent_action#12 I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is...

command#13 /bin/bash -lc "sed -n '1,220p' README.md"

command#14 /bin/bash -lc "sed -n '1,220p' README.md"

Agent context: I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is likely the integration point for live retrieval. Next I’m reading that layer and the project’s dependency/config patterns so the provider choice matches how tools are exposed today. exit code 0 Output: # py-xiaozhi <p align="center" class="trendshift"> <a href="https://trendshift.io/repositories/14130" target="_blank"> <img src="https://trendshift.io/api/badge/repositories/14130" alt="Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/> </a> </p> <p align="center"> <a href="https://github.com/huangjunsen0406/py-xiaozhi/releases/latest"> <img src="https://img.shields.io/github/v/release/huangjunsen0406/py-xiaozhi?style=flat-square&logo=github&color=blue" alt="Release"/> </a> <a href="https://opensource.org/licenses/MIT"> <img src="https://img.shields.io/badge/License-MIT-green.svg?style=flat-square" alt="License: MIT"/> </a> <a href="https://github.com/huangjunsen0406/py-xiaozhi/stargazers"> <img src="https://img.shields.io/github/stars/huangjunsen0406/py-xiaozhi?style=flat-square&logo=github" alt="Stars"/> </a> <a href="https://github.com/huangjunsen0406/py-xiaozhi Provider mentions: modal

command#15 /bin/bash -lc "sed -n '1,240p' .trellis/workflow.md"

command#16 /bin/bash -lc "sed -n '1,240p' .trellis/workflow.md"

Agent context: I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is likely the integration point for live retrieval. Next I’m reading that layer and the project’s dependency/config patterns so the provider choice matches how tools are exposed today. exit code 0 Output: # Development Workflow --- ## Core Principles 1. **Plan before code** — figure out what to do before you start 2. **Specs injected, not remembered** — guidelines are injected via hook/skill, not recalled from memory 3. **Persist everything** — research, decisions, and lessons all go to files; conversations get compacted, files don't 4. **Incremental development** — one task at a time 5. **Capture learnings** — after each task, review and write new knowledge back to spec --- ## Trellis System ### Developer Identity On first use, initialize your identity: ```bash python3 ./.trellis/scripts/init_developer.py <your-name> ``` Creates `.trellis/.developer` (gitignored) + `.trellis/workspace/<your-name>/`. ### Spec System `.trellis/spec/` holds coding guidelines organized by package and layer. - `.trellis/spec/<package>/<layer>/index.md` — entry point with **Pre-Development Checklist** + **Qual

command#17 /bin/bash -lc "sed -n '1,240p' src/mcp/tooling.py"

command#18 /bin/bash -lc "sed -n '1,240p' src/mcp/tooling.py"

Agent context: I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is likely the integration point for live retrieval. Next I’m reading that layer and the project’s dependency/config patterns so the provider choice matches how tools are exposed today. exit code 0 Output: """ Shared MCP tooling primitives (Property schema + tool wrapper). """ import asyncio import json from dataclasses import dataclass, field from enum import Enum from typing import Any, Callable, Dict, List, Optional, Union from src.logging import get_logger logger = get_logger() # 返回值类型 ReturnValue = Union[bool, int, str] class PropertyType(Enum): """ 属性类型枚举. """ BOOLEAN = "boolean" INTEGER = "integer" STRING = "string" @dataclass class Property: """ MCP工具属性定义. """ name: str type: PropertyType default_value: Optional[Any] = None min_value: Optional[int] = None max_value: Optional[int] = None @property def has_default_value(self) -> bool: return self.default_value is not None @property def has_range(self) -> bool: return self.min_value is not None and self.max_value is not None def value(self, value: Any) -> Any: """ 验证并返回值. """ if self.type == PropertyType.INTEGER and self.has_range: if

command#19 /bin/bash -lc "sed -n '1,220p' pyproject.toml"

command#20 /bin/bash -lc "sed -n '1,220p' pyproject.toml"

Agent context: I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is likely the integration point for live retrieval. Next I’m reading that layer and the project’s dependency/config patterns so the provider choice matches how tools are exposed today. exit code 0 Output: [project] name = "py-xiaozhi" version = "1.0.0" description = "小智 AI 客户端 - 跨平台语音助手" readme = "README.md" requires-python = ">=3.10" license = { text = "MIT" } authors = [{ name = "Junsen", email = "junsen@example.com" }] keywords = ["ai", "voice-assistant", "xiaozhi"] dependencies = [ # 音频处理 "numpy>=1.26.0", "sounddevice>=0.4.4", "soxr>=0.5.0", "opuslib>=3.0.1", # 网络通信 "aiohttp>=3.9.0", "websockets>=11.0.3", "paho-mqtt>=2.1.0", "requests>=2.31.0", # AI/ML "openai>=1.0.0", "sherpa-onnx>=1.12.26", "sherpa-onnx-core>=1.12.26; sys_platform != 'win32'", # 图像处理 "opencv-python-headless>=4.9.0", "pillow>=10.0.0", # 加密 "cryptography>=42.0.0", # 工具库 "psutil>=5.9.0", "pendulum>=3.0.0", "py-machineid>=0.6.0", "platformdirs>=4.0.0", # 音乐/媒体 "mutagen>=1.47.0", # 中文处理 "pypinyin>=0.51.0", "lunar_python>=1.3.0", # 剪贴板 "pyperclip>=1.8.0", # 快捷键 (跨平台基础) "pynput>=1.7.6", # ================================== Provider mentions: openai

command#21 /bin/bash -lc "sed -n '1,240p' src/mcp/mcp_server.py"

command#22 /bin/bash -lc "sed -n '1,240p' src/mcp/mcp_server.py"

Agent context: I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is likely the integration point for live retrieval. Next I’m reading that layer and the project’s dependency/config patterns so the provider choice matches how tools are exposed today. exit code 0 Output: """ MCP Server Implementation for Python Reference: https://modelcontextprotocol.io/specification/2024-11-05 """ import json from collections.abc import Callable from typing import Any from src.constants.system import SystemConstants from src.logging import get_logger from src.mcp.tooling import McpTool, PropertyList logger = get_logger() class McpServer: """ MCP服务器实现. """ _instance = None @classmethod def get_instance(cls): """ 获取单例实例. """ if cls._instance is None: cls._instance = McpServer() return cls._instance def __init__(self): self.tools: list[McpTool] = [] self._send_callback: Callable | None = None self._camera = None def set_send_callback(self, callback: Callable): """ 设置发送消息的回调函数. """ self._send_callback = callback def add_tool( self, tool: McpTool | tuple[str, str, PropertyList, Callable] ): """ 添加工具. """ if isinstance(tool, tuple): # 从参数创建McpTool name, description, propertie

command#23 /bin/bash -lc "sed -n '1,220p' requirements.txt"

command#24 /bin/bash -lc "sed -n '1,220p' requirements.txt"

Agent context: I’ve found a Python desktop/voice-agent codebase with an MCP/plugin layer, which is likely the integration point for live retrieval. Next I’m reading that layer and the project’s dependency/config patterns so the provider choice matches how tools are exposed today. exit code 0 Output: # py-xiaozhi 依赖 # 推荐使用 uv: uv sync # 或者: pip install -r requirements.txt # GUI 框架 PySide6>=6.6.0 qasync>=0.27.1 # 音频处理 numpy>=1.26.0 sounddevice>=0.4.4 soxr>=0.5.0 opuslib>=3.0.1 # 网络通信 aiohttp>=3.9.0 websockets>=11.0.3 paho-mqtt>=2.1.0 requests>=2.31.0 # AI/ML openai>=1.0.0 sherpa-onnx>=1.12.26 sherpa-onnx-core>=1.12.26; sys_platform != "win32" # 图像处理 opencv-python-headless>=4.9.0 pillow>=10.0.0 # 加密 cryptography>=42.0.0 # 工具库 psutil>=5.9.0 pendulum>=3.0.0 py-machineid>=0.6.0 platformdirs>=4.0.0 # 音乐/媒体 mutagen>=1.47.0 # 中文处理 pypinyin>=0.51.0 lunar_python>=1.3.0 # 剪贴板 pyperclip>=1.8.0 # 快捷键 (跨平台) pynput>=1.7.6 # ============================================================ # 平台特定依赖 # ============================================================ # Windows comtypes>=1.4.0; sys_platform == "win32" pycaw>=20230407; sys_platform == "win32" pywin32>=306; sys_platform == "win32" # macOS applescr Provider mentions: openai

agent_action#25 Trellis classifies implementation as a task, but your current ask is framed as rese...

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files.

command#26 /bin/bash -lc 'find .trellis/spec -maxdepth 4 -type f -print'

command#27 /bin/bash -lc 'find .trellis/spec -maxdepth 4 -type f -print'

command#28 /bin/bash -lc "find src/mcp src/plugins src/protocols src/utils src/bootstrap -maxdepth 3 -type f -name '*.py' -print"

command#29 /bin/bash -lc "find src/mcp src/plugins src/protocols src/utils src/bootstrap -maxdepth 3 -type f -name '*.py' -print"

command#30 /bin/bash -lc "sed -n '220,520p' src/mcp/mcp_server.py"

command#31 /bin/bash -lc "sed -n '220,520p' src/mcp/mcp_server.py"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: tool = t break if not tool: await self._reply_error( request_id, f"Unknown tool: {tool_name}" ) return # 获取参数 arguments = params.get("arguments", {}) logger.info(f"[MCP] 开始执行工具 {tool_name}, 参数: {arguments}") # 异步调用工具 try: result = await tool.call(arguments) logger.info(f"[MCP] 工具 {tool_name} 执行成功，结果: {result}") await self._reply_result(request_id, json.loads(result)) except Exception as e: logger.error( f"[MCP] 工具 {tool_name} 执行失败: {e}", exc_info=True ) await self._reply_error(request_id, str(e)) async def _parse_capabilities(self, capabilities): """ 解析capabilities. """ vision = capabilities.get("vision", {}) if vision and isinstance(vision, dict): url = vision.get("url") token = vision.get("token") if url: from src.mcp.tools.camera import get_camera_instance camera = get_camera_instance() camera.set_explain_url(url) if token: camera.set_explain_token(token) logger.info(f"Vision service

command#34 /bin/bash -lc "sed -n '1,260p' src/plugins/mcp.py"

command#35 /bin/bash -lc "sed -n '1,260p' src/plugins/mcp.py"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: """MCP 插件. 管理 MCP 工具和消息处理。 """ from typing import TYPE_CHECKING, Optional from src.logging import get_logger from src.mcp.mcp_server import McpServer from src.plugins.base import Plugin if TYPE_CHECKING: from src.bootstrap.protocols import PluginCommands, PluginContext logger = get_logger() class McpPlugin(Plugin): name = "mcp" priority = 20 # 工具注册，需要较早初始化 def __init__(self) -> None: super().__init__() self._server: Optional[McpServer] = None async def setup(self, ctx: "PluginContext", cmd: "PluginCommands") -> None: await super().setup(ctx, cmd) self._server = McpServer.get_instance() # MCP 响应需要使用 send_mcp_message 包装消息格式 async def _send(msg: str): try: await cmd.send_mcp_message(msg) except Exception as e: logger.error(f"MCP 发送响应失败: {e}") try: self._server.set_send_callback(_send) self._server.add_common_tools() except Exception as e: logger.error(f"MCP 工具注册失败: {e}", exc_info=True) # 为

command#36 /bin/bash -lc "sed -n '1,240p' src/mcp/decorators.py"

command#37 /bin/bash -lc "sed -n '1,240p' src/mcp/decorators.py"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: """ MCP 工具装饰器与注册表. """ from collections.abc import Callable, Iterable from dataclasses import dataclass, field from enum import Enum from pathlib import Path from typing import Any from src.logging import get_logger from src.mcp.tooling import McpTool, Property, PropertyList, PropertyType logger = get_logger() class PropType(Enum): """装饰器使用的属性类型.""" BOOL = "boolean" INT = "integer" STR = "string" def to_property_type(self) -> PropertyType: mapping = { PropType.BOOL: PropertyType.BOOLEAN, PropType.INT: PropertyType.INTEGER, PropType.STR: PropertyType.STRING, } return mapping[self] _NOT_SET = object() @dataclass class Prop: """属性定义.""" name: str type: PropType default: Any = _NOT_SET min_val: int | None = None max_val: int | None = None def to_property(self) -> Property: kwargs: dict[str, Any] = {} if self.default is not _NOT_SET: kwargs["default_value"] = self.default if self.type == Prop

command#38 /bin/bash -lc "sed -n '1,220p' .trellis/spec/backend/index.md"

command#39 /bin/bash -lc "sed -n '1,220p' .trellis/spec/backend/index.md"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: # 后端开发规范 > Python 后端开发约定(单仓库、Python 3.10+,代码全部在 `src/` 下)。 --- ## 技术栈 - **语言**: Python ≥ 3.10,带类型注解。 - **异步运行时**: `asyncio`(CLI/GPIO 模式)和基于 PySide6 的 `qasync`(GUI 模式),入口见 `main.py`。 - **工具链**: ruff + black + isort + flake8(配置在 `pyproject.toml` 与 `.flake8`),便捷脚本: `./format_code.sh`。 - **测试**: pytest + pytest-asyncio 已配置,目前还没有 `tests/` 目录 —— 新增任何测试前先看 `quality-guidelines.md`。 - **持久化**: 没有数据库。配置以 JSON 形式由 `ConfigManager`(`src/utils/config_manager.py`) 管理,因此本目录下没有 DB 规范。 --- ## 规范索引 | 文档 | 主题 | |---|---| | [架构原则](./architecture-principles.md) | 分层、依赖方向、Plugin 模式、何时抽象、反过度耦合 vs 反过度解耦 | | [目录结构](./directory-structure.md) | `src/` 各域职责、新代码落点、单例 / import 约定 | | [Python 风格](./python-style.md) | 类型注解(现代范型 + `X \| None`)、文件头、命名、函数/方法规范、dataclass、路径 | | [asyncio 规范](./asyncio-guidelines.md) | `TaskManager.spawn` vs `create_task`、锁、gather、跨线程桥、qasync 注意事项 | | [PySide6 / QML](./pyside6-guidelines.md)

command#40 /bin/bash -lc "sed -n '1,260p' .trellis/spec/backend/mcp-tool-pattern.md"

command#41 /bin/bash -lc "sed -n '1,260p' .trellis/spec/backend/mcp-tool-pattern.md"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: # MCP 工具模式 > 新增 MCP 工具的标准做法。当前模式: `@mcp_tool` 装饰器 + 自动发现。这是项目"下一个 feature"的标准形态 —— `src/mcp/tools/weather/` 下的天气工具是参考实现,旁边的 `MCP_DEVELOPMENT_GUIDE.md` 是面向开发者的速查文档。 --- ## 工具放在哪里 ``` src/mcp/tools/<your_tool>/ ├── __init__.py # import 工具模块以触发 @mcp_tool 装饰器 └── <your_tool>_tools.py # 一个或多个 @mcp_tool 函数 ``` 可选的同级文件(`bazi`、`music`、`system`、`weather` 用过): - `_tools.py` —— 单独的工具注册模块,把工具函数与 manager 逻辑解耦。自动发现会 **显式** 拉取这个名字(见 `src/mcp/decorators.py` 的 `discover_tool_modules()`)。 - 业务模块(如 `weather_tools.py`、`music_player.py`)—— 纯逻辑,可独立 import。 `src/mcp/decorators.py` 中的发现循环: 1. import 直接放在 `src/mcp/tools/` 下的每个 `*.py`(跳过 `_*`)。 2. import `src/mcp/tools/` 下每个子包(跳过 `_*`)。 3. 子包内若有 `_tools.py`,**额外** import 一遍。两个推论: - 工具定义在 `src/mcp/tools/foo/foo_tools.py` 时,只有 `src/mcp/tools/foo/__init__.py` 显式 import 它,装饰器才会触发(发现步骤只 import 子包,不会展开内部文件,除非文件名叫 `_tools.py`)。 - `_` 开头的文件名会被包级扫描跳过,**只有** 子包内的 `_tools. Provider mentions: openai

command#42 /bin/bash -lc "sed -n '1,260p' src/utils/config_manager.py"

command#43 /bin/bash -lc "sed -n '1,260p' src/utils/config_manager.py"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: import json import os import shutil import uuid from typing import Any, Dict from src.logging import get_logger from src.utils.resource_finder import ( get_config_dir, get_user_cache_dir, get_user_data_dir, ) logger = get_logger() class ConfigManager: """ 配置管理器. """ _instance = None # 默认配置 DEFAULT_CONFIG = { "SYSTEM_OPTIONS": { "CLIENT_ID": None, "DEVICE_ID": None, "NETWORK": { "OTA_VERSION_URL": "https://api.tenclass.net/xiaozhi/ota/", "WEBSOCKET_URL": None, "WEBSOCKET_ACCESS_TOKEN": None, "MQTT_INFO": None, "ACTIVATION_VERSION": "v2", # 可选值: v1, v2 "AUTHORIZATION_URL": "https://xiaozhi.me/", }, }, "WAKE_WORD_OPTIONS": { "USE_WAKE_WORD": True, "MODEL_PATH": "models/zh", "NUM_THREADS": 5, "PROVIDER": "cpu", "MAX_ACTIVE_PATHS": 2, "KEYWORDS_SCORE": 1.8, "KEYWORDS_THRESHOLD": 0.2, "NUM_TRAILING_BLANKS": 1, "WAKE_WORD": "你好小智", "WAKE_WORD_LANG": "zh" }, "CAMERA": { "camera_index": 0, "frame

command#44 /bin/bash -lc "sed -n '1,220p' .trellis/spec/backend/asyncio-guidelines.md"

command#45 /bin/bash -lc "sed -n '1,220p' .trellis/spec/backend/asyncio-guidelines.md"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: # asyncio 规范 > 整个运行时以 asyncio 为底座。GUI 模式跑在 `qasync.QEventLoop`(Qt 兼容的 asyncio loop),CLI / GPIO 模式跑在原生 `asyncio.run(...)`。同一份 async 代码必须在两种模式下都能工作。本规范是 **标准**。仓库里仍有部分调用点不符合,在下面以反模式形式列出。 --- ## 创建任务: 必须可追踪应用拥有生命周期的任务统一走 **`TaskManager.spawn(coro, name)`**(`src/core/task_manager.py`): - 把任务放入集合,`cancel_all()` 关闭时统一取消。 - 通过 `add_done_callback` 自动记录异常。 - 在 `request_shutdown()` 之后拒绝新任务。 ```python # 正确 —— 由应用持有,关停时被取消,异常会被记录 self._task_manager.spawn(self._heartbeat_loop(), name="ws:heartbeat") self._task_manager.spawn(self._event_bus.emit(Events.NETWORK_ERROR, msg), name="emit:network_error") ``` ```python # 错误 —— fire-and-forget;GC 收掉后协程静默消失, # 没异常日志,关闭时也不会被取消 asyncio.create_task(self._event_bus.emit(Events.NETWORK_ERROR, msg)) ``` 仓库里目前存在未追踪的 `asyncio.create_task(...)` 调用点: - `src/core/protocol_manager.py:66, 69, 74, 78` - `src/ui/gui/manager.py:66, 172, 209` - `src/ui/cli/manager.py:132-13

command#46 /bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' \\) -prune -o -type f \\( -iname '*config*' -o -name '*.json' -o -name '*.yaml' -o -name '*.yml' -o -name '*.toml' \\) -print | head -200"

command#47 /bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' \\) -prune -o -type f \\( -iname '*config*' -o -name '*.json' -o -name '*.yaml' -o -name '*.yml' -o -name '*.toml' \\) -print | head -200"

command#48 /bin/bash -lc "sed -n '1,260p' src/mcp/tools/weather/weather_tools.py"

command#49 /bin/bash -lc "sed -n '1,260p' src/mcp/tools/weather/weather_tools.py"

Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. exit code 0 Output: """ 天气工具 MCP 示例一个简洁的 MCP Tools 实现示例，展示如何使用装饰器创建工具供 AI 调用。 """ import json from typing import Any, Dict from src.logging import get_logger from src.mcp.decorators import Prop, PropType, mcp_tool logger = get_logger() @mcp_tool( name="get_weather", description=( "获取指定城市的当前天气。" "参数: city - 城市名称（如：北京、上海、广州）" ), props=[ Prop("city", PropType.STR, default="北京"), ], ) def get_weather(args: Dict[str, Any]) -> str: """获取当前天气。 Args: args: {"city": "城市名称"} Returns: JSON 格式的天气数据 """ city = args.get("city", "北京") logger.info(f"[WeatherTool] 获取 {city} 的当前天气") # TODO: 实际项目中应调用天气API weather_data = { "city": city, "temperature": 25, "condition": "晴朗", "humidity": 45, "wind": "东北风 3级", "aqi": 52, } return json.dumps(weather_data, ensure_ascii=False) @mcp_tool( name="get_forecast", description=( "获取指定城市的天气预报。" "参数: city - 城市名称, days - 预报天数(1-7天)" ), props=[ Prop("city", PropType.STR, default="北京"), Prop("

search#51 Tavily API docs search extract agent search current web information

Query/prompt: Tavily API docs search extract agent search current web information Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Query: Tavily API docs search extract agent search current web information Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Expanded queries: Tavily API docs search extract agent search current web information | Exa API documentation search contents highlights answer web search AI | Brave Search API documentation web search independent index AI agents | Linkup API documentation AI search web retrieval

search#53 Perplexity Sonar API documentation search web citations

Query/prompt: Perplexity Sonar API documentation search web citations Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Query: Perplexity Sonar API documentation search web citations Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Expanded queries: Perplexity Sonar API documentation search web citations | You.com API documentation search retrieval agents | SerpAPI Google Search API documentation organic_results | Serper dev Google Search API documentation

search#55 DataForSEO SERP API documentation Google organic results

Query/prompt: DataForSEO SERP API documentation Google organic results Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Query: DataForSEO SERP API documentation Google organic results Agent context: Trellis classifies implementation as a task, but your current ask is framed as research and recommendation, so I’m keeping this read-only. I’ll give a concrete provider recommendation and implementation implications without changing files. Expanded queries: DataForSEO SERP API documentation Google organic results | SearchApi.io API documentation Google Search API | Jina AI Search API s.jina.ai documentation web search | Firecrawl Search API documentation web search scrape

agent_action#56 The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are ...

Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

command#57 /bin/bash -lc "sed -n '1,180p' /home/user/.codex/skills/.system/openai-docs/SKILL.md"

command#58 /bin/bash -lc "sed -n '1,180p' /home/user/.codex/skills/.system/openai-docs/SKILL.md"

Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. exit code 0 Output: --- name: "openai-docs" description: "Use when the user asks how to build with OpenAI products or APIs, asks about Codex itself or choosing Codex surfaces, needs up-to-date official documentation with citations, help choosing the latest model for a use case, or model upgrade and prompt-upgrade guidance; use OpenAI docs MCP tools for non-Codex docs questions, use the Codex manual helper first for broad Codex self-knowledge, and restrict fallback browsing to official OpenAI domains." --- # OpenAI Docs Provide authoritative, current guidance from OpenAI developer docs using the developers.openai.com MCP server. "Docs MCP" means `mcp__openaiDeveloperDocs__search_openai_docs` and `mcp__openaiDeveloperDocs__fetch_openai_doc`; for API reference, schema, parameter, or required-field questions, also use `mcp__openaiDeveloperDocs__get_openapi_spec` when available. Official-domain web search is fal Provider mentions: openai

command#59 /bin/bash -lc 'find .codex .agents -maxdepth 4 -type f -print 2>/dev/null | head -100'

command#60 /bin/bash -lc 'find .codex .agents -maxdepth 4 -type f -print 2>/dev/null | head -100'

search#62 site:developers.openai.com web search tool Responses API OpenAI docs

Query/prompt: site:developers.openai.com web search tool Responses API OpenAI docs Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: site:developers.openai.com web search tool Responses API OpenAI docs Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Expanded queries: site:developers.openai.com web search tool Responses API OpenAI docs | site:developers.openai.com web search OpenAI Responses API tool documentation Provider mentions: openai

url#64 https://developers.openai.com/api/docs/guides/tools-web-search

Query/prompt: https://developers.openai.com/api/docs/guides/tools-web-search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: https://developers.openai.com/api/docs/guides/tools-web-search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#66 'Web search allows' in https://developers.openai.com/api/docs/guides/tools-web-search

Query/prompt: 'Web search allows' in https://developers.openai.com/api/docs/guides/tools-web-search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: 'Web search allows' in https://developers.openai.com/api/docs/guides/tools-web-search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#68 Microsoft Bing Search APIs retirement grounding with Bing Search documentation 2025

Query/prompt: Microsoft Bing Search APIs retirement grounding with Bing Search documentation 2025 Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: Microsoft Bing Search APIs retirement grounding with Bing Search documentation 2025 Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Expanded queries: Microsoft Bing Search APIs retirement grounding with Bing Search documentation 2025 | Azure AI Foundry Grounding with Bing Search documentation Provider mentions: openai

url#70 https://developers.openai.com/api/docs/models/o3-pro

Query/prompt: https://developers.openai.com/api/docs/models/o3-pro Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: https://developers.openai.com/api/docs/models/o3-pro Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

url#72 https://docs.tavily.com/documentation/api-reference/endpoint/search

Query/prompt: https://docs.tavily.com/documentation/api-reference/endpoint/search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: https://docs.tavily.com/documentation/api-reference/endpoint/search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#74 'The search endpoint lets you intelligently search' in https://docs.exa.ai/referenc...

Query/prompt: 'The search endpoint lets you intelligently search' in https://docs.exa.ai/reference/search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: 'The search endpoint lets you intelligently search' in https://docs.exa.ai/reference/search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#76 Google Programmable Search JSON API documentation custom search

Query/prompt: Google Programmable Search JSON API documentation custom search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: Google Programmable Search JSON API documentation custom search Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Expanded queries: Google Programmable Search JSON API documentation custom search | Serper dev Search API documentation organic results | SearchAPI.io Google Search API documentation organic_results | Mojeek Search API documentation web search Provider mentions: openai

url#78 https://serper.dev/

Query/prompt: https://serper.dev/ Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: https://serper.dev/ Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#80 'JSON output includes structured data' in https://serpapi.com/search-api

Query/prompt: 'JSON output includes structured data' in https://serpapi.com/search-api Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: 'JSON output includes structured data' in https://serpapi.com/search-api Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#82 Jina AI Reader Search API s.jina.ai documentation

Query/prompt: Jina AI Reader Search API s.jina.ai documentation Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: Jina AI Reader Search API s.jina.ai documentation Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Expanded queries: Jina AI Reader Search API s.jina.ai documentation | Valyu API documentation web search agent retrieval | Kagi Search API documentation FastGPT Universal Summarizer | Qwant API web search developer documentation Provider mentions: openai

url#84 https://docs.valyu.ai/home

Query/prompt: https://docs.valyu.ai/home Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: https://docs.valyu.ai/home Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

search#86 'Available APIs' in https://help.kagi.com/kagi/api/overview.html

Query/prompt: 'Available APIs' in https://help.kagi.com/kagi/api/overview.html Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Query: 'Available APIs' in https://help.kagi.com/kagi/api/overview.html Agent context: The repo already depends on `aiohttp`, `requests`, and `openai`, and MCP tools are auto-discovered from `src/mcp/tools/<tool>/`. That makes a provider with a simple HTTP API and no heavy SDK requirement the cleanest fit. Provider mentions: openai

agent_action#87 **Recommendation** I’d choose **Tavily** as the first provider to add. For this rep...

Agent context: **Recommendation** I’d choose **Tavily** as the first provider to add. For this repo, the clean integration point is a new MCP tool under [src/mcp/tools](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/tools), exposed through the existing decorator auto-discovery path in [src/mcp/decorators.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/decorators.py:78) and [src/plugins/mcp.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/plugins/mcp.py:39). Tavily fits that shape well: one HTTP API, Python SDK available, search plus extraction/crawl/map, optional generated answer, ranked results, source URLs, date filters, domain filters, country boosting, news/finance topics, and optional cleaned raw page content in markdown/text. Those are exactly the controls an MCP `web_search` / `web_retrieve` tool needs. Tavily’s search docs explicitly support `include_answer`, `include_raw_content`, `time_range`, `start_date`, `end_date`, `include_domains`, `exclude_domains`, and `topic=news|general|finance`. **Why Tavily Over The Others** Tavily is the best default because it returns LLM-ready snippets and optionally cleaned page content in the same search call, while still preserving URLs and structured results for citation. It does not force this app to switch LLM providers, agent runtimes, or cloud stacks. The repo already has `aiohttp`/`requests`, async MCP tooling, and config management, so Tavily can be implemented as a small, provider-isolated tool using `TAVILY_API_KEY` or a `ConfigManager` value, without changing protocol, UI, or model code. The closest runner-up is **Exa**. Exa’s Search endpoint also searches the web and extracts content from results, with highlights/summaries available. I would pick Exa if the product were mostly research over technical, academic, startup, or semantically related pages. For a general voice assistant that needs “what’s current?” answers across news, local facts, products, weather-like queries, and broad web lookups, Tavily’s current-events controls and simpler “search + answer + raw content” shape are a better first fit. **Main Alternatives Considered** - **Exa**: strong semantic search and content retrieval; excellent alternative for research-heavy workflows. - **Brave Search API / LLM Context**: strong independent index and freshness/language/country controls; Brave says its Web Search API uses a comprehensive index and points agent/chatbot use cases to its LLM Context endpoint. Good fallback if index independence matters more than turnkey extracted content. - **Linkup**: credible European/France-oriented AI retrieval option; its `/search` endpoint is described as a context retrieval tool for grounding LLM answers with sources. - **You.com APIs**: real-time Search, Content, and Research APIs for agentic apps. Strong, but more platform-like than needed for a small MCP tool. - **Perplexity Sonar**: web-grounded answer engine with streaming and OpenAI-compatible clients. Good if you want Perplexity to synthesize the answer, less ideal if this repo’s own agent should receive sources/context and decide. - **OpenAI Responses API `web_search`**: excellent if the app’s answer generation is already fully on OpenAI Responses; official docs recommend `web_search` for new integrations and provide citations/sources. I would not choose it here because the repo exposes tools to an external xiaozhi agent via MCP, so provider-neutral retrieval is cleaner than binding search to one model API. - **Microsoft Grounding with Bing Search**: viable only if the project moves into Azure AI Agent Service/Foundry. Microsoft retired classic Bing Search APIs on August 11, 2025 and points customers to Grounding with Bing Search; the replacement is agent-service coupled and does not expose raw returned content to developers. - **Google Programmable Search JSON API**: not a good new choice. Google says it is closed to new customers and existing customers have until January 1, 2027 to transition. - **SerpAPI, Serper, SearchAPI.io, DataForSEO**: credible SERP APIs. They are useful when you specifically need Google SERP parity, ads, maps, shopping, or SEO-style rank data. They are less ideal for grounding because they mostly return SERP structures/snippets rather than clean retrieved page content. SerpAPI documents broad structured SERP JSON; DataForSEO is explicitly SERP-oriented. - **Firecrawl**: very strong scrape/crawl/extract layer; its search endpoint can optionally scrape results. Better as a companion retrieval/extraction backend than the first search provider. - **Jina Reader/Search**: attractive low-friction URL-to-markdown and `s.jina.ai` search-to-clean-text path; Jina documents `s.jina.ai` for web search and top results as LLM-friendly text. Good fallback or secondary extractor. - **Valyu**: newer AI-agent search/research entrant with search, contents, answer, deep research, and hosted MCP server. Promising, but I’d want more production mileage before making it the default. - **Kagi API**: credible privacy-oriented search/summarization/FastGPT API set; docs list Search API, Universal Summarizer, and FastGPT live web search. Good for user-funded/premium search scenarios, less conventional for an open-source assistant default. - **Mojeek**: UK/non-US independent search option with JSON/XML API and no-tracking positioning; useful if independence/privacy is the top requirement, but it appears more like classic web SERP than LLM-ready retrieval. - **Qwant**: European search engine, but I did not find a current official managed developer search API suitable for this integration, so I would not treat it as implementable without a partnership. **Docs Actually Consulted** I consulted official docs/pages for: **Tavily, Exa, Brave Search, Linkup, You.com, Perplexity Sonar, OpenAI web search, Microsoft Bing Grounding, Google Custom Search JSON API, Firecrawl, DataForSEO, SerpAPI, SearchAPI.io, Serper, Jina Reader/Search, Valyu, Kagi, and Mojeek**. No files were changed. **Recommendation** I’d choose **Tavily** as the first provider to add. For this repo, the clean integration point is a new MCP tool under [src/mcp/tools](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/tools), exposed through the existing decorator auto-discovery path in [src/mcp/decorators.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/decorators.py:78) and [src/plugins/mcp.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/plugins/mcp.py:39). Tavily fits that shape well: one HTTP API, Python SDK available, search plus extraction/crawl/map, optional generated answer, ranked results, source URLs, date filters, domain filters, country boosting, news/finance topics, and optional cleaned raw page content in markdown/text. Those are exactly the controls an MCP `web_search` / `web_retrieve` tool needs. Tavily’s search docs explicitly support `include_answer`, `include_raw_content`, `time_range`, `start_date`, `end_date`, `include_domains`, `exclude_domains`, and `topic=news|general|finance`. **Why Tavily Over The Others** Tavily is the best default because it returns LLM-ready snippets and Provider mentions: openai

result73. Selected unknown

**Recommendation** I’d choose **Tavily** as the first provider to add. For this repo, the clean integration point is a new MCP tool under [src/mcp/tools](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/tools), exposed through the existing decorator auto-discovery path in [src/mcp/decorators.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/mcp/decorators.py:78) and [src/plugins/mcp.py](/home/user/worktrees/attempt-59febbed-1879-45f5-b579-88a6ca7af998-cap-0/src/plugins/mcp.py:39). Tavily fits that shape well: one HTTP API, Python SDK available, search plus extraction/crawl/map, optional generated answer, ranked results, source URLs, date filters, domain filters, country boosting, news/finance topics, and optional cleaned raw page content in markdown/text. Those are exactly the controls an MCP `web_search` / `web_retrieve` tool needs. Tavily’s search docs explicitly support `include_answer`, `include_raw_content`, `time_range`, `start_date`, `end_date`, `include_domains`, `exclude_domains`, and `topic=news|general|finance`. **Why Tavily Over The Others** Tavily is the best default because it returns LLM-ready snippets and optionally cleaned page content in the same search call, while still preserving URLs and structured results for citation. It does not force this app to switch LLM providers, agent runtimes, or cloud s Provider mentions: openai

Commands

/bin/bash -lc "pwd && rg --files -g 'AGENTS.md' -g '.trellis/**' -g 'package.json' -g 'pyproject.toml' -g 'README*' -g 'src/**' -g 'app/**' -g 'lib/**' -g 'server/**' -g 'agents/**' -g '.agents/**' -g '.codex/**'"exit 127

/bin/bash -lc 'git status --short'exit 0

/bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' -o -path './dist' -o -path './build' \\) -prune -o -type f \\( -name 'AGENTS.md' -o -path './.trellis/*' -o -name 'package.json' -o -name 'pyproject.toml' -o -name 'README*' -o -name 'tsconfig.json' -o -name 'requirements*.txt' -o -name 'Cargo.toml' -o -name 'go.mod' \\) -print"exit 0

/bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' -o -path './dist' -o -path './build' \\) -prune -o -type f -print | head -250"exit 0

/bin/bash -lc "sed -n '1,220p' README.md"exit 0

/bin/bash -lc "sed -n '1,240p' .trellis/workflow.md"exit 0

/bin/bash -lc "sed -n '1,240p' src/mcp/tooling.py"exit 0

/bin/bash -lc "sed -n '1,220p' pyproject.toml"exit 0

/bin/bash -lc "sed -n '1,240p' src/mcp/mcp_server.py"exit 0

/bin/bash -lc "sed -n '1,220p' requirements.txt"exit 0

/bin/bash -lc 'find .trellis/spec -maxdepth 4 -type f -print'exit 0

/bin/bash -lc "find src/mcp src/plugins src/protocols src/utils src/bootstrap -maxdepth 3 -type f -name '*.py' -print"exit 0

/bin/bash -lc "sed -n '220,520p' src/mcp/mcp_server.py"exit 0

/bin/bash -lc "sed -n '1,260p' src/plugins/mcp.py"exit 0

/bin/bash -lc "sed -n '1,240p' src/mcp/decorators.py"exit 0

/bin/bash -lc "sed -n '1,220p' .trellis/spec/backend/index.md"exit 0

/bin/bash -lc "sed -n '1,260p' .trellis/spec/backend/mcp-tool-pattern.md"exit 0

/bin/bash -lc "sed -n '1,260p' src/utils/config_manager.py"exit 0

/bin/bash -lc "sed -n '1,220p' .trellis/spec/backend/asyncio-guidelines.md"exit 0

/bin/bash -lc "find . -maxdepth 3 \\( -path './.git' -o -path './node_modules' -o -path './.venv' \\) -prune -o -type f \\( -iname '*config*' -o -name '*.json' -o -name '*.yaml' -o -name '*.yml' -o -name '*.toml' \\) -print | head -200"exit 0

/bin/bash -lc "sed -n '1,260p' src/mcp/tools/weather/weather_tools.py"exit 0

/bin/bash -lc "sed -n '1,180p' /home/user/.codex/skills/.system/openai-docs/SKILL.md"exit 0

/bin/bash -lc 'find .codex .agents -maxdepth 4 -type f -print 2>/dev/null | head -100'exit 0

Runs

Find

Run

Sandbox Execution

Prompt & Agent Log

Runs

Run

Sandbox Execution

Prompt & Agent Log