Find
Search dashboard pages and product views.
Search dashboard pages and product views.
MatrixAges/polywise
Add the ability for this repository's agent to fetch and extract clean, readable content from the web pages/URLs it encounters, so it can use that content in its workflow. Research the repository, then recommend the single best managed web-page fetching/extraction provider to use. Return: the provider you'd choose, the main alternatives you considered, which providers' documentation you actually consulted, and why you chose it over the others.
P0
1eab79217258b1314fe9302fb16113154535049b
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference. I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference.
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference.
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference. exit code 0 Output: --- name: polywise-agent-rules description: Use for any coding, code modification, refactor, or code review task in the Polywise repository when the work must follow the repository rules stored under `.agent/rules` and nearby `unify.md` or `agentmap.md` files. --- # Polywise Agent Rules 1. Read `.agent/rules/global.md` before doing work in this repository. 2. For coding, refactor, code-generation, and review tasks, also read `.agent/rules/coding.md`. 3. Before creating files, refactoring modules, or writing core logic, read the target package's outline-level `unify.md` and follow the matched route's sample paths. 4. If a package-level `agentmap.md` exists and the task changes that package's structure or responsibility boundaries, update that outline-level `agentmap.md` before finishing. 5. Treat `.agent/rules/` as the source of truth. Use this skill as the Codex-native bridge rather than
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference.
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference. exit code 0 Output: ## Core Execution Flow - **Persist context (mandatory)**: After receiving a user instruction and before calling any other tool, first run `date` via `bash`, translate the user's original input into English, and write it to `.prompts/YYYY-MM-DD/HH-mm-ss.md`. - **Language consistency**: Always reply in exactly the same language as the user's input. - **Path safety**: Before using file tools, if the path is not an explicit absolute path, first confirm the real location with `glob` or `ls`; never guess paths from memory. - **Complete replacement**: When using editing tools, replacement snippets must include enough context and contain the full business logic; never use `...` to omit code. - **Scope restriction**: Limit changes strictly to the files or modules explicitly mentioned by the user; if unmentioned modules are involved, confirm with the user first. - **Proposal first**: Before any wr
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference.
Agent context: I’ll first load the repository’s agent rules, then inspect where “agent” capabilities are defined so the provider recommendation is grounded in this codebase rather than generic tool preference. exit code 127 Output: /bin/bash: line 1: rg: command not found
Agent context: Using the Polywise agent rules now. One repository-specific rule requires persisting the prompt before deeper inspection; I’m correcting that bookkeeping step before continuing the read-only research. Using the Polywise agent rules now. One repository-specific rule requires persisting the prompt before deeper inspection; I’m correcting that bookkeeping step before continuing the read-only research.
Agent context: Using the Polywise agent rules now. One repository-specific rule requires persisting the prompt before deeper inspection; I’m correcting that bookkeeping step before continuing the read-only research.
Agent context: Using the Polywise agent rules now. One repository-specific rule requires persisting the prompt before deeper inspection; I’m correcting that bookkeeping step before continuing the read-only research. exit code 0 Output: 2026-06-06 05:05:04 UTC
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed. I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed.
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed.
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed. exit code 0 Output: /home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed.
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed. exit code 0
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed.
Agent context: I’m creating the required prompt record under `.prompts/2026-06-06/` with the user’s request translated to English, then I’ll keep the rest of this task read-only unless a write is explicitly needed. exit code 0
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment. The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment.
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment.
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment. exit code 0 Output: ./packages/app/agentmap.md ./packages/desktop/agentmap.md ./packages/erpc/agentmap.md ./packages/polywise/agentmap.md ./packages/stk/agentmap.md
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment.
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment. exit code 0 Output: ./.agent/rules/unify.md ./.github/unify.md ./AGENTS.md ./package.json ./packages/app/package.json ./packages/app/unify.md ./packages/desktop/package.json ./packages/desktop/unify.md ./packages/erpc/package.json ./packages/erpc/unify.md ./packages/polywise/package.json ./packages/polywise/unify.md ./packages/stk/package.json ./packages/stk/unify.md ./packages/website/package.json ./packages/website/unify.md ./pnpm-workspace.yaml ./unify.md
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment.
Agent context: The prompt record is in place. I’m now locating the package maps and likely agent entry points, using `find` because `rg` is not installed in this environment. exit code 0 Output: ./.draft/2026-05-31-global-panel-mention-tools-plan.md ./.draft/2026-05-31-polywise-tool-plan.md ./.draft/2026-06-02-bash-tool-high-availability.md ./.draft/agent-private-knowledge-import-export-plan.md ./AGENTS.md ./packages/app/agentmap.md ./packages/app/components/Session/components/SubAgent.tsx ./packages/app/components/Tooltip.tsx ./packages/app/pages/agent/components/AgentAvatar.tsx ./packages/app/pages/agent/components/AgentModel.tsx ./packages/app/pages/agent/components/AgentsMenu.tsx ./packages/app/pages/agent/components/ToolsPanel.tsx ./packages/app/pages/home/components/Agent.tsx ./packages/app/pages/linkcase/components/AgentDialog.tsx ./packages/app/pages/linkcase/components/MenuToolbar.tsx ./packages/app/utils/getToolDesc.ts ./packages/app/utils/getToolIcon.tsx ./packages/app/utils/getToolName.ts ./packages/desktop/agentmap.md ./packages/erpc/agentmap.md ./packages/polywise/
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns. The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns.
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns.
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns. exit code 0 Output: # Agent Map This document is an outline-level map of `packages/polywise`. It captures the package's long-lived runtime domains instead of enumerating every leaf file. ## 1. Module Overview - **Description**: Polywise backend and AI runtime package. - **Architecture**: TypeScript service runtime around search/memory pipelines, session orchestration, and SQLite-backed state. ## 2. Outline Tree ```json { "entry_and_bootstrap": { "package_files": [ "package.json", "rslib.config.ts", "drizzle.config.ts", "rstest.config.ts", "tsconfig.json", "tsconfig.build.json", "inspect.sh" ], "src/index.ts": "Primary package runtime entry.", "src/server.ts": "Standalone server entry.", "src/cli": "CLI surface for local runtime and tooling flows.", "src/env.ts": "Environment normalization.", "src/auth.ts": "Authentication bootstrap." }, "service_interfaces": { "src/api": "HTTP-facing endpoints and session/p
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns.
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns. exit code 0 Output: { "name": "polywise", "version": "0.0.5", "license": "MIT", "homepage": "https://polywise.io", "repository": { "type": "git", "url": "https://github.com/MatrixAges/polywise.git" }, "keywords": [ "agent", "decision", "workflow", "database", "ai", "memory", "self-hosted", "assistant", "twin", "rag", "decision-system", "graph-rag", "polywise", "llm-wiki" ], "type": "module", "main": "./dist/index.js", "sideEffects": false, "exports": { ".": { "types": "./dist/src/index.d.ts", "import": "./dist/index.js", "default": "./dist/index.js" } }, "bin": { "polywise": "./dist/cli.js" }, "files": [ "dist", "./package.json", "./README.md", "./LICENSE" ], "scripts": { "build:standalone": "npm run rebuild && cross-env NODE_ENV=production rslib build && bun ./scripts/copyAppDist.ts", "build:electron": "cross-env NODE_ENV=production rslib build", "build:watch": "cross-env NODE_ENV=development rslib build - Provider mentions: openai
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns.
Agent context: The agent runtime appears to live primarily under `packages/polywise`. I’m reading that package map before touching source files, then I’ll inspect existing fetch, MCP, and tool patterns. exit code 0 Output: { "name": "monorepo", "author": "1yasa", "packageManager": "pnpm@11.5.1", "scripts": { "deps:w": "ncu -u -i --workspaces", "build:deps": "turbo run deps", "build:workflows": "bun ./scripts/build_workflows.mjs", "build:mac": "turbo run desktop#pack:mac", "build:win": "turbo run desktop#pack:win", "pcu": "pcu update -i --no-ai --no-backup", "release": "VERSION=0.0.3 bun run ./scripts/release.ts" }, "simple-git-hooks": { "pre-commit": "pnpm pretty-quick --pattern \"**/*.*(md|css|js|jsx|ts|tsx)\" --staged" }, "devDependencies": { "@ianvs/prettier-plugin-sort-imports": "^4.7.1", "@jlarky/gha-ts": "^0.2.2", "@types/bun": "^1.3.14", "@types/node": "catalog:", "bun": "catalog:", "bun-types": "catalog:", "npm-check-updates": "^22.2.1", "pcu": "^1.3.0", "prettier": "^3.8.3", "prettier-plugin-tailwindcss-pro": "^0.2.0", "pretty-quick": "^4.2.2", "simple-git-hooks": "^2.13.1", "turbo": "^2.9.14" } }
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers. The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers.
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers.
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers. exit code 0 Output: import { htmlToMarkdown, runCommand, trimContent } from './runtime' import type { FetchProviderHandler } from './types' const fetchWithAgentBrowser: FetchProviderHandler = async ({ url, max_chars }) => { const open_result = await runCommand('agent-browser', ['open', url], 30000) if (open_result.exitCode !== 0) { throw new Error(open_result.stderr || open_result.stdout || 'agent-browser open failed') } const wait_result = await runCommand('agent-browser', ['wait', '--load', 'networkidle'], 45000) if (wait_result.exitCode !== 0) { throw new Error(wait_result.stderr || wait_result.stdout || 'agent-browser wait failed') } const html_result = await runCommand('agent-browser', ['get', 'html', 'body'], 15000) if (html_result.exitCode !== 0) { throw new Error(html_result.stderr || html_result.stdout || 'agent-browser get html failed') } if (!html_result.stdout.trim()) { throw new Error('agent-br
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers.
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers. exit code 0 Output: packages/polywise/src/fst/agents/skill_creator/agent.ts:12: description: string().describe('Progressive-disclosure description or empty string when skipped'), packages/polywise/src/fst/agents/skill_creator/createDraft.ts:58: description: output?.description || '', packages/polywise/src/fst/agents/skill_creator/defaultSkill.ts:3:description: Use this meta-skill when a conversation reveals a reusable workflow, repeated failure pattern, or missing recovery path that should be converted into a reusable local skill. packages/polywise/src/fst/agents/skill_creator/defaultSkill.ts:18:5. Use a persuasive short description so the main agent can discover the skill under progressive disclosure. packages/polywise/src/fst/agents/skill_creator/defaultSkill.ts:27:2. Confirm the description clearly signals when the skill should be read. packages/polywise/src/fst/agents/superego/agent.ts:5:import { create
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers.
Agent context: The package already depends on `ofetch`, `turndown`, and `jina-ai-provider`, which is a useful signal: the project is TypeScript-first, has existing HTTP conventions, and already uses Jina-adjacent tooling. I’m checking how URLs and tools are currently handled before comparing providers. exit code 0 Output: packages/polywise/src/auth.ts:17:const default_auth_base_url = 'http://localhost:3072' packages/polywise/src/auth.ts:29: process.env.BETTER_AUTH_URL, packages/polywise/src/auth.ts:30: process.env.POLYWISE_AUTH_URL, packages/polywise/src/auth.ts:31: process.env.POLYWISE_SERVER_URL, packages/polywise/src/auth.ts:32: default_auth_base_url packages/polywise/src/auth.ts:41: return new URL(value).toString().replace(/\/$/, '') packages/polywise/src/auth.ts:45: return default_auth_base_url packages/polywise/src/auth.ts:71: baseURL: resolveAuthBaseUrl(), packages/polywise/src/cli/index.ts:4:import { fileURLToPath, pathToFileURL } from 'node:url' packages/polywise/src/cli/index.ts:17:const server_base_url = (process.env.POLYWISE_SERVER_URL || 'http://localhost:3072').replace(/\/$/, '') packages/polywise/src/cli/index.ts:18:const api_base_url = `${server_base_url}/api` packages/polywise/src/cli/ind
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it. This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it.
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it.
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it. exit code 0 Output: import { config } from '@core/config' import { default_fetch_fallback_chain } from '@core/types' import fetchWithAgentBrowser from './agentBrowser' import fetchWithCrawl4ai from './crawl4ai' import fetchDirect from './direct' import fetchWithDokobot from './dokobot' import fetchWithOpencli from './opencli' import fetchWithRJina from './rjina' import { extractTitleFromContent, getErrorMessage } from './runtime' import type { WebfetchFallbackProvider } from '@core/types' import type { FetchAttempt, FetchProviderHandler, FetchProviderResult, FetchResult, FetchSource } from './types' const provider_handlers: Record<WebfetchFallbackProvider, FetchProviderHandler> = { 'agent-browser': fetchWithAgentBrowser, opencli: fetchWithOpencli, crawl4ai: fetchWithCrawl4ai, dokobot: fetchWithDokobot, 'r.jina.ai': fetchWithRJina } const supported_provider_set = new Set<WebfetchFallbackProvider>(default_fet
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it.
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it. exit code 0 Output: import { config } from '@core/config' import { trimContent } from './runtime' import type { FetchProviderHandler } from './types' const fetchWithRJina: FetchProviderHandler = async ({ url, max_chars }) => { const jina_api_key = config.jina_api_key?.trim() const resp = await fetch(`https://r.jina.ai/${url}`, { signal: AbortSignal.timeout(30000), headers: { ...(jina_api_key ? { Authorization: `Bearer ${jina_api_key}` } : {}) } }) if (!resp.ok) { throw new Error(`Jina returned HTTP ${resp.status}`) } const markdown = await resp.text() if (!markdown.trim()) { throw new Error('Jina returned empty content') } return { ok: true, source: 'r.jina.ai', ...trimContent(markdown, max_chars) } } export default fetchWithRJina
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it.
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it. exit code 0 Output: import { htmlToMarkdown, trimContent } from './runtime' import type { FetchProviderHandler } from './types' const fetchDirect: FetchProviderHandler = async ({ url, max_chars }) => { const resp = await fetch(url, { signal: AbortSignal.timeout(15000), headers: { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36', Accept: 'text/html,application/xhtml+xml,*/*' } }) if (!resp.ok) { throw new Error(`HTTP ${resp.status}`) } const html = await resp.text() const markdown = htmlToMarkdown(html) if (!markdown.trim()) { throw new Error('Direct fetch returned empty content') } return { ok: true, source: 'direct', ...trimContent(markdown, max_chars) } } export default fetchDirect
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it.
Agent context: This repository already has a `web_fetch_tool` and multiple fetch adapters, including `r.jina.ai`. I’m reading those adapters and configuration next so the recommendation can say whether to lean into the current path or replace it. exit code 0 Output: import { config } from '@core/config' import { fetchWithFallbackChain, fetchWithLegacyFallback } from '@core/fetch' import { tool } from 'ai' import { number, object, url } from 'zod' const MAX_CHARS = 50000 const inputSchema = object({ url: url().describe('The URL to fetch content from'), max_chars: number().optional().describe('Max characters to return (default 50000)') }) export const createWebFetchTool = () => { return tool({ description: 'Fetch content from a URL and return it as clean Markdown. Use for reading web pages, documentation, or any online content.', inputSchema, execute: async input => { const max_chars = input.max_chars ?? MAX_CHARS return config.enbale_webfetch_chain ? await fetchWithFallbackChain(input.url, max_chars) : await fetchWithLegacyFallback(input.url, max_chars) } }) }
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: export interface Workspace { name: string endpoint?: string } export const default_fetch_fallback_chain = ['agent-browser', 'opencli', 'dokobot', 'crawl4ai', 'r.jina.ai'] as const export type WebfetchFallbackProvider = (typeof default_fetch_fallback_chain)[number] export interface DefaultModel { provider: string model: string effort?: string } export interface McpOAuthConfig { clientId?: string clientSecret?: string scope?: string redirectUri?: string } export interface McpLocalConfig { type: 'local' command: Array<string> environment?: Record<string, string> enabled?: boolean timeout?: number } export interface McpRemoteConfig { type: 'remote' url: string enabled?: boolean headers?: Record<string, string> oauth?: McpOAuthConfig | false timeout?: number } export interface McpEnabledConfig { enabled: boolean } export interface McpConfig { enabled?: boolean [name: string]: McpLocalConfig |
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: packages/polywise/src/config/loadConfig.ts:4:import { default_fetch_fallback_chain } from '@core/types' packages/polywise/src/config/loadConfig.ts:15:const fetch_fallback_provider_set = new Set<string>(default_fetch_fallback_chain) packages/polywise/src/config/loadConfig.ts:107: if (config.jina_api_key === undefined) { packages/polywise/src/config/loadConfig.ts:108: config.jina_api_key = '' packages/polywise/src/config/loadConfig.ts:167: if (config.enbale_webfetch_chain === undefined) { packages/polywise/src/config/loadConfig.ts:168: config.enbale_webfetch_chain = false packages/polywise/src/config/loadConfig.ts:208: if (!Array.isArray(config.fetch_fallback_chain) || !config.fetch_fallback_chain.length) { packages/polywise/src/config/loadConfig.ts:209: config.fetch_fallback_chain = [...default_fetch_fallback_chain] packages/polywise/src/config/loadConfig.ts:212: const current_chain = con
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: import type { WebfetchFallbackProvider } from '@core/types' export type FetchSource = WebfetchFallbackProvider | 'direct' export interface FetchAttempt { source: FetchSource error: string } export interface FetchProviderSuccess { ok: true source: FetchSource content: string truncated: boolean title?: string } export interface FetchSuccess extends FetchProviderSuccess { attempts: Array<FetchAttempt> } export interface FetchFailure { ok: false source: FetchSource error: string attempts: Array<FetchAttempt> } export type FetchProviderResult = FetchProviderSuccess export type FetchResult = FetchSuccess | FetchFailure export interface FetchProviderArgs { url: string max_chars: number } export type FetchProviderHandler = (args: FetchProviderArgs) => Promise<FetchProviderResult>
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: # Your Role You are a smart assistant working in an "infinite conversation" scenario. Your goal is to build a long-term, coherent dialogue with the user and provide accurate, continuous help. ## Your Memory Limits The "short-term memory" in front of you can only see the most recent 12 messages. If you realize these messages lack necessary information, **do not guess or make things up**. Use the tools below to find missing information. ## Available Tools ### Composer Mentions - `[TOOL: tool name]` means the user explicitly mentioned a runtime tool. - `[MCP: server name]` means the user explicitly selected a configured MCP server. When `mcp_tool` is available, prefer scoping MCP discovery and execution to that server. - `[SKILL: skill name]` means the user explicitly mentioned a local skill. - `[FILE: path]` means the user explicitly mentioned a file or directory path from the session work
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: import path from 'path' import { app } from '@core/consts' import { preset_providers } from '@core/consts/providers' import defaultSkillCreator from '@core/fst/agents/skill_creator/defaultSkill' import { default_fetch_fallback_chain } from '@core/types' import { ensureWithValue } from '@core/utils' import fs from 'fs-extra' import { cron_path, pipeline_path, pthink_path, rewire_dir, temp_dir } from '../consts/app' import type { AppConfig, ProviderConfig } from '@core/types' const configs = ['config', 'providers'] export default async () => { for await (const name of configs) { const config_path = path.resolve(`${app.app_path}/${name}.json`) if (name === 'config') { const preset = preset_providers[0] const default_model = { provider: preset.name, model: preset.models[0].id } await ensureWithValue(config_path, { workspaces: [{ name: 'Default' }], current_workspace: 'Default', default_model
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: import { config } from '@core/config' import { tool } from 'ai' import TurndownService from 'turndown' import { number, object, string } from 'zod' const turndown = new TurndownService({ headingStyle: 'atx', hr: '---', bulletListMarker: '-', codeBlockStyle: 'fenced', emDelimiter: '*' }) turndown.remove(['script', 'style', 'meta', 'link', 'noscript']) const MAX_CHARS = 30000 const inputSchema = object({ query: string().describe('Keywords used to search the web for a list of relevant links'), max_chars: number().optional().describe('Max characters to return (default 30000)') }) export const createWebSearchTool = () => { return tool({ description: 'Search the web and return link list. This tool is for finding candidate URLs only, not for final factual answers. After reviewing the returned links and snippets, call web_fetch_tool on the most relevant target URL to read the webpage body.', inp
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction.
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: import { config } from '@core/config' import { global_linkcase_session_title } from '@core/consts' import getLinkcaseSystemPrompt from '@core/consts/prompts/getLinkcaseSystemPrompt' import { default_fetch_fallback_chain } from '@core/types' import dayjs from 'dayjs' import type Session from '../../../session' import type { PromptState } from '../../../session/core/types' export default (s: Session, state: PromptState) => { state.system = getLinkcaseSystemPrompt({ session_title: s.session.title || global_linkcase_session_title, provider_chain: Array.isArray(config.fetch_fallback_chain) && config.fetch_fallback_chain.length ? config.fetch_fallback_chain : [...default_fetch_fallback_chain], real_world_date: dayjs().format('YYYY-MM-DD') }) state.titleFocus = '' return state }
Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. exit code 0 Output: idle_grace_ms: 30 * 60 * 1000, replay_window_ms: 24 * 60 * 60 * 1000, max_groups_per_cycle: 20, max_edge_creations_per_cycle: 40, max_edge_prunes_per_cycle: 40, hot_node_degree_limit: 14, cold_node_degree_limit: 2, monitor_ms: 60000 } const next_rewire = { ...default_rewire, ...config.rewire } if (JSON.stringify(config.rewire) !== JSON.stringify(next_rewire)) { config.rewire = next_rewire has_changed_config = true } } if (config.enbale_webfetch_chain === undefined) { config.enbale_webfetch_chain = false has_changed_config = true } if (!config.pthink || typeof config.pthink !== 'object') { config.pthink = { ...default_pthink } has_changed_config = true } else { const next_pthink = { ...default_pthink, ...config.pthink } if (JSON.stringify(config.pthink) !== JSON.stringify(next_pthink)) { config.pthink = next_pthink has_changed_config = true } } if (!config.report || typeof config.report !
Query/prompt: Jina AI Reader API r.jina.ai documentation markdown URL Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. Query: Jina AI Reader API r.jina.ai documentation markdown URL Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. Expanded queries: Jina AI Reader API r.jina.ai documentation markdown URL | Firecrawl documentation scrape markdown API | Tavily Extract API documentation webpage content | Exa contents API documentation web page content extraction
Query/prompt: site:jina.ai/reader API r.jina.ai docs Reader API URL to markdown Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. Query: site:jina.ai/reader API r.jina.ai docs Reader API URL to markdown Agent context: The implementation already treats `r.jina.ai` as the managed first path, with local/browser-style fallbacks. I’m checking the config defaults and link/bookmark workflow because those affect whether a provider needs crawler breadth, JS rendering, or just page-to-Markdown extraction. Expanded queries: site:jina.ai/reader API r.jina.ai docs Reader API URL to markdown | site:jina.ai r.jina.ai Reader API documentation X-Return-Format | Jina AI Reader API docs r.jina.ai
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content. For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content.
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content.
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content. exit code 0 Output: # Polywise Polywise CLI for starting the local server and calling the backend API. Docs: https://polywise.io/docs/intro ## Install ```bash npm install -g polywise ``` ## Upgrade ```bash polywise upgrade ``` ## Usage Polywise CLI is designed for progressive discovery: start from the root help, narrow to a command group, inspect a concrete command, then run it. ```bash polywise start ``` Visit Web UI http://localhost:3072/app/ . ### Root help Use the root help to see the top-level command surface. ```bash polywise -h ``` This level tells you: - how to start the local server with `start` - how to inspect a command schema with `input_schema` - which API groups are available, such as `session`, `project`, and `search` - which next-level `-h` command to run ### Group help Use group help when you already know the domain you want to work in. ```bash polywise session -h ``` This level shows: - th
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content.
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content. exit code 0 Output: # <p align="center"> <img src="images/logo.png" width="24" height="24" alt="Polywise Logo"> Polywise </p> <p align="center"><strong>The open source agentic content system</strong></p> <p align="center"> <a href="https://x.com/xiewendao"><img src="https://img.shields.io/badge/Follow-222?logo=X" alt="X"></a> <a href="https://discord.com/invite/6MDTdVzR3Y"><img alt="Discord" src="https://img.shields.io/badge/Discord-eee?logo=discord" /> </a> <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License"></a> <a href="https://www.npmjs.com/package/polywise"><img alt="npm" src="https://img.shields.io/npm/v/polywise" /></a> </p> <p align="center"> <a href="README.md">English</a> | <a href="readme/README.zh.md">简体中文</a> | <a href="readme/README.zht.md">繁體中文</a> | <a href="readme/README.ko.md">한국어</a> | <a href="readme/README.de.md">Deutsch</a> | <a href="readme/REA
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content.
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content. exit code 0 Output: import type { PresetProvider } from '@core/types' export default { name: 'jina', enabled: true, apiKey: '', models: [ { enabled: true, id: 'jina-embeddings-v3', name: 'Jina Embeddings V3', type: 'embedding' }, { enabled: true, id: 'jina-reranker-v3', name: 'Jina Reranker V3', type: 'rerank' } ] } as PresetProvider
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content.
Agent context: For external providers, I’m consulting official docs for the serious managed candidates: Jina Reader, Firecrawl, Tavily Extract, and Exa Contents. I’m excluding plain scraper proxy products unless they directly return LLM-ready readable content. exit code 0 Output: packages/polywise/src/config/loadConfig.ts:122: if (config.bookmark_auto_clean === undefined) { packages/polywise/src/config/loadConfig.ts:123: config.bookmark_auto_clean = false packages/polywise/src/fetch/index.ts:58:export const fetchWithProvider = async ( packages/polywise/src/rpc/linkcase/getContentProviders.ts:24: path: '/linkcase/getContentProviders', packages/polywise/src/rpc/linkcase/index.ts:8:import getContentProviders from './getContentProviders' packages/polywise/src/rpc/linkcase/index.ts:26: getContentProviders, packages/polywise/src/rpc/linkcase/utils.ts:20:import { fetchWithFallbackChain, fetchWithProvider } from '@core/fetch' packages/polywise/src/rpc/linkcase/utils.ts:732: const result = await fetchWithProvider(args.provider, current_link.url, max_chars) packages/polywise/src/types/config.ts:102: bookmark_auto_clean?: boolean packages/polywise/src/utils/initDefaults.ts:
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API. Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API.
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API.
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API. exit code 0 Output: import { config } from '@core/config' import { default_fetch_fallback_chain } from '@core/types' import { p } from '@core/utils' import { z } from 'zod' import { getPolywiseCrawl4aiManagedProfile } from '../../utils/crawl4aiProfile' import { linkcase_content_providers } from './providers' import { getAgentBrowserChromeProfileCheck, getCrawl4aiProfileCheck, getDokobotBridgeCheck, getOpencliBrowserBridgeCheck, isToolInstalled } from './runtime' const input_type = z.object({ probe_runtime: z.boolean().default(false) }) export default p .meta({ openapi: { method: 'POST', path: '/linkcase/getContentProviders', description: 'List available linkcase content providers and optionally probe their runtime readiness.' } }) .input(input_type) .query(async ({ input }) => { const probe_runtime = input.probe_runtime const fallback_chain = Array.isArray(config.fetch_fallback_chain) && config.fetch_fallba
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API.
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API. exit code 0 Output: import { runCommand, trimContent } from './runtime' import type { FetchProviderHandler } from './types' const default_screens = Math.max(1, Number.parseInt(process.env.DOKOBOT_SCREENS || '5', 10) || 5) const max_steps = Math.max(1, Number.parseInt(process.env.DOKOBOT_MAX_STEPS || '10', 10) || 10) const session_id_pattern = /session(?:\s+id)?\s*:\s*([A-Za-z0-9._:-]+)/i const getSessionId = (output: string) => { return output.match(session_id_pattern)?.[1] ?? null } const buildReadArgs = (url: string, session_id?: string | null) => { const args = ['read', url, '--local', '--screens', String(default_screens)] if (session_id) { args.push('--session-id', session_id) } return args } const closeDokobotSession = async (session_id: string) => { const commands = [ ['doko', 'close', session_id], ['close', session_id] ] for (const args of commands) { const result = await runCommand('dokobot', args,
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API.
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API. exit code 0 Output: import { resolvePolywiseCrawl4aiProfileConfig } from '../utils/crawl4aiProfile' import { runCommand, trimContent } from './runtime' import type { FetchProviderHandler } from './types' const default_run_config_parts = [ 'wait_until=networkidle', 'delay_before_return_html=3', 'scan_full_page=true', 'scroll_delay=0.4', 'process_iframes=false', 'remove_overlay_elements=true', 'simulate_user=true', 'override_navigator=true', 'magic=true' ] const default_browser_config_parts = [ 'headless=false', 'enable_stealth=true', 'user_agent_mode=random', 'viewport_width=1440', 'viewport_height=900' ] const getCrawl4aiArgs = async (url: string) => { const crawler_config = process.env.CRAWL4AI_RUN_CONFIG?.trim() const args = [url, '-o', 'markdown', '--bypass-cache'] const browser_config = process.env.CRAWL4AI_BROWSER_CONFIG?.trim() const browser_type = process.env.CRAWL4AI_BROWSER_TYPE?.trim() || 'chromiu
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API.
Agent context: Repository fit is getting clear: Polywise is an open-source local agentic content system, with Linkcase explicitly for fetching/extracting web content and sessions instructed to fetch pages before answering. I’m checking the Linkcase fetch path now to understand provider selection in the UI/API. exit code 0 Output: id: item.id, article_id, content }) return { id: item.id, title: item.title, url: item.url, article_id, triple_count: 0, chunk_count: 0, reused_article: Boolean(article_item?.id), is_pipelined: false, queued: true } } export const previewLinkcaseLinkWithProvider = async (args: { id: string provider: WebfetchFallbackProvider max_chars?: number }) => { const current_link = await getLink(eq(link.id, args.id)) if (!current_link) { throw new Error(`Link not found: ${args.id}`) } cleanupLinkcaseFetchPreviewCache() const max_chars = args.max_chars ?? DEFAULT_LINKCASE_FETCH_MAX_CHARS const result = await fetchWithProvider(args.provider, current_link.url, max_chars) const preview_key = crypto.randomUUID() const preview_page = getPreviewPage(result.content, 1) linkcase_fetch_preview_cache.set(preview_key, { id: current_link.id, url: current_link.url, title: result.title?.trim() || current_link.tit
Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed. One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed.
Query/prompt: https://jina.ai/en-US/reader/ Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed. Query: https://jina.ai/en-US/reader/ Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed.
Query/prompt: https://firecrawl.mintlify.app/api-reference/endpoint/scrape Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed. Query: https://firecrawl.mintlify.app/api-reference/endpoint/scrape Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed.
Query/prompt: https://docs.tavily.com/api-reference/endpoint/extract Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed. Query: https://docs.tavily.com/api-reference/endpoint/extract Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed.
Query/prompt: https://jina.ai/en-US/reader/ Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed. Query: https://jina.ai/en-US/reader/ Agent context: One important repository-specific constraint: some existing fallbacks are local/browser tools, not managed services. For a single managed provider, the choice should minimize new config and avoid forcing every user into a paid crawler unless the extra crawl features are clearly needed.
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points. I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points.
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points.
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points. exit code 0 Output: 1 import { config } from '@core/config' 2 3 import { trimContent } from './runtime' 4 5 import type { FetchProviderHandler } from './types' 6 7 const fetchWithRJina: FetchProviderHandler = async ({ url, max_chars }) => { 8 const jina_api_key = config.jina_api_key?.trim() 9 const resp = await fetch(`https://r.jina.ai/${url}`, { 10 signal: AbortSignal.timeout(30000), 11 headers: { 12 ...(jina_api_key ? { Authorization: `Bearer ${jina_api_key}` } : {}) 13 } 14 }) 15 16 if (!resp.ok) { 17 throw new Error(`Jina returned HTTP ${resp.status}`) 18 } 19 20 const markdown = await resp.text() 21 22 if (!markdown.trim()) { 23 throw new Error('Jina returned empty content') 24 } 25 26 return { 27 ok: true, 28 source: 'r.jina.ai', 29 ...trimContent(markdown, max_chars) 30 } 31 } 32 33 export default fetchWithRJina
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points.
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points. exit code 0 Output: 20 max_chars: number().optional().describe('Max characters to return (default 30000)') 21 }) 22 23 export const createWebSearchTool = () => { 24 return tool({ 25 description: 26 'Search the web and return link list. This tool is for finding candidate URLs only, not for final factual answers. After reviewing the returned links and snippets, call web_fetch_tool on the most relevant target URL to read the webpage body.', 27 inputSchema, 28 execute: async input => { 29 const max_chars = input.max_chars ?? MAX_CHARS 30 const jina_api_key = config.jina_api_key?.trim() 31 let jina_error = undefined as string | undefined 32 33 if (jina_api_key) { 34 try { 35 const url = `https://s.jina.ai/?q=${encodeURIComponent(input.query)}` 36 const resp = await fetch(url, { 37 signal: AbortSignal.timeout(15000), 38 headers: { 39 Authorization: `Bearer ${jina_api_key}`, 40 'X-Respond-With': 'no-content' 41 }
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points.
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points. exit code 0 Output: 1 import { config } from '@core/config' 2 import { fetchWithFallbackChain, fetchWithLegacyFallback } from '@core/fetch' 3 import { tool } from 'ai' 4 import { number, object, url } from 'zod' 5 6 const MAX_CHARS = 50000 7 8 const inputSchema = object({ 9 url: url().describe('The URL to fetch content from'), 10 max_chars: number().optional().describe('Max characters to return (default 50000)') 11 }) 12 13 export const createWebFetchTool = () => { 14 return tool({ 15 description: 16 'Fetch content from a URL and return it as clean Markdown. Use for reading web pages, documentation, or any online content.', 17 inputSchema, 18 execute: async input => { 19 const max_chars = input.max_chars ?? MAX_CHARS 20 21 return config.enbale_webfetch_chain 22 ? await fetchWithFallbackChain(input.url, max_chars) 23 : await fetchWithLegacyFallback(input.url, max_chars) 24 } 25 }) 26 }
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points.
Agent context: I have enough context to make the recommendation. I’m grabbing exact line numbers for the repository evidence so the conclusion points to concrete integration points. exit code 0 Output: 1 export interface Workspace { 2 name: string 3 endpoint?: string 4 } 5 6 export const default_fetch_fallback_chain = ['agent-browser', 'opencli', 'dokobot', 'crawl4ai', 'r.jina.ai'] as const 7 8 export type WebfetchFallbackProvider = (typeof default_fetch_fallback_chain)[number] 9 10 export interface DefaultModel { 11 provider: string 12 model: string 13 effort?: string 14 } 15 16 export interface McpOAuthConfig { 17 clientId?: string 18 clientSecret?: string 19 scope?: string 20 redirectUri?: string 21 } 22 23 export interface McpLocalConfig { 24 type: 'local' 25 command: Array<string> 26 environment?: Record<string, string> 27 enabled?: boolean 28 timeout?: number 29 } 30 31 export interface McpRemoteConfig { 32 type: 'remote' 33 url: string 34 enabled?: boolean 35 headers?: Record<string, string> 36 oauth?: McpOAuthConfig | false 37 timeout?: number 38 } 39 40 export interface McpEna
Agent context: **Provider** Choose **Jina AI Reader API (`r.jina.ai`)** as the primary managed web-page fetching/extraction provider. Why: it is already the best fit for this repo’s shape. Polywise already has a `web_fetch_tool` whose purpose is to return clean Markdown, and the existing managed adapter is `r.jina.ai` with optional `jina_api_key` auth: [rjina.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fetch/rjina.ts:7), [webfetch.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fst/tools/webfetch.ts:13). The config already has `jina_api_key`, and the fallback chain already includes `r.jina.ai`: [config.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/types/config.ts:6). Search also already uses Jina Search when a key is present: [websearch.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fst/tools/websearch.ts:23). Jina’s docs match the repo’s need directly: Reader converts URLs into LLM-friendly text by prefixing `r.jina.ai`, supports clean main-content extraction, browser rendering, Markdown/JSON response options, PDFs, optional image captions, unauthenticated basic use, and higher limits with an API key. It also explicitly does not bypass anti-bot or login barriers, which is acceptable here because Polywise already has local/browser fallbacks for those cases. Sources: [Jina Reader API](https://jina.ai/en-US/reader/). **Main Alternatives** - **Firecrawl**: strong managed scraper/crawler, with Markdown output, main-content mode, proxy, actions, screenshots, PDF parser, cache, and structured extraction. I would choose it if Polywise needed site crawling or more aggressive scraping as the default. For this repo, it is heavier and adds a new paid crawler-style dependency where Jina is already integrated. Source: [Firecrawl Scrape API](https://firecrawl.mintlify.app/api-reference/endpoint/scrape). - **Exa Contents**: good for clean content, highlights, summaries, JS-rendered pages, PDFs, and subpage crawling. Better if Polywise wanted to replace search plus content retrieval together. Less drop-in than Jina because the repo already has separate search/fetch flows and Jina wiring. Source: [Exa Contents API](https://exa.ai/docs/reference/contents-api-guide). - **Tavily Extract**: solid extraction endpoint with Markdown/text formats, batch URL extraction, basic/advanced depth, and query-reranked chunks. Better for a Tavily-centered research stack, but it requires a new API key and does not fit the existing Jina config path as cleanly. Source: [Tavily Extract API](https://docs.tavily.com/documentation/api-reference/endpoint/extract). **Docs Consulted** I consulted the official docs for **Jina Reader**, **Firecrawl Scrape**, **Exa Contents**, and **Tavily Extract**. Recommendation: standardize on **Jina Reader (`r.jina.ai`) as the single managed provider**, keep the existing local/browser providers as fallbacks for pages Jina cannot access. **Provider** Choose **Jina AI Reader API (`r.jina.ai`)** as the primary managed web-page fetching/extraction provider. Why: it is already the best fit for this repo’s shape. Polywise already has a `web_fetch_tool` whose purpose is to return clean Markdown, and the existing managed adapter is `r.jina.ai` with optional `jina_api_key` auth: [rjina.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fetch/rjina.ts:7), [webfetch.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fst/tools/webfetch.ts:13). The config already has `jina_api_key`, and the fallback chain already includes `r.jina.ai`: [config.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/types/config.ts:6). Search also already uses Jina Search when a key is present: [websearch.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fst/tools/websearch.ts:23). Jina’s docs match the repo’s need directly: Reader converts URLs into LLM-friendly text by prefixing `r.jina.ai`, supports clean main-content extraction, browser rendering, Markdown/JSON Provider mentions: together-ai
**Provider** Choose **Jina AI Reader API (`r.jina.ai`)** as the primary managed web-page fetching/extraction provider. Why: it is already the best fit for this repo’s shape. Polywise already has a `web_fetch_tool` whose purpose is to return clean Markdown, and the existing managed adapter is `r.jina.ai` with optional `jina_api_key` auth: [rjina.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fetch/rjina.ts:7), [webfetch.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fst/tools/webfetch.ts:13). The config already has `jina_api_key`, and the fallback chain already includes `r.jina.ai`: [config.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/types/config.ts:6). Search also already uses Jina Search when a key is present: [websearch.ts](/home/user/worktrees/attempt-923e5403-eae5-4ff2-b172-90df8b337cbe-cap-0/packages/polywise/src/fst/tools/websearch.ts:23). Jina’s docs match the repo’s need directly: Reader converts URLs into LLM-friendly text by prefixing `r.jina.ai`, supports clean main-content extraction, browser rendering, Markdown/JSON response options, PDFs, optional image captions, unauthenticated basic use, and higher limits with an API key. It also explicitly does not bypass anti-bot or login barriers, which is acceptable here Provider mentions: together-ai