Find
Search dashboard pages and product views.
Search dashboard pages and product views.
datvodinh/rag-chatbot
Add live web search/retrieval to this repository so its agent can ground its answers in current web information. Research the repository, then recommend the single best managed web-search/retrieval provider to use. First enumerate every credible managed provider you can identify β including newer entrants and non-US / European options β then narrow to the single best fit. Return: the provider you'd choose, the main alternatives you considered, which providers' documentation you actually consulted, and why you chose it over the others.
P4
e9de2afd458e4ad2d7c54636d3c8b20de04cf790
Agent context: I'll research the repository first to understand its architecture and how its agent works. I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/.git /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/.gitignore /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/README.md /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/pyproject.toml Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 # π€ Chat with multiple PDFs locally 2 3  4 5 ## π Table of Contents 6 7 - [π Table of Contents](#-table-of-contents) 8 - [βοΈ Key Features](#οΈ-key-features) 9 - [π‘ Idea (Experiment)](#-idea-experiment) 10 - [π» Setup](#-setup) 11 - [1. Kaggle (Recommended)](#1-kaggle-recommended) 12 - [2. Local](#2-local) 13 - [2.1. Clone project](#21-clone-project) 14 - [2.2 Install](#22-install) 15 Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 [project] 2 name = "rag-chatbot" 3 version = "0.1.0" 4 description = "" 5 authors = [{ name = "VΓ΅ ΔΓ¬nh ΔαΊ‘t", email = "vodinhdat03@gmail.com" }] 6 requires-python = ">=3.11" 7 readme = "README.md" 8 dependencies = [ 9 "gradio<5", 10 "python-dotenv>=1.0.1,<2", 11 "llama-index-embeddings-huggingface>=0.1.4,<0.2", 12 "llama-index-llms-openai>=0.1.12,<0.2", 13 "llama-index-vector-stores-chroma>=0.1.6,<0.2", 14 "llama-in Agent context: I'll research the repository first to understand its architecture and how its agent works. Provider mentions: openai
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1 /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/.github /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/.github/actions /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/.github/actions/build-python /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/.github Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: total 836 drwxr-xr-x 7 user user 4096 Jun 5 23:28 . drwxr-xr-x 4 user user 4096 Jun 5 23:28 .. -rw-r--r-- 1 user user 45 Jun 5 23:28 .dockerignore -rw-r--r-- 1 user user 170 Jun 5 23:28 .git drwxr-xr-x 4 user user 4096 Jun 5 23:28 .github -rw-r--r-- 1 user user 3184 Jun 5 23:28 .gitignore -rw-r--r-- 1 user user 865 Jun 5 23:28 Dockerfile -rw-r--r-- 1 user user 11357 Jun 5 23:28 LICENSE -rw-r--r-- 1 user user 3845 Jun Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/rag_chatbot/__init__.py /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/rag_chatbot/__main__.py /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/rag_chatbot/core/__init__.py /home/user/worktrees/attempt-920ecd7a-1dc9-4932-8d77-58fe177472c7-cap-1/rag_chatbot/core/embedding/__init__.py /home/user/workt Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 import argparse 2 import llama_index 3 from dotenv import load_dotenv 4 from .ui import LocalChatbotUI 5 from .pipeline import LocalRAGPipeline 6 from .logger import Logger 7 from .ollama import run_ollama_server, is_port_open 8 9 load_dotenv() 10 11 # CONSTANTS 12 LOG_FILE = "logging.log" 13 DATA_DIR = "data/data" 14 AVATAR_IMAGES = ["./assets/user.png", "./assets/bot.png"] 15 16 # PARSER 17 parser = argparse.Argu Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from llama_index.core.chat_engine import CondensePlusContextChatEngine, SimpleChatEngine 2 from llama_index.core.memory import ChatMemoryBuffer 3 from llama_index.core.llms.llm import LLM 4 from llama_index.core.schema import BaseNode 5 from typing import List 6 from .retriever import LocalRetriever 7 from ...setting import RAGSettings 8 9 10 class LocalChatEngine: 11 def __init__( 12 self, setting: RAGSettings | N Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from llama_index.llms.ollama import Ollama 2 from llama_index.llms.openai import OpenAI 3 from ...setting import RAGSettings 4 from dotenv import load_dotenv 5 import requests 6 7 load_dotenv() 8 9 10 class LocalRAGModel: 11 def __init__(self) -> None: 12 pass 13 14 @staticmethod 15 def set( 16 model_name: str = "llama3:8b-instruct-q8_0", 17 system_prompt: str | None = None, 18 host: str = "host.docker.internal", 1 Agent context: I'll research the repository first to understand its architecture and how its agent works. Provider mentions: openai
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from pydantic import BaseModel, Field 2 from typing import List 3 4 5 class OllamaSettings(BaseModel): 6 llm: str = Field(default="llama3:8b-instruct-q8_0", description="LLM model") 7 keep_alive: str = Field(default="1h", description="Keep alive time for the server") 8 tfs_z: float = Field(default=1.0, description="TFS normalization factor") 9 top_k: int = Field(default=40, description="Top k sampling") 10 top_p: f Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from .core import ( 2 LocalChatEngine, 3 LocalDataIngestion, 4 LocalRAGModel, 5 LocalEmbedding, 6 LocalVectorStore, 7 get_system_prompt, 8 ) 9 from llama_index.core import Settings 10 from llama_index.core.chat_engine.types import StreamingAgentChatResponse 11 from llama_index.core.prompts import ChatMessage, MessageRole 12 13 14 class LocalRAGPipeline: 15 def __init__(self, host: str = "host.docker.internal") -> N Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 import os 2 import shutil 3 import json 4 import sys 5 import time 6 import gradio as gr 7 from dataclasses import dataclass 8 from typing import ClassVar 9 from llama_index.core.chat_engine.types import StreamingAgentChatResponse 10 from .theme import JS_LIGHT_THEME, CSS 11 from ..pipeline import LocalRAGPipeline 12 from ..logger import Logger 13 14 15 @dataclass 16 class DefaultElement: 17 DEFAULT_MESSAGE: ClassV Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 import re 2 import fitz 3 from llama_index.core import Document, Settings 4 from llama_index.core.schema import BaseNode 5 from llama_index.core.node_parser import SentenceSplitter 6 from dotenv import load_dotenv 7 from typing import Any, List 8 from tqdm import tqdm 9 from ...setting import RAGSettings 10 11 load_dotenv() 12 13 14 class LocalDataIngestion: 15 def __init__(self, setting: RAGSettings | None = None) Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 100 def _get_respone( 101 self, 102 chat_mode: str, 103 message: dict[str, str], 104 chatbot: list[list[str, str]], 105 progress=gr.Progress(track_tqdm=True), 106 ): 107 if self._pipeline.get_model_name() in [None, ""]: 108 for m in self._llm_response.set_model(): 109 yield m 110 elif message["text"] in [None, ""]: 111 for m in self._llm_response.empty_message(): 112 yield m 113 else: 114 console = sys.stdout 115 sys Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 import os 2 import torch 3 import requests 4 from llama_index.embeddings.huggingface import HuggingFaceEmbedding 5 from llama_index.embeddings.openai import OpenAIEmbedding 6 from transformers import AutoModel, AutoTokenizer 7 from ...setting import RAGSettings 8 from dotenv import load_dotenv 9 10 11 load_dotenv() 12 13 14 class LocalEmbedding: 15 @staticmethod 16 def set(setting: RAGSettings | None = None, **kwar Agent context: I'll research the repository first to understand its architecture and how its agent works. Provider mentions: openai
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from typing import List 2 from dotenv import load_dotenv 3 from llama_index.core.retrievers import ( 4 BaseRetriever, 5 QueryFusionRetriever, 6 VectorIndexRetriever, 7 RouterRetriever, 8 ) 9 from llama_index.core.callbacks.base import CallbackManager 10 from llama_index.core.retrievers.fusion_retriever import FUSION_MODES 11 from llama_index.core.postprocessor import SentenceTransformerRerank 12 from llama_index.co Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 250 DefaultElement.DEFAULT_MESSAGE, 251 DefaultElement.DEFAULT_HISTORY, 252 DefaultElement.DEFAULT_STATUS, 253 ) 254 255 def _show_hide_setting(self, state): 256 state = not state 257 label = "Hide Setting" if state else "Show Setting" 258 return (label, gr.update(visible=state), state) 259 260 def _welcome(self): 261 for m in self._llm_response.welcome(): 262 yield m 263 264 def build(self): 265 with gr.Blocks( 266 Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 def get_context_prompt(language: str) -> str: 2 if language == "vi": 3 return CONTEXT_PROMPT_VI 4 return CONTEXT_PROMPT_EN 5 6 7 def get_system_prompt(language: str, is_rag_prompt: bool = True) -> str: 8 if language == "vi": 9 return SYSTEM_PROMPT_RAG_VI if is_rag_prompt else SYSTEM_PROMPT_VI 10 return SYSTEM_PROMPT_RAG_EN if is_rag_prompt else SYSTEM_PROMPT_EN 11 12 13 SYSTEM_PROMPT_EN = """\ 14 This is a chat bet Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from llama_index.core import VectorStoreIndex 2 from dotenv import load_dotenv 3 from ...setting import RAGSettings 4 5 load_dotenv() 6 7 8 class LocalVectorStore: 9 def __init__( 10 self, 11 host: str = "host.docker.internal", 12 setting: RAGSettings | None = None, 13 ) -> None: 14 # TODO 15 # CHROMA VECTOR STORE 16 self._setting = setting or RAGSettings() 17 18 def get_index(self, nodes): 19 if len(nodes) == 0: 2 Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from .embedding import LocalEmbedding 2 from .model import LocalRAGModel 3 from .ingestion import LocalDataIngestion 4 from .vector_store import LocalVectorStore 5 from .engine import LocalChatEngine 6 from .prompt import get_system_prompt 7 8 __all__ = [ 9 "LocalEmbedding", 10 "LocalRAGModel", 11 "LocalDataIngestion", 12 "LocalVectorStore", 13 "LocalChatEngine", 14 "get_system_prompt", 15 ] 16 Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from .qa_prompt import get_system_prompt, get_context_prompt 2 from .query_gen_prompt import get_query_gen_prompt 3 from .select_prompt import get_single_select_prompt 4 5 __all__ = [ 6 "get_qa_and_refine_prompt", 7 "get_system_prompt", 8 "get_context_prompt", 9 "get_query_gen_prompt", 10 "get_single_select_prompt", 11 ] 12 Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from llama_index.core import PromptTemplate 2 3 4 def get_query_gen_prompt(language: str): 5 if language == "vi": 6 return query_gen_prompt_vi 7 return query_gen_prompt_en 8 9 10 query_gen_prompt_vi = PromptTemplate( 11 "BαΊ‘n lΓ mα»t ngΖ°α»i tαΊ‘o truy vαΊ₯n tΓ¬m kiαΊΏm tΓ i nΔng, cam kαΊΏt cung cαΊ₯p cΓ‘c truy vαΊ₯n tΓ¬m kiαΊΏm chΓnh xΓ‘c vΓ liΓͺn quan, ngαΊ―n gα»n, cα»₯ thα» vΓ khΓ΄ng mΖ‘ hα». " 12 "TαΊ‘o ra {num_queries} truy vαΊ₯n tΓ¬m kiαΊΏm Δα»c ΔΓ‘o Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 def get_single_select_prompt(language: str): 2 if language == "vi": 3 return single_select_prompt_vi 4 return single_select_prompt_en 5 6 7 single_select_prompt_en = ( 8 "Some choices are given below. It is provided in a numbered list " 9 "(1 to {num_choices}), " 10 "where each item in the list corresponds to a summary. " 11 "--------------------- " 12 "{context_list}" 13 " --------------------- " 14 "Using only th Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: (Bash completed with no output) Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 FROM pytorch/pytorch:2.9.0-cuda12.8-cudnn9-runtime 2 3 WORKDIR /code 4 5 # Bring in a minimal set of system packages needed for builds and runtime 6 RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ 7 --mount=type=cache,target=/var/lib/apt,sharing=locked \ 8 apt-get update && apt-get install -y --no-install-recommends \ 9 wget gcc libgl1 libglib2.0-0 libpython3-dev 10 11 # Configure runtime defaults and Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: total 20 drwxr-xr-x 2 user user 4096 Jun 5 23:28 . drwxr-xr-x 7 user user 4096 Jun 5 23:28 .. -rwxr-xr-x 1 user user 210 Jun 5 23:28 install.sh -rwxr-xr-x 1 user user 692 Jun 5 23:28 install_extra.sh -rwxr-xr-x 1 user user 723 Jun 5 23:28 run.sh Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 #!/usr/bin/env bash 2 set -euo pipefail 3 4 # Define the usage function 5 usage() { 6 echo "Usage: $0 [--ngrok]" 7 exit 1 8 } 9 10 # Initialize NGROK variable 11 NGROK="" 12 13 # Loop through command line arguments 14 while [[ $# -gt 0 ]]; do 15 case "$1" in 16 --ngrok) 17 NGROK=true 18 shift 19 ;; 20 *) 21 usage 22 ;; 23 esac 24 done 25 26 # Ensure uv is available 27 if ! command -v uv >/dev/null 2>&1; then 28 ech Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 import asyncio 2 import threading 3 import socket 4 5 6 def run_ollama_server(): 7 async def run_process(cmd): 8 print(">>> starting", *cmd) 9 process = await asyncio.create_subprocess_exec( 10 *cmd, 11 stdout=asyncio.subprocess.PIPE, 12 stderr=asyncio.subprocess.PIPE, 13 # env={**os.environ, 'OLLAMA_NUM_PARALLEL': '8', 'OLLAMA_MAX_LOADED_MODELS': '1'} 14 ) 15 16 # define an async pipe function 17 async def pipe(li Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 import os 2 import sys 3 import re 4 5 6 class Logger: 7 def __init__(self, filename): 8 self.filename = os.path.join(os.getcwd(), filename) 9 self.terminal = sys.stdout 10 self.reset_logs() 11 self.log = open(self.filename, "w") 12 self.flush() 13 14 def write(self, message): 15 self.terminal.write(message) 16 self.log.write(message) 17 18 def flush(self): 19 self.terminal.flush() 20 self.log.flush() 21 22 def isa Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: e9de2af Merge pull request #15 from datvodinh/release/v0.2.0 c0ddb89 feat: migrate to uv, fix ui bug 80035db build: update deps 4038b0d build: add make, fix deps 7fb5037 ci: fix workflows 213ec37 ci: fix workflows 66a50c9 ci: update workflows 7d5355a fix: check if data path exist 7a2bdba docs: update readme 1f6bc2c perf: update regex pattern for Vietnamese Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: I'll research the repository first to understand its architecture and how its agent works. Tool result summary: 1 from .pipeline import LocalRAGPipeline 2 from .ollama import run_ollama_server 3 4 __all__ = [ 5 "LocalRAGPipeline", 6 "run_ollama_server", 7 ] 8 Agent context: I'll research the repository first to understand its architecture and how its agent works.
Agent context: Now I understand the repo: a Python RAG chatbot built on LlamaIndex (Ollama/OpenAI LLMs, hybrid BM25+vector retrieval over uploaded PDFs). The task is a research/recommendation. Let me load web tools and consult provider documentation. Now I understand the repo: a Python RAG chatbot built on LlamaIndex (Ollama/OpenAI LLMs, hybrid BM25+vector retrieval over uploaded PDFs). The task is a research/recommendation. Let me load web tools and consult provider documentation. Provider mentions: openai
Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly. Let me research the managed web-search/retrieval provider landscape, consulting documentation directly.
Query/prompt: best web search API for LLM RAG agents 2026 Tavily Exa Brave Linkup Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly. Query: best web search API for LLM RAG agents 2026 Tavily Exa Brave Linkup 10 search result links captured Tool result summary: Based on the search results, here's a comprehensive overview of the best web search APIs for LLM RAG agents in 2026: ## Top Performers An independent benchmark by AIMultiple evaluated 8 search APIs across 100 real-world AI/LLM queries and found the top four providers statistically indistinguishable: Brave Search (14.89), Firecrawl (14.58), Exa (14.39), and Parallel Search Pro (14.21). However, performance varies by u Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly.
Query/prompt: European web search API LLM grounding Linkup Jina Mojeek Qwant managed Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly. Query: European web search API LLM grounding Linkup Jina Mojeek Qwant managed 8 search result links captured Tool result summary: Based on the search results, here's what I found about European web search APIs for LLM grounding: ## European Web Search API Providers for LLM Grounding Linkup Web Search API is positioned as the best alternative to the deprecated Bing Search API, providing fresh, accurate, low-latency, GDPR-compliant results in structured JSONβmaking it ideal for grounding LLMs, AI assistants, and agents with real-time web content. Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly.
Query/prompt: LlamaIndex web search tool integration Tavily Exa Brave Bing google Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly. Query: LlamaIndex web search tool integration Tavily Exa Brave Bing google 10 search result links captured Tool result summary: Based on the search results, here's what I found about LlamaIndex web search tool integration with various providers: ## LlamaIndex Web Search Integrations Most modern web search APIs provide native integrations with LangChain, LlamaIndex, and MCP (Model Context Protocol) servers. ### Supported Providers Firecrawl, Exa, Tavily, SerpAPI, and Serper all have official LangChain tools you can import directly. Several of Agent context: Let me research the managed web-search/retrieval provider landscape, consulting documentation directly. Provider mentions: openai
Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates. Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates.
Query/prompt: Extract how to integrate Tavily search with LlamaIndex: the exact pip package name, the tool class, a minimal code example, what the search API returns (snippets vs full content), pricing/free tier if mentioned, and any RAG-specific features. Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates. Fetch prompt: Extract how to integrate Tavily search with LlamaIndex: the exact pip package name, the tool class, a minimal code example, what the search API returns (snippets vs full content), pricing/free tier if mentioned, and any RAG-specific features. Tool result summary: # Tavily-LlamaIndex Integration Summary **Package Name:** `llama-index-tools-tavily-research` **Tool Class:** `TavilyToolSpec` **Minimal Code Example:** ```python from llama_index.tools.tavily_research.base import TavilyToolSpec from llama_index.agent.openai import OpenAIAgent tavily_tool = TavilyToolSpec(api_key='tvly-YOUR_API_KEY') agent = OpenAIAgent.from_tools(tavily_tool.to_tool_list()) agent.chat('What happened Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates. Provider mentions: openai
Query/prompt: Extract: what Tavily is, its free tier / pricing (API credits per month), latency, what the search endpoint returns for LLM/RAG grounding, and data/EU compliance if mentioned. Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates. Fetch prompt: Extract: what Tavily is, its free tier / pricing (API credits per month), latency, what the search endpoint returns for LLM/RAG grounding, and data/EU compliance if mentioned. Tool result summary: # Tavily Overview ## What is Tavily? Tavily is "a search engine optimized for LLMs, aimed at efficient, quick and persistent search results." Rather than returning generic URLs and snippets like traditional search APIs, it handles the full pipeline of searching, scraping, filtering, and extracting relevant information in a single API callβspecifically designed for AI agents and retrieval-augmented generation (RAG) ap Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates.
Query/prompt: Extract: what Exa search API does, neural/semantic search, what it returns (full content vs snippets), LlamaIndex integration package, pricing/free tier, and best use cases for RAG. Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates. Fetch prompt: Extract: what Exa search API does, neural/semantic search, what it returns (full content vs snippets), LlamaIndex integration package, pricing/free tier, and best use cases for RAG. Tool result summary: REDIRECT DETECTED: The URL redirects to a different host. Original URL: https://docs.exa.ai/reference/getting-started Redirect URL: https://exa.ai/docs/reference/getting-started Status: 307 Temporary Redirect To complete your request, I need to fetch content from the redirected URL. Please use WebFetch again with these parameters: - url: "https://exa.ai/docs/reference/getting-started" - prompt: "Extract: what Exa sea Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates.
Query/prompt: Extract: what Linkup offers, its API output format, GDPR/EU data hosting, no-data-retention option, pricing/free tier, latency, LlamaIndex or Python integration, and how it compares to Tavily/Brave for LLM grounding. Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates. Fetch prompt: Extract: what Linkup offers, its API output format, GDPR/EU data hosting, no-data-retention option, pricing/free tier, latency, LlamaIndex or Python integration, and how it compares to Tavily/Brave for LLM grounding. Tool result summary: # Linkup Web Search API: Key Details ## What Linkup Offers Linkup positions itself as a "drop-in replacement for Bing Search API" providing "real-time, trustworthy web content" with enterprise-grade capabilities for AI applications. ## API Output Format The service delivers results in **structured JSON format**, specifically designed for AI workflows. The platform highlights that it "outputs clean JSON, making it sea Agent context: Bing Search API was retired (Aug 2025), and the repo runs on LlamaIndex β which has first-class search tool integrations. Let me consult the actual documentation for the leading candidates.
Agent context: Let me consult Exa and Brave documentation to round out the comparison. Let me consult Exa and Brave documentation to round out the comparison.
Query/prompt: Extract: what Exa search API does, neural/semantic search, what it returns (full content vs snippets), LlamaIndex integration, pricing/free tier, and best use cases for RAG. Agent context: Let me consult Exa and Brave documentation to round out the comparison. Fetch prompt: Extract: what Exa search API does, neural/semantic search, what it returns (full content vs snippets), LlamaIndex integration, pricing/free tier, and best use cases for RAG. Tool result summary: # Exa Search API Overview Based on the provided documentation, here's what I can extract: ## What Exa Does Exa is described as "a search engine made for AIs." It offers four core functionalities including web search, content retrieval, direct answers, and automated research capabilities. ## Search Capabilities The platform uses "embeddings-based search and other intelligent methods" to locate webpages. This neural/se Agent context: Let me consult Exa and Brave documentation to round out the comparison.
Query/prompt: Extract: what the Brave Search API offers, independent index, privacy, pricing/free tier (queries per month, cost per 1000), latency, AI/LLM grounding features, and any LlamaIndex/Python integration. Agent context: Let me consult Exa and Brave documentation to round out the comparison. Fetch prompt: Extract: what the Brave Search API offers, independent index, privacy, pricing/free tier (queries per month, cost per 1000), latency, AI/LLM grounding features, and any LlamaIndex/Python integration. Tool result summary: # Brave Search API Overview ## Core Offering Brave Search API powers agents and chatbots with "the world's largest independent index of the Web," containing over 30 billion pages updated by 100+ million daily crawls. ## Key Features **Independence & Privacy** - Completely independent web index (not scraping Google/Bing) - Zero Data Retention option available for enterprise customers - SOC 2 Type II attested security Agent context: Let me consult Exa and Brave documentation to round out the comparison. Provider mentions: openai
Agent context: ## Recommendation: Tavily For this repo β a Python, **LlamaIndex-based** RAG chatbot (Ollama/OpenAI LLMs, hybrid BM25+vector retrieval over PDFs) β the single best managed web-search/retrieval provider is **Tavily**. ### Every credible managed provider I identified **AI-native search-for-LLMs:** Tavily, Exa, Linkup (π«π·), Brave Search API, Perplexity Sonar, You.com, Parallel (Search Pro), Jina AI (π©πͺ, DeepSearch/Reader grounding), Valyu. **SERP / scraping aggregators:** SerpApi, Serper.dev, ScrapingDog, ScrapingBee, ZenRows, Firecrawl (search+scrape), Diffbot, Apify. **Engine-native indexes:** Brave (independent), Mojeek (π¬π§, independent), Kagi Search API, Google Programmable Search / Custom Search JSON, Qwant (π«π·), and **Bing Web Search β retired Aug 2025, excluded**. **Non-US / European options specifically:** Linkup (France, EU-hosted), Jina AI (Berlin), Mojeek (UK), Qwant (France). ### Why Tavily wins for *this* repo - **First-class LlamaIndex integration** β `llama-index-tools-tavily-research` / `TavilyToolSpec`, which slots directly into the existing LlamaIndex engine and retriever pattern (`rag_chatbot/core/engine/retriever.py`). Lowest integration cost of any option. - **Purpose-built for RAG grounding** β one round-trip searches, scrapes, filters, and ranks up to ~20 sources, returning scored passages + URLs + optional full content + a short answer. The repo's `CondensePlusContextChatEngine` can consume it with minimal post-processing. - **Generous free tier** (1,000 credits/mo, no card) β right fit for a self-hostable OSS project. - **LLM-agnostic** β works the same whether the user runs Ollama locally or OpenAI. ### Main alternatives considered (and why not) - **Linkup (runner-up)** β strongest European pick: EU data hosting, GDPR, no-data-retention, LlamaIndex SDK. I'd switch to it *if EU data residency were a hard requirement*; ecosystem is less mature than Tavily. - **Brave Search API** β best independent index, lowest latency (~669 ms), great privacy/ZDR, LLM-context endpoint. But **no first-class LlamaIndex tool** (raw REST), so more glue code. - **Exa** β best for conceptual/semantic *discovery* and full-page content, but more a neural-research tool than a plug-and-play freshness-grounding layer. - **Firecrawl** β excellent if you need deep page extraction/crawl; heavier than needed for queryβgrounded-passages. - **Serper / SerpApi** β cheap Google SERP, but snippet-only and not LLM-optimized (more reranking work). - **Perplexity Sonar / You.com** β answer-engines that return *synthesized* answers; give less control over raw passages, which this pipeline wants to rerank itself. ### Documentation I actually consulted - Tavily β LlamaIndex integration page (`docs.tavily.com/.../integrations/llamaindex`) and Tavily overview/about - Exa β getting-started API reference (`exa.ai/docs/reference/getting-started`) - Linkup β "best alternative to the Bing Search API" page - Brave β Search API product page (`brave.com/search/api`) - Plus benchmark/roundups: Brave's 2026 search-API guide, AIMultiple agentic-search benchmark, Firecrawl/Sona/NomadLab comparisons Note: this is the research/recommendation you asked for β I haven't written code yet. If you want, I'll wire Tavily in as a new LlamaIndex tool/retriever with a `TAVILY_API_KEY` `.env` entry and a UI toggle for web grounding. Sources: - [The best web search APIs for AI in 2026 | Brave](https://brave.com/learn/best-search-api-2026/) - [Agentic Search in 2026: Benchmark 8 Search APIs | AIMultiple](https://aimultiple.com/agentic-search) - [Best AI Search APIs for Agents 2026: Tavily vs Exa vs Serper | NomadLab](https://nomadlab.cc/blog/2026/05/best-ai-search-apis-2026-tavily-exa-serper-firecrawl) - [LLM Search API: Best Options in 2026 | Sona](https://www.sona.com/blog/llm-search-api-best-options-for-developers-in-2026) - [Tavily LlamaIndex integration docs](https://docs.tavily.com/documentation/integrations/llamaindex) - [Tavily research tool β LlamaIndex](https://docs.llamaindex.ai/en/stable/api_reference/tools/tavily_research/) - [Exa API getting started](https://exa.ai/docs/reference/getting-started) - [Brave Search API](https://brave.com/search/api/) - [Linkup β best alternative to the Bing Search API](https://www.linkup.so/blog/what-s-the-best-alternative-to-the-bing-search-api) - [Mojeek Web Search API](https://www.mojeek.com/services/search/web-search-api/) - [Jina Reader search grounding](https://jina.ai/news/jina-reader-for-search-grounding-to-improve-factuality-of-llms/) ## Recommendation: Tavily For this repo β a Python, **LlamaIndex-based** RAG chatbot (Ollama/OpenAI LLMs, hybrid BM25+vector retrieval over PDFs) β the single best managed web-search/retrieval provider is **Tavily**. ### Every credible managed provider I identified **AI-native search-for-LLMs:** Tavily, Exa, Linkup (π«π·), Brave Search API, Perplexity Sonar, You.com, Parallel (Search Pro), Jina AI (π©πͺ, DeepSearch/Reader grounding), Valyu. **SERP / scraping aggregators:** SerpApi, Serper.dev, ScrapingDog, ScrapingBee, ZenRows, Firecrawl (search+scrape), Diffbot, Apify. **Engine-native indexes:** Brave (independent), Mojeek (π¬π§, independent), Kagi Search API, Google Programmable Search / Custom Search JSON, Qwant (π«π·), and **Bing Web Search β retired Aug 2025, excluded**. **Non-US / European options specifically:** Linkup (France, EU-hosted), Jina AI (Berlin), Mojeek (UK), Qwant (France). ### Why Tavily wins for *this* repo - **First-class LlamaIndex integration** β `llama-index-tools-tavily-research` / `TavilyToolSpec`, which slots directly into the existing LlamaIndex engine and retriever pattern (`rag_chatbot/core/engine/retriever.py`). Lowest integration cost of any option. Provider mentions: openai
## Recommendation: Tavily For this repo β a Python, **LlamaIndex-based** RAG chatbot (Ollama/OpenAI LLMs, hybrid BM25+vector retrieval over PDFs) β the single best managed web-search/retrieval provider is **Tavily**. ### Every credible managed provider I identified **AI-native search-for-LLMs:** Tavily, Exa, Linkup (π«π·), Brave Search API, Perplexity Sonar, You.com, Parallel (Search Pro), Jina AI (π©πͺ, DeepSearch/Reader grounding), Valyu. **SERP / scraping aggregators:** SerpApi, Serper.dev, ScrapingDog, ScrapingBee, ZenRows, Firecrawl (search+scrape), Diffbot, Apify. **Engine-native indexes:** Brave (independent), Mojeek (π¬π§, independent), Kagi Search API, Google Programmable Search / Custom Search JSON, Qwant (π«π·), and **Bing Web Search β retired Aug 2025, excluded**. **Non-US / European options specifically:** Linkup (France, EU-hosted), Jina AI (Berlin), Mojeek (UK), Qwant (France). ### Why Tavily wins for *this* repo - **First-class LlamaIndex integration** β `llama-index-tools-tavily-research` / `TavilyToolSpec`, which slots directly into the existing LlamaIndex engine and retriever pattern (`rag_chatbot/core/engine/retriever.py`). Lowest integration cost of any option. - **Purpose-built for RAG grounding** β one round-trip searches, scrapes, filters, and ranks up to ~20 sources, returning scored passages + URLs + optional full content + a short answer. The repo's ` Provider mentions: openai