Web Search
Use the gateway-owned web_search built-in when you want the model to search
the web, open a page, or search inside a page it already opened during the same
request.
web_search is one public built-in tool:
The gateway owns how that public tool is realized internally. Operators enable one shipped profile at startup, and clients continue to use the same public tool shape regardless of which profile is active.
Shipped Profiles
Current shipped profiles:
exa_mcpduckduckgo_plus_fetch
exa_mcp
- Uses Exa-backed MCP tools for search and page opening.
- Requires the Built-in MCP runtime, but the shipped helper entry is provisioned automatically when the profile is enabled.
- If
EXA_API_KEYis set in the gateway environment, the shipped Exa MCP entry appends it automatically to the Exa MCP URL.
duckduckgo_plus_fetch
- Uses DuckDuckGo for search and the Fetch MCP server for page opening.
- Also requires the Built-in MCP runtime, with the shipped Fetch helper entry provisioned automatically when the profile is enabled.
Principles
web_searchstays one public built-in tool even though the gateway may realize it internally with multiple actions.- Profile selection is an operator decision made at startup, not a per-request client parameter.
- The gateway keeps backend/provider details out of the public Responses API request shape.
find_in_pageworks over request-local cached page text from a prioropen_pageresult in the same request.
Enable the Tool
vllm-responses serve
vllm serve --responses
Notes:
- If the profile flag is omitted,
web_searchis disabled. - For supported entrypoints, web-search enablement is CLI-owned.
- Shipped profiles that need Built-in MCP helper servers provision their
default helper entries automatically. You do not need
--mcp-configjust to enable a shippedweb_searchprofile.
Use the Tool
Minimal Python SDK example:
response = client.responses.create(
model="meta-llama/Llama-3.2-3B-Instruct",
input=[{"role": "user", "content": "Find the latest migration notes for vLLM."}],
tools=[{"type": "web_search"}],
)
When the model uses the tool, the response contains a web_search_call output
item.
If you want source expansion on search results, add:
Streaming responses emit the web_search_call lifecycle family, including:
response.web_search_call.in_progressresponse.web_search_call.searchingresponse.web_search_call.completed
Current Behavior and Limitations
- Profile selection happens at startup. Requests do not choose between shipped profiles.
- Current shipped profiles are limited to
exa_mcpandduckduckgo_plus_fetch. - Completed
web_search_calloutput items follow normal Responses storage behavior whenstore=true. find_in_pagedepends on page text cached from an earlieropen_pageaction in the same request, and that page cache is not persisted across requests orprevious_response_idcontinuations.include=["web_search_call.action.sources"]controls source expansion for search results.- Built-in MCP auto-provision covers the shipped helper defaults. Use
--mcp-configor--responses-mcp-configwhen you also need extra MCP inventory or explicit overrides.