ihyee is a web intelligence API designed for agentic AI workloads. It combines a frontier-grade search index, content extraction, headless-browser rendering, and RAG-ready chunking under a single Bearer-authenticated REST surface — so an agent making a research call gets back usable, ranked content rather than a list of URLs to chase.

How is ihyee accessed?

Direct REST with Bearer auth, JSON in and out. A Python SDK and a Model Context Protocol server are also provided so ihyee plugs into any compatible AI assistant, IDE, or agent framework without writing API glue.

Product · Integration & Connectivity · Production

ihyee

Q: How is ihyee different from a generic search API?

Generic search APIs return URLs and snippets. The agent then has to fetch each page, parse the HTML, decide whether the content is real or rendered client-side, strip boilerplate, and figure out which chunks are worth feeding to the model. ihyee compresses that pipeline into one call — search returns extracted page content, with optional summarization or RAG-style ranked chunks under a hard token budget.

Q: What is RAG mode?

Set content_mode to "rag" with a max_total_tokens budget and ihyee returns BM25-ranked chunks of the matched content under a hard cap. Agents stop guessing about response size; the response size becomes a parameter. Per-chunk caps, minimum relevance scores, top-K chunks, and tokenizer choice are all configurable.

Web intelligence API. Frontier-grade search, clean extraction, JS rendering, and RAG-ready chunking — bounded in size, optimized for agentic AI workloads.

Get an API key ↗ Talk to an engineer

Schematic showing the ihyee single-API-call retrieval pipeline — search, fetch, JS-render, extract, chunk — producing RAG-ready context with source URL and timestamp metadata

The problem

Generic search APIs ship URLs. Agents need usable content.

Most search APIs return a list of URLs. The agent then has to fetch each one, parse the HTML, decide whether the content is real or rendered client-side, strip boilerplate, and figure out which chunks are worth feeding to the model. Every step is somebody else's failure mode in disguise.

ihyee compresses that pipeline into one call. Search returns extracted page content, with optional summarization or RAG-style ranked chunks. JS-heavy pages get rendered through a headless browser when needed. Returned payload size is bounded so the agent's context budget is never blown by an unexpectedly large article.

How it works

Three endpoints, every advanced filter exposed.

POST · /v1/search

Frontier-grade search with extracted content per result.

Operators (site:, filetype:, intitle:, "exact"), date filters (before / after / date_restrict), phrase filters (must_have, intext, allintext, and_condition). Content modes: summary, markdown, full_text, both, rag.

POST · /v1/fetch

Extract content from one or more known URLs.

Up to 10 URLs per call. Same content modes as search. Use when the agent already has a URL it needs the body of.

POST · /v1/render

Force a single URL through a headless browser.

For SPAs and JS-heavy pages where fetch returns thin content. Configurable wait strategy and selector.

RAG mode

Hard token budgets for agent context windows.

Set content_mode: "rag" with a max_total_tokens budget and ihyee returns BM25-ranked chunks of the matched content under a hard cap. Agents stop guessing about response size; the response size becomes a parameter.

Tunable: per-chunk token cap, per-result token budget, minimum relevance score, top-K chunks per result, tokenizer (cl100k, o200k, approximate). The default is sensible for most agent workloads; the knobs are there when you need them.

Where it fits

Built for agents, useful anywhere a clean web answer matters.

Agentic web research

Agents that need to ground their responses in current web content — sales-intelligence flows, regulatory tracking, market scans. One call returns ranked, extracted content with source URLs and timestamps; the agent never has to write its own crawler.

RAG over public corpora

For workloads where the knowledge base is the open web rather than an internal corpus. Hard token budgets and BM25-ranked chunks let the model see the most relevant material under a fixed context cap, with no surprise blow-outs from unexpectedly long pages.

Date-bounded competitive monitoring

before, after, and date_restrict filters combined with phrase operators let an agent run targeted recurring queries — "what changed in this competitor's pricing in the last 30 days" — and get back the actual extracted text, not just URLs to follow up on.

Content extraction from JS-heavy pages

For documentation portals, single-page applications, and any site where a plain HTTP fetch returns thin content, the headless-browser render endpoint produces the actual rendered DOM. The agent gets the same content a human reader sees.

Patterns that use this product

Worked examples from real engagements.

Government · Pattern

Building integration that survives a four-hour satellite outage

How we designed offline-first integration for a Northern anchor institution: local-first state, queue-and-replay sync, idempotent integrations. The field worker never knows the link dropped, and the back-office systems reconcile cleanly when it returns.

10–14 weeks for the first deployment

Telecom · Pattern

Agentic AI on a thirty-year-old OSS, without ripping it out

How we wired an agent runtime into a telecom operator's legacy customer system using a custom protocol connector and reusable skill files — turning a read-only audit nightmare into a queryable, governable AI workspace for operations teams.

8–12 weeks for a production read-access pilot

Cross-sector · Pattern

Replacing legacy field-service applications without disrupting operations

How we migrate operationally critical field-service applications onto the operator's existing enterprise platform — preserving the workflow, replacing the lock-in. Worked example: a continental field-service replacement that ran every shift through the cutover without missing a dispatch.

10–14 weeks end-to-end

Get started

Search, fetch, render. One Bearer token.

Tell us about your agent's research workload. We'll issue an API key, share the SDK, and you'll have it integrated within an afternoon.

Request an API key