Skip to content
Delta Telematics

Product · Integration & Connectivity · Production

ihyee

Web intelligence API. Frontier-grade search, clean extraction, JS rendering, and RAG-ready chunking — bounded in size, optimized for agentic AI workloads.

Schematic showing the ihyee single-API-call retrieval pipeline — search, fetch, JS-render, extract, chunk — producing RAG-ready context with source URL and timestamp metadata

The problem

Generic search APIs ship URLs. Agents need usable content.

Most search APIs return a list of URLs. The agent then has to fetch each one, parse the HTML, decide whether the content is real or rendered client-side, strip boilerplate, and figure out which chunks are worth feeding to the model. Every step is somebody else's failure mode in disguise.

ihyee compresses that pipeline into one call. Search returns extracted page content, with optional summarization or RAG-style ranked chunks. JS-heavy pages get rendered through a headless browser when needed. Returned payload size is bounded so the agent's context budget is never blown by an unexpectedly large article.

How it works

Three endpoints, every advanced filter exposed.

POST · /v1/search

Frontier-grade search with extracted content per result.

Operators (site:, filetype:, intitle:, "exact"), date filters (before / after / date_restrict), phrase filters (must_have, intext, allintext, and_condition). Content modes: summary, markdown, full_text, both, rag.

POST · /v1/fetch

Extract content from one or more known URLs.

Up to 10 URLs per call. Same content modes as search. Use when the agent already has a URL it needs the body of.

POST · /v1/render

Force a single URL through a headless browser.

For SPAs and JS-heavy pages where fetch returns thin content. Configurable wait strategy and selector.

RAG mode

Hard token budgets for agent context windows.

Set content_mode: "rag" with a max_total_tokens budget and ihyee returns BM25-ranked chunks of the matched content under a hard cap. Agents stop guessing about response size; the response size becomes a parameter.

Tunable: per-chunk token cap, per-result token budget, minimum relevance score, top-K chunks per result, tokenizer (cl100k, o200k, approximate). The default is sensible for most agent workloads; the knobs are there when you need them.

Where it fits

Built for agents, useful anywhere a clean web answer matters.

Agentic web research

Agents that need to ground their responses in current web content — sales-intelligence flows, regulatory tracking, market scans. One call returns ranked, extracted content with source URLs and timestamps; the agent never has to write its own crawler.

RAG over public corpora

For workloads where the knowledge base is the open web rather than an internal corpus. Hard token budgets and BM25-ranked chunks let the model see the most relevant material under a fixed context cap, with no surprise blow-outs from unexpectedly long pages.

Date-bounded competitive monitoring

before, after, and date_restrict filters combined with phrase operators let an agent run targeted recurring queries — "what changed in this competitor's pricing in the last 30 days" — and get back the actual extracted text, not just URLs to follow up on.

Content extraction from JS-heavy pages

For documentation portals, single-page applications, and any site where a plain HTTP fetch returns thin content, the headless-browser render endpoint produces the actual rendered DOM. The agent gets the same content a human reader sees.

Get started

Search, fetch, render. One Bearer token.

Tell us about your agent's research workload. We'll issue an API key, share the SDK, and you'll have it integrated within an afternoon.

Request an API key