Extract structured data from web pages with ZapFetch

Extraction lets you define exactly what data you want from a page — as a JSON Schema — and ZapFetch handles fetching the content, running LLM inference, and returning a typed response that matches your schema. You describe the shape of the output; ZapFetch does the scraping and parsing. This is useful for pulling product details, article metadata, leaderboard data, or any other structured information that would otherwise require custom HTML parsing. Each call costs 1 credit.

Define your schema and extract

Send POST /v1/extract with the urls you want to read, a natural-language prompt describing what to extract, and a JSON Schema in the schema field. The following example extracts the top Hacker News stories with their points and authors.

curl -X POST https://api.zapfetch.com/v1/extract \
  -H "Authorization: Bearer $ZAPFETCH_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://news.ycombinator.com"],
    "prompt": "Extract the top 5 story titles with their points and author.",
    "schema": {
      "type": "object",
      "properties": {
        "stories": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "title":  { "type": "string" },
              "points": { "type": "integer" },
              "author": { "type": "string" }
            }
          }
        }
      }
    }
  }'

Key parameters

Parameter	Type	Description
`urls`	string[]	One or more URLs to fetch and extract from.
`prompt`	string	A natural-language description of what to extract.
`schema`	object	A JSON Schema defining the shape of the response.

Write your prompt as if you were instructing a human researcher. Be specific about counts (“top 5”), field names, and any edge cases — the more precise the prompt, the more consistent the output.

Credit cost

Outcome	Cost
Successful extraction	1 credit
Failed call (error response)	0 credits

The 1-credit cost is flat regardless of how many urls you include or how complex your schema is. If you need to extract from many URLs, batching them into a single extract call is more credit-efficient than separate scrape requests followed by local parsing.

LLM extraction works best on pages with clear, structured content. Dynamic pages that require JavaScript interaction or pages behind login walls may return incomplete results. Test with a representative sample before relying on extraction at scale.

Next steps

Fetch raw page content with Scrape.
Extract from every page on a site by combining Crawl with extract calls.
Run a search and extract structured data from the results with Search.

Get Started

Guides

Reference

Extract structured data from web pages with ZapFetch

Define your schema and extract

Key parameters

Credit cost

Next steps

Get Started

Guides

Reference

​Define your schema and extract

​Key parameters

​Credit cost

​Next steps

Define your schema and extract

Key parameters

Credit cost

Next steps