Skip to main content
Extraction lets you define exactly what data you want from a page — as a JSON Schema — and ZapFetch handles fetching the content, running LLM inference, and returning a typed response that matches your schema. You describe the shape of the output; ZapFetch does the scraping and parsing. This is useful for pulling product details, article metadata, leaderboard data, or any other structured information that would otherwise require custom HTML parsing. Each call costs 1 credit.

Define your schema and extract

Send POST /v1/extract with the urls you want to read, a natural-language prompt describing what to extract, and a JSON Schema in the schema field. The following example extracts the top Hacker News stories with their points and authors.
curl -X POST https://api.zapfetch.com/v1/extract \
  -H "Authorization: Bearer $ZAPFETCH_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://news.ycombinator.com"],
    "prompt": "Extract the top 5 story titles with their points and author.",
    "schema": {
      "type": "object",
      "properties": {
        "stories": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "title":  { "type": "string" },
              "points": { "type": "integer" },
              "author": { "type": "string" }
            }
          }
        }
      }
    }
  }'

Key parameters

ParameterTypeDescription
urlsstring[]One or more URLs to fetch and extract from.
promptstringA natural-language description of what to extract.
schemaobjectA JSON Schema defining the shape of the response.
Write your prompt as if you were instructing a human researcher. Be specific about counts (“top 5”), field names, and any edge cases — the more precise the prompt, the more consistent the output.

Credit cost

OutcomeCost
Successful extraction1 credit
Failed call (error response)0 credits
The 1-credit cost is flat regardless of how many urls you include or how complex your schema is. If you need to extract from many URLs, batching them into a single extract call is more credit-efficient than separate scrape requests followed by local parsing.
LLM extraction works best on pages with clear, structured content. Dynamic pages that require JavaScript interaction or pages behind login walls may return incomplete results. Test with a representative sample before relying on extraction at scale.

Next steps

  • Fetch raw page content with Scrape.
  • Extract from every page on a site by combining Crawl with extract calls.
  • Run a search and extract structured data from the results with Search.