FAQs

What is the Scraper API?

The Scraper API is a hosted extraction API. You send it a URL, and it returns clean Markdown or HTML from the target page. It also handles proxy routing, geo-targeting, JavaScript rendering, and anti-bot handling.

What is one credit?

One credit means one Scraper API request. In these docs, you'll usually see the word request instead of credit because the dashboard and pricing model are request-based.

One successful page extraction uses one request. Batch jobs count each successfully extracted URL as one request, and crawl jobs count each successfully extracted page as one request.

There are no extra request multipliers for JavaScript rendering, proxy type, geo-targeting, or requesting both Markdown and HTML.

What happens when I run out of free requests?

When you use all free requests for the billing month, new extraction requests stop working until you add paid requests, upgrade to a subscription, or wait for the next monthly renewal.

The API can return 402 when your account does not have enough available requests.

Do free requests roll over?

No. Free tier requests renew every billing month and expire at the end of that month. Unused free requests do not roll over.

Do subscription requests roll over?

Yes. Unused subscription requests roll over to the next month with no rollover cap while your subscription remains active.

Do Pay as you Go requests expire?

No. Pay as you Go requests do not expire. They are a good fit when your scraping workload is occasional or bursty.

Does JavaScript rendering cost more?

No. JavaScript rendering does not cost extra requests. A successful extraction with render_js: true still uses one request.

JavaScript rendering can take longer, so it is best to start with render_js: false and enable it only when the returned content is incomplete.

How is this different from using proxies directly?

With proxies directly, you still have to write the scraper, manage retries, choose when to run a browser, parse HTML, handle noisy pages, and normalize output.

The Scraper API sits above that work. You send a URL and get extracted Markdown or HTML back. It is a better fit when you want page content quickly and do not want to maintain browser or proxy orchestration yourself.

Direct proxies are still useful when you need full control over the browser, request flow, cookies, sessions, or a custom scraper pipeline.

What sites can I scrape?

You can use the Scraper API with public webpages that your account is allowed to access. It works well with static pages, documentation pages, articles, product pages, and many JavaScript-rendered pages.

Complex sites can still return noisy output. Some sites include large navigation menus, tracking links, ads, images, recommendations, cookie banners, or challenge pages in the extracted content. In those cases, inspect the returned Markdown or HTML and add post-processing if your workflow needs cleaner fields.

Always follow the target site's terms, applicable laws, and your own compliance requirements.

Why is my result missing content?

The most common reason is that the page loads content with JavaScript after the initial HTML response. Try the same request with render_js: true.

{
  "url": "https://quotes.toscrape.com/js/",
  "formats": ["markdown"],
  "render_js": true
}

If the result is still incomplete, the site may require a user interaction, a longer wait condition, a login, or a site-specific scraping strategy.

Why is the output noisy?

The API extracts raw page content. Many modern sites include navigation, filters, image data, tracking links, cookie banners, recommendations, and footer content in the page itself. The API may return some of that content because it exists in the source page.

For LLM, search, or analytics workflows, it is a good idea to test a few pages from the same target site and add your own cleanup step if needed.

Can I choose the country used for scraping?

Yes. Pass proxy.country with a two-letter ISO country code.

{
  "proxy": {
    "country": "US",
    "type": "residential"
  }
}

If you omit proxy, the API applies default residential proxy routing and tries to infer a useful country from the target URL when possible.

Which proxy types are supported?

The extraction endpoint supports residential, datacenter, and mix.

{
  "proxy": {
    "country": "US",
    "type": "residential"
  }
}

Can I pass custom headers?

Yes. Use the headers object for headers that should be included in the target extraction request.

{
  "url": "https://example.com",
  "formats": ["markdown"],
  "headers": {
    "Accept-Language": "en-US,en;q=0.9"
  }
}

Do not send your Geonode API key inside this object. Your API key belongs in the X-Api-Key header sent to the Scraper API.

The public extraction schema supports custom headers, but it does not provide a full browser session management interface for logging in, clicking through flows, or maintaining user state across many pages. You can use the headers field to send authentication cookies or tokens with each extraction request.

If your use case requires authenticated scraping beyond what custom headers can provide, contact support so the team can recommend the right setup.

Should I use sync or async mode?

Use sync mode for quick pages when you want the result in the same HTTP response.

Use async mode for slower pages, JavaScript-heavy pages, and workflows where you want to start the extraction and poll for the result later.

What is the map endpoint for?

The map endpoint discovers URLs under a base URL by reading sitemaps and links from the seed page.

{
  "url": "https://quotes.toscrape.com/",
  "search": "author",
  "include_subdomains": false,
  "ignore_query_parameters": true
}

The search field filters URLs the API already discovered. It does not run a Google search or query an external search engine.

Where can I see my usage?

Open the Geonode dashboard and check your Scraper API request balance. Use the dashboard UI as the source of truth for your current balance and plan details.

Where can I find the full API reference?

Read the full Scraper API reference here:

Open the Scraper API Reference

On this page