Scraper apiGeneratedCrawl
Start a crawl job
Crawl a website starting from a seed URL up to a given depth and page limit
X-Api-Key<token>
In: header
urlstring
Seed URL to start crawling from
Format
uriLength
1 <= lengthdepth?integer
Maximum BFS depth from the seed URL (1 = seed only)
Default
2Range
1 <= value <= 10limit?integer
Maximum number of pages to crawl
Default
50Range
1 <= value <= 10000formats?Formats
Output formats to extract per page
Default
["markdown"]render_js?boolean
Use a headless browser to render each page (slower, handles JS-heavy sites)
Default
falsesame_domain_only?boolean
Only follow links that stay on the same domain as the seed URL
Default
trueinclude_subdomains?boolean
When same_domain_only is true, also include subdomains of the seed domain
Default
falseproxy?|null
wait_config?|null
Response Body
application/json
curl -X POST "https://scraper.geonode.io/v1/crawl" \ -H "Content-Type: application/json" \ -d '{ "url": "http://example.com" }'{
"job_id": "453bd7d7-5355-4d6d-a38e-d9e7eb218c3f",
"url": "string",
"status": "queued",
"status_url": "string",
"estimated_pages": 0
}Empty
Empty
Empty
Empty
Empty