Get Extraction Job
Poll one async extraction job and retrieve the result when it is complete.
GET /v1/extract/{job_id} returns the current status of an async extraction job. Use it after starting an extraction with processing_mode: "async".
Request
Replace the example job ID with the job_id returned by POST /v1/extract.
export SCRAPER_API_BASE_URL="https://scraper.geonode.io"
export GEONODE_SCRAPER_API_KEY="YOUR_API_KEY"
curl -X GET "$SCRAPER_API_BASE_URL/v1/extract/4844831a-a222-4cac-b5e6-7e3f2dd07b48" \
-H "X-Api-Key: $GEONODE_SCRAPER_API_KEY"Response While Running
A running job can return queued or processing. At this point, data, metadata, and tokens_charged are usually null because extraction has not finished yet.
{
"job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"status": "processing",
"created_at": "2026-05-26T10:30:00Z",
"completed_at": null,
"data": null,
"metadata": null,
"error": null,
"tokens_charged": null
}Response After Completion
A completed job includes data, metadata, and tokens_charged.
{
"job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"status": "completed",
"created_at": "2026-05-26T10:30:00Z",
"completed_at": "2026-05-26T10:30:04Z",
"data": {
"markdown": "...",
"html": null,
"links": null
},
"metadata": {
"url": "https://docs.python.org/3/library/json.html",
"render_js": false,
"http_status": 200,
"duration_ms": 631,
"retry_count": 0,
"formats": ["markdown"],
"processing_mode": "async"
},
"error": null,
"tokens_charged": 1
}Job statuses are queued, processing, completed, failed, and cancelled.
Starting an Async Job
To create an async extraction job, send processing_mode: "async" to POST /v1/extract.
curl -X POST "$SCRAPER_API_BASE_URL/v1/extract" \
-H "X-Api-Key: $GEONODE_SCRAPER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.python.org/3/library/json.html",
"formats": ["markdown"],
"render_js": false,
"processing_mode": "async"
}'The API returns 202 with a job ID.
{
"job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"status": "queued",
"status_url": "/v1/extract/4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"estimated_tokens": 1
}The status_url can be relative. Prefix it with https://scraper.geonode.io when you call it directly.