Checking Job Status
When you run an extraction in asynchronous mode, the API immediately returns a job ID instead of waiting for the extraction to finish.
You can use that job ID to check the current status of the extraction and retrieve the result once processing is complete.
How It Works
The asynchronous extraction workflow follows these steps:
- Start an extraction using
processing_mode: "async". - Receive a
job_id. - Poll the job status endpoint.
- Retrieve the extracted content when the job is completed.
Get an Extraction Job
Use the following endpoint to retrieve the current status of an extraction job.
GET /v1/extract/{job_id}Replace {job_id} with the value returned by your extraction request.
Example Request
curl -X GET "https://scraper.geonode.io/v1/extract/4844831a-a222-4cac-b5e6-7e3f2dd07b48" \
-H "X-Api-Key: YOUR_API_KEY"Response While Processing
A job may still be running when you check its status.
During this stage, the extracted content is not yet available.
{
"job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"status": "processing",
"created_at": "2026-05-26T10:30:00Z",
"completed_at": null,
"data": null,
"metadata": null,
"error": null,
"tokens_charged": null
}What This Means
| Field | Description |
|---|---|
status | Current state of the extraction job |
created_at | Time the job was created |
completed_at | null until processing finishes |
data | Extracted content, available after completion |
metadata | Extraction details, available after completion |
error | Error information if the job fails |
tokens_charged | Token usage after processing completes |
Response After Completion
Once the extraction finishes successfully, the response includes the extracted content and metadata.
{
"job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"status": "completed",
"created_at": "2026-05-26T10:30:00Z",
"completed_at": "2026-05-26T10:30:04Z",
"data": {
"markdown": "# Example Page Content"
},
"metadata": {
"url": "https://docs.python.org/3/library/json.html",
"render_js": false,
"http_status": 200,
"duration_ms": 631,
"formats": ["markdown"],
"processing_mode": "async"
},
"error": null,
"tokens_charged": 1
}Job Status Values
The API can return the following job statuses.
| Status | Description |
|---|---|
queued | The job has been accepted and is waiting to start |
processing | The extraction is currently running |
completed | The extraction finished successfully |
failed | The extraction could not be completed |
cancelled | The job was cancelled before completion |
Starting an Async Extraction
To use this endpoint, first create an asynchronous extraction job.
curl -X POST "https://scraper.geonode.io/v1/extract" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.python.org/3/library/json.html",
"formats": ["markdown"],
"processing_mode": "async"
}'Async Job Response
The extraction endpoint returns a job ID that can be used for polling.
{
"job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"status": "queued",
"status_url": "/v1/extract/4844831a-a222-4cac-b5e6-7e3f2dd07b48",
"estimated_tokens": 1
}Polling for Results
A common pattern is to periodically check the job status until the extraction completes.
POST /v1/extract
↓
Receive job_id
↓
GET /v1/extract/{job_id}
↓
status = processing
↓
GET /v1/extract/{job_id}
↓
status = completed
↓
Read extracted contentNext Step
Now that you can monitor individual extraction jobs, the next guide explains how to view and filter multiple extraction jobs using the Jobs endpoint.