Checking Job Status

When you run an extraction in asynchronous mode, the API immediately returns a job ID instead of waiting for the extraction to finish.

You can use that job ID to check the current status of the extraction and retrieve the result once processing is complete.

How It Works

The asynchronous extraction workflow follows these steps:

Start an extraction using processing_mode: "async".
Receive a job_id.
Poll the job status endpoint.
Retrieve the extracted content when the job is completed.

Get an Extraction Job

Use the following endpoint to retrieve the current status of an extraction job.

GET /v1/extract/{job_id}

Replace {job_id} with the value returned by your extraction request.

Example Request

curl -X GET "https://scraper.geonode.io/v1/extract/4844831a-a222-4cac-b5e6-7e3f2dd07b48" \
  -H "X-Api-Key: YOUR_API_KEY"

Response While Processing

A job may still be running when you check its status.

During this stage, the extracted content is not yet available.

{
  "job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
  "status": "processing",
  "created_at": "2026-05-26T10:30:00Z",
  "completed_at": null,
  "data": null,
  "metadata": null,
  "error": null,
  "tokens_charged": null
}

What This Means

Field	Description
`status`	Current state of the extraction job
`created_at`	Time the job was created
`completed_at`	`null` until processing finishes
`data`	Extracted content, available after completion
`metadata`	Extraction details, available after completion
`error`	Error information if the job fails
`tokens_charged`	Token usage after processing completes

Response After Completion

Once the extraction finishes successfully, the response includes the extracted content and metadata.

{
  "job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
  "status": "completed",
  "created_at": "2026-05-26T10:30:00Z",
  "completed_at": "2026-05-26T10:30:04Z",
  "data": {
    "markdown": "# Example Page Content"
  },
  "metadata": {
    "url": "https://docs.python.org/3/library/json.html",
    "render_js": false,
    "http_status": 200,
    "duration_ms": 631,
    "formats": ["markdown"],
    "processing_mode": "async"
  },
  "error": null,
  "tokens_charged": 1
}

Job Status Values

The API can return the following job statuses.

Status	Description
`queued`	The job has been accepted and is waiting to start
`processing`	The extraction is currently running
`completed`	The extraction finished successfully
`failed`	The extraction could not be completed
`cancelled`	The job was cancelled before completion

Starting an Async Extraction

To use this endpoint, first create an asynchronous extraction job.

curl -X POST "https://scraper.geonode.io/v1/extract" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://docs.python.org/3/library/json.html",
    "formats": ["markdown"],
    "processing_mode": "async"
  }'

Async Job Response

The extraction endpoint returns a job ID that can be used for polling.

{
  "job_id": "4844831a-a222-4cac-b5e6-7e3f2dd07b48",
  "status": "queued",
  "status_url": "/v1/extract/4844831a-a222-4cac-b5e6-7e3f2dd07b48",
  "estimated_tokens": 1
}

Polling for Results

A common pattern is to periodically check the job status until the extraction completes.

POST /v1/extract
        ↓
Receive job_id
        ↓
GET /v1/extract/{job_id}
        ↓
status = processing
        ↓
GET /v1/extract/{job_id}
        ↓
status = completed
        ↓
Read extracted content

Next Step

Now that you can monitor individual extraction jobs, the next guide explains how to view and filter multiple extraction jobs using the Jobs endpoint.

Checking Job Status

On this page