Extraction

Processing Modes

The Extraction API supports two processing modes:

  • sync for immediate results
  • async for background processing

Use the processing_mode field to control how extraction requests are handled.

{
  "processing_mode": "sync"
}

If processing_mode is omitted, the API uses sync mode by default.

Processing Modes

Synchronous mode waits for extraction to complete before returning a response.

Request

request.sh
curl -X POST "https://scraper.geonode.io/v1/extract" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "processing_mode": "sync"
  }'

Response

response.json
{
  "data": {
    "html": "<html>...</html>",
    "markdown": null
  },
  "metadata": {
    "url": "http://example.com/",
    "http_status": 200,
    "processing_mode": "sync"
  },
  "tokens_charged": 1
}

The request remains open until extraction is complete and the content is returned in the response.

Asynchronous mode immediately creates an extraction job and returns a job ID.

Request

request.sh
curl -X POST "https://scraper.geonode.io/v1/extract" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://geonode.com/",
    "processing_mode": "async"
  }'

Response

response.json
{
  "job_id": "32561cfc-4d87-4a46-af4a-a10e5f3168b9",
  "status": "queued",
  "status_url": "/v1/extract/32561cfc-4d87-4a46-af4a-a10e5f3168b9",
  "estimated_tokens": 1
}

The extraction continues in the background while your application continues running.

Use the returned job_id or status_url to retrieve the extraction result later.

Choosing a Processing Mode

ModeBest For
SyncInteractive applications, quick extractions, and immediate results
AsyncBackground processing, large workloads, and long-running extractions

Success

You now know how to choose between synchronous and asynchronous extraction requests.

Synchronous mode returns the extracted content immediately, while asynchronous mode returns a job ID that can be used to retrieve the result later.

Next Steps

Continue to Checking Job Status to learn how to retrieve the status and results of an asynchronous extraction job using GET /v1/extract/{job_id}.

On this page