Webhooks

Create a Webhook

Register a webhook subscription for Scraper API completion events.

POST /v1/webhooks registers a new webhook endpoint. Use webhooks when your application needs a callback after asynchronous extract, batch, or crawl work completes.

The callback URL must be an absolute URI that your application can receive HTTP callbacks on. You can register up to 10 webhooks per account.

Event Types

Choose one event type for each webhook subscription.

Event typeWhen it is used
extract_completedAn asynchronous extraction job completes.
batch_completedA batch extraction job completes.
crawl_completedA crawl job completes.

Request

The example below creates a webhook for batch completion events.

export SCRAPER_API_BASE_URL="https://scraper.geonode.io"
export GEONODE_SCRAPER_API_KEY="YOUR_API_KEY"

curl -X POST "$SCRAPER_API_BASE_URL/v1/webhooks" \
  -H "X-Api-Key: $GEONODE_SCRAPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/webhooks/geonode-scraper",
    "event_type": "batch_completed",
    "description": "Production batch completion webhook"
  }'

Webhook Delivery

When a registered event occurs, the Scraper API sends an HTTP POST to your callback URL with a JSON envelope. Every delivery includes these HTTP headers:

HeaderDescription
Content-Typeapplication/json
User-Agentscraper-api-webhook/1.0
X-Webhook-SignatureHMAC-SHA256 signature for verification (see below).
X-Webhook-Delivery-IdUnique ID for this delivery attempt.
X-Webhook-EventThe event type that triggered the delivery.

Delivery Envelope

Every delivery body is a JSON object with this envelope:

{
  "event": "batch_completed",
  "webhook_id": "2a936d3b-5a5d-47a0-b68d-5df0d8b8327c",
  "delivery_id": "395cf7e2-41e0-4e44-85f1-5f00f537a44f",
  "timestamp": "2026-05-27T08:16:09Z",
  "data": { ... }
}

The data field is event-specific:

Event typedata shape
extract_completedFull job result matching the GET /v1/extract/{job_id} response (job_id, status, created_at, completed_at, data with markdown/html/links, metadata, error, tokens_charged).
batch_completed{ "job_id": "...", "status": "completed", "counts": { "total_urls": N, "completed_urls": N, "failed_urls": N, "cancelled_urls": N, "pending_urls": N }, "status_url": "/v1/batch/{job_id}" }
crawl_completed{ "job_id": "...", "status": "completed", "counts": { "total_pages": N, "completed_pages": N, "failed_pages": N, "cancelled_pages": N }, "status_url": "/v1/crawl/{job_id}" }

Signature Verification

Every delivery includes an X-Webhook-Signature header with an HMAC-SHA256 signature computed over the raw request body using your webhook secret. The signature value is prefixed with sha256=.

Verify this signature in your receiver to confirm the delivery originated from the Scraper API and was not tampered with in transit:

import hmac
import hashlib

secret = "whsec_example"    # The secret returned when you created the webhook
body = request.body          # Raw bytes, not the parsed dict
header = request.headers.get("X-Webhook-Signature", "")

expected = "sha256=" + hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected, header):
    raise ValueError("Invalid signature")

Retry Behavior

The Scraper API retries failed deliveries with exponential backoff. A delivery is retried when:

  • Your receiver returns HTTP 429 or a 5xx status code.
  • The connection times out or a network error occurs.

The API does not retry on client errors (4xx other than 429). Those deliveries are immediately abandoned.

Up to 5 retry attempts are made, with the following default delays between attempts (configurable per deployment):

AttemptDelay before next retry
1st retryimmediate
2nd retry1 minute
3rd retry5 minutes
4th retry30 minutes
5th retry2 hours

After the final retry fails, the delivery is abandoned. Delivery records in GET /v1/webhooks/{webhook_id}/deliveries show the attempt_number and current status for each delivery.

Timeout

The upstream HTTP timeout for outbound webhook POSTs is 10 seconds (configurable per deployment). If your receiver needs more time, acknowledge the webhook quickly with a 2xx status and process the job asynchronously.

Request Body

The create body describes where the API should send the callback and which event should trigger it.

FieldTypeRequiredDescription
urlstringYesAbsolute webhook callback URL.
event_typestringYesOne of extract_completed, batch_completed, or crawl_completed.
descriptionstring or nullNoOptional label to help you identify the webhook later.

Response

A successful create request returns 201 and includes the generated secret. Save the secret when you create the webhook, because it is not shown in normal webhook lookup responses.

{
  "id": "2a936d3b-5a5d-47a0-b68d-5df0d8b8327c",
  "url": "https://example.com/webhooks/geonode-scraper",
  "description": "Production batch completion webhook",
  "is_active": true,
  "event_type": "batch_completed",
  "created_at": "2026-05-27T08:15:30Z",
  "updated_at": "2026-05-27T08:15:30Z",
  "secret": "whsec_example"
}

On this page