Best Practices
The following recommendations can help improve extraction results and reduce unnecessary processing.
Choose the Right Output Format
Use the output format that matches your use case.
| Format | Best For |
|---|---|
markdown | AI workflows, RAG pipelines, indexing, and text processing |
html | Preserving page structure and rendering content |
| Both | Applications that require both formats |
Requesting only the formats you need can reduce response size.
Enable JavaScript Rendering Only When Needed
JavaScript rendering increases extraction time because the page must be rendered before content can be extracted.
Use:
{
"render_js": true
}only for websites that depend on client-side rendering.
Common examples include:
- React
- Next.js
- Vue
- Single-page applications (SPAs)
Use Asynchronous Processing for Large Workloads
For long-running extractions, use asynchronous processing.
{
"processing_mode": "async"
}This prevents request timeouts and allows your application to continue processing while extraction runs in the background.
Use Geo-Targeting Only When Required
Proxy routing may increase processing time.
Only specify a country when content differs by location.
{
"proxy": {
"country": "DE",
"type": "residential"
}
}Reuse Job IDs
When using asynchronous extraction:
- Create the extraction job once.
- Store the returned
job_id. - Poll the job status endpoint.
Avoid creating duplicate extraction jobs for the same request.
Use Link Extraction Only When Needed
{
"extract_links": true
}Enable link extraction only when you need URLs from the page.
This keeps responses smaller and easier to process.
Monitor Job Status
Before retrieving extraction results, check the job status.
GET /v1/extract/{job_id}Wait until the status becomes:
completedbefore processing the result.
Store Extracted Content
If content does not change frequently, consider storing extraction results instead of repeatedly extracting the same page.
This can reduce costs and improve performance.
Success
You now know the recommended practices for building reliable extraction workflows.
Next Steps
Continue to Common Errors to learn how to troubleshoot common extraction issues.