Working With Output Formats
The Extraction API can return content in Markdown, HTML, or both formats in a single request.
All examples in this guide use the following endpoint:
POST /v1/extractThe formats field controls which output formats are returned by the extraction request.
{
"formats": ["markdown"]
}If the formats field is omitted, the API returns HTML by default.
Output Formats
Markdown returns the extracted content as clean, readable text.
Request
curl -X POST "https://scraper.geonode.io/v1/extract" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["markdown"]
}'Response
{
"data": {
"markdown": "# Example Domain..."
}
}The extracted content is available in the data.markdown field.
Common use cases:
- AI and LLM workflows
- Search indexing
- Knowledge bases
- Text processing pipelines
HTML returns the extracted content with a structure closer to the original webpage.
Request
curl -X POST "https://scraper.geonode.io/v1/extract" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["html"]
}'Response
{
"data": {
"html": "<html>...</html>"
}
}The extracted content is available in the data.html field.
Common use cases:
- Rendering content in applications
- Preserving page structure
- Working with HTML elements
- Content transformation workflows
Request both Markdown and HTML when your application needs both formats from the same extraction.
Request
curl -X POST "https://scraper.geonode.io/v1/extract" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["markdown", "html"]
}'Response
{
"data": {
"markdown": "# Example Domain...",
"html": "<html>...</html>"
}
}The extracted content is available in both the data.markdown and data.html fields.
Choosing the Right Format
| Format | Best For |
|---|---|
| Markdown | AI workflows, search indexing, knowledge bases, and text processing |
| HTML | Preserving page structure and rendering content |
| Both | Applications that need both representations from a single extraction |
Success
You now know how to control the format returned by the Extraction API.
Whether you need Markdown, HTML, or both, you can choose the format that best fits your workflow.
Next Steps
Continue to Extracting JavaScript Websites to learn how to extract content from pages that rely on client-side rendering.