Data from Request
collect_data_from_requests: boolean | false;
What's This?
Web pages can load content in two main ways: all at once when you first visit, or bit by bit as you interact with the page. This collect_data_from_requests option lets the Scraper also grab those bits of content that load later, which are often fetched using 'fetch' or 'xhr' requests.
How Does it Work?
-
Default (collect_data_from_requests=false): The Scraper gets only the content that loads when you first visit the page.
-
With collect_data_from_requests=true: The Scraper will also pick up content that the webpage loads as you interact with it, like when new posts appear as you scroll down on a social media feed.
When Should I Use This?
Turn on collect_data_from_requests if you're scraping sites that keep loading new content as you interact, like sites with infinite scrolling or ones that show new data when you click on something. It ensures you don't miss out on any content.
Methods
collect_data_from_requests(boolean)
Configuration Name: cookies
Initializing Scraper
const GeonodeScraperApi = require('geonode-scraper-api');
const scraper = new GeonodeScraperApi('<Your_username>', '<Your_password>');
Using Method
scraper.setCollectDataFromRequests(true);
scraper.scrape('https://example.com/');
Using Configuration Object
const config = { collect_data_from_requests: true };
scraper.scrape('https://example.com/', config);