dynamic website

17 Dec, 2019

Instant Data Scraper Update

By |2019-12-17T15:59:01+02:00December 17th, 2019|Uncategorized|0 Comments

In October and November of this year we decided to survey Instant Data Scraper extension users to see where Web Robots team should focus for the next update. We already had some ideas from user emails that we received over last couple years, but we needed a more scientific proof to see which features would be most desired. Among features we consider things like infinite scroll support, running jobs on cloud, processing batches of URLs, proxy support, etc.

Before the end of the survey it became clear that infinite scroll support is by far most desired feature and decided to release it as soon as possible. One December 11th we published a 0.2.0 version to Chrome Webstore. Enjoy it!

Other features […]

4 Mar, 2019

Scraping Dynamic Websites Using The wait() Function

By |2019-03-04T15:12:11+02:00March 4th, 2019|Web Scraping|0 Comments

Dynamic websites are one of the biggest headaches of every developer who works with web scraping robots. Data extraction becomes complicated when it cannot be found in the initial HTML of the website. For example, walmart.com loads product data via AJAX call after the initial DOM is rendered. Therefore we must wait and then extract data from the DOM.For example Walmart’s product page https://grocery.walmart.com/ip/Viva-Paper-Towels-Choose-A-Sheet-1-Big-Roll/52291575 product data appears in selector $(‘div[class^=”ProductPage__details”]’).


steps.start = function() {

    console.log($('div[class^="ProductPage__details"]').length);

    done();

};

Logged result is 0, as our code executes as soon as the DOM is ready, but before the element appears. There are several ways we can fix this.Simple waiting strategy – use setTimeout()We can use setTimeout() where we specify the […]