web robots ide – Web Scraping Service

21 Jun, 2017

New IDE Extension Release

By nicerobot|2019-03-04T15:26:12+02:00June 21st, 2017|Web Scraping|2 Comments

Today we are releasing an update to our main extension – Web Robots Scraper IDE. This release has a version number 2017.6.20 and has several improvements in UI, proxy settings control, handling hash symbols in URLs.

Version 2017.6.20 RELEASE NOTES

UI: Robot run statistics is displayed in the same place and no longer “jumping”
UI: when robot finishes it’s status is a direct link to robot run list on portal. Run link is a direct link to data preview and download on portal.
setProxy() functionality has been expanded. See documentation for details.
Bugfix: fixed a bug where subsequent steps with URLs having identical address before # symbol were not loading correctly (Example: http://foobar.com#a and after that go to http://foobar.com#b).
Other internal engine improvements […]

24 Feb, 2017

Scraping Extension Update – version 2017.2.23

By nicerobot|2019-03-04T15:35:16+02:00February 24th, 2017|Web Scraping|0 Comments

Recently we rolled out an updated version of our main web scraping extension which contains several important updates and new features. This update allows our users to develop and debug robots even faster than before. So what exactly is new?

jQuery has been upgraded from version 1.10.2 to 2.2.4
done() now can take a milliseconds delay parameter. For example done(1000); will delay step finish by 1 second.
New tab Selectors which allows testing selectors inline and generates robot code. Selectors are immediately tested on browser’s active tab so developer can see if they work correctly. Copy code button copies Javascript code to clipboard which can be pasted directly into robot’s step.

[…]

3 Dec, 2015

New Features

By nicerobot|2019-03-04T16:16:49+02:00December 3rd, 2015|Web Scraping|0 Comments

We are happy to announce some new features in our robot writing framework. These features are:

Fork() – split robot into many parallel robots and run them simultaneously. This feature shortens long scraping jobs by parallelising them. Cloud autoscaling handles necessary instance capacity so our customers can run 100s of instances on-demand.
skipVisited – allows robot to intelligently skip steps to links that were already visited. Avoid data duplication, save robot running time.
respectRobotsTxt – crawl target sources with compliance to their robots.txt file.

These features are explained in detail and examples added to our framework documentation page.