How We Validate Data

By |2019-03-04T16:24:28+02:00December 18th, 2014|Web Scraping|1 Comment

Data is only valuable if it can be trusted. At weRobots we spend as much effort on validating data as on collecting it. It is a multi stage process.

weRobots data validation workflow

  • Scraping

Initial checks happen in scraper robots. Robot crawls target website and looks for data. Captured data is sent to our staging database. Many abnormal situations can arise at this stage:

      • Site may be down. Robot will log warnings and will retry pages that do not respond. Usually outage is temporary and robot resumes without intervention
      • Site layout changes. If robot cannot find navigation links or data it will stop and report error so […]