Datasets

Blog posts about extraction, manipulation and interpretation of various datasets.

27 Jan, 2016

New Dataset – UK LPA Search

By |2019-03-04T16:09:32+02:00January 27th, 2016|Datasets|0 Comments

We are excited to announce UK LPA Search – it is a search engine for all UK’s local planning authorities. Until now there was no possibility to search LPA databases from one place. One had to find each LPA’s website and search inside it. Considering there are few hundred of them – this would not be an easy task for a human. Our robots have no problems indexing all databases and providing them as a single dataset.

A bonus point – we geocoded all requests and display them on a map. Therefore anyone can see what building permits are being issues around them. Example: Map of building permits in London […]

31 Dec, 2015

New Kickstarter Dataset

By |2019-03-04T16:10:59+02:00December 31st, 2015|Datasets|2 Comments

Recently we updated our Kickstarter robot to crawl project subcategories. This allows us to collect a richer dataset, for example on 2015-12-17 run robot collected data about 144,263 projects with a running time only 2 hours! We also started presenting it in the JSON streaming format which is just a line delimited JSON. Previously we used to stuff all projects into JSON array and the downside of it was that user would have to read the entire large JSON file into memory before any kind of processing starts. with JSON streaming it is possible to read one line at a time.

Data is posted in the usual place.