Drake after Two Years: “Barely Famous”

We released Drake (“Data workflow tool, like a ‘Make for data’”) two years ago. The impetus behind Drake was the classic “scratch your own itch”. We craved a text-based data workflow tool that would make it easier to build and manage our data processing tasks. We open sourced Drake in the hope that others would [...]

Changes in our Global Places Data – Q4 2014

Place data, like most things, has an expiration date. Go too long without picking a fresh crop, and you end up with something stale an unpalatable— or in many of our partners’ cases, even unusable. This is why we work tirelessly to refresh and improve Global Places all the time. We clean out listings for [...]

Factual Debuts Two New Tools for Geofencing and Audience Targeting, Putting Location Data in the Hands of the Client

Factual is excited to launch two new product features to make it easier for marketers to take advantage of the power and flexibility of our Geopulse Proximity and Geopulse Audience products. The Geopulse Proximity Designer and Geopulse Audience Builder are transparent and powerful self-serve tools that enable our partners to craft their campaigns – using [...]

The Humongous nfu Survival Guide

Github: github.com/spencertipping/nfu A lot of projects I’ve worked on lately have involved an initial big Hadoop job that produces a few gigabytes of data, followed by some exploratory analysis to look for patterns. In the past I would have sampled the data before loading it into a Ruby or Clojure REPL, but increasingly I’ve started [...]

How Factual Uses Persistent Storage For Its Real-Time Services

As part of Factual’s Geopulse product suite, we need to be able to absorb and process large amounts of data, and deliver back a somewhat smaller amount of data. There is a significant amount of technology available for the processing stage, but fewer for both the intake and delivery. Today, we’re open sourcing two libraries [...]