Friday, July 29, 2011

Chalkboard Maps: United States of Craigslist

geometric pass at delineating areas within the United States potentially covered by each craigslist site -the United States of Craigslist.  Check out poster print options here.

Source Data Download
Here is a link to a skydrive folder with this data as a shapefile (.shp, .dbf, yada yada...) and a (spartanly formatted) kml file:!/?cid=2eb6aaf6c3ac1ebe&sc=documents&uc=2&id=2EB6AAF6C3AC1EBE%212242
Poster Print Available
If you want, you can go here and order a print of this sucker.  Or you can call your friend in the geology department with a huge plotter and sneak off a print after hours.

Locality is inherent to the value of craigslist; I go to but I get kicked over to the local instance of craigslist (my IP address sources me to somewhere in the illustrious Lansing, MI).  But how does craigslist know where to send me?   Some mysterious system of assigning a geocoded IP address to just the right site must be in place...I wonder what that map looks like.
When Ian Clemens proposed the idea, I looked around to find an existing map of craigslist sites-to-areas -maybe even find the lookup that they themselves use. I couldn't find anything like it.
Whether it matches their system well or not, here is a map that approximates geographic coverage to individual sites using a Voronoi process as a base (more info on process below).  It is at least a start at visualizing the geographic coverage and distribution of the community-driven instances of craigslist.  Shapes like this might provide some useful context for other data, demographic or market information, for instance.  Also, when pulled into VFX, it can serve as an input to some spatial querying on those other metrics.

With access to web traffic data, one could compile a pragmatic view of coverage driven by the locations of actual website visitors (but this would just be the incestuous results of the current method craigslist uses to allocate visitors to sites).  That's ok, but it's more detective work than interesting data creation.
The use of openstreetmap data to weight the polygon drawing by travel time would improve the realism of the hypothetical zones considerably.  In that case, maybe it could be used to drive a more efficient assignment of visitors to their actual-nearest craigslist community.  That would rule.

Spin-off Functionality
Creating these areas was, in part, a helpful testbed for a generic region-building functionality that is in the skunk-works here at IDV.  The algorithmically inclined Abhinav Dayal has been crafting our drive-time service that is already doing a lot of the heavy lifting when it comes to Voronoi diagramming.  So, down the road we might see a new specialized tool that generates best-fit areas around an existing set of points -useful for some what-if scenarios around territory creation, available to the business user, not just the research scientist.

Nerdy Bits
• Scrape the list of Craig’s cities at
• Split joint-locations into individual locations (like "Odessa / Midland")
• Geocode place-specific locations.
• Manually position the more regional locations (like "Southeast Iowa").
• Divide locations into three geographically distinct regions (split by the Continental Divide along the spine of the Rockies and the Mississippi); duplicate any locations that meaningfully straddle a border, like St. Louis. I do this to introduce some true-cost of crossing either of those features, in the face of an algorithm that would otherwise treat the whole country as a smooth unfettered plain.
• Run Voronoi (Thiessen) algorithm to generate best-fit zones for the points, for all three regions.
• Clip Voronoi zones by a “land” shape to cut out the oceans and provide a common border between the three regions (my "land" was constructed from the Census Bureau's tracts file).
• Merge the 3 regional Voronoi sets into a unified nation-wide set.
• Dissolve boundaries between same-website Voronoi zones (to re-combine the joint-locations up there in step 2) into merged chunky polygons.
• Manually re-assign oddly-orphaned or split areas (common along complicated shorelines).

That's just about it.  Thoughts?  Ideas?  Outrage?  Incredulity?  Been done?  Guffaws?  Know a good dataset to improve this method?

Download the United States of Craigslist as a table of member Zip-Codes
Here's a map and lookup of the zip codes that correspond to these areas here:


Thursday, July 21, 2011

Chalkboard Maps: Urban Enclaves

An interactive Visual Fusion map application now available:

Buy the poster here. See in huge proportions here.

As a follow-up to the American Enclaves chalkboard map (thumb below), which showed the national distribution of racial proportion outliers, here is a closer look at some of the larger cities in the United States and their areas of way-more-than-average-populations of various ethnicities.

Some time ago Matthew Bloch, Shan Carter, and Alan McLean of the New York Times put together an interesting dot density map of these races.  This map borrows from that notion but isolates areas to only those where the rates are statistically marginalized from the national means.

Of note, in the whole of the country only a handful of census tract outliers overlapped races (more commonly asian, since their comparatively lower national rate makes for an easier outlier qualification).

Thumbnail of the national enclave map.  More on this here.

Wednesday, July 20, 2011

Fun With VFX: Browser Zoom Shenanigans

Have you ever played with a browser's "zoom" setting?  You can scale the content of your browser to make things appear larger or smaller, for whatever reason.  Look for it under the wrench icon in Chrome and the gear icon in IE.

The browser at regular old 100%.

 Here, I've set the zoom to 50%, shrinking everything down.

A pretty low-profile feature of VFX is the ability to save an image of your application to your computer: the "Snapshot" tool.  Maybe you want it for a PowerPoint or for your or whatever.  An evil trick you can play on the snapshot tool is to change the "zoom" of your browser, then take the snapshot.  Since the snapshot image is created by VFX within the browser, when you set your browser's zoom to 25% (making everything tiny and allowing for more content to show) VFX thinks you have ginormous monitors and will generate an insanely detailed poster-sized image.

At 100% browser zoom.  Resulting snapshot image dimensions are 1916 x 995.

At 50% browser zoom.  Resulting snapshot image dimensions are 3832 x 2073.

 At 25% browser zoom.  Resulting snapshot image dimensions are 7664 x 4229.

This is the method we used to create the massive shipping traffic maps a few weeks back, and I hope VFX users out there give it a try.

P.S. Your VFX instance has to be within a web page that grows or shrinks to fit your browser.  If your app is embedded in defined dimensions, you'll just get a regular screenshot with a huge background (like our homepage up there).

Have fun!

Friday, July 8, 2011

Backdated UI Metaphors

In light of the full history of technology and invention, the digital environment has erupted in a comparative blink of an eye.  New things and ideas need to be named in order to convey their use, and usually the best method is to borrow a design or idea that lends meaning to the new doodad. 
There are plenty of layout/print terms that carry over (cut/paste, carbon copy, margins…), and even good old “bug” (booooring), but here are four metaphors that are pretty interesting to me.
Oh, on a semi-related note, did you know why stone columns have vertical fluting and flaring ornamentation at the top?  Because they are modeled after the bundles of reeds or sticks that used to hold up our roofs!  I love that kind of thing (I learned about that in The Fountainhead).  Anyways…


When I hear threshold it is usually in the context of determining a numerical value that serves as a dividing line between one status and another –usually in terms of visual settings based on business rules, a value of a slider control that triggers a change, etc.
The word’s original use is the name for the board or stone sill under a door that separates the interior from the outside.  It’s called that because it used to hold thresh.  Threshing a grain harvest is the process of separating the grain from the chaff.  Chaff was (and in some cases still is) pretty useful stuff for scattering around inside a house over dirt floors; like bedding in a barn, it provided a clean and comfortable barrier between you and the floor -easily collected and chucked out when it gets too dirty.  The threshold prevented that bedding from spilling out of the house.
From there, crossing over a threshold was a common term for passing from one environment to another.  Its application in the statistical sense is a natural progression and, likewise, a convenient UI design / application development metaphor.  Cool.


Until relatively recently, it was a terrifying word –ripe with doom for a mortgage holder.  It means to do nothing, and, in the financial realm, to do nothing essentially meant that you didn’t pay your due.  You defaulted; time to drum up some cardboard boxes and call your friend with a truck!
Now, though, it has a much more general application.  In a digital environment many assumptions have to be made about a user’s preferences (which is a dark art in itself) –and the initial settings for variables ought to reflect the general desire (or at least the least-bad setting) of the audience.  So now, the word default carries with it a matronly sense of comfort and safety.  The defaults know best; just stick with the default settings and you’ll be fine!  Oh no I’ve hosed it all; return to defaults!

Radio Button

Good old David Hammond pointed this one out to me.  Remember the radios that came in cars from the 80’s and older that had those analog push buttons that when poked heavily would slide the dial with a thwunk over to a pre-set station?  You could (can) only listen to one station at a time, so the action of pushing in one button mechanically forced the unselection of any other button (via a beautiful cat’s cradle of cords and levers).  It is a mechanical interface for mutually exclusive selection.  Now our cars’ digital radios still have those mutually exclusive pre-sets, but the push buttons don’t have the visceral feeling of one-at-a-timedness as the old timey radios.
So now, when an interface requires a user to select only one option from a set of candidates, they may be presented a set of “radio buttons” which very likely have nothing to do with a radio.


I hesitate including this one because it’s not as cool as the others.  But shoot.  Remember before bound pages were invented and documents were written on a long strip of parchment with a winding tube on each end that you would progressively roll through?  Me neither, but that method of perusing down (or across) a very long bank of text is a really nice metaphor for choosing a smaller viewable window’s worth of content from the larger whole in a digital display.  I think I even remember seeing scrollbar interfaces in the 90’s that incorporated cute parchment scroll elements into the design (for that matter, there was likely some green and purple marbling motifs in there as well).

Tuesday, July 5, 2011

Chalkboard Maps: KaBOOM!

KaBOOM! is a national non-profit whose goal is to get a playground within walking distance of every child (  Here is a chalkboard map showing the amount of KaBOOM! playground projects per Congressional District.  I've highlighted the highest frequency districts in the left insets.

