A geometric pass at delineating areas within the United States potentially covered by each craigslist site -the United States of Craigslist. Download the full-sized version here.
Source Data Download
Here is a link to a skydrive folder with this data as a shapefile (.shp, .dbf, yada yada...) and a (spartanly formatted) kml file: https://skydrive.live.com/#!/?cid=2eb6aaf6c3ac1ebe&sc=documents&uc=2&id=2EB6AAF6C3AC1EBE%212242
With great power comes great responsibility.
**********
Update: Now you can grab a look-up table and sassy new map of the zip codes that correspond to these areas here: http://uxblog.idvsolutions.com/2011/12/craigszips.html
**********
Why?
Locality is inherent to the value of craigslist; I go to craigslist.org but I get kicked over to the local instance of craigslist (my IP address sources me to somewhere in the illustrious Lansing, MI). But how does craigslist know where to send me? Some mysterious system of assigning a geocoded IP address to just the right site must be in place...I wonder what that map looks like.
When Ian Clemens proposed the idea, I looked around to find an existing map of craigslist sites-to-areas -maybe even find the lookup that they themselves use. I couldn't find anything like it.
Whether it matches their system well or not, here is a map that approximates geographic coverage to individual sites using a Voronoi process as a base (more info on process below). It is at least a start at visualizing the geographic coverage and distribution of the community-driven instances of craigslist. Shapes like this might provide some useful context for other data, demographic or market information, for instance. Also, when pulled into VFX, it can serve as an input to some spatial querying on those other metrics.
We'll soon be releasing an interactive version of this data in VFX where you
can play with it in the context of other data and within alternate, though coupled,
charting and timeline dimensions.
Improvements
With access to web traffic data, one could compile a pragmatic view of coverage driven by the locations of actual website visitors (but this would just be the incestuous results of the current method craigslist uses to allocate visitors to sites). That's ok, but it's more detective work than interesting data creation.
The use of openstreetmap data to weight the polygon drawing by travel time would improve the realism of the hypothetical zones considerably. In that case, maybe it could be used to drive a more efficient assignment of craigslist.org visitors to their actual-nearest craigslist community. That would rule.
Spin-off Functionality
Creating these areas was, in part, a helpful testbed for a generic region-building functionality that is in the skunk-works here at IDV. The algorithmically inclined Abhinav Dayal has been crafting our drive-time service that is already doing a lot of the heavy lifting when it comes to Voronoi diagramming. So, down the road we might see a new specialized tool that generates best-fit areas around an existing set of points -useful for some what-if scenarios around territory creation, available to the business user, not just the research scientist.
Nerdy Bits
• Scrape the list of Craig’s cities at http://www.craigslist.org/about/sites.
• Split joint-locations into individual locations (like "Odessa / Midland")
• Geocode place-specific locations.
• Manually position the more regional locations (like "Southeast Iowa").
• Divide locations into three geographically distinct regions (split by the Continental Divide along the spine of the Rockies and the Mississippi); duplicate any locations that meaningfully straddle a border, like St. Louis. I do this to introduce some true-cost of crossing either of those features, in the face of an algorithm that would otherwise treat the whole country as a smooth unfettered plain.
• Run Voronoi (Thiessen) algorithm to generate best-fit zones for the points, for all three regions.
• Clip Voronoi zones by a “land” shape to cut out the oceans and provide a common border between the three regions (my "land" was constructed from the Census Bureau's tracts file).
• Merge the 3 regional Voronoi sets into a unified nation-wide set.
• Dissolve boundaries between same-website Voronoi zones (to re-combine the joint-locations up there in step 2) into merged chunky polygons.
• Manually re-assign oddly-orphaned or split areas (common along complicated shorelines).
That's just about it. Thoughts? Ideas? Outrage? Incredulity? Been done? Guffaws? Know a good dataset to improve this method?
Follow @JohnNelsonIDV



Nice Voronois. However, the reason to employ Voronois is generally to accommodate a non-geographic pass at trade area delineation. Retailers have employed this to provide an objective (i.e. non ZIP code based) evaluation of a trade areas. In developing these service areas for craigs list, you've manually altered objectivity based on some fundamental regional understanding like the splits you mentioned. So, my question is, have you been true to the Voronoi method and do you think you've accurately portrayed these trade areas?
ReplyDeletePedantic and condescending.
Delete@Anonymous: Actually, there are several of us who found Joe's response -- and then John's follow-up -- quite interesting. What's the point of flaming a contributor (especially under an "Anonymous" ID)?? If you don't like it, ignore it. And BTW, I read an interesting article in The California Geographer about why they should NOT be called Thiessen polygons. Thiessen (1911) "fudged" his own example to yield remarkable "accurate" estimations or precipitations.
Deletecorrection: "... his own example to yield remarkably 'accurate' estimations of precipitations." (--http://www.csun.edu/~calgeosoc/CG1999.html)
DeleteI agree, Mark; I was disappointed to see that anonymous response to a reasonable comment (I really hesitated doing the initial divide anyway). And Joe's comment is just the sort that improves and extents a discussion.
DeleteTo add a little more detail to my followup, since Craigslist transactions are largely drive-to-pickup transactions I would really love to incorporate some road friction in order to weight the zone boundaries by driving effort. I think this would result in a much more true-to-live divination of these economic zones. As an example of an input wishlist, check out this great work by Brandon Martin-Anderson at MIT...
http://shortestpathtree.org/
Right!
DeleteHi Joe, thanks for the post!
ReplyDeleteActually, I'd love to inject plenty more bias into these areas. Like many models have to be, and why I can seldomly embrace economic models with much conviction, is that they, necessarily, remove variables from reality. When I can reintroduce variability based on prior understanding I have to consider it, particularly with social science data.
The voronois are a useful general filling mechanism but with more time I'd like to first pre-split the country into the "where's George" economic zones by the folks from Northwestern. Then in addition to that, weight by drive-time.
So in this case I'm relatively happy with the geometric heavy lifting Voronoi did, but I'm always looking for ways to introduce social and physical friction.
John
Why not just have people add their address along with their post, not viewable to the public obviously for privacy, and then have sections based on peoples distance from one another. Just a simple 5, 10, 25, 50 miles away system based on the 2nd partys location
ReplyDeleteOr the tap or click method on a map to show where u r located
ReplyDeleteOne way to dramatically improve this would be to use the BTS Journey To Work data.
ReplyDeletehttp://transtats.bts.gov/tables.asp?Table_ID=1321&SYS_Table_Name=T_CTPP2000_PART3_SELECTED
It's Census-Tract based, so you'll need to get a shapefile for each Census tract (available online) and process it like you did the Zip Files. Then, you can get the Time-To-Work and Time-From-Work for any two tracts. Aggregation would probably be difficult, but should be possible.
I did a similar process for a (now dead) mapping company. We would let you choose a house and we'd calculate the corresponding tract. Then we could map your commute by how long it would take to drive from home (and the reverse, where you should live to get a certain length commute from a known work).
Hope this helps.
Excellent, thanks Bill!
DeleteI wish there was a way to search within a certain radius of a zip code or certain region without having to click each town or area and retype what I am searching all over again.
ReplyDelete....*** CAN YOU PRETTY PLEASE ADD ON THE UPSTATE NY CL CITY/COUNTY OF ....SCHOHARIE COUNTY*** IT WOULD BE A GREAT ADDITION TO THE NEW YORK CL. ...PLEASE!!! :-)
ReplyDelete...*** from a LONG TIME USEER of CL...JENNA...I LOVE THIS SITE!!!.... PLEASE PLEASE ADD ... SCHOHARIE COUNTY IN NEW YORK'S CL. Without CL I would have never found my dog!! ...and she is the best in the world....I have found many great things on this site from apartments to my dog...and funiture... LOVE IT!! THANKS FOR ALL THE HARD WORK CL TEAM!!! KEEP IT UP, AND just remember to read customer comments to make the best service available!! :-)
ReplyDeleteWhat is the problem with entering the address and it bouncing back now?
ReplyDeletepers-xxxx-2955369389@craigslist.org
doesnt work anymore???
What is the new and improved address method?
Dear Various Anonymous Commenters,
ReplyDeleteThis is not a tool of Craigslist; it is not used by them; it was not made by them. It is a visualization of presumed/de-facto economic zones using craigslist sites as an input, on a sweet sweet chalkboard.
I love craigslist, a lot more than ebay. Except for the fornications hook ups, but besides that, you guys rule!
ReplyDelete"You guys?"
DeleteVery helpful! I was wondering why you didn't opt to include craigslist sub-regions? The Los Angeles craiglist, for example, has several sub-regions such as Long Beach. Probably adds all sorts of complexities but might be an interesting addition to v2.
ReplyDeleteI need to reach the person in Tulsa, Ok with Craigslist number 2963512050... There isn't any contact information and the email address is invalid.
ReplyDeleteI can be reached at 918-378-8800 and/or 918-652-8422.
I don't know how to contact Craigslist - this was the only way I thought I might get some feedback on other options.
He can be reached at thatdudeintulsa@craigslist.org
DeleteI need to up;oad some pics in my add, but they never show HELP!! please reply to my e-mail gaby37_nice@hotmail.com
ReplyDeletehow in the world do i add my phone number to my cl account so i can post?
ReplyDeleteWhy can't I see the classes I post. I know I made some mistakes when I started using CL this past week, but none of my friends told me about multiple similar events. Now, when I try to post correctly, I cannot see any of the dates I posted on; I can only scroll backwards form dates I try to preview. I am told someone may be trying to sabotage my classes. Ugh!
ReplyDeleteThis is a wonderful post! is really informative for me. I liked it very much.
ReplyDeleteI went to my account today and saw two things posted there that weren't mine, and one of the things I had posted was completely gone. Does this mean someone is trying to get my identity? What should I do??
ReplyDeleteOn cragslist under car/ truck by owner the pictures on the front page is not work well it takes too long to load the page. Put it back please thank you.
ReplyDelete