Thursday, December 27, 2012

A National Portrait of Drunk Driving

This map illustrates ten years of traffic fatality incidents by where, how much, and the extent to which intoxication was involved (a big merge of FARS data).  In this manner regions and cities, and the neighborhoods therein, can be visually characterized and compared.  More on the whys and hows of this map below, as well as a load of insets.

Size corresponds to the number of deadly traffic accidents over ten years, and color corresponds to the area's rate of incidents involving intoxication. Navigate this map or click on the image below for the huge image.

Why and How
Traffic fatalities represent an enormous proportion of deaths in the United States each year.  This map is an effort to better understand the geographic characteristics of these events, particularly the regional dimension that that intoxication plays (rather than the temporal characteristics illustrated here).  It turns out, some areas are much more well behaved in this regard than others.

The hexagonal mesh is one way of more fairly visualizing highly overlapping, highly clustered, data (pretty much wringing point data into a polygon choropleth map where ratios can be calculated).  The nature of our population distribution and our transportation infrastructure leads to traffic fatality data points that are highly stacked upon each other.  Each of these incidents represents a horrible event and to paint in the raw data in a way that would obscure them would be unfortunate -and ineffective.  The mesh carves up the country into a baseline set of place-buckets (more uniform than political boundaries) of roughly equal size into which the overall number of fatal crashes is aggregated, and within which a rate of intoxication can be shown.

For a way better description of binning, check out Nate Smith's post over at MapBox.

How to be true to the underlying population?  Once the overall number of events within each zone is added up, the visual for that zone can be scaled accordingly (a way of un-biasing the uneven populations) so that areas with lots of events appear larger and areas with fewer events appear smaller -and areas with no events disappear entirely.

The result is a sort of population density map of the highway infrastructure thick enough to see at a reasonable scale and capable of holding a ratio value (intoxication, in this case).

A dark version of the same map, if you rather.  In this image, dark blue represents low proportions of intoxication, while brighter yellow areas show higher proportions.

The coloration of the zones is tied to the rate of events that involved intoxication.  I was quite surprised by both the general proportion intoxication plays in these events, and also how regionally varied that rate was.  Some areas have impressively low rates (like Memphis and Manhattan, and less surprisingly Salt Lake City) while others demonstrate pretty high rates (like St. Louis and pretty much all of South Carolina).
This map does not answer why some places are more likely to have a problem with drunk driving, but it can help us get a survey of how that terrain looks and get us going on asking more specific questions of places.

Tools: Excel, QGIS, and the Gimp.


Caveats (updated here 1/9/13)
This intoxication flag considers alcohol alone, based on blood alcohol tests provided in the reports.  A BAC of 0.01 or greater by anybody involved in the fatality, not necessarily just the driver, is considered for the 2001-2007 years in this analysis.  2008, 09, and 10 limit that element only to drivers.

Here's the FARS Analytical Reference Guide:
and specifics about the interpretation of the drunk driving data element:


  1. Very interesting. Thanks for the work.

  2. I'd say it's about time to talk about Car-control. Too many needless deaths at the hands of the deadly automobile. Just imagine how many lives could be saved if we just outlaw the automobile. I mean, come on, tell me why anyone would ever need to go that fast in the first place. You have two feet, or a bike, plan ahead and you can get anywhere you want to get to and without losing your life getting there. I'm starting a new lobbying group to get local and national lawmakers to ban the ownership, operation and promotion of automobiles.

    Join me?!?!

  3. The problem is this is just a map that reflects population density, generally speaking it shows that if there are more people there is more drunk driving. If the data could be normalized in some fashion to pull actual trends out. Even the binning described seems to have only created a pop density map and not something that will show that there are more drunk drivers per 100,000 in this rural area than the city next door.

    1. Thanks for taking an interest, and for pointing out an interesting pattern. You aren't alone in your criticism, but let me elaborate on why I did what I did in a tone that reads more defensively than it should...

      While I agree that the underlying structure of the map is inherently related to population, it is not a problem. This is a choropleth map and the coloration is the key visual dimension showing the rate of drunk driving. The zonal pattern that supports that coloration being a proxy for population is precisely the point.
      Shape: where and how much things happen. Color: the rate of a subset of those things.

      The hexagonal mesh is the polygonal framework that serves to bin then display that rate. These hexagons are intentionally apolitical -a similar aggregation using political boundaries would visually bias against densely populated areas (which are inherently smaller). A uniform mesh of equal area zones is just the construct I wanted to build a framework to serve as the physical distribution of the underlying population (a spatial normalization). To use census zones or counties would result in just another map where vast unpopulated regions dominate the visual landscape and muddy truth.

      If you'd rather see discrete rankings of political entities devoid of a spatial distribution framework, you might prefer this:

  4. Times and dates might be useful for optimal trip planning ...

  5. Maybe the map just shows the most likely areas for speed traps and sobriety check points ?