Tuesday, November 22, 2011

Illustrating Flow: Two Methods for Wrangling Big Data

The goal of the set of shipping mix maps, other than to distinguish between the types of shipment, was to try to tease out the de facto shipping lanes in addition to the well defined official shipping lanes.  With one year of data (six million locations) we saw, as you might expect, a lot of activity around ports, a few surprisingly regimented lanes in established paths, and a diaspora of dots that if you squinted enough hinted at some worn paths.  In order to make these paths more evident, which was the goal, we used a few methods -I'll tell you about two of the cooler ones.

The shipping mix maps try to illustrate practical or unofficial seaways.

Extrapolation
The data had, in addition to other cool metrics, speed and direction information for each vessel position.  Using the vessel's heading, we extrapolated positions before and after with reasonable confidence that the vessel had been / would be there at some point.  This did a lot to enhance the linear structure, taking a sea of points and teasing out some directionality.

An illustration comparing raw locations, which don't give much indication of a travel pattern, to the extrapolated ghost locations based on predicted positions -which help to visually define paths.


Speed Weighting

We noticed that well worn routes were quite bright and queuing around ports and data artifacts resulted in distracting blobs.  Also high traffic bottlenecks quickly hit the the heatmapping theme bookend; increasing the power to bring out the finer network resulted in blowouts in the mega traffic areas.
Using speed as a relative weighting input let us borrow intensity from the superdense areas (which generally are more slow-moving) and distribute that thematic energy to the thinner open-water paths (where vessels tend to really gun it).
The result was a set of shipping maps that illuminated both high-density and open water traffic paths alike with a probability-enhanced indication of linear patterns.

A detail around Barcelona and Corsica.  High volume, low speed areas lend their brightness to lower volume higher speed sea paths.

A detail around Genoa and Marseille.  An additional benefit of speed weighting is the suppression of null-speed port queues and some of the positioning artifacts around them.




No comments:

Post a Comment