Thursday, October 31, 2013

20 Unrequested Map Tips part 1

Golly I like maps.  I make them and I look at them and I imagine them in my head when I look at spreadsheets.  Some are better than others, and after a while I've noticed some things that tend to work and plenty of things that don't.  I'll tend to use examples of my own stuff because it's easy to ask my permission (not because I know everything and always get it right).  Here are some map-making tips you did not ask for...

#1: Aggregate reluctantly.
Rolling up data into bigger chunks removes information.  When you do that geographically, you inherently degrade the resolution of the data and maybe mislead (on the flip-side you get a tidy summary of a phenomenon...that may or may not be true).  But almost certainly you make it look less-cool.  People are shockingly good at detecting patterns from noise (sometimes too good).  Maybe it's a good call to aggregate your map data into bigger units and that's the way to go, but don't just do it as a knee-jerk reaction to a big data set.  I mapped on mental cruise-control in that way for a long tome and in hindsight missed out on lots of cool opportunities.

Aggregation inherently diminishes nuance.
Which map will give Bear a more realistic sense of the political landscape of his fellow citizens? From Election 2012.

#2: Makers need to make. Give them the chance or they'll move on.
If you are a cartographer, or hire cartographers, the ones that are any good are the ones that have maps in them that are going to come out one way or another.  If a maker doesn't have a creative outlet for their own ideas, they will feel a diminished sense of purpose as the fine abrasion of solving other people's problems wears down the sharpness of their intrinsic satisfaction.
Seeking Truth is one of the three pillars of motivation in the human endeavor, according to the compact framework of Plato (by way of Jeffrey Wagner).
A solid portion of official time spent mapping what you feel like will make a you as a cartographer more intrinsically motivated, more curious about and practiced in alternative methods, more tuned-in to the community, and more invested in the rest of the work you do.  You'll be better.  Or you'll find yourself mysteriously interested in other opportunities.

I don't have a graphic for this one.

#3: Defaults are evil! Stay away from the defaults!
You are not among the legion of mappers that are happy to crank out a GIS-ville map where not a single default was considered and changed.  But if you were...
Default whats?  Default projection (Equirectangular, see #14), default colors, default range classification (whatever the GIS threw at you -probably "natural breaks"), default range labels (truncated attribute names with underscores) default auto-layouts full of huge north arrows, overzealous neatlines, and scalebars that label in meaningless units and precision.
Defaults are nice because they let us see our stuff right away without manually setting 19 preferences before any data is rendered.  But that's it.  Change each and every one, because you thought about them.  I've made plenty of ugly maps in my life, and they were usually the result of my lazy acceptance of the default assembly line.
Daniel Huffman did a great job of helpfully describing why certain maps fail, over at Cartastrphe.  I recognize lots of default-ridden examples in there.

I made this unfairly poor map, using the magic of defaults.


He hates these cans! Stay away from the cans! In this illustration...the cans are defaults.  And I'm Navin.


#4: Can your legend double as a supporting chart? That liability is now an asset.
Legends are like collateral damage.  An unfortunate reality of getting a job done.  They only exist because the map is too hard to decipher on its own and needs a de-coder tool, and often, that's just the way it is.  But if you can add a dimension of data into the legend then it becomes a chart and pulls its own weight.  You can think of this in reverse, too.  If your map has a companion chart (which is awesome), use the same colors, for example, as the map and in that way they support and explain each other (then just scrap that dead weight legend).

In this case, the chart steals the show while also serving to legend the maps.  From Tornado Travel Map.

Acts of piracy color-coded to time makes a helpful chart and removes the need for a dedicated legend. From the Piracy Top 5.

#5: Does your phenomenon care about political boundaries? Maybe not.
Political units are the most over-used geometry in mapping.  They are really convenient because you can download them anywhere and the lion's share of geographic data is already aggregated into those areas.  And that's fine.  But remember that you aren't mapping data, you are mapping a phenomenon.  And if your phenomenon couldn't care less about counties, states, and countries (pretty much anything that doesn't involve humans) then neither should you.
BUT even if your phenomenon doesn't care about political boundaries, adding them in as a faint reference will provide geographic context.

A tornado doesn't care what state it passes through.  But adding a minimal state reference helps readers orient themselves and identify familiar places ("What's up with the tornado-free zone around West Virginia?"). From Tornado Tracks.

#6: Only aggregate to political boundaries if you want to (should) pin the phenomenon to politics.
This is pretty similar to #6, but its overuse bears some repeating.  What should you stuff into political polygons?  Anything that you want to associate with a politician or a cultural identity.  This includes the bulk of social scienc-y data, like population demographics, economic measures, and cultural characteristics and perceptions.  Even if mapping to political areas is the way to go, consider something other than a boring old choropleth.  Mapping data to political shapes does not automatically mean you make a choropleth.

Identifying specific counties with extremes in home vacancy rates is a natural fit for mapping with political areas. From Lights Out.

Sometimes the desire to attach a cultural place (building footprints) with a natural phenomenon (hurricane flooding) is such that that you take efforts to cram a-political data into political zones (as a means of personalizing a phenomenon using relateable places). From Sandy and the Buildings of NYC.

#7: To animate or to small-multiple?  Just make both.
Seriously.  If you've made one then you've pretty much made the other.  Throw them both out there and let folks get insights via either one.  Do you get more out of cartoons or comics?  Here's a cartoon of the comic, and here's a comic of the cartoon.
But, if you do animate, be wary of interpolated (guesses) transitions between stops (knowns).  Long smooth transitions damage change detection -which is why you are animating in the first place.  Readers will notice more change in a flip-book style animation than when there is a longer transition (short-term memory leaks between data A and data B); the "pleasing" transition inherently disguises differences.




You have the materials for both if you have them for either.  Just release the both of them and hedge your carto bets.
From Somali Pirate Years.

#8: Were you asked to use a dozen colors to denote categories? Run, it’s a trap!
This is one of those instances where you should hear what you are asked to do but listen for the actual problem to solve.  This is a really big issue and a huge (and valuable) skill to develop.  You were hired for your expertise, so don't be timid to respectfully propose preferred alternatives to a client.  But be ready to rationalize it.  I use this example because it seems to be the most common evidence in the map-making world of the gap between assuming the role of a technical resource (cog) vs. being a knowledge worker.  "Make them different colors" is easy to request but try to deliver an actual solution, rather than scratch off a task item (this is more easily said than done depending on the customer).
Since you know that our visual system is increasingly terrible at differentiating beyond a handful of hues, it's ok to suggest strategies that are effective (and likely more aesthetically pleasing).

Don't do this (sorry Guardian).

#9: Did you make it? Put your name on it! Were you influenced by others? Hat tips all around!
Always sign your work.  There is no better quality assurance measure than stamping your name to something.  If you are designing in autonomy it can be too easy to crank out sub-par work.  And on the flip side, you ought to be proud of the work you've done and it's right to attach ownership in this little way.  Earned esteem is another of the big three motivators.  Don't be ashamed by it, on balance.
The cartographic community isn't huge, and like any ecosystem we're influenced and inspired by others.  That's terrific, and much of the fun of being among a group of familiar faces is the leapfrogging that might not happen in a professional vacuum.  Just be generous in the citations and hat tips you attach to your work.  If your map is an extrapolation of somebody else's map, say so -they'll probably be excited to see it.  If yours is an exercise in applying somebody else's aesthetics to some new data, say so -they'll likely be honored.  Plus, you'll increase your audience and maybe find yourself to be an exciting thread in your own professional network.

Develop a citation layout that you like and try to be consistent with it.  If you have a writing domain like a map blog or Visual.ly, you can elaborate on the cast of influential characters there. From Election 2012.

#10: Can you get away with encoding color right in your map's title and scrap the “legend”? Go for it.
Sometimes the phenomenon you are mapping only needs a little bit of help to let the user know what they are seeing.  Especially when you are using a few colors to represent different categories.  If you can, try coloring the names of the categories right in the map's title and do away with the overly literal math equation (this = that).

The largely-pictoral title of a commuting map intrinsically connects topic and color. From Biking & Walking to Work.

The encoding of color to meaning in this title carries over to four separate types of illustrations throughout the graphic. From Game Day Fatalities.

A map of gender dispersion in New York is legended directly in the title. From Gender Flow NYC.

Rather than a legend that noted the dot-to-people ratio (1=1) and indicated color with little equations (this colored circle = this type of commuter) the use of color in the title removes unnecessary elements and provides smoother continuity from topic to graphic. From People Dots: Seattle Area Commuting.


Give up, or are you thirsty for more?! Head over to Unrequested Map Tips 11 - 20...


  

Wednesday, October 23, 2013

Dasymetric Dot Density and the Uncanny Valley

All of cartography is a lie.  And there are pros and cons that go along with lying more truthfully.  Dasymetric dot density mapping is one way of lying more truthfully.  Here's the lowdown...

Dasymetric just means cookie-cutting the areas (like countries) for which you have data (like population) into more specific areas (NASA's zones of populated places) that do a better job of restricting those zones to where your data actually occurs.

Hmmmm...
One of the great things about dot density mapping is that it normalizes for area all by itself.  You don't have to do a ratio of population divided by square miles (a cognitive abstraction) and then key that up to a sequence of colors (another cognitive abstraction) like you would with a choropleth map.  But one of the sort-of drawbacks of dot density mapping is that the dots are randomly distributed within their areas -all you can control is how many dots.  And if the dots are representing any sort of human data (like bicyclists, demographics, votes, commuters, downloads, etc.) then you are assuredly sprinkling many of them around in places that are obviously unpopulated.  Sometimes that's ok, but sometimes it's pretty silly.  Dasymetric mapping means we pare back our areas to only those that are likely to contain the phenomenon of interest (which was collected at that lower resolution).
Here's the Natural Earth cultural boundary file of countries.  It is a fantastic and generous resource, which I hope you check out.  It also has a field for estimated population, which I'll use as my example.

The countries of the world, provided by Natural Earth.

Inspired by Derek Watkins' fantastic squinty-eyed look at population densities (itself inspired by Wild Bill), I downloaded NASA's population density imagery and retained only the areas where they have estimated a population density of at least 5 people per square kilometer (seemed like a reasonable cut-off for "populated place"), converted it to a vector polygon set and then clipped the Natural Earth countries by it.
This is the result, in blue.  You can download this file for your own pursuits, here.

The countries of the world, pared back to populated areas.  Download this shapefile.

Before Dasymetric Clipping, and After
Here's a dot density map of the world showing population per country (each dot represents 50,000 people).  By the way, I use population because it's convenient for illustration.  So the map below is ok, and I do get a sense of which countries have more turf than people -overall, but it is not all that characteristic of how distributed the people are where people actually are.
Take Australia for instance.  The average Australian lives in a city perched along the coast.  And the fact that the rest of the continent is pretty much devoid of humans doesn't mean that they look around their neighborhood and see fewer people.  It's an imprecise look at population for countries with vast stretches of low or unpopulated areas (i.e. pretty much everywhere except Europe).  Alaska sucks up many of the dots that really ought to go into the lower 48.  Other obvious examples of people dots happily dispersed in vast lonely tracts are Russia and Canada.

Before
A standard dot density map of population (each dot represents 50,000 people).  There is comfort in this view, because it is familiar, though that familiarity is a lie.  Or at least wildly imprecise.

Below, I did the dot density mapping only inside my new clipped country shapes, where there is actually a population.  Overall there is much less visual country-to-country variability in the density, compared to the map above, but this is a much more accurate* picture of real population density.  Check out Australia.  Now, instead of implying every neighbor lives miles away, I get a truer sense of their actual population density.  I also get previously invisible insights into local distribution, like Egypt along the Nile, Canada in scattered cities nearer the border, Russia in their southwest, Algeria along the Mediterranean, and so on.  Dots dutifully avoid deserts, water, tundra, jungle, and more deserts.

After
A sort-of-dasymetric dot density map of population (each dot still represents 50,000 people).  While each country's overall population density (not all that meaningful, when you think about it) is harder to distinguish, the local population density is way truer.  But maybe misleadingly precise?

*All of Cartography is a Lie
The act of constraining a dot density map to more realistic (dasymetric) zones has a couple of drawbacks.  The first is that I lose that raw at-a-glance illustration of an overall national population density (though I would counter with the fact that this is a truer representation of actual local density). The second, and much bigger problem is emotional shenanigans.  A dasymetric dot density map can impart too great an expectation of precision.  Because I've taken the half-step of tighter population zones, it is entirely likely that readers assume more of its local distribution than is appropriate and take this is a direct and literal placement of people dots.  A problem of scale and false confidence.  The dot density map is still spraying dots randomly inside areas (it's just that the areas are more realistic).  India is a good example of this.  Because virtually all of India met the do people live here requirement (compared to scattered specks of population areas in the American west and pretty much all of Canada), all of India gets an evenly random dispersion of dots.  In reality there is a way higher density of folks living in the northeast band.  While this map doesn't have population data at that resolution, the precise speckles elsewhere in the world imply that it does.

The Cartographic Uncanny Valley
As a map-maker you will have to understand the balance of actual precision and perceived precision.  Doing your best to take steps toward an honest representation of a phenomenon could mayyyybe land you in the cartographic truth version of the uncanny valley.  Really generalized maps are clearly generalizations and are generously interpreted models of a phenomenon (at the cost of nuance).  Really precise maps are understood to be more literal pictures of where (at the cost of larger trends and scale-ability).  In-between precision can leave the reader wondering what level of scrutiny to apply which can inspire mistrust and revulsion.  So there's that.
The uncanny valley.  The hypothetical emotional response in terms of perceived realism.  This concept may come into play when a map reader is unsure of the level of generalization a map is showing them.


 

Tuesday, October 22, 2013

The United Nations of Bitcoin

This is a map of every Bitcoin download (that was able to be traced to a country), by operating system.  Each dot is an individual Bitcoin download.  The yellow dots are a general reference of all Bitcoin downloads ever while the red ones call out a specific OS used in the download.

Bitcoin downloads by country, by operating system.  At least, when the specific country was known.  Of course there is plenty of geographically anonymous downloading (here's the nitty gritty on that).

What is Bitcoin?
Bitcoin is a digital currency that lives as an open source peer-to-peer system.  All cash systems are symbolic representations of value.  Historically they were representative of actually-valuable stuff held in reserve, but increasingly it works without a net as fully fiat, whereby it has no intrinsic value but represents the say-so of the institution that regulates it.  Bitcoin takes that a step further, as a cryptocurrency, which is not representative of reserved valuable stuff, and not regulated by any central authority and relies on the virtual exchange of Bitcoins (and fractions of Bitcoins) whose value is tied to demand.  I know, I don't really get it either.  Trading goats.  I can grasp that.  Bitcoin is mysterious to me -which makes it interesting.

Closer Look
I look for three things in the geographic dispersion.  The first is the overall relative density of the yellow points across the world (which, admittedly, is pretty much a map of internet users). The second is the overall penetration of each operating system (how red is the world, overall), in terms of Bitcoin users.  The third, and most interesting for me, is the relative popularity of Bitcoin-downloading operating systems by country (how red are the various countries compared to others).






Reluctant Companion Map
And, since Bitcoin downloads are intrinsically correlated with the population size, I made this lame choropleth map to illustrate the proportional variability in Bitcoin "popularity" by country.  It paints countries by their Bitcoin downloads as a percentage of the national population...

I reluctantly made this dumb choropleth, ostensibly to tease out the proportional popularity of Bitcoin by population, but really to head off trolls who would undoubtedly point out their co-linearity if I hadn't (and of course include a link to the fun-at-first-but-way-over-used XKCD comic excoriating population-recursive heatmaps).  P.S. I recently fixed the legend, which had been high by one order of magnitude.

About Those Dots...
Yes, this is yet another foray into dot density mapping -but one that is a little more informed about the geography of population. The crux is, if I am scattering dots in countries based on the raw count of things (standard dot density territory), maybe I should more thoughtfully distribute them where people actually live (not-so-standard dot density territory).  More on how I made these maps here, but the cliff's notes version is I clipped Natural Earth's country boundaries by NASA's delineation of populated areas that have a population of at least 5 people per square kilometer.  I'm excited at the more meaningful dispersion areas that the dots get sprinkled into, but, like all of cartography, there are trade-offs.  I'll elaborate in a future post and give you my source data, so stay tuned.  In the meantime, help yourself to some twitter...

 

Friday, October 11, 2013

Adventures in Mapping: the slideshare of a recent presentation

I recently had the privilege of chatting with students in the UC Davis Health Informatics Program about the geographer's approach to data visualization.  I used a mishmash survey of the past couple of years' works as a ruse to really talk about what motivates us, the irony of passion vs. safety, leveraging our knowledge of cognition, and what happens when we feel we've lost our muse.  Whew.
I'd planned on sharing on a more personal level my recent slump and path out of it -but I was pretty sure I'd cry on video and embarrass myself so I wisely pulled those slides at the last minute.  I think it still holds up on a meaningfulness scale, though, and I'm happy with how it went (though you can't hear me over slideshare, or the thunderous chants for an encore *wink.* Oh, and the animated maps don't animate in slideshare)!
Anyway, have yourself a click through and feel free to reach out with questions or data resources...


Just imagine a nasally annotation in a heavy Midwestern accent.


They had the sweetest lecture hall I've ever been in, by the way.  Two jumbo-trons and a laser pointer (which I lost -sorry).
Thanks to Dr. Levenson and Dr. Yellowlees for the honor of the invitation, and thanks to the students for the hot-seat Q&A and discussion afterwards!


Monday, October 7, 2013

The Dispersion of Life and Gender in New York

This is a dot map of every person in New York City, colored blue-green for males and pink for females, segmented throughout their ages.  It is an animated GIF map of the movement of life and gender in the city.  Each dot represents one person.

An animated GIF dot density map of males and females throughout life, in New York City.
The sexes start out homogeneous go super segregated in the teen years, segregate for business in the twenty-somethings, and re-couple for co-habitation years.  Then with the slow attrition of death, the lights fade into faint pockets of pink.

Click here for an unreasonably bonkers version, which will take an eternity to load.
Click here to pop out a more realistic file size for the web.
Click here for a pointlessly small size that's still too big to be an effective thumbnail.
Click here for a cute little thumbnail.
and here to insult me with your cowardice.

If you'd rather see them at-a-glance as an un-annotated set, you can go to their flickr page where they are tiled out sort of like a small multiple.

Data Sources
I'm using simple tract-level population/gender counts from the US Census Bureau.  Because their tract boundaries extend into the water and vacant area, I used NYC's Bytes of the Big Apple zoning shapes to clip the census tracts to residentially zoned areas -giving me a more realistic (and more recognizable) definition of populated areas.  The census breaks out their population counts by gender for five-year age spans ranging from teeny tiny infants through esteemed 85+ year-olds.
I had originally wanted to render the dots within the awesome actual building footprints of New York, but the census tract attribute in the PLUTO shapes are truncated and therefore largely useless for joining to other data. Believe me, I tried.  Anyway, Brandon Martin-Anderson succeeded at this already and made a beautiful bi-variate dot map of olders & youngers.


Hat Tips
I, along with most other cartographers these days, am really into dot density mapping.  It is way more truthful a means of presenting relative geographic dispersion and affiliation than, say, choropleth mapping, which will be the carto-whipping-post of 2013.
I was inspired by the perpetually excellent work and insights of dot density heroes, Brandon Martin-Anderson, with his bold 1-for-1 crazy-detailed dot mapping of populations, Andy Woodruff, who artfully pushes dots for his beloved Boston and articulates why with style and wit, Ken Fields, who shares thoughtful and detailed production insights, and especially Kirk Goldsberry, whose beautiful pointillist maps of Texan politi-culture was the first of this magnitude and aesthetic that I'd seen.  Also, I've dabbled in it, too, making the obligatory election map, commuting maps, and walkability maps.

A Closer Look
Excluding color, this sequence of maps is a blooming, migration, and inevitable dimming, of life itself.  The addition of the gender color dimension means we can track at a more familial level the lifeline of New York residents.

The population of babies.  I expected this to be thoroughly blended, as babies don't choose where to live and parents don't really choose neighborhoods based on the gender of their babies.


As with the map of infants, the genders are still understandably mixed.  What is interesting in this age span is the general dearth of local youngsters in Lower and Midtown Manhattan.


Certain neighborhoods begin to reveal themselves as comparatively popular places to raise teens.


Now it gets interesting.  We are in the age-span where teens/young adults can choose where to live.  And they choose paths that are not gender-neutral.  Immediately we see clusters of females and, to a lesser extent, clusters of males.  What's the deal?  College. And prisons.

Morningside Heights positively glows pink as the home of Barnard College, as do other institutions of learning sprinkled throughout Manhattan. The garment district is another draw.
We also start to see the filling of Rikers Island with green dots as young men begin to populate the jail complex.
Certainly a deviation in optimism between high-density young women neighborhoods and high-density young men neighborhoods.

Now we reach the age of the young professional.  Early twenty-somethings are nearing the end of their education and entering the workforce.  For females, this apparently happens in Midtown Manhattan.

For males, the SUNY Maritime College, and Yeshiva University have a much higher local population of males than females.  Also, of course, there is Rikers Island.

The Upper West Side and Upper East Side continue to be popular places for young women to live, but twenty-something men are showing up for work, too.  Some areas of Queens show a greenish cloud of a male-heavy population.  It appears as though women in their late twenties outnumber their male counterparts in Brooklyn.

Also notable in this age range is the migration to the suburbs.  By the way, ages 25 - 29 is the most populous group of New Yorkers.

The early 30's shows the re-blending of the sexes, having been self-segregated since their early days out on their own.  Not since infancy have the genders been so mingled.  This 30-year echo is not a coincidence, since couples tend to create infants.  Trust me.


Superblending.  Some neighborhoods are more gender stacked than others but for the most part male and female New Yorkers are blended.  Brooklyn is still pretty in pink, and the Bronx increasingly so.


The early forties kicks off the re-segregation of the sexes.  Dense pockets of women form in neighborhoods in Brooklyn -and one neighborhood in Queens just east of the river from Roosevelt Island coalesces with forty-something women.
Men are disproportionately (compared to similarly aged women) living around Chelsea.


Men in their late forties continue to outnumber women in western Midtown.  The proportion of women seems to increase at the edges of the city.


More of the same.


Overall, the diminishing population is apparent.  But the Parkchester neighborhood in the Bronx remains popular with fifty-somethings of both ages.  Female enclaves throughout the city continue to concentrate.  Elmont, in Eastern Queens, seems an especially large popular residence of women in their late fifties.

As retirement age approaches, men begin to leave Midtown.  Flushing and Elmhurst are relatively popular.


Men continue to fall at a faster rate than women, throughout the city.


The lights are going out.  Retirement communities in pockets of Manhattan and Brighton Beach emerge.



Two neighborhoods are holdouts for octogenarian men.  They make a final stand in Brighton Beach and Turtle Bay.


At 85 and older, New York is essentially pink.  Women outnumber the remaining men at a rate of better than two to one.  Various retirement communities popular with women become apparent, almost as strongly as their geographic preferences in their teens and twenties.  Those two eras mark their times without men, when whole neighborhoods are almost empty of males their peer.  The boys have moved on.