How to make a heat map with hexagons, D3.js, hexbins.js, Open Street Map, Inkscape, and Paint.NET

In this post I will be showing how I made my map for the post, Average age of Brooklyn’s buildings mapped.

bk_r5_version4

I did it with:

  • Completely open source tools
  • D3.js and hexbins.js to render in Firefox
  • Inkscape and Paint.NET for graphical editing
  • Open Street Map for the basemap

Overall Technique

The overall technique is based on using D3.js offline in the browser on a one-off basis to create a static visual that you will use elsewhere. I outline the technique here in my post, Use D3.js on your desktop to publish static visualisations

 

Get Data

You can get the CSV I used here. The hexbins.js doesn’t do so well with missing values, so I cleaned out all of the records with missing values and re-saved the CSV. Additionally I removed most of the columns which are unnecessary and really bulk up the file.

 

Creating the Heat for the Map

First of all we’re going to be using D3.js along with hexbins.js. You can read a lot about hexbins.js here.

<script type="text/javascript" src="d3.v3.js"></script>
<script src="hexbin.js"></script>

Using D3.js as you normally would, and I’m not going to give a complete tutorial on that here…

The full code listing is at the bottom of this post.

First of all, add an SVG to the document. This is probably not the best way to manage widths, margins, etc., but remember this only has to work once and well enough, so elegance of code is not important. Below we have a chart that will be 1100 x 1100 with 50 padding around the outside.

var chartHeight = 1000;
var chartWidth = 1000;

var svg = d3.select("#chart").append("svg")
  .attr("height", chartHeight+100)
  .attr("width", chartWidth+100)
  .append("g")
    .attr("transform", "translate(50,50)");

Now we create two scales to translate the XCoord and YCoord from the Brooklyn data into coordinates in our chart. We are translating 973045 to 0 and 1024202 to chartWidth which is 1000 and using a linear interpolation for x. Similarly for y. Note that this is possibly a HUGE MISTAKE when it comes to dealing with maps. If you’re going to work with maps, then you may have to learn about projections. The fundamental thing here is that translating the 3D points on the surface of the earth to 2D points on the surface of a computer screen is not at all straightforward, and you will always have to use a projection of some sort. Here I am using a linear projection which can be fine, but if I combine this later with a basemap that isn’t on a linear projection, big, visible errors could be introduced. In this case, the basemap is small enough that those errors aren’t big and aren’t visible, so it’s not a big deal.

var x = d3.scale.linear()
  .range([0, chartWidth])
  .domain([973045,1024202]);

var y = d3.scale.linear()
  .domain([146940, 208432])
  .range([chartHeight, 0]);

Now we create the scale that we will be using to create the “heat” in the map. We will use this to control the opacity.  Here we are translating 1870 to 0 opacity and 2013 to 1.

  var opacity = d3.scale.linear()
    .domain([1870, 2013])
    .range([0,1]);    

Now we use D3 to load the CSV data as usual. The code that follows later comes from the ellipses below.

d3.csv("bk-xybuild-nomissing.csv",function(data) {
...
});

Now we use hexbins.js for the first time. We create a new hexbin that is 1000 x 1000 where the hexagons have a “radius” of 30. We also teach the hexbin how to retrieve the x and y to use when we pass it the data. In this case we will give it values converted by the scales x() and y() that we created above that come from the CSV data.

  var hexbin = d3.hexbin()
    .size([chartWidth, chartHeight])
    .radius(30)
    .x(function(d) {return x(d.XCoord);})
    .y(function(d) {return y(d.YCoord);});

Now we run the hexbin algorithm on our data. This will return an object that where every data point has been assigned to a hexagon.

  hexBinsData = hexbin(data);    

Now we create a special function that we will use to calculate the average for each hexagon. It takes in d which is the array of data points assigned to the hexagon, and calculated the average of YearBuilt for that hexagon.

  var averageFunction = function(d) {
    var sum = 0;
    for(var i = 0; i < d.length; i++)
    {
      sum += +d[i].YearBuilt;
    }
    return sum/d.length;
  };

Finally we do the very normal D3 thing of binding the hexBinsData to hexagon paths in the SVG. We give them a class of hexagon, a path as defined by hexbin.hexagon(), a transform to translate the hexagon by it’s x and y coordinate, and a fill colour. Finally we also change the fill-opacity. We use the averageFunction to calculate the average YearBuilt for the hexagon, and then we convert it using our opacity scale to a value between 0 and 1.

  svg.selectAll(".hexagon")
    .data(hexBinsData).enter()
    .append("path")
      .attr("class", "hexagon")
      .attr("d", hexbin.hexagon())
      .attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; })
      .style("fill", "#0000FF")
      .style("fill-opacity", function(d) { return opacity(averageFunction(d)); });

The result should look something like this:

radius_5_400

 

Extract SVG from the Browser

Use Firefox to copy and paste the SVG that is generated out of the browser, into a text editor, and save it off as an SVG file. I explain more here: Use D3.js on your desktop to publish static visualisations

 

Get a Basemap from Open Street Map

Open Street Map is great. If you aren’t aware of it, then you should be. I tell people it’s like “Wikipedia, but for maps” which is accurate enough. I’ve used Open Street Map to navigate 20,000 km of driving through 7 countries in South America at the time of writing this post, so I can say it’s pretty good.

With the heat we created above, we want to layer this on top of a map of Brooklyn.

Go to http://www.openstreetmap.org and click on the Share icon on the right-hand side.

share

You will see a menu that allows you to export PNGs from Open Street Map. There are some practical limits to how much detail you are allowed to get based on how large an area you select, but in any case it should be sufficient to make a large-ish map. If not, maybe you’ll have to piece together some pieces manually.

image

 

 

Convert SVG to PNG

Since our OSM data is in PNG format, we’ll want our heat to be as well. We could operate in SVG, but that can be very computer intensive with map data. Open the SVG in Inkscape and convert to a PNG.

 

Combine with Paint.NET

Not much to say here. I took the PNG from OSM, I black-and-white-ed it, then pasted in a layer on top the heat and adjusted it’s position and size until it matched the data below. In this instance I was able to line up the water and green space with the data, as there was no BuildYear data for these places. For other data, lining up with the map would be more complicated.

 

Done!

264,534 data points aggregated onto one heatmap of Brooklyn all using free open source tools and nothing specialized for map-making

 

Full JavaScript code listing:

var chartHeight = 1000;
var chartWidth = 1000;

var svg = d3.select("#chart").append("svg")
  .attr("height", chartHeight+100)
  .attr("width", chartWidth+100)
  .append("g")
    .attr("transform", "translate(50,50)");

var x = d3.scale.linear()
  .range([0, chartWidth])
  .domain([973045,1024202]);

var y = d3.scale.linear()
  .domain([146940, 208432])
  .range([chartHeight, 0]);

d3.csv("bk-xybuild-nomissing.csv",function(data) {
  var hexbin = d3.hexbin()
    .size([chartWidth, chartHeight])
    .radius(30)
  .x(function(d) {return x(d.XCoord);})
  .y(function(d) {return y(d.YCoord);});

  var opacity = d3.scale.linear()
    .domain([1870, 2013])
    .range([0,1]);  

  hexBinsData = hexbin(data);  

  var averageFunction = function(d) {
    var sum = 0;
  for(var i = 0; i < d.length; i++)
  {
    sum += +d[i].YearBuilt;
  }
  return sum/d.length;
  };

  svg.selectAll(".hexagon")
    .data(hexBinsData).enter()
  .append("path")
      .attr("class", "hexagon")
      .attr("d", hexbin.hexagon())
      .attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; })
    .style("fill", "#0000FF")
      .style("fill-opacity", function(d) { return opacity(averageFunction(d)); });
});

Use D3.js on your desktop to publish static visualisations

D3.js is a great tool for creating “Data Drive Documents”. As a JavaScript library, its natural primary use is on the web, rendering visualizations in the browser. It can also, however, be used offline on your PC to generate data driven content on a one-off basis. For someone familiar with D3, it can be a rapid and powerful way of scripting data into visual elements for use in a project that will not necessarily be published live and interactive on the web. In this post I will outline my simple method for using D3.js to script elements in static work.

Use D3.js to render content in the browser

Use D3 as you normally would to load CSVs and bind data to document elements. It doesn’t have to be polished, properly wired together, or compatible with various browsers. This only has to work once, well enough, and on your machine in order to move it on to the next step in your process. It can get whatever polish it needs later in your graphical editing programs. JavaScript in the browser is your chance to do anything in a scripted way that would otherwise be a huge pain. The truth is, you could create all kinds of data visualisations by hand with an infinite amount of time and patience to calculate values and draw rectangles of exactly the right length and shade. But you don’t.

Use Firefox. Like the rest of the world I’ve moved on to Chrome, but I still have Firefox installed on my computer for this purpose. I neither know nor need to know the full technical detail on this, but certain security settings to prevent cross site scripting attacks make it difficult to use D3.js to load CSV data locally from your machine in Chrome and Internet Explorer. There ways around this by running a tiny webserver on your desktop, but I find the simplest thing is to use Firefox which seems to not be quite so picky about its security on this issue.

Export SVG from the browser

Now that you’ve used D3 to render SVG in the browser based on your data, you can reach into that content and extract it and save it off to a file!

Firefox comes with its own inspector, so you don’t even have to use Firebug. If you right click on your image and pick Inspect Element, you should gain access to a tree-view of the Document Objecet Model that will allow you to find the code for your svg. It will look like <svg>…stuffhere…</svg>

innerhtml

In this example here, the SVG content we want is “<svg height=”1100″ width=”1100″ > .. </svg>. All this is just a bunch of text and we want to get it onto the clipboard. The way we do this is to right-click on the node that contains the SVG and choose Copy Inner HTML. In this case that is the <div id=”chart” >.

Now that the SVG is on the clipboard, now what? Open up a text editor (maybe Notepad, maybe Notepad++), paste in the text and save it off as a “.svg” file.

Use Inkscape to access SVG content

There’s not really much more to be said here. If you open up your new SVG file with Inkscape, you’ll find it malleable and in your hands like any other graphic. Combine, edit, and label. Publish.

New Project: K-means Clustering

When it comes to data visualisation design, it’s always important to consider your purpose and your audience. Are you trying to convince your audience of a particular point of view? Are you giving your audience an platform from which to explore and find their own insights? In my latest piece I take a step down a less discussed path.

I have created an interactive tool using D3.js that gives the user a chance to see and interact with the typical k-means clustering algorithm from data mining/machine learning. It is my hope, that it will enable students to develop an intuition for how the algorithm works, and a better appreciation of its shortcomings.

You can learn more about k-means clustering here.

K-means Clustering