## Acknowledgements

### Mike

First and foremost I would like to express my thanks to Mike Bostock, the driving force behind d3.js. His efforts are tireless and his altruism in making his work open and available to the masses is inspiring.

The decision for him to leave what must have been an incredible job with the New York Times to return to improving visualisation software (d3.js in particular) has marked him as a very special person indeed. If any reader of this book has the opportunity to support his continuing efforts, please do.

### Partners, Supporters and Contributors.

Mike has worked with a crew of like-minded individuals in bringing D3 to the World. Vadim Ogievetsky and Jeffrey Heer share honours for the work on D3: Data-Driven Documents and while there has been a cast of over 40 people contributing to the D3 code base, Jason Davies stands out as the man who has provided a generous portion especially in the area of mapping.

Nick Zhu has created a fantastic resource in dc.js (which is built on top of d3.js and crossfilter) and has been kind enough to provide good advice and permission to include some of his work.

Advice given by Christophe Viau has been a great help in getting me settled into the on-line world and his energy in managing and directing the D3 community is amazing.

Mike Dewar (Getting Started with D3), Scott Murray (Interactive Data Visualization for the Web) and Sebastian Gutierrez (dashingd3js.com) lead the pack for providing high quality reference material for learning D3. Many thanks gentlemen.

I am particularly grateful for the assistance given by Filiep Spyckerelle and Robin Bennett who selflessly donated their time and expertise in proofreading the earlier edition of D3 Tips and Tricks (d3.js v3) (where this document contains any errors, they are most certainly mine).

In fact Robin has been very quick off the mark and is feeding back areas for improvement in the new book already!

### The d3.js Community

Big thanks go out to the D3 community. Whether providing advice on Google Groups or Stack Overflow, contributing examples on bl.ocks.org or just giving back in the form of time and effort to similar work. Well done all.

### Cover art

Out of the blue and in yet another example of the friendly and giving nature of people involved in this community I was contacted by Jose (‘Tactician Jenro’) who offered to use his skills to design a cover for the original book. He has subsequently designed the cover for this version and I think he did an awesome job and was super helpful. If you think that he could help you out with a project, you can get in touch with him at @tacticianjenro or via his web site at http://mindthetimes.xyz/.

### Leanpub

Lastly, I want to pay homage to Leanpub who have made the publishing of this document possible. They offer an outstanding service for self-publishing and have made the task of providing and distributing content achievable.

## What is d3.js?

d3.js (hereafter abridged as D3) is “a JavaScript library for manipulating documents based on data”.

But that description doesn’t do it justice.

D3 is all about helping you to take information and make it more accessible to others via a web browser.

It’s a JavaScript library. That means that it’s a software tool that can be used in conjunction with other software tools to achieve a task. Those other tools are based on web standards such as HTML, SVG and CSS but you don’t need to know too much about them to start using D3 (although it will help :-)).

It’s an open framework, which means that there are no hidden mysteries about how it does its magic and it allows others to contribute to a constant cycle of improvement.

Being built to leverage web standards means that modern browsers don’t have to do anything special to use D3, they just have to support the framework that the Internet has adopted for ease of use.

The beauty of D3 is that it allows you to associate data and what appears on the screen in a way that directly links the two. Change the data and you change the object on the screen. D3’s trick is to let you set what appears on the screen. A circle, a line, a point on a map, a graph, a bouncing ball, a gradient (and way, way more). Once the data and the object are linked the possibilities are endless.

D3 bridges the gap between the static display of data and the desire of people to represent it dynamically. That applies equally to the developer who wants to show something cool and to the end user who wants to be able to explore information interactively.

It was (and still is being) developed by Mike Bostock who has not just spent time writing the code, but writing the documentation for D3 as well. There is an extensive community of supporters who also contribute to the code, provide technical support online and generally have fun creating amazing visualizations. Their contributions are extraordinary (you only have to look at the work of Jason Davies to be amazed).

This book has been written to incorporate the changes in version 4 of d3.js to the original edition of D3 Tips and Tricks. If you’re looking for the equivalent for version 3 you can find it here.

## Introduction

I never set out to write treatise on D3… But here I am three years after publishing the first version of this book and I’m in the process of updating it for version 4 of d3.js, while also looking back at a range of other books that have been written as a result of this first foray into publishing.

I am a simple user of this extraordinary framework and when I say simple, I really mean I had no idea how to get it to do anything when I started; I needed to do a lot of searching and learned by trial-and-error (emphasis on the errors which were entirely mine). The one thing that I did know was that the example graphics shown by Mike Bostock and others were the sort of graphical goodness that I wanted to play with.

So to get from the point of having no skills whatsoever to the point where I could begin to code up something to display data in a way I wanted, I had to capture the information as I went. The really cool thing about this sort of process is that it doesn’t need to occur all at once. You can start with no knowledge whatsoever (or pretty close) and by standing on the shoulders of other’s work, you can add building blocks to improve what you’re seeing and then change the blocks to adapt and improve.

For example (and this is pretty much how it started). I wanted to draw a line graph, so I imported an example and then got it running locally on my computer. Then I worked out how to change the example data for my data. Then I worked out how to move the Y axis from the right to the left. Then how to make the axis labels larger, change the tick size, make the lines fatter, change the colour, add a label, fill the area under the graph, put the graph in the centre of the page, add a glow to the text to help it stand out, put it in a framework (bootstrap), add buttons to change data sets, animate the transitions between data sets, update the data automatically when it changed, add a pan and zoom feature, turn parts of the graph into hyperlinks to move to other graphs… And then I started on bar graphs :-).

The point to take away from all of this is that any one graph is just a collection of lots of blocks of code, each block designed to carry out a specific function. Pick the blocks you want and implement them.

I found it was much simpler to work on one thing (block) at a time, and this helped greatly to reduce the uncertainty factor when things didn’t work as anticipated. I’m not going to pretend that everything I’ve done while trying to build graphs employs the most elegant or efficient mechanism, but in the end, if it all works on the screen, I walk away happy :-). That’s not to say I have deliberately ignored any best practices – I just never knew what they were. Likewise, wherever possible, I have tried to make things as extensible as possible.

D3 has also steered down the road of providing standalone micro-libraries available as components and this flexibility continues to redefine the maxim of change being the only constant in the software world.

You will find that I have typically eschewed a simple “Do this approach” for more of a story telling exercise. This means that some explanations are longer and more flowery than might be to everyone’s liking, but there you go, try to be brave :-)

I’m sure most authors try to be as accessible as possible. I’d like to do the same, but be warned… There’s a good chance that if you ask me a technical question I may not know the answer. So please be gentle with your emails :-).

Email: ‘d3noobmail+contact@gmail.com’

## What do we need to get started?

Let’s be honest with each other. D3 is not the simplest way to draw a graph.

However, that doesn’t mean that it’s beyond those with a little computer savy and a willingness to experiment. Remember failure is your friend (I am fairly sure that I am also related by blood). Just learn from your mistakes and it’ll all work out.

So, here in no particular order is a list of good things to know. None of which are essential, but any one (or more) of which will make your life slightly easier.

• HyperText Markup Language (HTML)
• JavaScript
• Web Servers
• PHP

### HTML

This stands for HyperText Markup Language and is the stuff that web pages are made of. Check out the definition and other information on Wikipedia for a great overview. Just remember that all you’re going to use HTML for is to hold the code that you will use to present your information. This will be as a .html (or .htm) file and they can be pretty simple (we’ll look at some in a moment).

### JavaScript

JavaScript is what’s called a ‘scripting language’. It is the code that will be contained inside the HTML file that will make D3 do all its fanciness. In fact, D3 is a JavaScript Library, it’s the native language for using D3.

Knowing a little bit about this would be really good, but to be perfectly honest, I didn’t know anything about it before I started. I read a book along the way (JavaScript: The Missing Manual from O’Reilly) and that helped with context, but the examples that are available for D3 graphics are understandable, and with a bit of trial and error, you can figure out what’s going on.

In fact, most of what this collection of information’s about is providing examples and explanations for the JavaScript components of D3.

Cascading Style Sheets (everyone tends to call them ‘Style Sheets’ or ‘CSS’) is a language used to describe the formatting (or “look and feel”) of a document written in a markup language. The job of CSS is to make the presentation of the components you will draw with D3 simpler by assigning specific styles to specific objects. One of the cool things about CSS is that it is an enormously flexible and efficient method for making everything on the screen look more consistent and when you want to change the format of something you can just change the CSS component and the whole look and feel of your graphics will change.

### Web Servers

Web servers can go one of two ways. If you have access to a web server and know where to put the files so that you can access them with your browser, you’re in a good place. If you’re not quite sure, read on…

A web server will allow you to access your HTML files and will provide the structure that allows it to be displayed on a web browser. There are some simple instructions on the main D3 wiki page for setting up a local server. Or you might have access to a remote one and be able to upload your files. However, for a little more functionality and a whole lot of ease of use, I can thoroughly recommend WampServer (WAMP) as a free and simple way to set up a local web server that includes PHP and a MySQL database (more on those later). Go to the WampServer web page (http://www.wampserver.com/en/) and see if it suits you.

Throughout this document I will be describing the files and how they’re laid out in a way that has suited my efforts while using WAMP, but they will work equally well on a remote server. I will explain a little more about how I arrange the files later in the ‘Getting D3’ section.

There are other options of course. You could host code on GitHub and present the resulting graphics on bl.ocks.org. This is a great way to make sure that your code is available for peer review and sharing with the wider community.

One such alternative option that I have recently started playing with is Plunker (http://plnkr.co/) This is a lightweight collaborative online editing tool. It’s so cool I wrote a special section for it which you can find later in this document. This is definitely worth trying if you want to use something simple without a great deal of overhead. If you like what you see, perhaps consider an alternative that provides a greater degree of capability if you go on to greater d3.js things.

### PHP

PHP is a scripting language for the web. That is to say that it is a programming language which is executed when you load web pages and it helps web pages do dynamic things.

You might think that this sounds familiar and that JavaScript does the same thing. But not quite.

JavaScript is designed so that it travels with the web page when it is downloaded by a browser (the client). However, PHP is executed remotely on the server that supplies the web page. This might sound a bit redundant, but it’s a big deal. This means that the PHP which is executed doesn’t form part of the web page, but it can form the web page. The implication here is that the web page you are viewing can be altered by the PHP code that runs on a remote server. This is the dynamic aspect of it.

In practice, PHP could be analogous to the glue that binds web pages together. Allowing different portions of the web page to respond to directions from the end user.

It is widely recognised not only as a relatively simple language to learn, but also as a fairly powerful one. At the same time it comes into criticism for being somewhat fragmented and sometimes contradictory or confusing. But in spite of any perceived shortcomings, it is a very widely used and implemented language and one for which there is no obvious better option.

### Other Useful Stuff

#### Text Editor

A good text editor for writing up your code will be a real boost. Don’t make the fatal mistake of using an office word processor or similar. THEY WILL DOOM YOU TO A LIFE OF MISERY. They add in crazy stuff that you can’t even see and never save the files in a way that can be used properly.

Preferably, you should get an editor that will provide some assistance in the form of syntax highlighting which is where the editor knows what language you are writing in (JavaScript for example) and highlights the text in a way that helps you read it. For example, it will change text that might appear as this;

Into something like this;

Infinitely easier to use. Trust me.

There are plenty of editors that will do the trick. I have a preference for Geany, mainly because it’s what I started with and it grew on me :-).

#### Getting D3

Luckily this is pretty easy and could go one of two ways.

##### Host d3.js locally

Go to the D3 repository on github and download the entire repository by clicking on the ‘ZIP’ button.

What you do with it from here depends on how you’re hosting your graphs. If you’re working on them on your local PC, then you will want to have the d3.js file in the path that can be seen by the browser. Again, I would recommend WAMP (a local web server) to access your files locally. If you’re using WAMP, then you just have to make sure that it knows to use a directory that will contain the d3 directory and you will be away.

##### Use a remote CDN to always use the latest version of d3.js

The alternative to downloading d3.js and using it locally is to always retrieve it from an online source. For d3.js this could be done via having the following line in our JavaScript; <script src="https://d3js.org/d3.v4.min.js"></script>. This method has the advantage of always using the latest version of D3 and is especially useful if your visualisations are hosted somewhere like bl.ocks.org.

##### Potential directory structure

The following image is intended to provide a very crude overview of how we can set up the directories for our web server.

• webserver: Use this as our ‘base’ directory where you put our files that we create. That way when we open our browser we point to this directory and it allows us to access the files like a normal web site.
• d3: This would be our unzipped d3 directory. It contains all the examples and more importantly the d3.v4.js file that we need to get things going. To do this we would include a line like the following;

This tells our browser that from the file it is running (one of the html graph files) if it goes into the ‘d3’ folder it will find the d3.v4.js file that it can load.

• data: I use this directory to hold any data files that I would use for processing. For example, you will see the following line in the code examples that follow d3.csv("data/data.csv", function(error, data) {. Again, that’s telling the browser to go into the ‘data’ directory and to load the ‘data.csv’ file.
• js: Often we will find that we will want to include other JavaScript libraries to load. This is a good place to put them.

#### Where to get information on d3.js

D3 has made huge advances in providing an extensible and practical framework for manipulating data as web objects. At the same time there has been significant increase in information available for people to use it. The following is a far from exhaustive list of sources, but from my own experience it represents a useful subset of knowledge.

##### d3js.org

d3js.org would be the first port of call for people wanting to know something about d3.js.

From the overview on the main page you can access a dizzying array of examples that have been provided by the founder of d3 (Mike Bostock) and a host of additional developers, artists, coders and anyone who has something to add to the sum knowledge of cool things that can be done with D3.

There is a link to a documentation page that serves as a portal to the ever important API reference, contributed tutorials and other valuable links (some of which I will mention in paragraphs ahead).

It is difficult to overstate the volume of available information that can be accessed from d3js.org. It stands alone as the one location that anyone interested in D3 should visit.

There is a Google Group dedicated to discussions on d3.js.

In theory this forum is for discussions on topics including visualization design, API design, requesting new features, etc. With a specific direction made in the main header that “If you want help using D3, please use the d3.js tag on Stack Overflow!”.

In practice however, it would appear that a sizeable proportion of the posts there are technical assistance requests of one type or another. Having said that this means that if you’re having a problem, there could already be a solution posted there. However, if at all possible the intention is certainly that people use Stack Overflow, so this should be the first port of call for those types of inquiry.

So, by all means add this group as a favourite and this will provide you with the opportunity to receive emailed summaries of postings or just an opportunity to easily browse recent goings-on.

##### Stack Overflow

Stack Overflow is a question and answer site whose stated desire is “to build a library of detailed answers to every question about programming”. Ambitious. So how are they doing? Actually really well. Stack overflow is a fantastic place to get help and information. It’s also a great place to help people out if you have some knowledge on a topic.

They have a funny scheme for rewarding users that encourages providing good answers based on readers voting. It’s a great example of gamification working well. If you want to know a little more about how it works, check out this page; http://stackoverflow.com/about.

They have a d3.js tag (http://stackoverflow.com/questions/tagged/d3.js) and like Google Groups there is a running list of different topics that are an excellent source of information.

##### Github

Github is predominantly a code repository and version control site. It is highly regarded for its technical acumen and provides a fantastic service that is broadly used for many purposes. Not the least of which is hosting the code (and the wiki) for d3.js.

Whilst not strictly a site that specialises in providing a Q & A function, there is a significant number of repositories which mention d3.js. With the help from an astute search phrase, there is potentially a solution to be found there.

The other associated feature of Github is Gist. Gist is a pastebin service (a place where you can copy and past code) that can provide a ‘wiki like’ feature for individual repositories and web pages that can be edited through a Git repository. Gist plays a role in providing the hub for the bl.ocks.org example hosting service set up by Mike Bostock.

For a new user, Github / Gist can be slightly daunting. It’s an area where you will get most value by understanding something about the services before you start using them. This is certainly true if you want to make use of its incredible features that are available for hosting code. However, if you want to browse other peoples code it’s an easier introduction. Have a look through what’s available and if you feel so inclined, I recommend that you learn enough to use their service. It’s time well spent.

##### bl.ocks.org

bl.ocks.org is a viewer for code examples which are hosted on Gist. You are able to load your code into Gist, and then from bl.ocks.org you can view them.

This is a really great way for people to provide examples of their work and there are many who do. However, it’s slightly tricky to know what is there. There is a project that will help with searching the bl.ocks for key words. This is an immensely valuable service that should be a highlight for someone wanting inspiration or assistance.

I would describe the process of getting your own code hosted and displaying as something that will be slightly challenging for people who are not familiar with Github / Gist, but again, in terms of visibility of the code and providing an external hosting solution, it is excellent and well worth the time to get to grips with.

It’s certainly a great way to keep in touch on an hour by hour basis with people who are involved with d3.js and this can be accomplished in a couple of ways. First, find as many people from the various D3 sites around the web who you consider to be influential in areas you want to follow (different aspects such as development, practical output, educational (etc) and follow them. Even better, I found it useful to find a small subset who I considered to be influential people and I noted who they followed. It’s a bit ‘stalky’ if you’re unfamiliar with it, but the end result should be a useful collection of people with something useful to say.

##### Books

The following books are referenced on the D3 wiki;

Of course, there is also the original paper that launched D3 D3: Data-Driven Documents by Michael Bostock, Vadim Ogievetsky and Jeffrey Heer (IEEE Trans. Visualization & Comp. Graphics (Proc. InfoVis), 2011)

## Starting with a simple graph

We’ll start with the full code for a simple graph and then we can go through it piece by piece.

Here’s what the basic graph looks like;

And here’s the code that makes it happen;

The full code for this example can be found on github or in the code samples bundled with this book (simple-graph.html and data.csv). A live example can be found on bl.ocks.org. Please note that the <head></head> tags are omitted which is a common thing for d3 examples (It’s presumably an effort to reduce potentially distracting code for when modern browsers can cope with the omission).

Once we’ve finished working through the explanation of the functional blocks that make up the graph, we’ll start looking at what we need to add in and adjust so that we can incorporate other useful functions that are completely reusable in other diagrams as well.

Working on the premiss that we can break the file down into component parts we will explain the major blocks as HTML, CSS and JavaScript. I’m going to play kind of fast and loose here, but never fear, it’ll all make sense.

### HTML

Here’s the HTML portion of the code;

Compare it with the full code. It kind of looks like a wrapping for the CSS and JavaScript. You can see that it really doesn’t boil down to much at all (that doesn’t mean it’s not important).

There are plenty of good options for adding additional HTML stuff into this very basic part of the file, but for what we’re going to be doing, we really don’t need to bother too much.

One thing probably worth mentioning is the line;

That’s the line that identifies the file that needs to be loaded to get D3 up and running. In this case the file is sourced from the official d3.js repository on the Internet (that way we are using the most up to date version). The D3 file is actually called d3.v4.min.js which may come as a bit of a surprise. That tells us that this is version 4 of the d3.js file (the v4 part) which is an indication that it is separate from the v3 release, which was superseded in the middle of 2016. The other point to note is that this version of d3.js is the minified version (hence min). This means that any extraneous information has been removed from the file to make it quicker to load.

Later when doing things like implementing integration with bootstrap (a pretty layout framework) we will be doing a great deal more, but for now, that’s the basics done.

The two parts that we left out are the CSS and the D3 JavaScript.

The CSS is as follows;

Cascading Style Sheets (CSS) give you control over the look / feel / presentation of web content. The idea is to define a set of properties to objects in the web page.

They are made up of ‘rules’. Each rule has a ‘selector’ and one or more ‘declarations’ and each declaration has a property and a value (or a group of properties and values).

For instance in the example code for this web page we have the following rule;

line is the selector. The period (.) in front of line indicates that the selector is a ‘class’. This tells us that on the web page, any particular element (and we are going to apply this rule to the line of our graph) which we decorate with the ‘class’, line will have the various declarations applied to it.

There are three declarations as part of the rule. These are contained within the curly braces and separated by semi-colons.

One of the declarations is for the width of the graph line (stroke-width: 2px;) The property is stroke-width: and the value is 2px (2 pixels). This tells the web page that any element in the web page that has the class line will have lines drawn that are (amongst other things) 2 pixels wide.

Sure enough if we look at the line of the graph…

That looks as if the line might actually be 2 pixels wide!

Let’s try a test. We can change that particular declaration to the following;

and the result is…

Ahh…. 20 pixels of goodness!

Because we’re getting the hang of things now, let’s change the colour declaration to…

and we get…

Awesome! I think we can safely say that this has had the desired effect.

So what else is there?

Since there’s only one declaration left, it seems like a shame not to try something different with it;

We’ll get…

So the ‘fill’ property looks like it will change the colour of the area that would be closed by the line. Nice.

The one thing to take away from this small exercise is that there is a good deal of flexibility in adjusting properties of elements on the web page via CSS.

### D3 JavaScript

The D3 JavaScript part of the code is as follows;

Again there’s quite a bit of detail in the code, but it’s not so long that we can’t work out what’s doing what.

The first thing to note is that throughout the code we have lines that are adding a description of what the code does. These have two forward-stroke characters (//) preceding them which the computer will recognise as a line that only contains comments. I recommend that you add them into your own code where you think that you might want reminding of a function or description.

Let’s examine the blocks bit by bit to get a feel for it.

#### Setting up the margins and the graph area.

The part of the code responsible for defining the canvas (or the area where the graph and associated bits and pieces is placed ) is this part.

This is really (really) well explained on Mike Bostock’s page on margin conventions here http://bl.ocks.org/3019563, but at the risk of confusing you here’s my crude take on it.

The first line defines the four margins which surround the block where the graph (as an object) is positioned.

So there will be a border of 20 pixels at the top, 20 at the right and 30 and 50 at the bottom and left respectively. Now the cool thing about how these are set up is that they use a JavaScript object to define everything. That means if you want to do calculations in the JavaScript later, you don’t need to put the numbers in, you just use the variable that has been set up. In this case margin.right = 20!

So when we go to the next line;

The width of the inner block of the area where the graph will be drawn is 960 pixels – margin.left – margin.right or 960-50-20 or 890 pixels wide. Of course now we have another variable ‘width’ that we can use later in the code.

Obviously the same treatment is given to height.

Another cool thing about all of this is that just because we appear to have defined separate areas for the graph and the margins, the whole area in there is available for use. It just makes it really useful to have areas designated for the axis labels and graph labels without having to juggle them and the graph proper at the same time.

So, let’s have a play and change some values.

Here we’ve made the graph narrower (400 pixels) but retained the left / right margins and increased the top / bottom margins while changing the overall height of the canvas to 270 pixels. The really cool thing that you can tell from this is that while we shrank the dimensions of the area that we had to draw the graph in, it was still able to dynamically adapt the axes and line to fit properly (Although the x axis values got a bit squished. Don’t worry we’ll work through that shortly). That is the really cool part of this whole business. D3 is running in the background looking after the drawing of the objects, while you get to concentrate on how the data looks without too much maths!

#### Getting the Data

We’re going to jump forward a little bit here to the portion of the JavaScript code that loads the data for the graph.

I’m going to go out of the sequence of the code here, because if you know what the data is that you’re using, it will make explaining some of the other functions much easier.

The section that grabs the data is this bit.

There’s lots of different ways that we can get data into our web page and turn it into graphics. The method that we’ll want to use will probably depend more on the format that the data is in than the mechanism we want to use for importing.

For instance, if it’s only a few points of data we could include the information directly in the JavaScript.

That would make it look something like;

The format of the data shown above is called JSON (JavaScript Object Notation) and it’s a great way to include data since it’s easy for humans to read what’s in there and it’s easy for computers to parse the data out. For a brief overview of JSON there is a separate section in the “Assorted Tips and Tricks Chapter” that may assist.

But if you’ve got a fair bit of data or if the data you want to include is dynamic and could be changing from one moment to the next, you’ll want to load it from an external source. That’s when we call on D3’s ‘Request’ functions.

The different types of data that can be requested by D3 are;

• text: A plain old piece of text that has options to be encoded in a particular way.
• json: This is the aforementioned JavaScript Object Notation.
• xml: Extensible Markup Language is a language that is widely used for encoding documents in a human readable forrm.
• html: HyperText Markup Language is the language used for displaying web pages.
• csv: Comma Separated Values is a widely used format for storing data where plain text information is separated by (wait for it) commas.
• tsv: Tab Separated Values is a widely used format for storing data where plain text information is separated by a tab-stop character.

Details on these ingestion methods and the formats for the requests are well explained on the D3 Wiki page. In this particular script we will look at the csv request method.

Back to our request…

The first line of that piece of code invokes the d3.csv request (d3.csv) and then the function is pointed to the data file that should be loaded (data.csv). This is referred to as the ‘URL’ (Unique Resource Locator) of the file. In this case the file is stored locally (in the same directory as the simple-graph.html file), but the URL could just as easily point to a file somewhere on the Internet.

The format of the data in the data.csv file looks a bit like this (although the file is longer (about 26 data points));

The ‘date’ and the ‘close’ heading labels are separated by a comma as are each subsequent date and number. Hence the ‘comma separated values’ :-).

The next part is part of the coolness of JavaScript. With the request for the file made, the script is told to carry out a function on the data (which will now be called ‘data’).

The function statement will catch any error that is generated and load the data that is ingested as the array ‘data’. The following line ensures that any errors that are generated are captured and ‘thrown’ to an appropriate ‘catch’ block (if it exists) in the function. If it doesn’t exist the program will terminate.

There are actually more things that get acted on as part of the function call (which we will examine soon), but the one we will consider here is contained in the following lines;

This block of code ensures that all the values that are pulled out of the csv file are set and formatted correctly. The first line declares that the data array called ‘data’ (confusingly) is being dealt with and tells the block of code that, for each group within the ‘data’ array it should carry out a function on it. Furthermore, when it carries out the formatting of each part of the array, it should designate the equivalent of each row as being ‘d’.

The information in the array can be considered as being stored in rows. Each row consists of two values: one value for ‘date’ and another value for ‘close’.

The function is pulling out values of ‘date’ and ‘close’ one row at a time.

Each time (Get it? forEach?) it gets a value of ‘date’ and ‘close’ it carries out the following operations;

For each value of date being operated on (d.date), d3.js changes it into a date format that is processed via a separate function ‘parseTime’. (The parseTime function is defined in a separate part of the script, and we will examine that later.) For the moment, be satisfied that it takes the raw date information from the CSV file in each row and converts it into a format that D3 can recognise as a date/time. That value is then re-saved in the same variable space.

The next line then sets the ‘close’ variable to a numeric value (if it isn’t already) using the ‘+’ operator.

At the end of this section of code, we have gone out and picked up a file with data in it of a particular type (comma separated values) and ensured that it is formatted in a way that the rest of the script can use correctly.

Now, the astute amongst you will have noticed that in the first line of that block of code (d3.csv("data.csv", function(error, data) {) we opened a normal bracket ( ( ) and a curly bracket ( { ), but we never closed them. That’s because they stay open until the very end of the file. That means that all those blocks that occur after the d3.csv bit are referenced to the data array. Or put another way, it uses the data in the data array to draw stuff!

But anyway, let’s get back to figuring what the code is doing by jumping back to the end of the margins block.

#### Formatting the Date / Time.

One of the glorious things about the World is that we all do things a bit differently. One of those things is how we refer to dates and time.

In my neck of the woods, it’s customary to write the date as day - month – year. E.g 23-12-2012. But in the United States the more common format would be 12-23-2012. Likewise, the data may be in formats that name the months or weekdays (E.g. January, Tuesday) or combine dates and time together (E.g. 2012-12-23 15:45:32). So, if we were to attempt to try to load in some data and to try and get D3 to recognise it as date / time information, we really need to tell it what format the date / time is in.

Time for a little demonstration (see what I did there).

We will change our data.csv file so that it only includes two points. The first one and the last one with a separation of a month and a bit. It will therefore look a little like this;

The graph now looks like this;

Nothing too surprising here, a very simple graph (note the time scale on the x axis).

Now we will change the later date in the data.csv file so that it is a lot closer to the starting date;

So, just a three day difference. Let’s see what happens.

Ahh…. Not only did we not have to make any changes to our JavaScript code, but it was able to recognise the dates were closer and fill in the intervening gaps with appropriate time / day values. Now, one more time for giggles.

This time we’ll stretch the interval out by a few years.

and the result is…

Hopefully that’s enough encouragement to impress upon you that formatting the time is a REALLY good thing to get right. Trust me, it will never fail to impress :-).

Back to formatting.

The line in the JavaScript that parses the time is the following;

This line is used when the data.forEach(function(d) portion of the code (that we looked at a couple of pages back) used d.date = parseTime(d.date) as a way to take a date in a specific format and to get it recognised by D3. In effect it said “take this value that is supposedly a date and make it into a value I can work with”.

The function used is the d3.timeParse(specifier) function where the specifier in this case is the mysterious combination of characters %d-%b-%y. The good news is that these are just a combination of directives specific for the type of date we are presenting.

The % signs are used as prefixes to each separate format type and the ‘-’ (minus) signs are literals for the actual ‘-’ (minus) signs that appear in the date to be parsed.

The d refers to a zero-padded day of the month as a decimal number [01,31].

The b refers to an abbreviated month name.

And the y refers to the year (without the centuries) as a decimal number.

If we look at a subset of the data from the data.csv file we see that indeed, the dates therein are formatted in this way.

That’s all well and good, but what if your data isn’t formatted exactly like that?

Good news. There are multiple different formatters for different ways of telling time and you get to pick and choose which one you want. Check out the Time Formatting page on the D3 Wiki for the authoritative list and some great detail, but the following is the list of currently available formatters (from the d3 wiki);

• %a - abbreviated weekday name.
• %A - full weekday name.
• %b - abbreviated month name.
• %B - full month name.
• %c - date and time, as “%a %b %e %H:%M:%S %Y”.
• %d - zero-padded day of the month as a decimal number [01,31].
• %e - space-padded day of the month as a decimal number [ 1,31].
• %H - hour (24-hour clock) as a decimal number [00,23].
• %I - hour (12-hour clock) as a decimal number [01,12].
• %j - day of the year as a decimal number [001,366].
• %m - month as a decimal number [01,12].
• %M - minute as a decimal number [00,59].
• %p - either AM or PM.
• %S - second as a decimal number [00,61].
• %U - week number of the year (Sunday as the first day of the week) as a decimal number [00,53].
• %w - weekday as a decimal number [0(Sunday),6].
• %W - week number of the year (Monday as the first day of the week) as a decimal number [00,53].
• %x - date, as “%m/%d/%y”.
• %X - time, as “%H:%M:%S”.
• %y - year without century as a decimal number [00,99].
• %Y - year with century as a decimal number.
• %Z - time zone offset, such as “-0700”.
• There is also a a literal “%” character that can be presented by using double % signs.

As an example, if you wanted to input date / time formatted as a generic MySQL ‘YYYY-MM-DD HH:MM:SS’ TIMESTAMP format the D3 parse script would look like;

#### Setting Scales Domains and Ranges

This is another example where, if you set it up right, D3 will look after you forever.

From our basic web page we have now moved to the section that includes the following lines;

The purpose of these portions of the script is to ensure that the data we ingest fits onto our graph correctly. Since we have two different types of data (date/time and numeric values) they need to be treated separately (but d3 manages them in almost the same way). To examine this whole concept of scales, domains and ranges properly, we will also move slightly out of sequence and (in conjunction with the earlier scale statements) take a look at the lines of script that occur later and set the domain. They are as follows;

The idea of scaling is to take the range of values of data that we have and to fit them into the space we have available.

If we have data that goes from 53.98 to 636.23 (as the data we have for ‘close’ in our csv file does), but we have a graph that is 450 pixels high (height = 500 - margin.top – margin.bottom;) we clearly need to make an adjustment.

Not only that. Even though our data goes from 53.98 to 636.23, that would look slightly misleading on the graph and it should really go from 0 to a bit over 636.23. It sounds really complicated, so let’s simple it up a bit.

First we make sure that any quantity we specify on the x axis fits onto our graph.

Here we set our variable (x) that will tell D3 where to draw something on the x axis. By using the d3.scaleTime() function we make sure that D3 knows to treat the values as date / time entities (with all their ingrained peculiarities). Then we specify the range that those values will cover (.range) and we specify the range as being from 0 to the width of our graphing area (See? Setting those variables for margins and widths are starting to pay off now!).

Then we do the same for the Y axis.

There’s a different function call (d3.scaleLinear()) but the .range setting is still there. In the interests of drawing a (semi) pretty picture to try and explain, hopefully this will assist;

I know, I know, it’s a little misleading because nowhere have we actually said to D3 this is our data from 53.98 to 636.23. All we’ve said is when we get the data, we’ll be scaling it into this space.

Now hang on, what’s going on with the [height, 0] part in y axis scale statement? The astute amongst you will note that for the time scale we set the range as [0, width] but for this one ([height, 0]) the values look backwards.

Well spotted.

This is all to do with how the screen is laid out and referenced. Take a look at the following diagram showing how the coordinates for drawing on your screen work;

The top left hand of the screen is the origin or 0,0 point and as we go left or down the corresponding x and y values increase to the full values defined by height and width.

That’s good enough for the time values on the x axis that will start at lower values and increase, but for the values on the y axis we’re trying to go against the flow. We want the low values to be at the bottom and the high values to be at the top.

No problem. We just tell D3 via the statement y = d3.scaleLinear().range([height, 0]); that the larger values (height) are at the low end of the screen (at the top) and the low values are at the bottom (as you most probably will have guessed by this stage, the .range statement uses the format .range([closer_to_the_origin, further_from_the_origin]). So when we put the height variable first, that is now associated with the top of the screen.

We’ve scaled our data to the graph size and ensured that the range of values is set appropriately. What’s with the domain part that was in this section’s title?

Come on, you remember this little piece of script don’t you?

While it exists in a separate part of the file from the scale / range part, it is certainly linked.

That’s because there’s something missing from what we have been describing so far with the set up of the data ranges for the graphs. We haven’t actually told D3 what the range of the data is. That’s also the reason this part of the script occurs where it does. It is within the section where the data.csv file has been loaded as ‘data’ and it’s therefore ready to use it.

So, the .domain function is designed to let D3 know what the scope of the data will be. This is what is then passed to the scale function.

Looking at the first part that is setting up the x axis values, it is saying that the domain for the x axis values will be determined by the d3.extent function which in turn is acting on a separate function which looks through all the ‘date’ values that occur in the ‘data’ array. In this case the .extent function returns the minimum and maximum value in the given array.

• function(d) { return d.date; } returns all the ‘date’ values in ‘data’. This is then passed to…
• The .extent function that finds the maximum and minimum values in the array and then…
• The .domain function which returns those maximum and minimum values to D3 as the range for the x axis.

Pretty neat really. At first you might think it was overly complex, but breaking the function down into these components allows additional functionality with differing scales, values and quantities. In short, don’t sweat it. It’s a good thing.

The x axis values are dates; so the domain for them is basically from the 26th of March 2012 till 1st of May 2012. The y axis is done slightly differently

Because the range of values desired on the y axis goes from 0 to the maximum in the data range, that’s exactly what we tell D3. The ‘0’ in the .domain function is the starting point and the finishing point is found by employing a separate function that sorts through all the ‘close’ values in the ‘data’ array and returns the largest one. Therefore the domain is from 0 to 636.23.

Let’s try a small experiment. Let’s change the y axis domain to use the .extent function (the same way the x axis does) to see what it produces.

The JavaScript for the y domain will be;

You can see apart from a quick copy paste of the internals, all I had to change was the reference to ‘close’ rather than ‘date’.

And the result is…

Look at that! The starting point for the y axis looks like it’s pretty much on the 53.98 mark and the graph itself certainly touches the x axis where the data would indicate it should.

Now, I’m not really advocating making a graph like this since I think it looks a bit nasty (and a casual observer might be fooled into thinking that the x axis was at 0). However, this would be a useful thing to do if the data was concentrated in a narrow range of values that are quite distant from zero.

For instance, if I change the data.csv file so that the values are represented like the following;

Then it kind of loses the ability to distinguish between values around the median of the data.

But, if I put in our magic .extent function for the y axis and redraw the graph…

The same data as the previous graph, but with one simple piece of the script changed and D3 takes care of the details.

#### Adding data to the line function

We’re getting towards the end of our journey through the script now. The next step is to associate the array ‘data’ with a new array that consists of a set of coordinates that we are going to plot.

I’m aware that the statement above may be somewhat ambiguous. You would be justified in thinking that we already had the data stored and ready to go. But that’s not strictly correct.

What we have is data in a raw format, we have added pieces of code that will allow the data to be adjusted for scale and range to fit in the area that we want to draw, but we haven’t actually taken our raw data and adjusted it for our desired coordinates. That’s what the code above does.

The main function that gets used here is the d3.line() function. This function uses accessor functions to store the appropriate information in the right area and in the case above they use the x and y accessors (that would be the bits that are .x and .y). The d3.line() function is called a ‘path generator’ and this is an indication that it can carry out some pretty clever things on its own accord. But in essence its job is to assign a set of coordinates in a form that can be used to draw a line.

Each time this line function is called on, it will go through the data and will assign coordinates to ‘date’ and ‘close’ pairs using the ‘x’ and ‘y’ functions that we set up earlier (which are responsible for scaling and setting the correct range / domain).

Of course, it doesn’t get the data all by itself, we still need to actually call the valueline function with ‘data’ as the source to act on. But never fear, that’s coming up soon.

As the title states, the next piece of script forms and adds the SVG element to the web page that D3 will then use to draw on.

So what exactly does that all mean?

Well D3 needs to be able to have a space defined for it to draw things. When you define the space it’s going to use, you can also give the space you’re going to use an identifying name and attributes.

In the example we’re using here, we are ‘appending’ an SVG element (an element designed for drawing graphics on) to the <body> of the HTML page.

We also add a group element ‘g’ that is referenced to the top left corner of the actual graph area on the canvas. ‘g’ is a grouping element in the sense that it is normally used for grouping together several related elements. So in this case those grouped elements will have a common reference.

(the image above is definitely not to scale, but I hope you get the general idea)

Interesting things to note about the code. The .attr("stuff in here") parts are attributes of the appended elements they are part of.

For instance;

tells us that the ‘svg’ element has a “width” of width + margin.left + margin.right and the “height” of height + margin.top + margin.bottom.

Likewise…

tells us that the group element ‘g’ has been transformed by moving (translating) to the point margin.left, margin.top. Or to the top left of the graph space proper. This way when we tell something to be drawn on our page, we can use this reference point ‘g’ to make sure everything is in the right place.

#### Actually Drawing Something!

Up until now we have spent a lot of time defining, loading and setting up. Good news! We’re about to finally draw something!

##### Drawing the line

We jump lightly over some of the code that we have already explained and land on the part that draws the line.

This area occurs in the part of the code that has the data loaded (via the d3.csv block) and it’s ready for action.

The svg.append("path") portion adds a new path element . A path element represents a shape that can be manipulated in lots of different ways (see more here: http://www.w3.org/TR/SVG/paths.html).

We join our array of data (confusingly the array is called ‘data’) to the path element with the .data([data]) line. We could have used an alternative method here with a line that read .datum(data). Both are completely valid to use, but have different strengths.

The next line down applies the ‘line’ styles from the CSS section that we experimented with earlier.

In the final line (.attr("d", valueline);), we add the attribute ‘d’ to the path with the data from the valueline function that we had declared earlier.

##### Drawing the Axes

Then we get to draw in the axes;

Both axes start by appending a group element (‘g’). Each axis will be bound to its own element.

The y axis can be drawn from the default position at the origin of the svg element (which we recall is 0,0 at the top left of the graph). However the x axis needs to be moved to the bottom of our graph.

On the x axis, we have a transform statement (.attr("transform", "translate(0," + height + ")")). If we want our x axis to be on the bottom of the graph, we need to move (transform) it to the bottom by a set amount. The set amount in this case is the height of the graph proper (height). So, for the point of demonstration we will remove the transform line and see what happens;

Yep, pretty much as anticipated.

The last part of the two sections of script ( .call(d3.axisBottom(x)); and .call(d3.axisLeft(y)); ) call the D3 x and y axis functions respectively and initiate the drawing action.

The method by which D3 orientates the axes is relatively self-evident and there are four options;

• .axisTop: An axis with values and ticks drawn above a horizontal axis.
• .axisRight: An axis with values and ticks drawn to the right of a vertical axis.
• .axisBottom: An axis with values and ticks drawn below a horizontal axis.
• .axisLeft: An axis with values and ticks drawn to the left of a vertical axis.

Just to illustrate the point, we can reverse the orientation of .axisBottom to .axisTop and .axisLeft to .axisRight to see what it looks like;

There we go.

It is worth stating that the axes as presented for this simple graph are very much a ‘straight out of the box’ configuration. Later in the book we will look at options for configuring and styling axes in more depth.

### Wrap Up

Well that’s it. In theory, you should now be a complete D3 ninja.

OK, perhaps a slight exaggeration. In fact there is a strong possibility that the information I have laid out here is at best borderline useful and at worst laden with evil practices and gross inaccuracies.

But look on the bright side. Irrespective of the nastiness of the way that any of it was accomplished or the inelegance of the code, if the picture drawn on the screen is pretty, you can walk away with a smile. :-)

This section concludes a very basic description of one type of a graphic that can be built with D3. We will look at adding value to it in subsequent chapters.

I’ve said it before and I’ll say it again. This is not a how-to for learning D3. This is how I have managed to muddle through and achieve what I wanted to do. If some small part of it helps you. All good. Those with a smattering of knowledge of any of the topics I have butchered above (or below) are fully justified in feeling a large degree of righteous indignation. To those I say, please feel free to amend where practical and possible, but please bear in mind this was written from the point of view of someone with no experience in the topic and therefore try to keep any instructions at a level where a new entrant can step in :-).

## Things we can do with the simple graph

The following headings in this section are intended to be a list of relatively simple ‘block’ type improvements that you can do to your graph to add functionality. The idea is to be able to use the simple graph that was used for the explanation of how D3 worked and just slot in code to add functionality (let’s hope it works for you :-)).

### Setting up and configuring the Axes

As referenced in the chapter where we initially developed our simple graph, the axes of that graph had no styling or configuration changes made to them at all. One of the results of this is that the font size, type, number of ticks and the way that the values are represented is very much at the default settings. This means that when we change our initial graph…

… and compress the margins or graph size we end up with axes that are not really suitable for the purpose;

Luckily, the D3 axis component has a wide range of configuration options and we can make changes simply via either the CSS styling or in the JavaScript code.

#### Change the text size

The first thing that we will change is the text size for the axes. The default size (built into D3) is 10px with the font type of sans-serif.

There are a couple of different ways that we could change the font size and either one is valid. The first way is to specify the font as a style when drawing an individual axis. To do this we simply add in a font style as follows;

This will increase the x axis font size to 14px and change the font type to ‘times’. Just like this;

There are a few things to notice here.

Firstly, we do indeed have a larger font and it appears to be of the type ‘times’. Yay!

Secondly, the y axis has remained as 10px sans-serif (which is to be expected since we only added the style to the x axis code block)

Lastly, the number of values represented on the x axis has meant that with the increase in font size there is some overlapping going on. We will deal with that shortly…

The addition of the styling for the x axis has been successful and in a situation where only one element on a page is being adjusted, this is a perfectly valid way to accomplish the task. However, in this case we should be interested in changing the font on both the x and y axes. We could do this by adding a duplicate style line to the y axis block, but we have a slightly better way of accomplishing the task by declaring the style in the HTML style block at the start of the code and then applying the same style to both blocks.

In the <style> ... </style> section at the start of the file add in the following line;

This will set the font to 14px sans-serif (I prefer this to ‘times’) for anything that has the axis class applied to it. All we have to do then is to tell our x and y axes blocks to use the axis class as an attribute. We can do this as follows;

It could be argued that this doesn’t really conserve more code, but in my humble opinion it adds a more elegant way to alter styling in this case.

The end result now looks like the following;

#### Changing the number of ticks on an axis

Now we shall address the other problem that cropped up when we changed the size of the text. We have overlapping values on the x axis.

If I was to be brutally honest, I think that the number of values (ticks) on the graph is a bit too many. The format of the values (especially on the x axis) is too wide and this type of overlap was bound to happen eventually.

Good news. D3 has got us covered.

The axis component includes a function to specify the number of ticks on an axis. All we need to do is add in the function and the number of ticks like so;

With the end result looking like this;

We can see that D3 has picked tick values that seem nice and logical. There’s one that starts on the 1st of April that’s just labelled ‘April’ and they go at a nice interval of one week for the subsequent ticks. Nice.

Hopefully you just did a quick count across the bottom of the previous graph and went “Yep, five ticks. Spot on”. Well done if you did, but there’s a little bit of a sneaky trick up D3’s sleeve with the number of ticks on a graph axis.

For instance, here’s what the graph looks like when the .ticks(5) value is changed to .ticks(4).

Eh? Hang on. Isn’t that some kind of mistake? There are still five ticks. Yep, sure is! But wait… we can keep dropping the ticks value till we get to two and it will still be the same. At .ticks(2) though, we finally see a change.

How about that? At first glance that just doesn’t seem right, then you have a bit of a think about it and you go “Hmm… When there were 5 ticks, they were separated by a week each, and that stayed that way till we got to a point where it could show a separation of a month”.

D3 is making a command decision for you as to how your ticks should be best displayed. This is great for simple graphs and indeed for the vast majority of graphs. Like all things related to D3, if you really need to do something bespoke, it will let you if you understand enough code.

The following is the list of time intervals that D3 will consider when setting automatic ticks on a time based axis;

• 1, 5, 15 and 30-second.
• 1, 5, 15 and 30-minute.
• 1, 3, 6 and 12-hour.
• 1 and 2-day.
• 1-week.
• 1 and 3-month.
• 1-year.

And yes. If you increase the number of ticks, you need to wait till you get to 10 before they change to an axis with interval of two days. And yes, the overlap is still there;

If we do a quick count we should also notice that we have 19 ticks!

The question should be asked. Can we specify our own intervals? Great question! Yes we can.

What we need to do is to use another D3 trick and specify an exact interval using the d3 time component. In our particular situation all we need to do is specify an interval inside the .ticks function. Specifically for an interval of 4 days for example we would use something like;

Here we use the timeDay unit of ‘days’ and specify an interval of 4 days.

The graph will subsequently appear as follows;

Intervals have a number of standard units (including UTC time) such as;

• d3.timeMillisecond : Milliseconds
• d3.timeSecond : Seconds
• d3.timeMinute : Minutes
• d3.timeHour : Hours
• d3.timeDay : Days
• d3.timeWeek : This is an alias for d3.timeSunday for a week
• d3.timeSunday : A week starting on Sunday
• d3.timeMonday : A week starting on Monday
• d3.timeTuesday : A week starting on Tuesday
• d3.timeWednesday : A week starting on Wednesday
• d3.timeThursday : A week starting on Thursday
• d3.timeFriday : A week starting on Friday
• d3.timeSaturday : A week starting on Saturday
• d3.timeMonth : Months starting on the 1st of the month
• d3.timeYear : Years Starting on the 1st day of the year

But what if we really wanted that two day separation of ticks without the overlap?

#### Rotating text labels for a graph axis

An answer to the problem of overlapping axis values might be to rotate the text to provide more space.

The answer I found most usable was provided by Aaron Ward on Google Groups.

The full code for this example can be found on github or in the code samples bundled with this book (simple-axis-rotated.html and data.csv). A working example can be found on bl.ocks.org.

The first substantive change would be a little housekeeping. Because we are going to be rotating the text at the bottom of the graph, we are going to need some extra space to fit in our labels. So we should change our bottom margin appropriately.

I found that 70 pixels was sufficient.

The remainder of our changes occur in the block that draws the x axis.

It’s pretty standard until the .call(d3.axisBottom(x).ticks(10)) portion of the code. Here we remove the semicolon that was there so that the block continues with its function.

Then we select all the text elements that comprise the x axis with the .selectAll("text"). From this point onwards, we are operating on the text elements associated with the x axis. In effect; the following four ‘actions’ are applied to the text labels.

The .style("text-anchor", "end") line ensures that the text label has the end of the label ‘attached’ to the axis tick. This has the effect of making sure that the text rotates about the end of the date. This makes sure that the text all ends up at a uniform distance from the axis ticks.

The dx and dy attribute lines move the end of the text just far enough away from the axis tick so that they don’t crowd it and not too far away so that it appears disassociated. This took a little bit of fiddling to ‘look’ right and you will notice that I’ve used the ‘em’ units to get an adjustment if the size of the font differs.

The final action is kind of the money shot.

The transform attribute applies itself to each text label and rotates each line by -65 degrees. I selected -65 degrees just because it looked OK. There was no deeper reason.

The end result then looks like the following;

This was a surprisingly difficult problem to find a solution to that I could easily understand (well done Aaron). That makes me think that there are some far deeper mysteries to it that I don’t fully appreciate that could trip this solution up. But in lieu of that, enjoy!

#### Formatting a date / time axis with specified values

OK then. We’ve been very clever in rotating our text, but you will notice that D3 has used its own good judgement as to what format the days / date will be represented as.

Not that there’s anything wrong with it, but what if we want to put a specific format of date / time nomenclature as axis labels?

No problem. D3 to the rescue again!

This is actually a pretty easy thing to do, but there are plenty of options for the formatting, so the only really tricky part is deciding what to put where.

But, before we start doing anything we are going to have to expand our bottom margin even more than we did with the rotate the axis labels feature.

That should see us right.

Now the simple part :-). Changing the format of the label is as simple as inserting the tickFormat command into the xAxis declaration and including a D3 time formatting function a little like this;

The timeFormat formatters are the same as those we used when parsing our time values when reading our data with the simple graph;

• %a - abbreviated weekday name.
• %A - full weekday name.
• %b - abbreviated month name.
• %B - full month name.
• %c - date and time, as “%a %b %e %H:%M:%S %Y”.
• %d - zero-padded day of the month as a decimal number [01,31].
• %e - space-padded day of the month as a decimal number [ 1,31].
• %H - hour (24-hour clock) as a decimal number [00,23].
• %I - hour (12-hour clock) as a decimal number [01,12].
• %j - day of the year as a decimal number [001,366].
• %m - month as a decimal number [01,12].
• %M - minute as a decimal number [00,59].
• %p - either AM or PM.
• %S - second as a decimal number [00,61].
• %U - week number of the year (Sunday as the first day of the week) as a decimal number [00,53].
• %w - weekday as a decimal number [0(Sunday),6].
• %W - week number of the year (Monday as the first day of the week) as a decimal number [00,53].
• %x - date, as “%m/%d/%y”.
• %X - time, as “%H:%M:%S”.
• %y - year without century as a decimal number [00,99].
• %Y - year with century as a decimal number.
• %Z - time zone offset, such as “-0700”.
• There is also a literal “%” character that can be presented by using double % signs.

So the format we have specified (%Y-%m-%d) will show the year with the century as a decimal number (%Y) followed by a hyphen, followed by a zero padded month as a decimal number (%m) followed by another hyphen and lastly the day as a zero padded day of the month (%d).

The end result looking a bit like this;

An example using this code can be found on github or in the code samples bundled with this book (simple-axis-rotated-formatted.html and data.csv). A working example can be found on bl.ocks.org. The example code also includes the rotating of the x axis text as described in the previous section.

So how about we try something a little out of the ordinary (extreme)?

How about the full weekday name (%A), the day (%d), the full month name (%B) and the year (%Y) as a four digit number?

We will also need some extra space for the bottom margin, so how about 170?

And….

Oh yeah… When axis ticks go bad…

But seriously, that does work as a pretty good example of the flexibility available.

What’s the first thing you get told at school when drawing a graph?

So, time to add a couple of labels!

We’ll start with our default code for our simple graph. The full code for this can be found on github or in the code samples bundled with this book (simple-graph.html and data.csv). A live example can be found on bl.ocks.org.

Preparation: Because we’re going to be adding labels to the bottom and left of the graph we need to increase the bottom and left margins. Changes like the following should suffice;

#### The x axis label

First things first (because they’re done slightly differently), the x axis. If we begin by describing what we want to achieve, it may make the process of implementing a solution a little more logical.

What we want to do is to add a simple piece of text under the x axis and in the centre of the total span. Wow, that does sound easy.

And it is, but there are different ways of accomplishing it, and I think I should take an opportunity to demonstrate them. Especially since one of those ways is a BAD idea.

This is the code we’re going to add to the simple line graph script;

We will put it in between the blocks of script that add the x axis and the y axis.

Before we describe what’s happening, let’s take a look at the result;

Well, it certainly did what it was asked to do. There’s a ‘Date’ label as advertised! (Yes, I know it’s not pretty.) Let’s describe the code and then work out why there’s a better way to do it.

The first line appends a “text” element to our svg element. There is a lot more to learn about “text” elements at the home of the World Wide Web Consortium (W3C). The next two lines ( .attr("x", 480 ) and .attr("y", 475 ) ) set the attributes for the x and y coordinates to position the text on the svg.

The second last line (.style("text-anchor", "middle")) ensures that the text ‘style’ is such that the text is centre aligned and therefore remains nicely centred on the x, y coordinates that we send it to.

The final line (.text("Date");) adds the actual text that we are going to place.

That seems really simple and effective and it is. However, the bad part about it is that we have hard coded the location for the date into the code. This means if we change any of the physical aspects of the graph, we will end up having to re-calculate and edit our code. And we don’t want to do that.

Here’s an example. If I decide that I would prefer to decrease the height of the graph by editing the line here;

and making the height 490 pixels;

The result is as follows;

EVERYTHING about the graph has adjusted itself, except our nasty, hard coded ‘Date’ label which has been cruelly cut off. This is far from ideal and can be easily fixed by using the variables that we set up ever so carefully earlier.

lets let our variables do the walking and use;

So with this code we tell the script that the ‘Date’ label will always be halfway across the width of the graph (no matter how wide it is) and at the bottom of the graph with respect to its height plus the top margin and 20 pixels (as a fixed offset) (remember it uses a coordinates system that increases from the top down).

The end result of using variables is that if I go to an extreme of changing the height and width of my graph to;

We still finish up with an acceptable result;

Well, for the label position at least :-).

So the changes to using variables is just a useful lesson that variables rock and mean that you don’t have to worry about your graph staying in relative shape while you change the dimensions. The astute readers amongst you will have learned this lesson very early on in your programming careers, but it’s never a bad idea to make sure that users that are unfamiliar with the concept have an indicator of why it’s a good idea.

Now the third method that I mentioned at the start of our x axis odyssey. This is not mentioned because it’s any better or worse way to implement your script (The reason that I say this is because I’m not sure if it’s better or worse.) but because it’s sufficiently different to make it look confusing if you didn’t think of it in the first place.

So, we’ll take our marvellous coordinates code;

And replace it with a single (longer) line;

This uses the "transform" attribute to move (translate) the point to place the ‘Date’ label to exactly the same spot that we’ve been using for the other two examples (using variables of course).

#### The y axis label

So, that’s the x axis label. Time to do the y axis. The code we’re going to use looks like this;

For the sake of neatness we will put the piece of code in a nice logical spot and this would be following the block of code that added the y axis (but before the closing curly bracket)

And the result looks like this;

There we go, a label for the y axis that is nicely centred and (gasp!) rotated by 90 degrees! Woah, does the leetness never end! (No. No it does not.)

So, how do we get to this incredible result?

The first thing we do is the same as for the x axis and append a text element to our svg element (svg.append("text")).

Then things get interesting.

Because that line rotates everything by -90 degrees. While it’s obvious that the text label ‘Value’ has been rotated by -90 degrees (from the picture), the following lines of code show that we also rotated our reference point (which can be a little confusing).

Let’s get graphical to illustrate how this works;

Here’s our starting position, with x,y in the 0,0 coordinate of the graph drawing area surrounded by the margins.

When we apply a -90 degrees transform we get the equivalent of this;

Here the 0,0 coordinate has been shifted by -90 degrees and the x,y designations are flipped so that we now need to tell the script that we’re moving a ‘y’ coordinate when we would have otherwise been moving ‘x’.

Hence, when the script runs…

… we can see that this is moving the x position to the left from the new 0 coordinate by the margin.left value.

Likewise when the script runs…

… this is actually moving the y position from the new 0 coordinate halfway up the height of the graph area.

Right, we’re not quite done yet. The following line has the effect of shifting the text slightly to the right.

Firstly the reason we do this is that our previous translation of coordinates means that when we place our text label it sits exactly on the line of 0 – margin.left. But in this case that takes the text to the other side of the line, so it actually sits just outside the boundary of the svg element.

The "dy" attribute is another coordinate adjustment move, but this time a relative adjustment and the “1em” is a unit of measure that equals exactly one unit of the currently specified text point size. So what ends up happening is that the ‘Value’ label gets shifted to the right by exactly the height of the text, which neatly places it exactly on the edge of the canvas.

The two final lines of this part of the script are the same as for the x axis. They make sure the reference point is aligned to the centre of the text (.style("text-anchor", "middle")) and then it prints the text (.text("Value");). There, that wasn’t too painful.

The full code for this example can be found on github or in the code samples bundled with this book (axis-labels.html and data.csv). A live example can be found on bl.ocks.org.

If you’ve read through the adding the axis labels section most of this will come as no surprise.

What we want to do to add a title to the graph is to add a text element (just a few words) that will appear above the graph and centred left to right.

We’ll start with our default code for our simple graph. The full code for this can be found on github or in the code samples bundled with this book (simple-graph.html and data.csv). A live example can be found on bl.ocks.org.

Preparation: Because we’re going to be adding a title to the top of the graph we need to increase the top margin. Changes like the following should suffice;

To add the title this is the code we’re going to add to the simple line graph script;

And the end result will look like this;

A nice logical place to put the block of code would be towards the end of the JavaScript. In fact I would put it as the last element we add. So here;

Now since the vast majority of the code for this block is a regurgitation of the axis labels code, I don’t want to revisit that and bloat up this document even more, so I will direct you back to that section if you need to refresh yourself on any particular line. But….. There are a couple of new ones in there which could benefit from a little explanation.

Both of them are style descriptors and as such their job is to apply a very specific style to this element.

What they do is pretty self explanatory. Make the text a specific size and underline it. But what is perhaps slightly more interesting is that we have this declaration in the JavaScript code and not in the CSS portion of the file.

### Change a line chart into a scatter plot

Confession time.

I didn’t actually intend to add in a section with a scatter plot in it because I thought it would be;

1. tricky
2. not useful
3. all of the above

I was wrong on all counts.

All you need to do is take the simple graph example file. The full code for this can be found on github or in the code samples bundled with this book (simple-graph.html and data.csv). A live example can be found on bl.ocks.org.

We then slot the following block in between the ‘Add the valueline path’ and the ‘add the x axis’ blocks.

And you will get…

The full code for this graph can also be found on github or in the code samples bundled with this book (scatterplot.html and data.csv). A live example can be found on bl.ocks.org.

I deliberately put the dots after the line in the drawing section, because I thought they would look better, but you could put the block of code before the line drawing block to get the following effect;

(just trying to reinforce the concept that ‘order’ matters when drawing objects :-)).

You could of course just remove the line block all together…

But in my humble opinion it loses something.

So what do the individual lines in the scatter plot block of JavaScript do?

The first line (svg.selectAll("dot")) essentially provides a suitable grouping label for the svg circle elements that will be added. The next line associates the range of data that we have to the group of elements we are about to add in.

Then we add a circle for each data point (.enter().append("circle")) with a radius of 5 pixels (.attr("r", 5)) and appropriate x (.attr("cx", function(d) { return x(d.date); })) and y (.attr("cy", function(d) { return y(d.close); });) coordinates.

There is lots more that we could be doing with this piece of code (check out the scatter plot example) including varying the colour or size or opacity of the circles depending on the data and all sorts of really neat things, but for the mean time, there we go. Scatter plot!

### Smoothing out graph lines

When you draw a line graph, what you’re doing is taking two (or more) sets of coordinates and connecting them with a line (or lines). I know that sounds simplistic, but bear with me. When you connect these points, you’re telling the viewer of the graph that in between the individual points, you expect the value to vary in keeping with the points that the line passes through. So in a way, you’re trying to interpret the change in values that are not shown.

Now this is not strictly true for all graph types, but it does hold for a lot of line graphs.

So… when connecting these known coordinates together, you want to make the best estimate of how the values would be represented. In this respect, sometimes a straight line between points is not the best representation.

For instance. Earlier, when demonstrating the extent function for graphing we showed a graph of the varying values with the y axis showing a narrow range.

The resulting variation of the graph shows a fair amount of extremes and you could be forgiven for thinking that if this represented a smoothly flowing analog system of some kind then some of those sharp peaks and troughs would not be a true representation of how the system or figures varied.

So how should it look? Ahh… The 64,000 question. I don’t know :-). You will have a better idea since you are the person who will know your data best. However, what I do know is that D3 has some tricks up its sleeve to help. We can easily change what we see above into; How about that? And the massive amount of code required to carry out what must be a ridiculously difficult set of calculations? So where does this neat piece of code go? Here; So is that it? Nooooo…….. There’s more! This is one form of interpolation effect that can be applied to your data, but there is a range and depending on your data you can select the one that is appropriate. Here’s the list of available options and for more about them head on over to the D3 wiki. • linear (d3.curveLinear) – Normal line (jagged). • linear-closed (d3.curveLinearClosed) – A normal line (jagged) families of curves available with the start and the end closed in a loop. • step (d3.curveStep)- a stepping graph alternating between vertical and horizontal segments. The y values change at the mid point of the adjacent x values • step-before (d3.curveStepBefore) - a stepping graph alternating between vertical and horizontal segments. The y values change before the x value. • step-after (d3.curveStepAfter) - a stepping graph alternating between horizontal and vertical segments. The y values change after the x value. • basis (d3.curveBasis) - a B-spline, with control point duplication on the ends (that’s the one above). • basis-open (d3.curveBasisOpen) - an open B-spline; may not intersect the start or end. • basis-closed (d3.curveBasisClosed) - a closed B-spline, with the start and the end closed in a loop. • bundle (d3.curveBundle) - equivalent to basis, except a separate tension parameter is used to straighten the spline. This could be really cool with varying tension. • cardinal (d3.curveCardinal) - a Cardinal spline, with control point duplication on the ends. It looks slightly more ‘jagged’ than basis. • cardinal-open (d3.curveCardinalOpen) - an open Cardinal spline; may not intersect the start or end, but will intersect other control points. So kind of shorter than ‘cardinal’. • cardinal-closed (d3.curveCardinalClosed) - a closed Cardinal spline, looped back on itself. • monotone (d3.curveMonotoneX) - cubic interpolation that makes the graph only slightly smoother. • catmull-Rom (d3.curveCatmullRom) - New for v4 - a cubic Catmull–Rom spline • catmull-Rom-closed (d3.curveCatmullRomClosed) - New for v4 - a closed cubic Catmull–Rom spline • catmull-Rom-open (d3.curveCatmullRomOpen) - New for v4 - an open cubic Catmull–Rom spline Because in the course of writing this I took an opportunity to play with each of them, I was pleasantly surprised to see some of the effects and it seems like a shame to deprive the reader of the same joy :-). So at the risk of deforesting the planet (so I hope you are reading this in electronic format) here is each of the above interpolation types applied to the same data. This is also an opportunity to add some reader feedback awesomeness. Many thanks to ‘enjalot’ for the great suggestion to plot the points of the data as separate circles on the graphs. Since the process of interpolation has the effect of ‘interpreting’ the trends of the data to the extent that in some cases, the lines don’t intersect the actual data much at all. Each of the following shows the smoothing curve and the data that is used to plot the graph (as a scatterplot). Just in case you’re in the mood for another example, feel free to check out the bl.ock here which shows all of the basic forms of the curve types (I didn’t include the open and closed versions or ‘bundle’ since it is the equivalent of ‘basis’). The full code for this can also be found in the code samples bundled with this book (interpolate.html and data-3.csv). A live example can be found on bl.ocks.org. So, over to you to decide which format of interpolation is going to suit your data best:-). ### Make a dashed line Dashed lines totally rock! One of the best parts about it is that they’re so simple to do! Literally one line!!!! So lets imagine that we want to make the line on our simple graph dashed. All we have to do is insert the following line in our JavaScript code here; And our graph ends up like this; Hey! It’s dashtastic! So how does it work? Well, obviously "stroke-dasharray" is a style for the path element, but the magic is in the numbers. Essentially they describe the on length and off length of the line. So "3, 3" translates to 3 pixels (or whatever they are) on and 3 pixels off. Then it repeats. Simple eh? So, experiment time :-) What would the following represent? "5, 5, 5, 5, 5, 5, 10, 5, 10, 5, 10, 5" Try not to cheat… Ahh yes, Mr. Morse would be proud. And you can put them anywhere. Here are our axes perverted with dashes; Well… I suppose you can have too much of a good thing. With great power comes great responsibility. Use your dash skills wisely and only for good. ### Filling an area under the graph Lines are all very well and good, but that’s not the whole story for graphs. Sometimes you’ve just got to go with a fill. Filling an area with a solid colour isn’t too hard. I mean we did it by mistake back a few pages when we were trying to draw a line. But to do it in a nice coherent way is fairly straight forward. It takes three sections of code in much the same way that we drew our grid lines earlier; 1. One in the CSS section to define what style the area will have. 2. One to define the functions that generate the area. And… 3. One to draw the area. The end result will looks a bit like this; While we’ll start with our default code for our simple graph, the full code for this area graph can be found on github or in the code samples bundled with this book (area.html and data.csv). A live example can be found on bl.ocks.org. #### CSS for an area fill This is pretty straight forward and only consists of one rule; Put it at the bottom of your <style> section. The style (fill: lightsteelblue;) sets the colour of our fill (and in this case we have chosen a lighter shade of the same colour as our line to match it). #### Define the area function We need a function that will tell the area what space to fill. This is accessed from the d3.area function The code that we will use is as follows; I have placed it in between the range variable definitions and the line definitions here; So the only changes to the code are the addition of the y0 line and the renaming of the y line y1. Here’s a picture that might help explain; As should be apparent, the top line (y1) follows the valueline line and the bottom line is at the constant ‘height’ value. Everything in between these lines is what gets filled. The function in this section describes the area. #### Draw the area Now to the money maker. The final section of code in the area filling odyssey is as follows; We should place this block directly after the domain functions but before the drawing of the valueline path; This is actually a pretty good idea to put it there since the various bits and pieces that are drawn in the graph are done so one after the other. This means that the filled area comes first, then the valueline is layered on top and then the axes come last. This is a pretty good sequence since if there are areas where two or more elements overlap, it might cause the graph to look ‘wrong’. For instance, here is the graph drawn with the area added last. You should be able to notice that part of the valueline line has been obscured and the line for the y axis where it coincides with the area is obscured also. Looking at the code we are adding here, the first line appends a path element (svg.append("path")) much like the script that draws the line. The second line (.data([data])) declares the data we will be utilising for describing the area and the third line (.attr("class", "area")) makes sure that the style we apply to it is as defined in the CSS section (under ‘area’). The final line (.attr("d", area);) declares “d” as the attributer for path data and calls the ‘area’ function to do the drawing. And that’s it! #### Filling an area above the line Pop Quiz: How would you go about filling the area ABOVE the graph? In this instance, you could fill the lower area as has been demonstrated here, and with a small change you can fill another area with a solid colour above another line. How is this incredible feat achieved? Well, remember the code that defined the area? All we have to do is tell it that instead of setting the y0 constant value to the height of the graph (remember, this is the bottom of the graph) we will set it to the constant value that is at the top of the graph. In other words zero (0). That’s it. Now, I’m not going to go over the process of drawing two lines and filling each in different directions to demonstrate the example I described, but this provides a germ of an idea that you might be able to flesh out :-) ### Adding a drop shadow to allow text to stand out on graphics. I’ve deliberately positioned this particular tip to follow the ‘filling an area’ description because it provides an opportunity to demonstrate the principle to slightly better effect. While we’ll start with our code for our area graph, the full code for the graph with shadowy text can be found on github or in the code samples bundled with this book (shadow.html and data.csv). A live example can be found on bl.ocks.org. There have been several opportunities where I have wanted to place text overlaid on graphs for convenience sake only to have it look overly messy as the text interferes with the graph. Anyway, what we’ll do is leave the area fill in place and place the title back on the graph, but position the title so that it lays on top of the fill like so; The additional code for the title is the following and appears just after the drawing of the axes. (the only change from the previous title example is the ‘y’ attribute which has been hard coded to 25 to place it inconveniently on the filled area and the size of the font) So, what we want to end up with is something like the following… In my humble opinion, it’s just enough to make the text acceptable :-). The method that I’ll describe to carry this out is designed so that the drop shadow effect can be applied to any text elements in the graph, not the isolated example that we will use here. In order to implement this marvel of utility we will need to make changes in two areas. One in the CSS where we will define a style for white shadowy backgrounds and the second to draw it. #### CSS for white shadowy background The code to add to the CSS section is as follows; The first line designates that the style applies to text with a ‘shadow’ label. The stroke is set to white. The width of the line is set to 4px and it is made to be slightly see-through. So by setting the line that surrounds the text to be thick, white and see-through gives it a slightly ‘cloudy’ effect. If we remove the black text from over the top we get a slightly better look; Of course if you want to have a play with any of these settings, you should have a go and see what works best for your graph. #### Drawing the white shadowy background. Now that we’ve set the style for our background, we need to draw it in. The code for this should be extremely familiar; That’s because it’s identical to the piece of code that was used to draw the title except for the one line that is indicated above. The reason that it’s identical is that what we are doing is placing a white shadow on the graph and then the text on top of it, if it deviated by a significant amount it will just look silly. Of course a slight amount could look effective, in which case adjust the ‘x’ or ‘y’ attributes. One of the things I pointed out in the previous paragraph was extremely important. That’s the bit that tells you that we needed to place the shadow before we placed the black text. For the same reason that we placed the area fill on first in the area fill example, If black text goes on before the shadow, it will look pretty silly. So place this block of code just before the block that draws the title. So the line that has been added in is the one that tells D3 that the text that is being drawn will have the white cloudy effect. And at the risk of repeating myself, if you have several text elements that could benefit from this effect, once you have the CSS code in place, all you need to do is duplicate the block that adds the text and add in that single line and voila! ### Adding grid lines to a graph Grid lines are an important feature for some graphs as they allow the eye to associate three analogue scales (the x and y axis and the displayed line). There is currently a tendency to use graphs without grid lines online as it gives the appearance of a ‘cleaner’ interface, but they are still widely used and a necessary component for graphing. This is what we’re going to draw; While we’ll start with our default code for our simple graph, the full code for the graph with grid lines can be found on github or in the code samples bundled with this book (grid.html and data.csv). A live example can be found on bl.ocks.org. How to build grid lines? We’re going to use the axis function to generate two more axis elements (one for x and one for y) but for these ones instead of drawing the main lines and the labels, we’re just going to draw the tick lines. Really long tick lines (I’m considering calling them long cat lines). To create them we have to add in 3 separate blocks of code. 1. One in the CSS section to define what style the grid lines will have. 2. One to define the functions that generate the grid lines. And… 3. One to draw the lines. #### The grid line CSS This is the total styling that we need to add for the tick lines; Just add this block of code at the end of the current CSS that is in the simple graph template (just before the </style> tag). The CSS here is done in two parts. The first portion sets the line colour (stroke), the opacity (transparency) of the lines and make sure that the lines are narrow (crispEdges). The colour is pretty standard, but in using the opacity style we give ourselves the opportunity to use a good shade of colour (if grey actually is a colour) and to juggle the degree to which it stands out a little better. The second part is the stroke width. Now it might seem a little weird to be setting the stroke width to zero, but if you don’t (and we remove the style) this is what happens; If you look closely (compare with the previous picture if necessary) the grid lines for the top and right edges have turned black. The stroke width style is obviously adding in new axis lines and we’re not interested in them at the moment. Therefore, if we set the stroke width to zero, we get rid of the problem. #### Define the grid line functions We will need to define two functions to generate the grid lines and they look a little like this; Each function will carry out its configuration when called from the later part of the script (the drawing part). A good spot to place the code is just before we load the data with the d3.csv Both functions are almost identical. They give the function a name (make_x_gridlines and make_y_gridlines) which will be used later when the piece of code that draws the lines calls out to them. Both functions also show which parameters will be fed back to the drawing process when called. Both make sure they use the d3.axis function and then they set individual attributes which make sense. They make sure they’ve got the right axes (with the x and y variables in the function). They set the orientation of the axes to match the incumbent axes (d3.axisBottom(x) and d3.axisLeft(y)). And they set the number of ticks to match the number of ticks in the main axis (.ticks(5) and .ticks(5)). You have the opportunity here to do something slightly different if you want. For instance, think back to when we were setting up the axis for the basic graph and we messed about, seeing how many ticks we could get to appear. If we increase the number of ticks that appear in the grid (lets say to .ticks(30) and .ticks(10))) we get the following; So the grid lines can now show divisions of 20 on the y axis and per 2 days on the x axis :-) #### Draw the lines The final block of code we need is the bit that draws the lines. The first two lines of both the x and y axis grid lines code above should be pretty familiar by now. The first one appends the element to be drawn to the group “g”. the second line (.attr("class", "grid")) makes sure that the style information set out in the CSS is applied. The x axis grid lines portion makes a slight deviation from conformity here to adjust its positioning to take into account the coordinates system .attr("transform", "translate(0," + height + ")"). Then both portions call their respective make axis functions (.call(make_x_gridlines() and .call(make_y_gridlines()). Now comes the really interesting bit. What you will see if you go to the D3 API wiki is that for the .tickSize function, the following is the format. So in our example we are setting our ticks to a length that corresponds to the full height or width of the graph. Which of course means that they extend across the graph and have the appearance of grid lines! What a neat trick. The last thing that is included in the code to draw the grid lines is the instruction to suppress printing any label for the ticks; After all, that would become a bit confusing to have two sets of labels. Even if one was on top of the other. They do tend to become obvious if that occurs (they kind of bulk out a bit like bold text). And that’s it. Grid lines! ### Adding more than one line to a graph All right, we’re starting to get serious now. Two lines on a graph is a bit of a step into a different world in one respect. I mean that in the sense that there’s more than one way to carry out the task, and I tend to do it one way and not the other mainly because I don’t fully understand the other way :-(. How are we going to do this? I think that the best way will be to make the executive decision that we have suddenly come across more data and that it is also in our data.csv file (which we’ll rename data2.csv just to highlight the difference between the two data sets). In fact it looks a little like this (apologies in advance for the big ugly block of data); Three columns, date open and close. The first two are exactly what we have been dealing with all along and the last (open) is our new made up data. Each column is separated by a comma (hence .csv (comma separated values)), which is the format we’re currently using to import data. We should save this as a new file so we don’t mess up our previous data, so (as mentioned earlier) let’s call it data2.csv. There is a copy of this file and the sample code at github and in the code samples bundled with this book (multiple-lines.html and data2.csv). A live example can be found on bl.ocks.org. We will build our new code using our simple graph template to start with, so the immediate consequence of this is that we need to edit the line that was looking for ‘data.csv’ to reflect the new name. So when you browse to our new graph’s html file, we don’t see any changes. It still happily loads the new data, but because it hasn’t been told to do anything with it, nothing new happens. What we need to do now it to essentially duplicate the code blocks that drew the first line for the second line. The good news is that in the simplest way possible that’s just two code blocks. The first sets up the function that defines the new line; You should notice that this block is identical to the block that sets up the function for the first line, except this one is called (imaginatively) valueline2 and instead of including the variable ‘close’ for our datapoint we are using our new variable (from the extra column in the csv file) ‘open’. We should put it directly after the block that sets up the function for valueline. The second block draws our new line; Again, this is identical to the block that draws the first line, except this one is called valueline2. We should put it directly after the block that draws valueline. After those three small changes, check out your new graph; Hey! Two lines! Hmm…. Both being the same colour is a bit confusing. Good news. We can change the colour of the second line by inserting a line that adjusts its stroke (colour) very simply. So here’s what our new drawing block looks like; And as if by magic, here’s our new graph; Wow. Right about now, we’re thinking ourselves pretty clever. But there are two places where we’re not doing things right. We took a simple way, but we took some short cuts that might bite us in the posterior. The first mistake we made was not ensuring that our variable "d.open" is being treated as a number or a string. We’re fortunate in this case that it is, but this can’t always be assumed. So, this is an easy fix and we just need to put the following (indicated line) in our code; The second and potentially more fatal flaw is that nowhere in our code do we make allowance for our second set of data (the second line’s values) exceeding our first lines values. That might not sound too normal straight away, but consider this. What if when we made up our data earlier, some of the new data exceeded our maximum value in our original data? As a means of demonstration, here’s what happens when our second line of data has values higher than the first lines; Ahh…. We’re not too clever now. Good news though, we can fix it! The problem comes about because when we set the domain for the y axis this is what we put in the code; So that only considers d.close when establishing the domain. With d.open exceeding our domain, it just keeps drawing off the graph! The good news is that ‘Bill’ has provided a solution for just this problem here; All you need to replace the y.domain line with is this; It does much the same thing, but this time it returns the maximum of d.close and d.open (whichever is largest). Good work Bill. If we put that code into the graph with the higher values for our second line we are now presented with this; And it doesn’t matter which of the two sets of data is largest, the graph will always adjust :-) You will also have noticed that our y axis has auto adjusted again to cope. Clever eh? ### Labelling multiple lines on a graph Our previous example of a graph with multiple lines is a thing of rare beauty, but which line relates to which set of data? We have data that defines values for open and close, but we don’t know which line is which. In this section we will add labels to our lines so that we know what it what. This section was inspired by a question from a reader (Arun b.s) of the d3noob.org blog where the question was asked “How can we put text at the end of each line on the graph?”. The question was so good I realised that it had to be part of the book, so here you go :-). It’s actually not too difficult. What we are trying to achieve is to find the position of the end of each line and to add a text label at that position so that the association of proximity denotes the linkage. Of course we’re going to go a little further and colour the text so that it’s really clear which label belongs with which line, but you get the idea. Each line requires a single block of script to add the text. The block that adds the open label is as follows; So firstly it appends a textual element to the svg object; Then it finds the position of the end of the line; To do this we use the transform and translate attribute and find the x position that equates to the end of the graph plus 3 pixels ((width+3)) (we add in the three pixels to create a small separation between the end of the line and the label). The y position is far more interesting. We need to find the position of the last point in our line for the open data. Because the data is in the form of an indexed array and because the data has the latest date at the start of the array, we only need to find the point at the 0 position of the array. This is data[0].open. But of course, we also need to adjust our data for our scale and range, so we transform it using the y function (in the same way that we do it for the valueline and valueline2 points. So the script to find the point on the screen in the y direction is y(data[0].open). If our data was arranged with the last date at the end of our data we would have to find the final index point and we would use y(data[data.length-1].open)). Then it’s just a matter of aligning and justifying our text correctly; Then colouring it the correct colour; And adding out text; We put this block of code after the blocks that add in the axes so that they make sure they’re on top of anything else we draw. Of course we will want to add another (almost) duplicate of the block for the ‘close’ column. The only other small change we want to make is to change the right margin for the graph that we set at the start of our script from 20 to 40 so that there is enough room to add our label without cutting it off. After that you have a marvellously labelled multi-line graph! The full code for this example can be found on github or in the code samples bundled with this book (dual-labels.html and data2.csv). A working example can be found on bl.ocks.org. Now, I’d like to pretend that this is perfection, but it isn’t. If our lines end too close together, the labels will interfere with each other, so in the ideal world I would include a bit of fanciness to prevent that, but for the purposes of this exercise we can consider ourselves happy. ### Multiple axes for a graph Alrighty… Let’s imagine that we want to show our wonderful graph with two lines, much like we already have, but imagine that the data that the lines are made from is significantly different in magnitude from the original data (in the example below, the data for the second line has been reduced by approximately a factor of 10 from our original data). Now this isn’t a problem in itself. D3 will still make a reasonable graph of the data, but because of the difference in range, the detail of the second line will be lost. What I’m proposing is that we have a second y axis on the right hand side of the graph that relates to the red line. The mechanism used is based on the great examples put forward by Ben Christensen here. The full code for this example can be found on github or in the code samples bundled with this book (dual-axes.html and data4.csv). A working example can be found on bl.ocks.org. First things first, there won’t be space on the right hand side of our graph to show the extra axis, so we should make our right hand margin a little larger. I went for 40 and it seems to fit pretty well. Then (and here’s where the main point of difference for this graph comes in) you want to amend the code to separate out the two scales for the two lines in the graph. This is actually a lot easier than it sounds, since it consists mainly of finding anywhere that mentions y and replacing it with y0 and then adding in a reciprocal piece of code for y1. Let’s get started. In order to colour the text on the two different y axes we will need to declare their styles in the <style> section at the start of the code. In the declaration shown below we are calling the two different classes ‘axisSteelBlue’ and ‘axisRed’. The style that we set for each is ‘fill’ since this is the style that will colour the text Then into the JavaScript and we want to change the variable declaration for y to y0 and add in y1. Now change our valueline declarations so that they refer to the y0 and y1 scales. There are a few different ways for the scaling to work, but we’ll stick with the fancy max method we used in the dual line example (although technically it’s not required). Again, here’s the y0 and y1 changed and added and the maximums for d.close and d.open are separated out). The final piece of the puzzle is to draw the new axis, but we also want to colour code the text in the axes to match the lines. We do this using the styling that we declared earlier In the above code you can see where we have added in a ‘style’ change for the axisLeft to make it ‘steelblue’ and a complementary change in the new section for axisRight to make that text red. The yAxisRight section obviously needs to be added in, but the only significant difference is the transform / translate attribute that moves the axis to the right hand side of the graph. And after all that, here’s the result… Now, let’s not kid ourselves that it’s a thing of beauty, but we should console our aesthetic concerns with the warm glow of understanding how the function works :-). ## Elements, Attributes and Styles This chapter is intended to provide an overview of some of the simpler things that d3.js can do, but in a way that may help some understand a little more about how images can be added to a web page and how they can be manipulated. Loosely speaking we will look at how objects (elements (like circles, rectangles, lines and even text)) can be declared and added to a page, how their attributes in relation to the page (position, size, shape, actions) can be changed and how their style (colour, width, transparency) can be applied. As we go through the explanation of different changes that can be applied to different elements there will be a small amount of repetition where there is cross-over with related drawing features. Please be patient :-). The aim is to have each section as complete in its own right as practical. ### The Framework To be able to demonstrate how these three related aspects of drawing objects work we will have to use a small, simple script to draw them in your web browser. We will just take a moment to explain the script that draws a circle. Here’s the contents of the file in its entirety. I have imaginatively called it circle.html and you can find it in the code samples that can be downloaded with the book. Please feel free to jump ahead slightly if you understand how a HTML file with JavaScript goes together :-). The HTML part of the file can be thought of as a wrapper for the JavaScript that will draw our circle. These are the HTML parts here… This portion of the file is built using HTML ‘tags’. These will set up the environment for the JavaScript. The tags tell the web browser what sort of language is being used and the type of characters used to write the code… Areas of the code are labelled. Like the body… And the place where we put the JavaScript… We even load an external file that contains JavaScript that will help run our code. Yes, that’s the line that loads d3.js. Once it’s loaded we can use the instructions that it makes available to make other JavaScript code (in this case ours) work. Then we have the JavaScript code that allows us to use the functions made possible by d3.js. I’ve broken the code into two separate blocks to provide some clarity to their function. We could make it one block, but that wouldn’t necessarily make it easier to understand. Firstly we add a ‘holder’ for our graphics on the web page. I’ve named it holder but we could just as easily named it anything we wanted. The first thing we do when declaring our holder is to select the body element of our web page (Remember those <body> tags in the HTML part earlier?). Then we append a Scalable Vector Graphic (SVG) object to the body and we make it 449 pixels wide and 249 pixels high. The width and height are ‘attributes’ of the SVG object. That is to say they describe a property of the object. The second block of our JavaScript finally draws our circle. The first line appends a new element (a circle) to our SVG ‘holder’. The second and third lines declare the attribute of our circle that specify where the centre of the circle is. In this case it’s at the x/y position 200/100 (cx/cy). The last line adds the radius attribute r. Here it is set to 50 pixels. The three attributes cx, cy and r are all required when drawing a circle. There are other attributes we can put in there (and when we look at some of the upcoming elements, you should get a feel for them), but these are the minimum. The purpose of describing this block of code that draws a circle isn’t to show you how to draw a circle. This has only been a way of showing you how the code in the following sections is laid out and how it works. The elements we are going to generate can be drawn with exactly the same file but with just the section that adds the circle altered. For example if you were to change this block of code; For this block of code; Instead of drawing a circle we would be drawing a rectangle. So this is what our circle will look like; Because it will help a great deal to have a common frame of reference, I’m going to display the elements on a grid that looks a little like this; With the grid in place it’s far easier to see that the centre of our circle is indeed at the coordinates x = 200, y = 100 and that the radius is 50. The circle is still somewhat plain, but bear with me because as we start to explore what we can do with styles and attributes we can add some variation to our elements. With that explanation behind us we should begin our odyssey into the world of d3 elements. ### Elements We will begin by describing what we mean when we talk about an ‘element’. There is considerable scope for confusion when talking about elements on a web page. Are we talking about HTML elements, SVG elements or something different? In fact we are going to be describing a subset of SVG elements. Specifically a collection of common shapes and objects which include circles, ellipses, rectangles, lines, polylines, polygons, text and paths. “Text?” I hear you say. “Doesn’t sound like a shape.” I suppose it depends on how you think of it. We can use text in different ways in d3, but for this particular exercise we can regard text as an SVG element. #### Circle A circle is a simple SVG shape that is described by three required attributes. • cx: The position of the centre of the circle in the x direction (left / right) measured from the left side of the screen. • cy: The position of the centre of the circle in the y direction (up / down) measured from the top of the screen. • r: The radius of the circle from the cx, cy position to the perimeter of the circle. The following is an example of the code section required to draw a circle in conjunction with the HTML file outlined at the start of this chapter; This will produce a circle as follows; The centre of the circle is at x = 200 and y = 100 and the radius is 50 pixels. #### Ellipse An ellipse is described by four required attributes; • cx: The position of the centre of the ellipse in the x direction (left / right) measured from the left side of the screen. • cy: The position of the centre of the ellipse in the y direction (up / down) measured from the top of the screen. • rx: The radius of the ellipse in the x dimension from the cx, cy position to the perimeter of the ellipse. • ry: The radius of the ellipse in the y dimension from the cx, cy position to the perimeter of the ellipse. The following is an example of the code section required to draw an ellipse in conjunction with the HTML file outlined at the start of this chapter; This will produce an ellipse as follows; The centre of the ellipse is at x = 200 and y = 100 and the radius is 50 pixels vertically and 100 pixels horizontally. #### Rectangle A rectangle is described by four required attributes and two optional ones; • x: The position on the x axis of the left hand side of the rectangle (required). • y: The position on the y axis of the top of the rectangle (required). • width: the width (in pixels) of the rectangle (required). • height: the height (in pixels) of the rectangle (required). • rx: The radius curve of the corner of the rectangle in the x dimension (optional). • ry: The radius curve of the corner of the rectangle in the y dimension (optional). The following is an example of the code section required to draw a rectangle (using only the required attributes) in conjunction with the HTML file outlined at the start of this chapter; This will produce a rectangle as follows; The top left corner of the rectangle is at 100, 50 and the rectangle is 200 pixels wide and 100 pixels high. The following code section includes the optional attributes for the curved corners; This will produce a rectangle (with curved corners) as follows; The corners are curved with radii in the x and y direction of 10 pixels. #### Line A line is a simple line between two points and is described by four required attributes. • x1: The x position of the first end of the line as measured from the left of the screen. • y1: The y position of the first end of the line as measured from the top of the screen. • x2: The x position of the second end of the line as measured from the left of the screen. • y2: The y position of the second end of the line as measured from the top of the screen. The following is an example of the code section required to draw a line in conjunction with the HTML file outlined at the start of this chapter. A notable addition to this code is the style declaration. In this case the line has no colour and this can be added with the stroke style which applies a colour to a line; This will produce a line as follows; The line extends from the point 100,50 to 300,150. #### Polyline A polyline is a sequence of connected lines described with a single attribute, points. • points: The points attribute is a list of x,y coordinates that are the locations of the connecting points of the polyline. The following is an example of the code section required to draw a polyline in conjunction with the HTML file outlined at the start of this chapter. A notable addition to this code are the style declarations. In this case the line of the polyline has no colour and this can be added with the stroke style which applies the colour black to a line. Likewise the area that is bounded by the polyline will be automatically filled with black unless we explicitly tell the object not to. This is achieved in this example by addition of the fill style to none. This will produce a polyline as follows; The polyline extends from the point 100,50 to 200,150 to 300,50. #### Polygon A polygon is a sequence of connected lines which form a closed shape described with a single attribute, points. • points: The points attribute is a list of x,y coordinates that are the locations of the connecting points of the polygon. The last point is in turn connected to the first point. The following is an example of the code section required to draw a polygon in conjunction with the HTML file outlined at the start of this chapter. A notable addition to this code are the style declarations. In this case the line of the polygon has no colour and this can be added with the stroke style which applies the colour black to a line. Likewise the area that is bounded by the polygon will be automatically filled with black unless we explicitly tell the object not to. This is achieved in this example by addition of the fill style to none. This will produce a polygon as follows; The polygon extends from the point 100,50 to 200,150 to 300,50 and then back to 100,50. #### Path A path is an outline of an SVG shape which is described with a ‘mini-language’ inside a single attribute. • d: This attribute is a list of instructions that allow a shape to be drawn in a complex way using a ‘mini-language’ of commands. These commands are written in a shorthand of single letters such as M-moveto, Z-closepath, L-lineto, C-curveto. These commands can be absolute (normally designated by capital letters) or relative (lower case). The following is an example of the code section required to draw a triangle in conjunction with the HTML file outlined at the start of this chapter. A notable addition to this code are the style declarations. In this case the line of the path has no colour and this can be added with the stroke style which applies the colour black to a line. Likewise the area that is bounded by the path will be automatically filled with black unless we explicitly tell the object not to. This is achieved in this example by addition of the fill style to none. This will produce a path as follows; The path mini-language first moves (M) to 100,50 then draws a line (L) to 200,150 then draws another line (L) to 300,50 then closes the path (Z). #### Clipped Path (AKA clipPath) A clipPath is the path of a SVG shape that can be used in combination with another shape to remove any parts of the combined shape that doesn’t fall within the clipPath. That sounds slightly confusing, so we will break it down a bit to hopefully clarify the explanation. Let’s imagine that we want to display the intersection of two shapes. What we will do is define our clipPath which will act as a ‘cookie cutter’ which can cut out the shape we want (we will choose an ellipse). Then we will draw our base shape (which is analogous to the dough) that we will use our cookie cutter on (our dough will be shaped as a rectangle). The intersection of the cookie cutter and the dough is our clipped path. Our clipPath (cookie cutter) element is an ellipse; Our shape that we will be clipping (the dough) is a rectangle; The intersection of the two is the clipped path (shaded grey); The graphic examples above are misleading in the sense that the two basic shapes are not actually displayed. All that results from the use of the clipPath is the region that is the intersection of the two. The following is an example of the code section required to draw the clipped path in conjunction with the HTML file outlined at the start of this chapter. The clipPath element is given the ID ‘ellipse-clip’ and a specified size and location. Then when the rectangle is appended. the clipPath is specified as an attribute (via a URL) using clip-path. This will produce a path as follows; An example of this in use can bee seen in the difference chart explanation later in the book. #### Text A text element is an SVG object which is shaped as text. It is described by two required attributes and three optional ones. • x: This attribute designates the anchor point location for the text in the x dimension (required). • y: This attribute designates the anchor point location for the text in the y dimension (required). • dx: This attribute designates the offset of the text from the anchor point in the x dimension (optional). There are several different sets of units that can be used to designated the offset of the text from an anchor point. These include em which is a scalable unit (used in these examples), px (pixels), pt (points (kind of like pixels)) and 5 (percent (scalable and kind of like em)) • dy: This attribute designates the offset of the text from the anchor point in the y dimension (optional). • text-anchor: This attribute controls the horizontal text alignment (optional). It has three values; start (left aligned), middle (centre aligned) and end (right aligned). The following is an example of the code section required to draw the text “Hello World” in conjunction with the HTML file outlined at the start of this chapter. A notable addition to this code is the style declaration which applies a black fill to the text. Additionally there is the declaration .text which defines the text that will be displayed. This will produce text as follows; It can be seen from the image that the anchor point for the text is at 200,100 and that the text is positioned with this anchor point at the bottom, left of the text. The following examples will demonstrate the various options for positioning and aligning text so that you can arrange it correctly. ##### Anchor at the bottom, middle of the text: This will produce text as follows; ##### Anchor at the bottom, right of the text: This will produce text as follows; ##### Anchor at the middle, left of the text: This will produce text as follows; ##### Anchor in the middle, centre of the text: This will produce text as follows; ##### Anchor in the middle, right of the text: This will produce text as follows; ##### Anchor at the top, left of the text: This will produce text as follows; ##### Anchor at the top, middle of the text: This will produce text as follows; ##### Anchor at the top, right of the text: This will produce text as follows; ### Attributes At the start of writing this section I was faced with the question “What’s an attribute?”. But a reasonable answer has eluded me, so I will make the assumption that the answer will be something of a compromise :-). I like to think that an attribute of an element is something that is a characteristic of the object without defining it, and/or it may affect the object’s position or orientation on the page. There could be a strong argument to say that the following section on styles could be seen to cross-over into attributes and I agree. However, for the purposes of providing a description of the syntax and effects, I’m happy with the following list :-). Because not all attributes are applicable to all elements, there will be a bit of variation in the type of shapes we deal with in the description below, but there won’t be any that are different to those that we’ve already looked at. There will be some repetition with recurring information from the elements section. This is intentional to hopefully allow each section to exist in its own right. #### x, y The x and y attributes are used to designate a position on the web page that is set from the top, left hand corner of the web page. Using the x and y attributes places the anchor points for these elements at a specified location. Of the elements that we have examined thus far, the rectangle element and the text element have anchor points to allow them to be positioned. For example the following is a code section required to draw a rectangle (using only the required attributes) in conjunction with the HTML file outlined at the start of this chapter; This will produce a rectangle as follows; The top left corner of the rectangle is specified using x and y at 100 and 50 respectively. #### x1, x2, y1, y2 The x1, x2, y1 and y2 attributes are used to designate the position of two points on a web page that are set from the top, left hand corner of the web page. These two points are connected with a line as part of the line element. The attributes are described as follows; • x1: The x position of the first end of the line as measured from the left of the screen. • y1: The y position of the first end of the line as measured from the top of the screen. • x2: The x position of the second end of the line as measured from the left of the screen. • y2: The y position of the second end of the line as measured from the top of the screen. The following is an example of the code section required to draw a line in conjunction with the HTML file outlined at the start of this chapter. The attributes connect the point 100,50 (x1, y1) with 300,150 (x2, y2); This will produce a line as follows; The line extends from the point 100,50 to 300,150. #### points The points attribute is used to set a series of points which are subsequently connected with a line and / or which may form the bounds of a shape. These are specifically associated with the polyline and polygon elements. Like the x, y and x1, x2, y1, y2 attributes, the coordinates are set from the top, left hand corner of the web page. The data for the points is entered as a sequence of x,y points in the following format; Where 100,50 is the first x,y point then 200,150 is the second. The following is an example of the code section required to draw a polyline in conjunction with the HTML file outlined at the start of this chapter. The additional style declarations are included to illustrate the shape better. The points values can be compared with the subsequent image. This will produce a polyline as follows; The polyline extends from the point 100,50 to 200,150 to 300,50. #### cx, cy The cx, cy attributes are associated with the circle and ellipse elements and designate the centre of each shape. The coordinates are set from the top, left hand corner of the web page. • cx: The position of the centre of the element in the x axis measured from the left side of the screen. • cy: The position of the centre of the element in the y axis measured from the top of the screen. The following is an example of the code section required to draw an ellipse in conjunction with the HTML file outlined at the start of this chapter. In it the centre of the ellipse is set by cx, cy as 200, 100. This will produce an ellipse as follows; The centre of the ellipse is at x = 200 and y = 100 and the radius is 50 pixels vertically and 100 pixels horizontally. #### r The r attribute determines the radius of a circle element from the cx, cy position (the centre of the circle) to the perimeter of the circle. The following is an example of the code section required to draw a circle in conjunction with the HTML file outlined at the start of this chapter; This will produce a circle with a radius of 50 pixels as follows; The centre of the circle is at x = 200 and y = 100 and the radius is 50 pixels. #### rx, ry The rx, ry attributes are associated with the ellipse element and designate the radius in the x direction (rx) and the radius in the y direction (ry). • rx: The radius of the ellipse in the x direction from the cx, cy position to the perimeter of the ellipse. • ry: The radius of the ellipse in the y direction from the cx, cy position to the perimeter of the ellipse. The following is an example of the code section required to draw an ellipse in conjunction with the HTML file outlined at the start of this chapter. In it, the centre of the ellipse is set by cx, cy as 200, 100 and the radius in the x direction (rx) is 100 pixels and the radius in the y direction (ry) is 50 pixels. This will produce an ellipse as follows; The centre of the ellipse is at x = 200 and y = 100 and the radius is 50 pixels vertically and 100 pixels horizontally. #### transform (translate(x,y), scale(k), rotate(a)) The transform attribute is a powerful one which allows us to change the properties of an element in several different ways. • translate: Where the element is moved by a relative value in the x,y direction. • scale: Where the element’s attributes are increased or reduced by a specified factor. • rotate: Where the element is rotated about its reference point by an angular value. Without a degree of prior understanding, these transforms can appear to behave in unusual ways, but hopefully we’ll explain it sufficiently here so that you can appreciate the logic in the way they work. ##### transform (translate(x,y)) The transform-translate attribute will take an element’s position and adjust it based on a specified value(s) in the x,y directions. The best way to illustrate this is with an example; This is the code snippet from the HTML file outlined at the start of this chapter which draws a circle at the position 200,100 (cx,cy); This will produce a circle as follows; If we add in a transform (translate(*x*,*y*)) attribute for values of x,y of 50,50 this will shift our circle by an additional 50 pixels in the x direction and 50 pixels in the y direction. Here’s the code snippet that will draw our new circle; And here’s the resulting change; The circle was positioned at the point 200,100 and then translated by 50 pixels in both axes to 250,150. The original code snippet could in fact be written as follows; Since by default our starting position is 0,0 if we apply a translation of 200,100 we will end up at 200,100. ##### transform (scale(k)) The translate-scale attribute will take an element’s attributes and scale them by a factor k. Originally I thought that this attribute would affect the size of the element, but it affects more than that! As with the transform-translate attribute, the best way to illustrate this is with an example; The following code snippet (in conjunction with the HTML file outlined at the start of this chapter) which draws a circle at the position 150,50 with a radius of 25 pixels; This will produce a circle as follows; If we now introduce a transform-scale attribute with a scale of 2 we will see all three of the other attributes (cx, cy and r) scaled by a factor of two to 300, 100 and 50 respectively. Here is the code; Which will produce a circle as follows; In this example we can see that the position (cx, cy) and the radius (r) have been scaled up by a factor of 2. ##### transform (rotate(a)) The translate-rotate attribute will rotate an element and its attributes by a declared angle in degrees. The ability to rotate elements is obviously a valuable tool. The transform-rotate attribute does a great job of it, but the key to making sure that you know exactly what will happen to an object is to remember where the anchor point is for the object and to ensure that the associated attributes are set appropriately. As with the transform translate & scale attributes, the best way to illustrate this is with an example; The following is the code snippet (in conjunction with the HTML file outlined at the start of this chapter) which draws the text “Hello World” at the position 200,100 with the anchor point being the the middle of the text; This will produce text as follows; If we then apply a transform-rotate of 10 degrees as follows; We will see the following on the screen; Obviously the text has been rotated, but hopefully you’ll have noticed that it’s also been displaced. This is because the transform-rotate attribute has been applied to both the text element (which has been rotated by 10 degrees) and the x,y attributes. If you imagine the origin point for the element being at 0,0, the centre, middle of the text element has been rotated about the point 0,0 by 10 degrees (hopefully slightly better explained in the following picture). This could be seen as an impediment to getting things to move / change as you want to, but instead it’s an indication of a different way of doing things. The solution to this particular feature is to combine the transform-rotate with the transform-translate that we used earlier so that the code looks like this; And the image on the page looks like this; Which leads us to the final example which is a combination of all three aspects of the transform attribute. Here we have a text element translated to its position on the page, rotated by 10 degrees about the centre of the text and scaled by a factor of two. #### width, height width and height are required attributes of the rectangle element. width designates the width of the rectangle and height designates the height (If you’re wondering, I often struggle defining the obvious). The following is an example of the code section required to draw a rectangle (using only the required attributes) in conjunction with the HTML file outlined at the start of this chapter; This will produce a rectangle as follows; The width of the triangle is 200 pixels and the height is 100 pixels. #### text-anchor The text-anchor attribute determines the justification of a text element Text can have one of three text-anchor types; • start where the text is left justified. • middle where the text is centre justified. • end where the text is right justified. The following is an example of code that will draw three separate lines of text with the three different text-anchor types in conjunction with the HTML file outlined at the start of this chapter; This will produce an output as follows; #### dx, dy dx and dy are optional attributes that designate an offset of text elements from the anchor point in the x and y dimension . There are several different sets of units that can be used to designate the offset of the text from an anchor point. These include em which is a scalable unit, px (pixels), pt (points (kind of like pixels)) and % (percent (scalable and kind of like em)) We can demonstrate the offset effect by noting the difference in two examples. The first is a simple projection of SVG text that aligns the text “Hello World” above and to the right of the anchor point at 200,100 (It does this in conjunction with the HTML file outlined at the start of this chapter.). Which produces the following on the page; The second example introduces the dx attribute setting the offset to 50 pixels. This adds another 50 pixels to the x dimension. We also introduce the dy attribute with an offset of .35em. This scalable unit allows the text to be set as a factor of the size of the text. In this case .35em will add half the height of the text to the y dimension placing the text so that it is exactly in the middle (vertically) of the 100 pixel line on the y dimension. Which produces the following on the page; The text has been moved 50 pixels to the right and half the height of the text down the page. #### textLength The textLength attribute adjusts the length of the text to fit a specified value. The following is a code snippet that prints the text “Hello World” above and to the right of the anchor point at 200,100 (It does this in conjunction with the HTML file outlined at the start of this chapter.). The addition of the textLength attribute declaration in the code stretches the “Hello World” out so that it fills 150 pixels. Which produces the following on the page; It is worth noting that while the text has been spread out, the individual letters remain un-stretched. Only the letter and word spacing has been adjusted. However, using the lengthAdjust attribute can change this. #### lengthAdjust The lengthAdjust attribute allows the textLength attribute to have the spacing of a text element controlled to be either spacing or spacingAndGlyphs; • spacing: In this option the letters remain the same size, but the spacing between the letters and words are adjusted. • spacingAndGlyphs: In this option the text is stretched or squeezed to fit. The attribute can be best illustrated via an example. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) shows three versions of the text element. The top line is the standard text. The middle line is the textLength set to 150 and the lengthAdjust set to spacing (which is the default). The bottom line is the textLength set to 150 and the lengthAdjust set to spacingAndGlyphs. The image on the screen will look like the following; The image shows that the top line looks normal, the middle line has had the spaces increased to increase the length of the text and the bottom line has been stretched. ### Styles What’s a style? Believe it or not, that’s as difficult a question to answer as “What’s an attribute?”. I like to think that an element can be selected and arranged on a web page with select and attr, but once it’s there, changes to how it looks are a matter for style. We will cover a range of qualities that neatly fit into this definition in the following section (such as fill, opacity and stroke-width) but there are also a range of unusual style declarations that many may not have come across (I certainly hadn’t before writing this). The other important thing to mention about setting styles for elements is that there are different ways to accomplish the task. We’ll go through the process of describing different styles as they can be applied to individual elements in isolation, but there is a more powerful way to manage styles across a range of elements via Cascading Style Sheets (CSS) in the <style> section of a web page or even via an external style sheet. We will examine these possibilities at the end of the section. Full disclosure: I have not figured out how to work some of the styles for d3.js I’m afraid that clip-path and mask have exceeded my skill-set and I will have to leave them for another day :-(. I found that there are several good examples that make use of these styles, but I have struggled (unsuccessfully) to present them in a simple example. #### fill The fill style will fill the element being presented with a specified colour. By default, most elements will be filled with black (the majority of the examples used in this chapter make no fill declaration). The following example (which works in conjunction with the HTML file outlined at the start of this chapter) shows the syntax for filling a simple circle with the colour red; Which results in the following image; As we saw with the polyline and polygon examples earlier in the chapter some shapes may need to have their fill colour turned off in some circumstances and this can be accomplished by declaring the colour to be none (.style("fill", "none");). There are several different ways to define exactly what colour we want as a fill. The example above uses a ‘named colour code’ to declare the colour as “red” but we could also have defined it as rgb (.style("fill", "rgb(255,0,0)");) or in hexadecimal (.style("fill", "#f00");) #### stroke The stroke style applies a colour to lines. By default many elements do not have a stroke colour set, so it’s a matter of declaring the colour with either a named colour code (“red”), an rgb value (“rgb(255,0,0)”) or the appropriate hex (“#f00”). The following example (which works in conjunction with the HTML file outlined at the start of this chapter) shows the syntax for applying the colour red to a simple circle. The fill has been set to none to help the colour stand out. Which results in the following image; #### opacity The opacity style has the effect of varying an element’s transparency. The valid range for opacity is from 0 (completely transparent) to 1 (solid colour). We should make the distinction at this point that opacity affects the entire element, whereas the following fill-opacity and stroke-opacity affect only the fill and stroke respectively. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates a green circle with a red border. The opacity value of .2 creates a degree of transparency which will show the grid lines underneath the element. Which results in the following image; #### fill-opacity The fill-opacity style changes the transparency of the fill of an element. The valid range for fill-opacity is from 0 (completely transparent) to 1 (solid colour). We should make the distinction at this point that fill-opacity affects only the fill of an element, whereas opacity will affect the entire element. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates a green circle with a red border. The opacity value of .2 creates a degree of transparency for the fill which will show the grid lines underneath. Which results in the following image; The distinction between this image and the one for the opacity style clearly shows the line around the outside of the object as still a solid (opaque) colour. #### stroke-opacity The stroke-opacity style changes the transparency of the stroke (line) of an element. The valid range for stroke-opacity is from 0 (completely transparent) to 1 (solid colour). We should make the distinction at this point that stroke-opacity affects only the line or border of an element, whereas opacity will affect the entire element. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates an empty circle with a red border. The opacity value of .2 creates a degree of transparency for the stroke which will show the grid lines underneath (or at least make it appear more ‘muted’). Which results in the following image; Although it is not necessarily easy to see in this example because the line is quite thin, the lines of the grid behind the circle will be showing through the line of the circle. #### stroke-width The stroke-width style adjusts the width of the line of an element. The value specified when setting stroke-width is in pixels. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates an empty circle with a red border. The stroke-width is set to 5 which equates to 5 pixels (it can also be specified as “5px”). Which results in the following image; The width of the line that forms the border of the circle is now 5 pixels wide :-). #### stroke-dasharray The stroke-dasharray style allows us to form element lines with dashes instead of solid lines. We have covered dashed lines in practical way in a previous section of the book (‘Make a Dashed Line’) but for the sake of completeness I will include dashed lines here as well. We create a dashed line by specifying the length of a dash and then the length of a space. We can include a long list of dashes and spaces and once complete our line will simply repeat the pattern we have specified. For example the following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates a line with a dash of 10 pixels followed by a space of 3 pixels; Which results in the following image; More complex combinations of dashes and spaces are possible as are complex animation sequences that leverage the ability to move objects along a path (these are certainly more advanced examples). #### stroke-linecap The stroke-linecap style allows control of the shape of the ends of lines in d3.js. There are three shape options; • butt where the line simply butts up to the starting or ending position and is cut off squarely. • round where the line is rounded in proportion to its width. • square where the line is squared off but extended in proportion to its width. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) generates three lines showing each stroke-linecap style option. The top line uses butt. The middle line uses round and the bottom line uses square. Which results in the following image; The shapes are quite distinct for each type and it is useful to note the degree to which the lines extend beyond their start and end points. #### stroke-linejoin The stroke-linejoin style specifies the shape of the join of two lines. This would be used on path, polyline and polygon elements (and possibly more). There are three line join options; • miter where the join is squared off as would be expected at the join of two lines. • round where the outside portion of the join is rounded in proportion to its width. • bevel where the join has a straight edged outer portion clipped off to provide a slightly more contoured effect while still being angular. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) generates a poly line where the join has the connection shaped using the stroke-linejoin round style. Which results in the following image; Note the curve on the outer of the join. Changing the shape of the line join to bevel produces the following; Here we can see the clipping of the outer portion of the join. And using miter produces a standard connection; This is the default setting for line joins and does not need to be added unless the line join type has already been set to a different default. #### writing-mode The writing-mode style changes the orientation of the text so that it prints out top to bottom. It has a single option “tb” that accomplishes this. It is relatively limited in scope compared to the equivalent for CSS, but for the purposes of generating some text it has a definite use. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates a line of text that is now printed from top to bottom instead of left to right. Which results in the following image; It is significant to note that while it looks like the text has been rotated about its anchor point, this actually isn’t the case since the anchor point should be at 200,100. Also, the glyph-orientation-vertical style (which follows) will allow the text to be orientated vertically which will be useful. #### glyph-orientation-vertical The glyph-orientation-vertical style changes the rotation of the individual glyphs (characters) in text and if used in conjunction with the writing-mode style (and set to 0) will allow the text to be displayed vertically with the letters orientated vertically as well. The following code snippet (which works in conjunction with the HTML file outlined at the start of this chapter) creates a line of text that is now printed from top to bottom with letters orientated vertically. Which results in the following image; It is worth noting that the text spacing increases dramatically as the spacing for each letter relies on the normal distance between the bottom and top of a line of text. #### Using styles in Cascading Style Sheets Declaring styles on an element by element basis is an OK way to apply styles, but when our visualizations become more complex, this can be an inefficient use of code. A smarter way to provide a common set of styles to elements is to declare them in the <style> section of our HTML document using Cascading Style Sheets (CSS). These will then be automatically applied to our elements. We start with an example script that draws our three lines that have different styles of linecaps. Our previous example looked like the following (in conjunction with the HTML file outlined at the start of this chapter) Which resulted in the following image; The block of code for each of the three lines contains three separate style declarations. Two of which are identical for all three blocks of code; To make these styles available from a common point, we declare them in the <style> section of our HTML file as follows; The <style> tags simply tell our browser which part of the HTML file we are using to define our styles. The line.linecap portion identifies the following styles as belonging to the line elements that are also identified as belonging to the ‘class’ linecap (We have used the linecap name as a convenience only and it could just as easily have been foobar.). The two styles are enclosed within curly braces and are declared in the form <style-name>: <style-value>;. So for our example here, the stroke is black and its width is 20 pixels. Then our example script can have the two styles removed from each of the blocks that draws the lines and in their place we add a new attribute class that assigns a class to the element (in this case the class linecap). Our new code will look like this; While this has only replaced two lines with one in our code, the potential for use in far more complex examples should be obvious. There is significantly more detail that can be gone into with regard to CSS, but that would be beyond my meagre abilities. ## Manipulating data ### How to use data imported from a csv file with spaces in the header. When importing data from a csv file (dataSpace.csv) that has headers with spaces in the middle of some of the fields there is a need to address the data slightly differently in order for it to be used easily in your JavaScript. For example the following csv data has a column named ‘Date Purchased’; This is not an uncommon occurrence since RFC 4180 which specifies csv content allows for it and d3.js supports the RFC; Within the header and each record, there may be one or more fields, separated by commas. Each line should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. When we go to import the data using the d3.csv function, we need to reference the ‘Data Purchased’ column in a way that makes allowances for the space. The following piece of script (with grateful thanks to Stephen Thomas for answering my Stack Overflow question) appears to be the most basic solution. In the example above the ‘Date Purchased’ column is re-declared as ‘date’ making working in the following script much easier. ### Extracting data from a portion of a string. Suppose we have a set of values we want to extract from a string because they cannot be used in their original form. For example, the following csv file contains the column ‘value’ and the values of the data in that column are prefixed with a dollar sign ().

We can use the JavaScript substring() method to easily remove the leading character from the data.

The following example processes our csv file after loading it and for each ‘value’ entry on each row takes a substring of the entry that removes the first character and retains the rest.

The substring() function includes a ‘start’ index (as used above) and optionally a ‘stop’ index. More on how these can be configured can be found on the w3schools site.

### Grouping and summing data (d3.nest)

Often we will wish to group elements in an array into a hierarchical structure similar to the GROUP BY operator in SQL (but with the scope for multiple levels). This can be achieved using the d3.nest operator. Additionally we will sometimes wish to collapse the elements that we are grouping in a specific way (for instance to sum values). This can be achieved using the rollup function.

The example we will use is having the following csv file consisting of a column of dates and corresponding values;

We will nest the data according to the date and sum the data for each date so that our data is in the equivalent form of;

We will do this with the following script;

We are assuming the data is in a csv file and is named source-data.csv.

The first thing we do is load that file and assign the loaded array the variable name csv_data.

Then we declare our new array’s name will be data and we initiate the nest function;

We assign the key for our new array as date. A ‘key’ is like a way of saying “This is the thing we will be grouping on”. In other words our resultant array will have a single entry for each unique date value.

Then we include the rollup function that takes all the individual value variables that are in each unique date field and sums them;

Lastly we tell the entire nest function which data array we will be using for our source of data.

What if your data turns out to be unsorted? Never fear, we can easily sort on the key value by tacking on the sortKeys function like so;

You should note that our data will have changed name from date and value. This is as a function of the nest and rollup process. But never fear, it’s a simple task to re-name them if necessary using the following function (which could include a call to parse the date, but I have omitted it for clarity);

### Selecting a random string from an array.

What if we had a situation where we wanted to be able to select a random colour for the fill of a set of objects from a restricted set of colour options.

The colours we want are green, orange, red and blue and the solution uses an adaptation of the one presented by Jacob Relkin on stackoverflow.

First we start by declaring the colours in an array;

From there we set up the function that will return one of the elements of the array at random by calculating an index number from the array of possible options based on the length of the array;

colorRange.length returns the number of elements in the array (in this case 4). This is multiplied by a random number between 0 and 1 (Math.random()). Then we get the largest integer that is less than or equal to our generated number using Math.floor. This ‘flattens out’ the result to be one of 0,1,2 or 3.

Then when we want to find one of our random colours we simply call our randomColour function a little like the following for a fill.

## Bar Charts and Histograms

Yes! There is a difference! I know they look similar but for a bar charts, each column represents a group defined by a category and with a histogram, each column represents a group defined by a range.

### Bar Chart

• Each column is positioned over a label that represents a categorical variable.
• The height of the column indicates the size of the group defined by the category.

#### Histogram

• Each column is positioned over a label that represents a quantitative variable.
• The column label can be a single value or a range of values.

### Bar Charts

A bar chart is a visual representation using either horizontal or vertical bars to show comparisons between discrete categories. There are a number of variations of bar charts including stacked, grouped, horizontal and vertical.

We will work through a simple vertical bar chart that uses a value on the y axis and category in the form of a name on the x axis.

The end result will look like this;

#### The data

The data for this example will be sourced from an external (purely fictional) csv file named sales.csv. It consists of a column of names and ‘sales’ and its contents are as follows;

#### The code

The full code listing for the example we are going to work through is as follows;

#### The bar chart explained

In the course of describing the operation of the file I will gloss over the aspects of the structure of an HTML file which have already been described at the start of the book. Likewise, aspects of the JavaScript functions that have already been covered will only be briefly explained.

The start of the file deals with setting up the document’s head and body, loading the d3.javascript script and setting up the CSS in the <style> section.

The CSS section sets styling for the colour of the bars. In all reality we could have placed it as a style later in the code, but it’s nice to have something in the CSS area because you never know, we might want it later.

Then our JavaScript section starts and the first thing that happens is that we set the size of the area that we’re going to use for the chart and the margins;

The next section of our code includes some of the functions that will be called from the main body of the code. This includes the functions to determine positioning in the x and y domains.

The band scale set up for the x domain is a neat function that allows the creation of a series of uniform bands that can be computed from the assigned range. For the purposes of our bar chart, these will be the equivalent of the bars. These bands and their properties (the spacing between them and other details) can be assigned for display purposes.

For example, in this case the padding is the space made available between bars. This is set to 0.1, or 1/10th of the width of the space available for each band. If we were to alter the padding to 0.5 (or half the width of the band) we would have the following;

For the full description of band scales, check out the D3 wiki.

The function to set the scaling in the y domain is the same as most of our other graph examples;

The next block of code selects the body on the web page and appends an svg object to it of the size that we have set up with our width, height and margins.

It also adds a g element that provides a reference point for adding our axes.

Then we begin the main body of our JavaScript. We load our csv file and then loop through it making sure that the dates and numerical values are recognised correctly;

We then work through our x and y data and ensure that it is scaled to the domains we are working in;

Then we add the bars to our chart;

This block of code creates the bars (selectAll("bar")) and associates each of them with a data set (.data(data)).

We then append a rectangle (.append("rect")) with the colour assigned by our class (set in the <style> section) along with values for x/y position. The width of the bars is determined from our band scale function we assigned earlier and is found by retrieving the value via the .bandwidth() call. The height is as configured in our earlier code.

Finally we append our axes;

The end result is our pretty looking bar chart;

### Histograms

A histogram is a graphical representation of the distribution of numerical data. It is typically formed by creating ‘bins’ of a larger dataset that group the data into a range of values and count the number of pieces of data fall into each bin. Each bin is then represented as a bar showing the relationship between each range.

The example we will work through shows the frequency of earthquakes above magnitude 3 between July 2010 and January 2012 in Christchurch, New Zealand (A time of some significant seismic activity). Data was sourced from New Zealand’s Geonet site.

We can see that the data has been ‘binned’ by month and that between the 1st of September and the 1st of October there were over 1800 earthquakes registering over magnitude 3.

#### The data

The data for this example will be sourced from an external (purely fictional) csv file named sales.csv. It consists of a column of dates in day-month-year format (and magnitudes, which won’t be used in this graph) and its contents looks similar to the following;

#### The code

The full code listing for the example we are going to work through is as follows;

#### The histogram explained

In the course of describing the operation of the file I will gloss over the aspects of the structure of an HTML file which have already been described at the start of the book. Likewise, aspects of the JavaScript functions that have already been covered will only be briefly explained.

The start of the file deals with setting up the document’s head and body, loading the d3.javascript script and setting up the CSS in the <style> section.

The CSS section sets styling for the colour of the rectangles that make up the bars. Similar to the Bar graph, we could have placed it as a style later in the code, but it’s nice to have something in the CSS area.

Then our JavaScript section starts and the first thing that happens is that we set the size of the area that we’re going to use for the chart and the margins;

Then we declare the code that parses the time;

Here we have it set to look for time that is formatted as day-month-year.

The next section of our code scales the ranges for x and y.

y is pretty standard, but for x we specify a time scale that has a domain that goes from one date to another. The date specified here is mildly artificial in the sense that I selected it to look good with the graph when I adjust the bins (you’ll see later), but a bit of experimentation will see you right. Lastly for the x range we get it to round itself to logical values using rangeRound.

Now we start to setup the function that will apply the D3 magic required to form our histogram with the code;

The d3.histogram function allows us to form our data into ‘bins’ that form “discrete samples into continuous, non-overlapping intervals”. In other words in this case we are going to take a data set of close to 10,000 points and we are going to form them into bins corresponding to the months that they occurred in. The value that we’re going to bin will be the variable date and they will fit into the domain that we have already specified in the range section (via .domain(x.domain())). Lastly, we apply the thresholds that we are going to use for the bins which in this case is monthly via .thresholds(x.ticks(d3.timeMonth));.

We can very crudely change our histogram by simply changing that to .thresholds(x.ticks(d3.timeWeek)); to produce the following;

And we can go slightly more extreme by specifying a daily bin and produce the following;

(Although technically I cheated slightly with this version and I removed the padding between the bars to allow the data to be presented a bit more faithfully.)

The next block of code selects the body on the web page and appends an svg object to it of the size that we have set up with our width, height and margins.

It also adds a g element that provides a reference point for adding our axes.

Then we begin the main body of our JavaScript. We load our csv file and then loop through it making sure that the dates converted into a time format correctly;

Now that we have our data we can put it into the appropriate bins using the histogram function that we declared earlier;

At this point we have two data sets. The first is our array of information from our earthquakes.csv file which is called ‘data’. The second is an array of grouped data called ‘bins’. We use the ‘bins’ data to draw our histogram.

We then use our ‘bin’ data to ensure that the y domain is scaled to the longest bar in the ‘bins’ data set;

Then we add the bars to our chart;

This block of code selects all the rectangles (selectAll("rect")) and associates each of them with our binned data set (.data(bins)).

We then append the rectangles (.append("rect")) with the colour assigned by our class (set in the <style> section) and we offset all the bars by 1 to make sure that we have a nice symmetrical set of bars with a thin separation (.attr("x", 1)).

The transform function sets the starting point for where we begin drawing the rectangles and the height and width attributes set the height and width of the rectangles. If you really want to stop and look at the transform and height attributes you will notice that it seems a bit ‘odd’. That is because it draws the graph from the top of the screen down. The origin is at the top left of the screen remember and we are trying to represent bars that appear to extend upwards from a ‘0’ point (on the y axis) that exists at a distance ‘height’ from the top of the screen. Sound weird. It is a little, but at the very least it’s logical. You can do it in several different ways, some more confusing than this, and I think that this represents a good balance between code complexity and understandability.

It’s also useful to note that when our histogram function was creating our bins, it also associated some variables with each bin. x0 and x1 to denote the start and stop point for each bin in the x domain and length for the number of data points in each bin. You can see more details in the D3 wiki here.

Finally we append our axes;

The end result is our sharp looking histogram;

## Tree Diagrams

### What is a Tree Diagram?

The ‘Tree layout’ is not a distinct type of diagram per se. Instead, it’s representative of D3’s family of hierarchical layouts.

It’s designed to produce a ‘node-link’ diagram that lays out the connection between nodes in a method that displays the relationship of one node to another in a parent-child fashion.

For example, the following diagram shows a root node (the starting position) labelled ‘Top Node’ which has two children (Bob: Child of Top Node and Sally: Child of Top Node). Subsequently, Bob:Child of Top Node has two dependant nodes (children) ‘Son of Bob’ and ‘Daughter of Bob’.

The clear advantage to this style of diagram is that describing it in text is difficult, but representing it graphically makes the relationships easy to determine.

The data required to produce this type of layout needs to describe the relationships, but this is not necessarily an onerous task. For example, the following is the data (in JSON form) for the diagram above and it shows the minimum information required to form the correct layout hierarchy.

It shows each node as having a name that identifies it on the tree and, where appropriate, the children it has (as an array).

There is a wealth of examples of tree diagrams on the web, but I would recommend a visit to blockbuilder.org as a starting point to get some ideas.

In this chapter we’re going to look at a very simple piece of code to generate a tree diagram before looking at different ways to adapt it. Including rotating it to be vertical, adding some dynamic styling to the nodes, importing from a flat file and from an external source. Finally we’ll look at a more complex example that is more commonly used on the web that allows a user to expand and collapse nodes interactively.

### A simple Tree Diagram explained

We are going to work through a simple example of the code that draws a tree diagram, This is more for the understanding of the process rather than because it is a good example of code for drawing a tree diagram. It is a very limited example that lacks any real interactivity which is one of the strengths of d3.js graphics. However, we will outline the operation of an interactive version towards the end of the chapter once we have explored some possible configuration options that we might want to make.

The graphic that we are going to generate will look like this…

The full code for it looks like this;

In the course of describing the operation of the file I will gloss over the aspects of the structure of an HTML file which have already been described at the start of the book. Likewise, aspects of the JavaScript functions that have already been covered will only be briefly explained.

The start of the file deals with setting up the document’s head and body loading the d3.js script and setting up the CSS in the <style> section.

The CSS section sets styling for the circle that represents the nodes, the text alongside them and the links between them.

Then our JavaScript section starts and the first thing that happens is that we declare our array of data in the following code;

As outlined at the start of the chapter, this data is encoded hierarchically in JavaScript Object Notation (JSON). Each node must have a name and if it is going to have subordinate nodes it must include a ‘children’ element. There are many examples of hierarchical data that can be encoded in this way. From the traditional parent - offspring example to directories on a hard drive or a breakdown of materials for a complex object. Any system of encoding where there is a single outcome from multiple sources like an election or an alert encoding system dependent on multiple trigger points.

The next section of our code declares some of the standard features for our diagram such as the size and shape of the svg container with margins included.

Now we start to get into the specifics for the diagram. The next block of code invokes the D3 .tree component, configures the data and assigns it to the tree structure (no actual drawing mind you, just getting the data ready).

The first part of this is the declaration of treemap as using the d3.tree function and assigning the size of the diagram from our earlier variables;

Then we assign our data (with the variable treeData) to nodes using the d3.hierarchy function.

This assigns a range of properties to each node including;

• node.data - the data associated with the node (in our case it will include the name accessible as node.data.name)
• node.depth - a representation of the depth or number of hops from the initial ‘root’ node.
• node.height - the greatest distance from any descendant leaf nodes
• node.parent - the parent node, or null if it’s the root node
• node.children - child nodes or undefined for any leaf nodes

While we’re telling the function to use the ‘children’ elements from ‘treeData’, to generate the properties for the nodes, by default it will use the name ‘children’ if a name is not specified.

Lastly for this block we map the ‘nodes’ data to the tree layout;

The next block of code appends our SVG working area to the body of our web page and creates a group element (<g>) that will contain our svg objects (our nodes, text and links).

Now we’re going to start drawing something! First up is the trickiest part. The links between the nodes.

I say tricky because we’re going to be using the svg mini language again, and while it will work just fine if we simply paste it in and move on, if we want to understand a bit more about how it works, we will need to take a bit of a detour.

But first we need to get the details for this block of code explained. We declare and select all the ‘links’ and then assign the nodes as ‘data’ (.data(nodes.descendants().slice(1))). When we do this we are using .descendants() to return the array of ‘descendant’ nodes. We also specify .slice(1) to not include the main ‘root’ node since the links will be drawn by drawing a line from a child node to its parent (which we wouldn’t be able to do with the root node).

We append a path (.enter().append("path")) and apply some styling (.attr("class", "link")) and then embark on drawing the link lines with the SVG mini language via the ‘d’ attribute.

As we have mentioned previously the ‘d’ attribute allows for the creation of a string of instructions that describe a path. These instructions include;

• Moveto : moves the drawing point using M for absolute coordinates and m for relative movements
• Lineto : draws a straight line from the current position to the next specified location using L for absolute coordinates and l for relative movement.
• Curveto : draws a Bezier curve using control points from the current position to an end point. C designates absolute coordinates and c is used for relative movement. Either one or two sets of control points are used depending on whether a quadratic or cubic Bezier curve is used.
• Arcto : describes a curved path as an elliptical curve rather than a Bezier with additional complexity
• ClosePath : draws a straight line from the current position to the first point in the path

If we look at a single node instance and break down the ‘d’ attribute path we can see the following;

• "M" + d.x + "," + d.y : Moves to the starting point of our node
• "C" + d.x + "," + (d.y + d.parent.y) / 2 : Establishes that we are going to draw a Cubic (C) Bezier curve and the first control point for it is at d.x in the x dimension and halfway between the starting node and its parent.
• " " + d.parent.x + "," + (d.y + d.parent.y) / 2 : Sets the second control point for the curve in line with the parent node in the x dimension and still halfway between the starting node and its parent in the y dimension.
• " " + d.parent.x + "," + d.parent.y : sets the end point for our curve at the parent node location.

If we wanted to make the code easier to follow we could change the curve between nodes to a straight line with the following code;

Which would result in lines being drawn with coordinates from the ‘d’ attribute as follows;

Then we create a variable node that creates a group element (g) for each node;

This time we can see that;

• We don’t slice off the root node (.data(nodes.descendants())) from our data set.
• We apply a different ‘class’ to the node depending on whether it’s an internal node (it has children) or it’s a leaf node (it has no children) d.children ? " node--internal" : " node--leaf".
• We place each node ‘group’ at the appropriate location.
• We don’t actually draw anything. All we’re doing is getting the properties for each node set.

Once we have everything set up we can start to add objects to our node groups.

And then we add the text;

When we’re adding the text, we make sure that we add it on the side appropriate for either a leaf node or an internal node.

And there’s our tree diagram.

### A horizontal tree diagram explained

As we discussed at the start of the previous section, we wanted to start describing tree diagrams with a vertical version because there was an added degree of complexity with the horizontal version that might cause some confusion. If you have worked through and understood the vertical version, the horizontal won’t present any problems other than when you go “Oh, I see what’s going on.”. If you find yourself part way through this description of the changes to the code and can’t see what’s going on, revisit the vertical code and come back.

The graphic that we are going to generate will look like this…

The full code for this is almost identical to the code for the vertical tree diagram with a couple of simple, yet major changes.

The first change is that the way that we draw the diagram relies on changing our reference by rotating everything by 90 degrees.

The (very simplistic) diagram shows that what used to be our ‘x’ dimension is now our ‘y’ dimension and visa versa.

The second change is that where we rotated our axes above we are now left with y and x dimensions that have an origin in the bottom left of the graph. Of course when we draw our diagram, the origin is in the top left corner. This vertical flip occurs automatically and is the reason the diagram doesn’t appear to have ‘just’ rotated.

Now we have our origin in the top left again and the layout of our tree looks pretty much as the example we will be producing.

Believe it or not, this is WAY more difficult to explain than it is to actually do. Again, D3 takes care of the heavy lifting, the explanation of the changes above is just to help us understand why we make some of the changes.

The first change we make is just to give ourselves some extra margin space since some of the labels extend slightly more left and right with a horizontal diagram;

The second change sets the size of the graphic. Here the width and height are swapped as part of the rotation.

When we add the links we need to incorporate the rotation and the easiest way to appreciate the change is to compare the two pieces of code at the same time. In fact it is only when we draw the ‘d’ attribute for the path that we see the changes.

Firstly the vertical tree code;

Then the horizontal tree code;

While there is a bit of math involved, the easiest way to tell that there is a difference is that for the horizontal tree, each coordinate is being described in y,x fashion rather than the conventional x,y.

A similar coordinate change is made when we translate the positions for the nodes;

And lastly when we place the text next to the circles we need to adjust the default ‘y’ distance value with an ‘x’ distance value and to ensure that the labels are spaced at an appropriate distance and to align the text to either the left or right depending on whether it has children or not. We adjust the text anchor appropriately with an ‘end’ or ‘start’.

And there we are….

The full code for this example can be found on github, or in the code samples bundled with this book (simple-horizontal-tree-diagram.html). A working example can be found on bl.ocks.org.

### Styling nodes in a tree diagram

#### Changing node and link colours

The nodes in a tree diagram are objects that exist to provide a representation of the structure of data, but on a tree diagram they should also be viewed as an opportunity to encode additional information about the underlying data.

From the horizontal example shown we have encoded a certain amount of information already. The position of the text relative to each node is determined by whether or not the node is the parent of another node (if it’s a parent it’s on the left) or a child that is on the edge of the tree (in which case it is on the right of the node).

Now, that’s nice, but are we going to be satisfied with that??? (The answer is “No” by the way.)

This example is fairly simple, but it is an example of applying different styles to the nodes to convey additional information. I should be clear at this stage that I am not advocating turning your tree diagram into something that looks like it came out of a circus, because that would be a crime against style, so don’t repeat my upcoming example, but let some of the features be a trigger for developing your own subtle, yet compelling visualizations.

Brace yourself. Here’s a picture of the tree diagram that we’re going to generate. Those with weaker constitutions should look away and flip forward a few pages;

The changes that have been made are as a result of additional data fields that have been added to the JSON array and these fields have been applied to various style options throughout the code.

The types of style changes we have made are - Variation of the diameter of nodes - Changing the fill and stroke colour of nodes - Changing the colour of links depending on the associated node they are connected to.

The code changes we describe from here are assuming that we start with our simple horizontal tree diagram from the previous chapter. We’ll start by looking at the new JSON data set;

Each node now has a value which might represent a degree of importance (we will use this to affect the radius of the nodes), a type which might indicate a difference in the type of node (they might be in active, inactive or undetermined states) and a level which might indicate an alert level for determining problems (red = bad, orange = caution and green = normal).

Irrespective of the contrived nature of our styling options, they are applied to our tree in fairly similar ways with some subtle differences.

The full code for this example can be found on github or in the code samples bundled with this book (tree-styling.html). A working example can be found on bl.ocks.org.

The first change is to the node radius, stroke colour and fill colour.

We simply change the portion of the code that appends the circle from this…

… to this …

The changes return the radius attribute as a function using data.value, the stroke colour is returned using data.type and the fill colour is returned with data.level. This is nice and simple, but we do need to make a slight adjustment to the code that sets the distance that the text is from the nodes so that when the radius expands or contracts, the text distance from the edge of the node adjusts as well.

To do this we take the clever piece of code that adjusts the distance that the text is in the x dimension from the node that looks like this …

… and we add in a dynamic aspect using the data.value field.

The last thing we wanted to do is to change the colour of the link based on the colour of the node. We accomplish this by taking the code that inserts the links…

… and adding in a line that styles the link colour (the stroke) based on the data.level colour of node.

Use the concepts here wisely. I don’t want to see any heinously styled tree diagrams floating around the internet with “Thanks to the help from D3 Tips and Tricks” next to them. Be subtle, be thoughtful :-).

#### Changing the nodes to different shapes

Many thanks to Josiah who asked a question on the d3noob.org blog on how the shapes of the nodes could be varied based on an associated value in the data.

There is more than one way to do this, but perhaps the simplest is to replace the section of the JavaScript that appends the circle with one that appends a symbol from d3’s symbol generator.

There are six pre-defined symbol types as follows;

• circle (d3.symbolCircle) - a circle.
• cross (d3.symbolCross) - a Greek cross or plus sign.
• diamond (d3.symbolDiamond) - a rhombus.
• square (d3.symbolSquare) - an axis-aligned square.
• triangle (d3.symbolTriangle) - an upward-pointing equilateral triangle.
• star (d3.symbolStar) - a five pointed star.
• ‘Y’ (d3.symbolWye) - a ‘Y’ shape.

If we start with our ‘tree-styling’ script from above we can replace the code block that added the circles with the following script will look at the value in the data and assign either a cross or a diamond depending on the value

It will also adjust the size of the symbol along with the stroke and fill.

The full code for this example can be found on github or in the code samples bundled with this book (tree-symbol.html). A working online example can be found on bl.ocks.org.

#### Using images as nodes

Many thanks to nbhatta who asked a question on the d3noob.org blog on how to use images as nodes.

This was a slightly simpler change and just involved replacing the code snippet that added the circles with one that added an image;

The images I chose were all 48 x 48 pixel for the sake of consistency and in the code above I formatted them to be half that size and moved them in the x and y direction so that they were centred correctly.

The cool thing that you will notice is that the specific icon that is placed at each node position is set by the name of the icon which is gathered from the JSON file with the tree details;

It’s possible to just have a single image and to hard-code it into the script, but where’s the fun in that?

The full code for this example can be found on github or in the code samples bundled with this book (tree-images.html, cart.png, earth.png, lettern.png, random.png and vlc.png). A working online example can be found on bl.ocks.org.

### Generating a tree diagram from external data

In all the examples we have looked at so far we have used data that we have declared from within the file itself. Being able to import data from an external file is an important feature that we need to know how to implement.

Starting from the simple tree diagram example that we began with at the start of the chapter, the first change that we need to make is to remove the section of code that declares our data. But don’t throw it away since we will use it to create a separate file called treeData.json. Its contents will be;

(don’t include the treeData = part, or the semicolon at the end (you can delete those))

Then all we need to do is include a section that uses the d3.json accessor to load the file treeData.json (Remember to correctly address the file. This one assumes that the treeData.json file is in the same directory as the html file we are opening).

We can put it somewhere near the start of the JavaScript, but make sure it comes before the ‘nodes’ declaration (when in doubt, check out the sample code).

We also need to make sure that we include the wrapping, closing curly braces and bracket / semicolon (});) at the end of the script.

The full code for this example can be found on github or in the code samples bundled with this book (tree-from-external.html and treeData.json). A working example can be found on bl.ocks.org.

### Generating a tree diagram from ‘flat’ data

Tree diagrams are a fantastic way of displaying information, but one of the drawbacks (to the examples we’ve been using so far) is the need to have your data encoded hierarchically. Most data in a raw form will be flat. That is to say, it won’t be formatted as an array with the parent - child relationships. Instead it will be a list of objects (which we will want to turn into nodes) that might describe the relationship to each other, but they won’t be encoded that way. For example, the following is the flat representation of the example data we have been using thus far.

It is actually fairly simple and consists of only the name of the node and the name of its parent node. It’s easy to see how this data could be developed into a hierarchical form, but it would take a little time and for a larger data set, that would be tiresome.

Luckily computers are built for shuffling data about and with the advent of v4 of d3.js we now have the d3.stratify operator that will convert flat data into a hierarchy suitable for use in our tree diagram.

We will be using the simple example that we started with at the start of the chapter and the first change we need to make is to replace our original data…

… with our flat data array…

It’s worth noting here that we have also changed the name of the array (to flatData) since we are going to convert, then declare our newly massaged data with our original variable name treeData so that the remainder of our code thinks there have been no changes.

Then we use the d3.stratify operator on our flat data;

The stratify function requires a unique identifier to be used for each node and it will be declared as .id. In this example each of our nodes has a unique ‘name’, so we are using that as our id (.id(function(d) { return d.name; })). We also need to understand the hierarchy by having each node identify who its parent is. This will be stored as parentId (.parentId(function(d) { return d.parent; }))

That’s it!

Because we want to be able to use our code as intact as possible from our horizontal tree example we will want to run through our dataset and assign the ‘name’ to each node that has been stored as id;

That’s it!

The brevity of the code to do this is fantastic and well done to Mike Bostock for including the new function in v4. Of course, the end result looks exactly the same;

… but it adds a significant capability for use of additional data.

The full code for this example can be found on github or in the code samples bundled with this book (tree-from-flat.html). A working example can be found on bl.ocks.org.

### Generating a tree diagram from a CSV file.

Creating a tree diagram from a csv file is an extension of the sections where we create a diagram from flat data and where we create a diagram from an external file.

By mashing these together and using a csv file something like the following…

… we can ingest the name of the nodes and their relationships and then format the data correctly.

The main piece of code that we would add that is different from the standard horizontal tree diagram is as follows;

The only part of that code which is new is the portion where we look for the node whose parent is ‘“null”’ and change it to null. This is necessary since the script interprets the name as actually being the text ‘null’ so we have to force the code to realise that we want it to refer to a null amount.

The end result looks very familiar.

The full code for this example can be found on github or in the code samples bundled with this book (tree-from-csv.html and treeCsv.csv). A working example can be found on bl.ocks.org.

### An interactive tree diagram

The examples presented thus far have all been static in the sense that they present information on a web page, but that’s where they stop. One of the strengths of web content is the ability to involve the reader to a greater extent. Therefore the following tree diagram example includes an interactive element where the user can click on any parent node and it will collapse on itself to make more room for others or to simplify a view. Additionally, any collapsed parent node can be clicked on and it will re-grow to its previous condition.

The example included here has it’s roots in the is v3 tree diagram of Mike Bostock’s example. Kudos and thanks also go out to Soumya Ranjan for steering me in the fight direction for the diagonal solution. This was necessary to work around the deprecation of svg.diagonal in v3.

The full code for this example can be found on github, in the appendices of this book or in the code samples bundled with this book (interactive-tree.html). A working online example can be found on bl.ocks.org.

For a brief visual description of the action. The diagram will initially display a partially collapsed tree…

Then when clicking on the ‘Level 2: A’ node, the tree expands to…

We could also click on the root node (Top Level’) to fully collapse the tree…

Then clicking on the nodes opens the diagram back up again.

One of the important changes is to allow the diagram to follow the d3.js model of enter - update - exit for the nodes with a suitable transition in between.

Nodes are coloured (“steelblue”) if they have been collapsed and at the end of the script we have a function that makes use of the d._children reference we have been using in most of our examples.

This allows the action of clicking on the nodes to update the data associated with the node and as a consequence change it’s properties in the script based on if statements (Such as "fill", function(d) { return d._children ? "lightsteelblue" : "#fff"; } which will fill the node with “lightsteelblue” if d._children exists, otherwise make it white.)

The examples we have looked at in the previous sections in this chapter are all applicable to this interactive version, so this should provide you with the capability to generate some interesting visualizations.

## Sankey Diagrams

### What is a Sankey Diagram?

A Sankey diagram is a type of flow diagram where the ‘flow’ is represented by arrows of varying thickness depending on the quantity of flow.

They are often used to visualize energy, material or cost transfers and are especially useful in demonstrating proportionality to a flow where different parts of the diagram represent different quantities in a system.

Probably the most famous example of a Sankey diagram is Charles Minard’s Map of Napoleon’s Russian Campaign of 1812.

From Wikipedia;

Étienne-Jules Marey first called notice to this dramatic depiction of the fate of Napoleon’s army in the Russian campaign, saying it defies the pen of the historian in its brutal eloquence. Edward Tufte says it “may well be the best statistical graphic ever drawn” and uses it as a prime example in The Visual Display of Quantitative Information.”

Wikipedia has a great explanation of the diagram type and there is a wealth of information dedicated to it on the inter-web. I heartily recommend http://www.sankey-diagrams.com/ for all things Sankey!

So it would come as little surprise that Mike Bostock has developed a plugin for Sankey diagrams (http://bost.ocks.org/mike/sankey/) so that we can all enjoy Sankey goodness with lashings of D3.

For a great page dedicated to Sankey diagrams, check out sankey-diagrams.com.

### Which Sankey plugin should we use?

Hmmmm…… Good question.

As at the time of writing there were 4 different sankey plugins listed on the d3 wiki (including the vertical one). The examples we will walk through have used the plugins from Stefaan Lippens and Jason Davies without problem. I have used the Jason Davies version for no other reason than it appears to be the originator from which others have been derived.

### How d3.js Sankey Diagrams want their data formatted

If we think of Sankey diagrams consisting of ‘nodes’ and ‘links’…

… the data that generates them must be formatted as nodes and links as well.

For instance a JSON file with appropriate data to build the diagram above could look like the following;

In the file above we have 6 nodes (0-5) sequentially numbered and with names appropriate to their position in the list.

The sequential numbering is only for the purpose of highlighting the structure of the data, since when we get D3 running, it will automatically index each of the nodes according to its position. In other words, we could have omitted the “node”:n parts since D3 will know where each node is anyway. The big deal is that WE need to know what each node is as well. Especially if we’re going to be building the data by hand (doing it dynamically would be cool, but let’s not get ahead of ourselves just yet).

The ‘links’ part of the data can be broken down into individual source to target ‘links’ that have an associated value (could be a quantity or strength, but at least a numeric value).

The ‘source’ and ‘target’ numbers are references to the list of nodes. So, “source”:1, “target”:2 means that this link is whatever node appears at position 1 going to whatever node appears at position 2. The important point to make here is that D3 will not be interested in the numerical value of the node, just its position in the list (starting at zero).

### Description of the code

The code for the Sankey diagram is significantly different to that for a line graph although it shares the same core language and programming methodology.

The code we’ll go through is an adaptation of the version first demonstrated by Mike Bostock so it’s got a pretty good pedigree. We will begin with a version that uses data that is formatted so that it can be used directly with no manipulation, then in subsequent sections we will work on different techniques for getting data from different formats (and with different structures) to work.

I found that getting data in the correct format was the biggest hurdle for getting a Sankey diagram to work. We will start off assuming that the data is perfectly formatted, then where only the link data is available, then where there is just names to work with (no numeric node values) and lastly, one that can be used for people with changeable data from a MySQL database.

We won’t try to go over every inch of the code as we did with the simple graph example (I’ll skip things like the HTML header) and will focus on the style sheet (CSS) portion and the JavaScript.

The full code for this example can be found on github or in the code samples bundled with this book (sankey-formatted-json.html, sankey.js and sankey.json). A live example can be found on bl.ocks.org.

On to the code…

So, going straight to the style sheet bounded by the <style> tags;

The CSS in this example is mainly concerned with formatting of the mouse cursor as it moves around the diagram.

The first part…

… provides the properties for the node rectangles. It changes the icon for the cursor when it moves over the rectangle to one that looks like it will move the rectangle (there is a range of different icons that can be defined here http://www.echoecho.com/csscursors.htm), sets the fill colour to mostly opaque and keeps the edges sharp.

The next block…

… sets the properties for the text at each node. The mouse is told to essentially ignore the text in favour of anything that’s under it (in the case of moving or highlighting something else) and a slight shadow is applied for readability).

The following block…

… makes sure that the link has no fill (it actually appears to be a bendy rectangle with very thick edges that make the element appear to be a solid block), colours the edges black (#000) and makes the edges almost transparent.

The last block….

… simply changes the opacity of the link when the mouse goes over it so that it’s more visible. If so desired, we could change the colour of the highlighted link by adding in a line to this block changing the colour like this stroke: red;.

Just before we get into the JavaScript, we do something a little different for d3.js. We tells it to use a plug-in with the following line;

The concept of a plug-in is that it is a separate piece of code that will allow additional functionality to a core block (which in this case is d3.js). There are a range of plug-ins available and we will need to source the sankey.js file from the repository and place that somewhere where our HTML code can access it. In this case I have put it in the same directory as the main sankey web page.

The start of our JavaScript begins by defining a range of variables that we’ll be using.

Our units are set as ‘Widgets’ (var units = "Widgets";), which is just a convenient generic (nonsense) term to provide the impression that the flow of items in this case is widgets being passed from one person to another.

We then set our canvas size and margins…

… before setting some formatting.

The formatNumber function acts on a number to set it to zero decimal places in this case. In the original Mike Bostock example it was to three places, but for ‘widgets’ I’m presuming we don’t divide :-).

format is a function that returns a given number formatted with formatNumber as well as a space and our units of choice (‘Widgets’). This is used to display the values for the links and nodes later in the script.

The color = d3.scaleOrdinal(d3.schemeCategory20); line is really interesting and provides access to a colour scale that is pre-defined for your convenience! Later in the code we will see it in action.

Our next block of code positions our svg element onto our page in relation to the size and margins we have already defined;

Then we set the variables for our sankey diagram;

Without trying to state the obvious, this sets the width of the nodes (.nodeWidth(36)), the padding between the nodes (.nodePadding(40)) and the size of the diagram(.size([width, height]);).

The following line defines the path variable as a pointer to the sankey function that makes the links between the nodes do their clever thing of bending into the right places;

I make the presumption that this is a defined function within sankey.js.

Then we load the data for our sankey diagram with the following line;

As we have seen in previous usage of the d3.json, d3.csv and d3.tsv functions, this is a wrapper that acts on all the code within it bringing the data in the form of graph to the remaining code.

I think it’s a good time to take a slightly closer look at the data that we’ll be using;

I want to look at the data now, because it highlights how it is accessed throughout this portion of the code. It is split into two different blocks, ‘nodes’ and ‘links’. The subset of variables available under ‘nodes’ is ‘node’ and ‘name’. Likewise under ‘links’ we have ‘source’, ‘target’ and ‘value’. This means that when we want to act on a subset of our data we define which piece by defining the hierarchy that leads to it. For instance, if we want to define an action for all the links, we would use graph.links (they’re kind of chained together).

Now that we have our data loaded, we can assign the data to the sankey function so that it knows how to deal with it behind the scenes;

In keeping with our previous description of what’s going on with the data, we have told the sankey function that the nodes it will be dealing with are in graph.nodes of our data structure.

I’m not sure what the .layout(32); portion of the code does, but I’d be interested to hear from any more knowledgeable readers. I’ve tried changing the values to no apparent effect and googling has drawn a blank. Internally to the sankey.js file it seems to indicate ‘iterations’ while it establishes computeNodeLinks, computeNodeValues, computeNodeBreadths, computeNodeDepths (iterations) and computeLinkDepths.

Then we add our links to the diagram with the following block of code;

This is an analogue of the block of code we examined way back in the section that we covered in explaining the code of our first simple graph.

We append svg elements for our links based on the data in graph.links, then add in the paths (using the appropriate CSS). We set the stroke width to the width of the value associated with each link or ‘1’. Whichever is the larger (by virtue of the Math.max function). As an interesting sideline, if we force this value to ‘10’ thusly…

… the graph looks quite interesting.

The sort function (.sort(function(a, b) { return b.dy - a.dy; });) makes sure the link for which the target has the highest y coordinate departs first out of the rectangle. Meaning if you have flows of 30,40,50 out of node 1, heading towards nodes 2, 3 and 4, with node 3 located above node 2 and that above node 4, the outflow order from node 1 will be 40,50,30. This makes sure there are a minimum of flow crosses. It’s slightly confusing and for a long time it was a mystery (big thanks and kudos to ‘napicool’ who was able to explain it on d3noob.org.

This code appends a text element to each link when moused over that contains the source and target name (with a neat little arrow in between and the value) which, when applied with the format function, adds the units.

The next block appends the node objects (but not the rectangles or text) and contains the instructions to allow them to be arranged with the mouse.

While it starts off in familiar territory with appending the node objects using the graph.nodes data and putting them in the appropriate place with the transform attribute, I can only assume that there is some trickery going on behind the scenes to make sure the mouse can do what it needs to do with the d3.behaviour,drag function. There is some excellent documentation on the wiki, but I can only presume that it knows what it’s doing :-). The dragmove function is laid out at the end of the code, and we will explain how that operates later. Kudos for this code portion should go to @syntagmatic.

I really enjoyed the next block;

It starts off with a fairly standard appending of a rectangle with a height generated by its value { return d.dy; } and a width dictated by the sankey.js file to fit the area (.attr("width", sankey.nodeWidth())).

Then it gets interesting.

The colours are assigned in accordance with our earlier colour declaration and the individual colours are added to the nodes by finding the first part of the name for each node and assigning it a colour from the palate (the script looks for the first space in the name using a regular expression). For instance: ‘Widget X’, ‘Widget Y’ and ‘Widget’ will all be coloured the same even if the ‘Widget X’ and ‘Widget Y’ are inputs on the left and ‘Widget’ is a node in the middle.

The stroke around the outside of the rectangle is then drawn in the same shade, but darker. Then we return to the basics where we add the title of the node in a tool tip type effect along with the value for the node.

From here we add the titles for the nodes;

Again, this looks pretty familiar. We position the text titles carefully to the left of the nodes. All except for those affected by the filter function (return d.x < width / 2;). Where if the position of the node on the x axis is less than half the width, the title is placed on the right of the node and anchored at the start of the text. Very neat.

The last block is also pretty neat, and contains a little surprise for those who are so inclined.

This declares the function that controls the movement of the nodes with the mouse. It selects the item that it’s operating over (d3.select(this)) and then allows translation in the y axis while maintaining the link connection (sankey.relayout(); link.attr("d", path);).

But that’s not the cool part. A quick look at the code should reveal that if you can move a node in the y axis, there should be no reason why you can’t move it in the x axis as well!

Sure enough, if you replace the code above with this…

… you can move your nodes anywhere on the canvas.

I know it doesn’t seem to add anything to the diagram (in fact, it could be argued that there is a certain aspect of detraction) however, it doesn’t mean that one day the idea doesn’t come in handy :-). You can see a live version on bl.ocks.org.

### Formatting data for Sankey diagrams

As explained in the previous section, data to form a Sankey diagram needs to be a combination of nodes and links.

As we also noted earlier, the "node" entries in the "nodes" section of the JSON file are superfluous and are really only there for our benefit since D3 will automatically index the nodes starting at zero. As a test to check this out we can change our data to the following;

This will produce the following graph;

As you can see, essentially the same, but with easier to understand names.

As you can imagine, while the end result is great, the creation of the JSON file manually would be painful at best. Doing something similar but with a greater number of nodes / links would be a nightmare.

Let’s see if we can make the process a bit easier and more flexible.

It would make thing much easier, if you are building the data from hand, to have nodes with names, and the ‘source’ and ‘target’ links to have those same name values as identifiers.

In other words a list of unique names for the nodes (and perhaps some details) and a list of the links between those nodes using the names for the nodes.

So, something like this;

Once again, D3 to the rescue!

The little piece of code that can do this for us is here;

This elegant solution comes from Stack Overflow and was provided by Chris Pettitt (nice job).

So if we sneak this piece of code into here…

… and this time we use our JSON file with just names (sankey-names.json) and our new html file (sankey-formatted-names.html) we find our Sankey diagram working perfectly!

The full code for this example can be found on github or in the code samples bundled with this book (sankey-formatted-names.html, sankey.js and sankey-names.json). A live example can be found on bl.ocks.org.

Looking at our new piece of code…

… the first thing it does is create an object called nodeMap (The difference between an array and an object in JavaScript is one that is still a little blurry to me and judging from online comments, I am not alone).

Then for each of the graph.node instances (where x is a range of numbers from 0 to the last node), we assign each node name to a number.

Then in the next piece of code…

… we go through all the links we have and for each link, we map the appropriate number to the correct name.

Very clever.

#### From a CSV with ‘source’, ‘target’ and ‘value’ info only.

In the first iteration of this section of the book I had no solution to creating a Sankey diagram using a csv file as the source of the data.

But cometh the hour, cometh the man. Enter @timelyportfolio who, while claiming no expertise in D3 or JavaScript was able to demonstrate a solution to exactly the problem I was facing! Well done Sir! I salute you and name the technique the timelyportfolio csv method!

The full code for this example can be found on github or in the code samples bundled with this book (sankey-formatted-csv.html, sankey.js and sankey.csv). A live example can be found on bl.ocks.org.

So here’s the cleverness that @timelyportfolio demonstrated;

Using a csv file (in this case called sankey.csv) that looks like this;

We take this single line from our original Sankey diagram code;

And replace it with the following block;

The comments in the code (and they are fuller in @timelyportfolio’s original gist solution) explain the operation;

… Loads the csv file from the data directory.

… Declares graph to consist of two empty arrays called nodes and links.

… Takes the data loaded with the csv file and for each row loads variables for the source and target into the nodes array. Then for each row it loads variables for the source target and value into the links array.

… Is a routine that Mike Bostock described on Google Groups that (as I understand it) nests each node name as a key so that it returns with only unique nodes.

… Goes through each link entry and, for each source and target, it finds the unique index number of that name in the nodes array and assigns the link source and target an appropriate number.

And finally…

… Goes through each node and (in the words of @timelyportfolio) “make nodes an array of objects rather than an array of strings” (I don’t really know what that means :-(. I just know it works :-).)

There you have it. A Sankey diagram from a csv file. Well played @timelyportfolio!

## Assorted Tips and Tricks

### Change a line chart into a scatter plot

Confession time.

In the original book I didn’t actually intend to add in a section with a scatter plot in it for its own sake because I thought it would be;

1. tricky
2. not useful
3. all of the above

I was wrong on all counts.

All we need to do is take the simple graph example file and slot the following block in between the ‘Add the valueline path’ and the ‘add the x axis’ blocks.

And we will get…

The full code for this graph can also be found on github or in the code samples bundled with this book (simple-scatterplot.html and data.csv). A live example can be found on bl.ocks.org.

I deliberately put the dots after the line in the drawing section, because I thought they would look better, but we could put the block of code before the line drawing block to get the following effect;

(just trying to reinforce the concept that ‘order’ matters when drawing objects :-)).

We could of course just remove the line block all together…

But in my humble opinion it loses something.

So what do the individual lines in the scatter plot block of JavaScript do?

The first line (svg.selectAll("dot")) essentially provides a suitable group label for the svg circle elements that will be added. The next line associates the range of data that we have to the group of elements we are about to add in.

Then we add a circle for each data point (.enter().append("circle")) with a radius of 5 pixels (.attr("r", 5)) and appropriate x (.attr("cx", function(d) { return x(d.date); })) and y (.attr("cy", function(d) { return y(d.close); });) coordinates.

There is lots more that we could be doing with this piece of code including varying the colour or size or opacity of the circles depending on the data and all sorts of really neat things, but for the mean time, there we go. Scatter plot!

Tooltips have a marvellous duality. They are on one hand a pretty darned useful thing that aids in giving context and information where required and on the other hand, if done with a bit of care, they can look very stylish :-).

Technically, they represent a slight deviation from what we have been playing with so far into a mildly more complex arena of ‘transitions’ and ‘events’. You will probably regard this one of two ways. Either accepting that it just works and using it as shown, or you will actually know what’s going on in which case feel free to deride my efforts as those of a rank amateur :-).

Just in case there is some confusion, a tooltip (one word or two?) is a discrete piece of information that will pop into view when the mouse hovers over somewhere specific. Most of us have seen and used them, but I suppose we all tend to call them different things such as ‘infotip’, ‘hint’ or ‘hover box’ I don’t know if there’s a right name for them, but here’s an example of what we’re trying to achieve;

We can see the mouse has hovered over one of the scatter plot circles and a tip has appeared that provides us with the exact date and value for that point.

We can also notice that there’s a certain degree of ‘fancy’ here as the information is bound by a rectangular shape with rounded corners and a slight opacity. The other piece of ‘fancy’ which you don’t see in a PDF (or whatever format this distinguished tome will be published in on its 33rd reprint in the year 2034), is that when these tool tips appear and disappear, they do so in an elegant fade-in, fade-out way. Pretty!

Before we get started describing how the code goes together, let’s take a quick look at the two technique specifics that I mentioned earlier, ‘transitions’ and ‘events’.

#### Transitions

From the main d3.js web page (d3js.org) transitions are described as gradually interpolating styles and attributes over time. So what I take that to mean is that if you want to change an object, you can do so by simply specifying the attribute / style end point that you want it to end up with and the time you want it to take and go!

Of course, it’s not quite that simple, but luckily, smarter people than I have done some fantastic work describing different aspects of transitions so please see the following for a more complete description of the topic;

Hopefully observing the mouseover and mouseout transitions in the tooltips example will whet your appetite for more!

#### Events

The other technique is related to mouse ‘events’. This describes the browser watching for when ‘something’ happens with the mouse on the screen and when it does, it takes a specified action. A (probably non-comprehensive) list of the types of events are the following;

• mousedown: Triggered by an element when a mouse button is pressed down over it
• mouseup: Triggered by an element when a mouse button is released over it
• mouseover: Triggered by an element when the mouse comes over it
• mouseout: Triggered by an element when the mouse goes out of it
• mousemove: Triggered by an element on every mouse move over it.
• click: Triggered by a mouse click: mousedown and then mouseup over an element
• contextmenu: Triggered by a right-button mouse click over an element.
• dblclick: Triggered by two clicks within a short time over an element

How many of these are valid to use within d3 I’m not sure, but I’m willing to bet that there are probably more than those here as well. Please go to http://javascript.info/tutorial/mouse-events for a far better description of the topic if required.

#### Get tipping

So, bolstered with a couple of new concepts to consider, let’s see how they are enacted in practice.

The full code for this graph can also be found on github or in the code samples bundled with the book (simple-tooltips.html and data.csv). A live example can be found on bl.ocks.org.

If we start with our simple-scatter plot graph there are 4 areas in it that we will want to modify (it may be easier to check the simple-tooltips.html file in the code samples bundled with this book).

The first area is the CSS. The following code should be added just before the </style> tag;

These styles are defining how our tooltip will appear . Most of them are fairly straight forward. The position of the tooltip is done in absolute measurements, not relative. The text is centre aligned, the height, width and colour of the rectangle is 28px, 60px and lightsteelblue respectively. The ‘padding’ is an interesting feature that provides a neat way to grow a shape by a fixed amount from a specified size.

We set the border to 0px so that it doesn’t show up and a neat style (attribute?) called border-radius provides the nice rounded corners on the rectangle.

Lastly, but by no means least, the pointer-events: none line is in place to instruct the mouse event to go ‘through’ the element and target whatever is ‘underneath’ that element instead (Read more here). That means that even if the tooltip partly obscures the circle, the code will still act as if the mouse is only over the circle.

The second addition is a simple one-liner that should (for forms sake) be placed under the parseTime variable declaration;

This line formats the date when it appears in our tooltip. Without it, the time would default to a disturbingly long combination of temporal details. In the case here we have declared that we want to see the day of the month (%e) and the full month name(%B).

The third block of code is the function declaration for ‘div’.

We can place that just after the valueline definition in the JavaScript. Again there’s not too much here that’s surprising. We tell it to attach a div element to the body element, we set the class to the tooltip class (from the CSS) and we set the opacity to zero. It might sound strange to have the opacity set to zero, but remember, that’s the natural state of a tooltip. It will live unseen until it’s moment of revelation arrives and it pops up!

The final block of code is slightly more complex and could be described as a mutant version of the neat little bit of code that we used to do the drawing of the dots for the scatter plot. That’s because the tooltips are all about the scatter plot circles. Without a circle to ‘mouseover’, the tooltip never appears :-).

So here’s the code that includes the scatter plot drawing (it’s included since it’s pretty much integral);

The first six lines of the code are a repeat of the scatter plot drawing script. The only changes are that we’ve removed the semicolon from the cy attribute line since the code now has to carry on.

So the additions are broken into two areas that correspond to the two events. mouseover and mouseout. When the mouse moves over any of the circles in the scatter plot, the mouseover code is executed on the div element. When the mouse is moved off the circle a different set of instructions are executed.

#### on.mouseover

The .on("mouseover" line initiates the introduction of the tooltip. Then we declare the element we will be introducing (‘div’) and that we will be applying a transition to its introduction (.transition()). The next two lines describe the transition. It will take 200 milliseconds (.duration(200)) and will result in changing the element’s opacity to .9 (.style("opacity", .9);). Given that the natural state of our tooltip is an opacity of 0, this make sense for something appearing, but it doesn’t go all the way to a solid object and it retains a slight transparency just to make it look less permanent.

The following three lines format our tooltip. The first one adds an html element that contains our x and y information (the date and the close value). Now this is done in a slightly strange way. Other tooltips that I have seen have used a ‘.text’ element instead of a ‘.html’ one, but I have used ‘.html’ in this case because I wanted to include the line break tag <br/> to separate the date and value. I’m sure there are other ways to do it, but this worked for me. The other interesting part of this line is that this is where we call our time formatting function that we described earlier. The next two lines position the tooltip on the screen and to do this they grab the x and y coordinates of the mouse when the event takes place (with the d3.event.pageX and d3.event.pageY snippets) and apply a correction in the case of the y coordinate to raise the tooltip up by the same amount as its height (28 pixels).

#### on.mouseout

The .on("mouseout" section is slightly simpler in that it doesn’t have to do any fancy text / html / coordinate stuff. All it has to do is to fade out the ‘div’ element. And that is done by simply reversing the opacity back to 0 and setting the duration for the transition to 500 milliseconds (being slightly longer than the fade-in makes it look slightly cooler IMHO).

Right, there you go. As a description it’s ended up being a bit of a wall of text I’m afraid. But hopefully between the explanation and the example code you will get the idea. Please take the time to fiddle with the settings described here to find the ones that work for you and in the process you will reinforce some of the principles that help D3 do its thing.

There was an interesting question on d3noob.org about adding an HTML link to a tooltip. While the person asking the question had the problem pretty much solved already, I thought it might be useful for others.

The premise is that you want to add a tool tip to your visualization using the method described here, but you also want to include an HTML link in the tooltip that will link somewhere else. This might look a little like the following;

In the image above the date has been turned into a link. In this case the link goes to google.com, but that can obviously be configurable.

The full code for this example can be found on github or in the code samples bundled with this book (simple-tooltips-link.html and data.csv). A working example can be found on bl.ocks.org.

There are a few changes that we would want to make to our original tooltip code to implement this feature.

First of all, we’ll add the link to the date element. Adding an HTML link can be as simple as wrapping the ‘thing’ to be used as a link in <a> tags with an appropriate URL to go to.

The following adaptation of the code that prints the information into our tooltip code does just that;

<a href= "http://google.com"> places our first <a> tag and declares the URL and the second tag follows after the date.

The second change we will want to make is to ensure that the tooltip stays in place long enough for us to actually click on the link. The problem being solved here is that our original code relies on the mouse being over the dot on the graph to display the tooltip. if the tooltip is displayed and the cursor moves to press the link, it will move off the dot on the graph and the tooltip vanishes (Darn!).

To solve the problem we can leave the tooltip in place adjacent to a dot while the mouse roams freely over the graph until the next time it reaches a dot and then the previous tooltip vanishes and a new one appears. The best way to appreciate this difference is to check out the live example on bl.ocks.org.

The code is as follows (you may notice that this also includes the link as described above);

We have removed the .on("mouseout" portion and moved the function that it used to carry out to the start of the .on("mouseover" portion. That way the first thing that occurs when the mouse cursor moves over a dot is that it removes the previous tooltip and then it places the new one.

The last change we need to make is to remove from the <style> section the line that told the mouse to ignore the tooltip;

One link is interesting, but let’s face it, we didn’t go to all the trouble of putting a link into a tool tip to just go to one location. Now we shift it up a gear and start linking to different places depending on our data. At the same time (and because someone asked) we will make the link open in a new tab!

The changes to the script are fairly minor, but one fairly large change is the need to have links to go to. For this example I have added a range of links to visit to our csv file so it now looks like this;

The code change is to the piece of JavaScript where we add the HTML. This is what we end up with;

We’ve replaced the URL http://google.com with the variable for our link column d.link and we’ve also added in the target="_blank" statement so that our link opens in a new tab.

The full code for this multi link example can be found on github or in the code samples bundled with this book (tooltips-link-multi.html and data-link.csv). A working example can be found on bl.ocks.org.

Hopefully that helps people with a similar desire to include links in their tooltips. Many thanks to the reader who suggested it :-).

### What are the predefined, named colours?

Throughout this document I generally use colours defined by name. This is mainly because I can, and not for any other reason. In fact there several different ways to define colours used in D3 / JavaScript / CSS and HTML. I have no idea what the limitations for use are and / or how their use in different browsers impacts on correct representation. But I do know that they’re used widely.

There seem to be several different standards for what constitutes an authoritative list of named colours. After a cursory search I was able to find a great list on about.com and there are some nice representations on Wikipedia.

The overriding point of all this is that there’s more than one way to define colours in your graphs.

It means that considering… .style("fill", "steelblue")
and…
.style("fill", "#4682b4")
and…
.style("fill", "rgb(70,130,180)")

All three alternatives result in the same colour being applied.

For a long time I didn’t actually have the images of the colours represented here in D3 Tips and Tricks, but like all things, one day I thought ‘Hey, I could just write a simple script that placed them on the screen’. So here they are :-).

I have tried to group them as ‘like’ colours per the entry in Wikipedia.

You can also see a live page with the script that produces the rectangles at bl.ocks.org.

### Selecting / filtering a subset of objects

Imagine a scenario where you want to select (or should we say filter) a particular range of objects from a larger set.

For example, what if we wanted to use our scatter plot example to show the line as normal, but we are particularly interested in the points where the values of the points fall below 400. Therefore, when the value falls below 400 we want them highlighted with a circle as we have done with *the scatter plot points previously.

So that we end up with something that looks a little like this…

Err… Yes, for those among you who are of the observant persuasion, I have deliberately coloured them red as well (red for DANGER!).

This is a fairly simple example, but serves to illustrate the principle adequately. From our simple scatter plot example we only need to add in two lines to the block of code that draws the circles as follows;

The full code for this example can be found on github or in the code samples bundled with this book (filter-selection.html and data.csv). A working example can be found on bl.ocks.org.

The first added line uses the .filter function to act on the data points and according to the arguments passed to it in this case, only return those where the value of d.close is less than 400 (return d.close < 400).

The second added line simply colours the circles red (.style("fill", "red")).

That’s all there is to it. Pretty simple, but the filter function can be very powerful when used wisely.

### Select items with an IF statement.

The filtering – selection section above is a good way to adapt what you see on a graph, but so is a more familiar friend… The ‘if’ statement.

An ‘if’ statement will act to carry out a task in a particular way dependant on a condition that you specify.

Starting with the simple scatter plot example all we have to do is include the if statement in the block of code that draws the circles. Here’s the entire block with the additions highlighted;

Our first added line introduces the style modifier and the rest of the code acts to provide a return for the ‘fill’ attribute.

The second line introduces our if statement. There’s very little difference using if statements between languages. Just look out for maintaining the correct syntax and you should be fine. In this case we’re asking if the value of ‘d.close’ is less than or equal to 400 and if it is it will return the "red" statement for our fill.

The third line covers our rear and make sure that if the colour isn’t going to be red, it’s going to be black. The last line just closes the style and function statements.

The result?

Aww….. nice.

The full code for this example can be found on github or in the code samples bundled with this book (if-selection.html and data.csv). A working example can be found on bl.ocks.org.

What if we wanted to have all the points where close was less than 400 red and all those where close was greater than 620 green? Oh yeah! Now we’re talking.

So with one small change to the if statement;

Check it out…

Nice.

### Applying a colour gradient to a line based on value.

I know that we were impressed with the changing dots in a scatter plot based on the value. But could we go one better?

How about we try to reproduce the same effect but by varying the colour of the plotted line. This is a neat feature and a useful example of the flexibility of d3.js and SVG in general. I used the appropriate bits of code from Mike Bostock’s Threshold Encoding example. And I should take the opportunity to heartily recommend browsing through his collection of examples on bl.ocks.org.

The full code for this example can be found on github or in the code samples bundled with this book (line-graph-gradient.html and data.csv). A working example can be found on bl.ocks.org.

Here then is a plotted line that is red below 400, green above 620 and black in between.

How cool is that?

Enough beating around the bush, how is the magic line produced?

Starting with our simple line graph, there are only two blocks of code to go in. One is CSS in the <style> area and the second is a tricky little piece of code that deals with gradients.

First the CSS.

This block will go in the <style> area.

There’s the fairly standard fill of none and a stroke width of 2 pixels, but the stroke: url(#line-gradient); is something different.

In this case the stroke (the colour of the line) is being determined at a link within the page which is set by the anchor #line-gradient. We will see shortly that this is in our second block of code, so the colour is being defined in a separate portion of the script.

And now the JavaScript gradient code;

There’s our anchor on the third line!

But let’s not get ahead of ourselves. This block should be placed after the x and y domains are set, but before the line is drawn.

Our second line adds our linear gradient. Gradients consist of continuously smooth colour transitions along a vector from one colour to another. We can have a linear or radial gradient and depending on which you select, there are a few options to define. There is some great information on gradients at http://www.w3.org/TR/SVG/pservers.html (more than I ever thought existed).

The third line (.attr("id", "line-gradient")) sets our anchor for the CSS that we saw earlier.

The fourth, fifth and sixth lines define the bounds of the area over which the gradient will act. Since the coordinates x1, y1, x2, y2 will describe an area. The values for y1 (0) and y2 (1000) are used more for convenience to align with our data (which has a maximum value around 630 or so). For more information on the ‘gradientUnits’ attribute I found this page useful https://developer.mozilla.org/en-US/docs/SVG/Attribute/gradientUnits. We’ll come back to the coordinates in a moment.

The next block selects all the ‘stop’ elements for the gradients. These stop elements define where on the range covered by our coordinates the colours start and stop. These have to be defined as either percentages or numbers (where the numbers are really just percentages in disguise (i.e. 45% =0.45)).

The best way to consider the stop elements is in conjunction with the gradientUnits. The image following may help.

In this case our coordinates describe a vertical line from 0 to 1000. Our colours transition from red (0) to red (400) at which point they change to black (400) and this will continue until it gets to black (620). Then this changes to green (620) and from there, any value above that will be green.

After defining the stop elements, we enter and append the elements to the gradient (.enter().append("stop")) with attributes for offset and colour that we defined in the stop elements area.

Now, that IS cool, but by now, I hope that you have picked that a gradient function really does mean a gradient, and not just a straight change from one colour to another.

So, let’s try changing the stop element offsets to the following (and making the stroke-width slightly larger to see more clearly what’s going on);

And here we go…

I have tended to find that I need to have a good think about how I set the offsets and bounds when doing this sort of thing since it can get quite complicated quite quickly :-)

### Applying a colour gradient to an area fill.

The previous example of a varying gradient on a line is neat, but hopefully you’re already thinking “Hang on, can’t that same thing be applied to an area fill?”.

Damn! You’re catching on.

To do this there’s only a few things we need to change;

First of all the CSS for the line needs to be amended to refer to the area. So this…

…gets changed to this…

We’ve defined the styles for the area this time, but instead of the stroke being defined by the separate script, now it’s the area. While we’ve changed the url name, it’s actually the same piece of code, with a different id (because it seemed wrong to be talking about an area when the label said line). We’ve also set the stroke width to zero, because we don’t want any lines around our filled area.

Now we want to take the block of code that defined our line…

… and we need to replace it with the standard block that defined an area fill.

So we’re not going to be drawing a line at all. Just the area fill.

Next, as I mentioned earlier, we change the id for the linearGradient block from "line-gradient" to "area-gradient"

And lastly, we remove the block of code that drew the line and replace it with a block that draws an area. So change this….

… to this;

And then sit back and marvel at your creation;

The full code for this example can be found on github or in the code samples bundled with this book (area-graph-gradient.html and data.csv). A working example can be found on bl.ocks.org.

For a slightly ‘nicer’ looking example, you could check out a variation of one of Mike Bostock’s (v3) originals here; http://bl.ocks.org/4433087.

### Transitions

A transition in d3 is an application of an animation to an element on the page. For the purpose of demonstration we can think of an element being one of the common shapes and objects which include circles, ellipses, rectangles, lines, polylines, polygons, text and paths. This is a gross oversimplification as transitions can be applied in far more complex ways, but this will help us get started. An animation could be described as a change in an attribute or style of an element over time.

If we use a circle as an example we know that a circle is described by three required attributes;

• cx: The position of the centre of the circle in the x direction (left / right) measured from the left side of the screen.
• cy: The position of the centre of the circle in the y direction (up / down) measured from the top of the screen.
• r: The radius of the circle from the cx, cy position to the perimeter of the circle.

To animate a circle we would therefore be changing (or transitioning) one of those attributes over time.

The following JavaScript will draw a simple blue circle with radius 20 pixels at the position 40,250

To transition that circle from left to right we would change the cx attribute by simply including the transition instruction the new value of the attribute to be changed and the time that it should be completed in;

The total amount of code required is;

And seen on the web page our circle moves from left to right.

A transition can be of more than one attribute at the same time. If we add lines to change the radius and fill colour as well we will have some JavaScript that looks a bit like this;

And when we load the page we see our circle move from left to right wile at the same time increasing in radius and changing colour from blue to red.

#### Transitioning Chaining

Instead of having multiple attributes / styles changing at once we can stagger them using transition chaining. This is where we can employ several different transitions to an element one after the other.

For example, the following JavaScript will move our circle from left to right then it will increase the radius from 20 to 40 and then it will change it’s colour from blue to red.

#### Transition Easing

The reader who runs the code or checks out the simple example here will notice that our circle does not move from one side to the other at a constant speed. It starts off slowly, builds up speed towards the middle of its travels and then slows down before stopping. This gives the movement a pleasing appearance, but it is an example of the ‘easing’ of the transition from one point to another.

To apply a linear motion to the movement we can introduce .ease(d3.easeLinear) to the code as follows;

The easing of an element describes a distortion in the apparent flow of time. There are a range of different types of easing described in the d3 wiki. Most are representative of a function although there are some which are intended to represent specific real-world motions.

• linear
• cubic
• poly
• sin
• exp
• circle
• bounce
• back
• elastic

Each easing (except linear) can be further modified using ‘In’, ‘Out’ or InOut’ which allows for variations in the animation that can give the appearance of starting or stopping at different points in the curves or in the case of ‘InOut’ of going through a complete transition. For example, .ease(d3.easeExpIn) or .ease(d3.easePolyInOut).

If an easing function is not specified, the default used is cubic and if ‘In’, ‘Out’ or InOut’ aren’t specified, the default is ‘InOut’ except for bounce and elastic which use ‘Out’. Therefore we can use .ease(d3.easePoly) which is an alias for .ease(d3.easePolyInOut). ‘easePoly’ is also an alias for ‘easeCubic’ (and vice versa).

To get a good impression of the way that each easing method is represented, Mike Bostock has an ‘Easing Explorer’ block set up here. There is also a block to do a side by side comparison here (this is also in the code samples bundled with this book (transition-easing-multiple.html)).

#### Looping a Transition

We can use a transition to create a convincing impression of an element that is in a constant rate of change in a looping condition. We achieve this by creating a transition chain that starts and ends in the same state and then we instruct the transition code to repeat itself.

In the example we will have a circle that moves from left to right and then back again constantly.

The JavaScript starts by creating an SVG element;

We then declare the function circleTransition which starts by creating the component that defines the circle and gives it a colour and radius;

We then call a sub function repeat that performs the transition chaining loop. Then the looping transition is declared;

Here we can see the familiar parts of a transition where the attributes and transition details are declared in a chain so that when it completes, the circle has moved from the left to the right and then back to the starting position. At that point the code listens for the end of the transformations, and when this occurs it calls the transition function repeat again. This will continue to cycle indefinitely.

The last part of the code is the instruction to call our initial function;

Because the code is executing asynchronously, it can continue to oscillate while other code could be carrying out other tasks.

### Show / hide an element by clicking on another element

This is a trick that I found I wanted to implement in order to present a graph with a range of lines and to then provide the reader with the facility to click on the associated legend to toggle the visibility of the lines off and on as required.

The example we’ll follow is our friend from earlier, a slightly modified example of the graph with two lines.

In this example we will be able to click on either of the two titles at the bottom of the graph (‘Blue Line’ or ‘Red Line’) and have it toggle the respective line and Y axis.

##### The code

The code for the example is available online at bl.ocks.org or GitHub. It is also available as the file ‘show-hide.html’ that can be a download when you download the book from Leanpub.

There are more changes in the example code than we will explain below such as the addition of CSS to the <style> area and the code that allows both of the lines to be shown / hidden (we’ll only go through the blue line code).

There are two main parts to implementing this technique. Firstly we have to label the element (or elements) that we wish to show / hide and then we have to give the object that will get clicked on the attribute that allows it to recognise a mouse click and the code that it subsequently uses to show / hide our labelled element.

Labelling the element that is to be switched on and off is dreadfully easy. It simply involves including an id attribute to an element that identifies it uniquely.

In the example above we have applied the id blueLine to the path that draws the blue line on our graph.

The second part is a little trickier. The following is the portion of JavaScript that places our text label under the graph. The only part of it that is unusual is the .on("click", function() section of the code.

When we click on our ‘Blue Line’ text element the .on("click", function() section executes.

We’re using a short-hand version of the if statement a couple of times here. Firstly we check to see if the variable blueLine.active is true or false and if it’s true it gets set to false and if it’s false it gets set to true (not at all confusing).

Then after toggling this variable we set the value of newOpacity to either 0 or 1 depending on whether active is false or true (the second short-hand JavaScript if statement).

We can then select our identifiers that we have declared using the id attributes in the earlier pieces of code and modify their opacity to either 0 (off) or 1 (on)

Lastly we update our blueLine.active variable to whatever the active state is so that it can toggle correctly the next time it is clicked on.

Quite a neat piece of code. Kudos to Max Leiserson for providing the example on which it is largely based in an answer to a question on Stack Overflow.

### Using HTML inputs with d3.js

Part of the attraction of using technologies like d3.js is that it expands the scope of what is possible in a web page. At the same time, there are many different options for displaying content on a page and plenty of ways of interacting with it.

Some of the most basic of capabilities has been the use of HTML entities that allow the entry of data on a page. This can take a range of different forms (pun intended) and the <input> tag is one of the most basic.

#### What is an HTML input?

An HTML input is an element in HTML that allows a web page to input data. There are a range of different input types (with varying degrees of compatibility with browsers) and they are typically utilised inside a <form> element.

For example the following code allows a web page to place two fields on a web page so that a user can enter their first and last names in separate boxes;

The page would then display the following;

The range of input types is large and includes;

• text: A simple text field that a user can enter information into.
• radio: Buttons that let a user select only one of a limited number of choices.
• button: A clickable button that can activate JavaScript.
• range: A slider control for setting a number whose exact value is not important.
• number: A field for entering a number or toggling a number up and down.

… and many more. To check out others and get further background, it would be worthwhile visiting the Mozilla developer pages or w3schools.com.

While d3.js has the power to control and manipulate a web page to an extreme extent, sometimes it’s desirable to use a simple process to get a result. The following explanations will demonstrate a simple use case linking an HTML input with a d3.js element and will go on to provide examples of using multiple inputs, affecting multiple elements and using different input types. The examples are deliberately kept simple. They are intended to demonstrate functionality and to provide a starting position for you to go forward :-).

#### Using a range input with d3.js

The first example we will follow will use a range input to adjust the radius of a circle.

##### The code

The following is the full code for the example. A live version is available online at bl.ocks.org or GitHub. It is also available as the file ‘input-radius.html’ as a separate download with the book D3 Tips and Tricks v4.x. A copy of the files that appear in the book can be downloaded (in a zip file) when you download the book from Leanpub.

##### The explanation

As with the other examples in the book I will not go over some of the simpler lines of code that are covered in greater detail in earlier sections of the book and will concentrate on those sections that contain new concepts, code or look like they might need expanding :-).

The first section is the portion that sets out the html range input;

The entire block is enclosed in a paragraph (<p>) tag so that is appears on a single line. It can be broken down into the label that occurs before the input slider which is given the id nRadius-value and the input proper.

The for attribute of the label tag equals to the id attribute of the input element to bind them together. This allows us to update the text later as the slider is moved.

The input tag can include four attributes that specify restrictions on the operation of the slider;

• max: specifies the maximum value allowed
• min: specifies the minimum value allowed
• step: specifies the number intervals as you move the slider
• value: Specifies the default value

The ids supplied for both the label and the input are important since they provide the reference for our d3.js script.

The first portion of our JavaScript is fairly routine if you’ve been following along with the rest of the book.

We append an SVG element to the body of our page and then we append a circle with some particular styling to the SVG element.

Then things start to get more interesting…

We select our input using the id that we had declared earlier in the html (nRadius). Then we use the .on operator which adds what is called an ‘event listener’ to the element so that when there is a change in the element (in this case an adjustment of the slider of the input) a function is called (function()) that in turn calls the update function with the value from the input (+this.value). We haven’t seen the update function yet, but never fear, it’s coming.

We also call the update function with a specific value in the next line;

This might seem slightly redundant, but unless the function gets a value, the text associated with the range input doesn’t get a reading and remains on ‘…’ until the slider is moved.

Lastly we have our update function;

The first part of the function selects the label associated with our input (with the id, nRadius-value) and applies the value that has been passed into the function (nRadius). The next line selects the input itself and applies the value to it (this would be the equivalent of having value="<number here>" as a property in the html).

Lastly, we select the circle element and apply the new radius value based on our input value nRadius (.attr("r", nRadius)).

And there we have it, a fully adjustable radius for our circle controlled with an HTML input.

#### Using more than one input

In this example we will use two separate inputs (range type) to adjust the height and width of a rectangle.

This is not too much of a stretch from the previous single input example with the radius of a circle, but it may be useful to reinforce the concept and illustrate something slightly different.

##### The code

The following is the full code for the example. A live version is available online at bl.ocks.org or GitHub. It is also available as the file ‘input-double.html’ as a separate download with D3 Tips and Tricks v4.x. A copy of the files that appear in the book can be downloaded (in a zip file) when you download the book from Leanpub.

##### The explanation

For the sake of brevity, this explanation will simply concentrate on the differences between the previous single input example and this one.

The declarations for the inputs in the HTML at the start of the code are simply duplicates of each other in terms of function;

The only significant difference is the declaration of the id’s for each input and their respective labels.

The JavaScript selection of the inputs is more duplication;

Again the only substantive difference is the use of the appropriate id values.

The updating of the width and height is done via two different functions;

The rectangle is selected using a common rect designator, so multiple rectangles could be controlled. But each function controls only a specific attribute (height or width).

#### Rotate text with an input

This example is really just a derivative of the adjustment of a single attribute of an element.

I happen to think it’s just a little bit ‘neater’ because it includes text, but in reality, it’s just another attribute that can be adjusted.

Here we let our range input adjust the rotation of a piece of text.

##### The explanation

We’ll dispense with the full code listing since it’s just a regurgitation of the adjusting of the radius of the circle example, but the code for the example is available online at bl.ocks.org or GitHub. It is also available as the file ‘input-text-rotate.html’ as a separate download with D3 Tips and Tricks v4.x. A copy of the files that appear in the book can be downloaded (in a zip file) when you download the book from Leanpub.

The only, thing of even a slight difference (other than some naming conventions) is the initial drawing of the text…

… and the update function;

#### Use a number input with d3.js

There are obviously different inputs types that can be implemented. The following example still rotates our text, but uses a number type of input to do it;

We have set the step value to speed things up a bit when rotating, but it’s completely optional.

The input itself can be adjusted up or down using a mouse click or have a number typed into the input box.

This type of input is slightly different from the range type since it isn’t fully supported under Firefox and as a result when I was testing it the arrow keys for going up and down weren’t present.

The full code for the example is available online at bl.ocks.org or GitHub. It is also available as the file ‘input-number-text.html’ as a separate download with D3 Tips and Tricks v4.x. A copy of the files that appear in the book can be downloaded (in a zip file) when you download the book from Leanpub.

#### Change more than one element with an input

The final example looking at using HTML inputs with d3.js incorporates a single input acting or two different elements. This might seem self evident, but if you’re as unfamiliar with HTML as I am (it’s embarrassing I know, but what can you do?) it may be of assistance.

The end result is to produce a single slider as a range input that rotates two separate text objects in different directions simultaneously.

##### The code

The following is the full code for the example. A live version is available online at bl.ocks.org or GitHub. It is also available as the file ‘input-text-rotate-2.html’ as a separate download with D3 Tips and Tricks v4.x. A copy of the files that appear in the book can be downloaded (in a zip file) when you download the book from Leanpub.

##### The explanation

The explanation for this example differs from the others in the way that the d3.js elements (the two pieces of text) are initially appended and then updated.

When they are initially drawn…

… both elements are declared with a class attribute that serves as a reference for the future updating. Here, the text ‘d3.js’ is given a class name of d3js and the text ‘d3noob.org’ is given a class name of d3noob.

Then when we call the update function each of the two text elements is adjusted seperatly by selecting each based on the class name that was applied in the initial setup;

So the ‘d3.js’ text is selected using text.d3js and ‘d3noob.org’ is selected using text.d3noob. That’s a pretty neat trick and a good lesson for applying specific transformations to specific objects.

So graphs and graphics are D3’s bread and butter you’d think. Hmm…

Well yes and no.

Yes D3 has extraordinary powers for presenting and manipulating images in a web page. But if you’ve read through the entirety of the d3.js main site (haven’t we all) you will recall that D3 actually stands for Data Driven Documents. It’s not necessarily about the pretty pictures and the swirling cascade of colour. It’s about generating something in a web browser based on data.

This transitions nicely into consideration of adding a table of information that can accompany your graph (it could just as easily (or easier) be stand alone, but for the sake of continuity, we’ll use the graph).

What we’ll do is add the data that we’ve used to make our simple graph under the graph itself. To make sure that it’s all nicely aligned, we’ll place it in a table.

It should end up looking a little like this (and this has been cropped slightly at the bottom to avoid expanding the page with rows of numbers / dates).

The code was drawn from an example provided by Shawn Allen on Google Groups. In fact, the post itself is an excellent one if you are considering creating a table straight from a csv file.

#### HTML Tables

Tables are made up of rows, columns and data (that goes in each cell). All you need to do to successfully place a table on a web page is to lay out the rows and columns in a logical sequence using the appropriate HTML tags and you’re away.

For example here’s the total HTML code for a web page to display a simple table;

This will result in a table that looks a little like this in a web browser;

 Header 1 Header 2 row 1, cell 1 row 1, cell 2 row 2, cell 1 row 2, cell 2

The entire table itself is enclosed in <table> tags. Each row is enclosed in <tr> tags. Each row has two items which equate to the two columns. Each piece of data for each cell is enclosed in a <td> tag except for the first row, which is a header and therefore has a special tag <th> that denotes it as a header making it bold and centred. For the sake of ease of viewing we have told the table to place a border around each cell and we do this in the first <table> tag with the border="1" statement (although in this book view it may be absent).

There are three main things you need to do to the basic line graph to get your table to display.

2. Add some table building d3.js code
3. Make a small but cunning change…

There is a copy of the code and the data file for this example at github and in the code samples bundled with this book (simple-graph-plus-table.html and data.csv). A live example can be found on bl.ocks.org.

#### First the CSS

This just helps the table with formatting and making sure the individual cells are spaced appropriately;

This sets a padding of 1 px around each cell and 4 px between each column.

I’ve placed this portion of CSS at the end of our <style>` section.

#### Now the d3.js code

Oki doki… Hopefully you have a loose understanding of the html layout of a table as explained above, but if not you can always go with the ‘it just works’ approach.

Here’s what we should add into our simple graph example;

And we should take care to add it into the code at the end of the portion where we’ve finished drawing the graph, but before the enclosing curly and regular brackets that complete the portion of the graph that has loaded our data.csv file. This is because we want our new piece of code to have access to that data and if we place it after those brackets it won’t know what data to display.