I share with you the five mistakes I made in data visualization from the conception of a custom graph to its implementation in D3.js.
Data visualization libraries allow developers to represent complex datasets in a simple educational straightforward way. Histograms, pie charts, trend lines, you name it… All these graphs help your users get the most important information from a complex dataset at a glance.
However, datasets may be much more structured and complex than a simple curve. In my case, I recently had to build a conversion funnel showing the transformation of marketing contacts into leads, then into effective sales:
This article will guide you from the conception to the realization of this graph. The five key learnings I took as feedback out of this experience in data visualization were:
Challenge your need
Choose an appropriate data structure
Incrementally build your graph from very easy forms
Uncouple coordinates calculation and drawing
Plug to real data from the start
I use TypeScript along with Angular (version 4+) and D3.js (version 3+). Yet, you can easily transpose all learnings to any other languages or data visualization framework than D3.js.
Do you really need that over-engineered fancy graph?
Maybe a higher-level library or a similar template exists for the purpose. Building from an existing library or template would considerably decrease the time you spend developing and then maintaining the graph.
Maybe you could use a more standard representation instead of trying to reinvent the wheel. Standard data visualization techniques were proved effective for displaying information and D3.js implements most of them.
Asking yourself those questions should be the very first step before diving into the code of your graph.
In my case, user tests with sketch prototypes did confirm the need for a funnel. As for using libraries, we first thought of using d3-sankey. We spent a few hours of investigation, before finding out it did not fully cover our needs. In particular, the library doesn’t handle links with decreasing height, which was one of the expected features.
The data structure is the direct representation of the graph as it should be sent by the backend. For a curve, it’s a set of x and y-coordinates. For a pie chart, it’s a list of labels and percentages that add up to 100%. For your graph, it will probably be a bit more complex.
During that phase, you will also come up with the vocabulary you will use throughout your code and your tests. In my case, I identified three main components of the funnel:
bars that represent either contacts, leads or sales,
nodes with different values inside each bar,
links that are streams between nodes.
The data structure that naturally came is a JSON object representing the three levels of abstraction (bars, nodes, and links).
Whether you are proficient or not in D3.js, you should go for it step by step. Begin with all primary shapes that constitute your graph. Rectangles, circles, text, colored areas... You will familiarize with D3.js’s API. The small functions you can build during this phase of investigation will also be of great help later on. You will be able to combine them together to build the full graph.
For example, here is one of the first functions I wrote in D3.js for this project:
It is minimal and can be easily tested. Once you are confident enough, continue the development by progressively adding new features to your graph according to your needs.
Basically, the code has two parts:
compute all coordinates and ratios by iterating on your data structure;
draw the graph by calling small functions that draw simple forms.
The first part takes as input the data structure and outputs a set of coordinates. It can be handled in a service. The second part takes as input this set of coordinates and outputs the graph in the frontend by updating the DOM with D3.js’s methods. It can be handled in a component.
Don’t draw the graph on the fly while you are still calculating the coordinates of each form of your graph. Indeed, some elements of the graph may need the entire picture before being drawn. For instance, onMouseover functions need to know the exact positions of other elements in the graph, which may not already be drawn.
Even in data visualization, using pure functions and keeping your code uncoupled is the best way to produce sustainable code. Here is an example of the main function in charge of drawing the whole graph. Comments highlight how uncoupled the code is:
Maybe the most important point of all. When plugging to real data, you will discover edge cases that you did not think of and yet need to be addressed. Don’t waste your time in over-engineering a graph without working with real data.
In my case, when connecting to production data, I discovered some nodes were consequently bigger than others due to class imbalance. I had to refine the rules by using logarithmic scales for the heights of nodes.
You will deliver maximum value only by using fresh data from production. So don’t waste your time with dummy data.
Congratulations, after going through all these steps, it seems you just built the graph that will best suit your needs! If you have some more tips on using D3.js and doing data visualization, or if you liked this article, feel free to contact me!
Custom Maps on react-native-maps And react-google-maps
Using Open Data Shapefiles
Enhance Your Loopback Models with Custom mixins
This article puts light on the very useful mixins option of your model.json declaration file.
A progressive Web application with Vue JS, Webpack & Material Design [Part 1]
This tutorial aims to create a basic but complete progressive web application with VueJS and Webpack, from scratch.