For some reason, I’ve been obsessing over the presentation of data. (Either it is that I’ve just read all of Edward Tufte‘s books, or I’m just being a nerd. But I guess that those two things aren’t exactly exclusive….) Considering my obsession, you could imagine how I felt when one of my students stood up and made a presentation that included the following slides, along with the typical, “As you can see here, the production of rice has been decreasing. And as you can see in this chart, the production of wheat has been decreasing,” for slide after slide after slide.
If for some reason you’re not able to see the embedded slides, you can also view the slides in a new window.
For me, there are several problems with this. First, I can’t really compare the first slide with, say, the eight, because I’m not given enough time to do so. Second, if the main point is to just talk about “increasing” and “decreasing”, are this many slides necessary? Third, the axes on the charts aren’t the same, making comparisons more difficult. Oh, and printing out all of your slides as a handout doesn’t help either.
For something like this, sparklines–one of the many interesting ideas that Tufte suggests–might be a solution, and they fit in well with my advice to my students that they should prepare presentation “fact sheets” or something similar rather than prepare slide after slide in PowerPoint. So, I thought I should figure out what my options are for creating them (short of downloading an illegal copy of Microsoft Office 2010, which is supposed to have sparklines built into the charting options).
It turns out that there are several options for making sparklines, whether you are using OpenOffice.org or Microsoft Office, or, for that matter, preparing data for presentation online. And, since it’s pretty easy to figure out the offline options, I thought I would try out the Google Chart application programming interface (API) to see what I could do with it.
The construct is pretty basic. You have some code that looks like “data.addColumn(“number”, “Revenue”);” representing all of your “columns” of data and the data for each column is represented in an array like “data.setValue(0,0,435);” where the first number is the position on the x-axis for the item you’re charting, the second number is the variable you’re charting (since you might have several), and the third number is the value of the variable at that position.
Here’s the problem, though. The format that my data is in looks like this:
To present my data using the Google Chart API would require a lot of annoying cutting and pasting.
Or would it?
Of course we could just make our lives easier by using R to prepare our data, and here’s how.
First, load the data (using “read.csv” and creating a data frame), create a new object in R with the column names (we’re lazy, right, and we don’t want to type any more than we have to). Also, convert the data frame that you created into a matrix, and convert the values to numeric values. If this sounds complicated, it’s not. It’s just the following few lines of code:
> crop.prod = read.csv("http://news.mrdwab.com/cropproduction", + header = T, row.names = 1) > crop.prod.names = names(crop.prod) # object with the column names > crop.prod.num = c(as.numeric(as.matrix(crop.prod))) # data values
Next, we want to set things up so that we can “paste” our data together in a form that the Google Chart API can process.
> prefix.column = "data.addColumn(\"number\", \"" # ugly, I know > prefix.value = "data.setValue(" # but we'll clean it up later > end.column = "\");" # to be pasted at the end of each line of column names > end.value = ");" # to be pasted at the end of each line of data
R has this great function where you can paste things together. Well use that function to get our data in a nicer format. We’ll still have to clean it up a little bit (mostly removing extra commas and quotation marks) but that’s a simple search-and-replace procedure.
> # pasting together the column names > column.names = paste(prefix.column,crop.prod.names,end.column) > # pasting together each measurement of data > crop.prod.data = paste(c(0:10),sort(rep(c(0:7),11)),crop.prod.num, sep=",") > crop.prod.data = paste(prefix.value,crop.prod.data,end.value)
The most complicated line above is the crop.prod.data = paste(c(0:10),sort(rep(c(0:7),11)),crop.prod.num, sep=",") line, but even that is not too difficult to follow. “crop.prod.data” is the name of our object. That object comprises three values, each separated by a comma. The first value is the digits 0 to 10, (eleven values overall) looped for as long as required. The second value is eleven 0s, eleven 1s, eleven 2s and so on. The third value is the array from the “crop.prod.num” object we had created earlier.
At this point, we’re pretty much done. We just need to clean things up, and replace the contents of Google’s example page with our own data. Using fix(column.names) and fix(crop.prod.data) gives us the following output:
# output from "fix(column.names)" c("data.addColumn(\"number\", \" Rice \");", "data.addColumn( \"number\", \" Wheat \");", "data.addColumn(\"number\", \" Pulses \") ;", "data.addColumn(\"number\", \" Cereals \");", "data.addColumn( \"number\", \" Food.Grains \");", "data.addColumn(\"number\", \" Oil. Seeds \");", "data.addColumn(\"number\", \" Cotton \");", "data. addColumn(\"number\", \" Sugarcane \");" ) # extracted output from "fix(crop.prod.data)" c("data.setValue( 0,0,4500 );", "data.setValue( 1,0,5000 );", "data. setValue( 2,0,6500 );", "data.setValue( 3,0,1000 );", "data.setValue( 4,0,1750 );", "data.setValue( 5,0,1000 );", "data.setValue( 6,0,1750 );", ... ... ... "data.setValue( 2,1,8600 );", ... ... ...
Using any decent text editor (like Notepad++ or Komodo Edit [which I use]) makes getting rid of your unnecessary slashes and quotation marks a two second job, and then you are ready to do one last bit of copying and pasting, using the html on this page as a guide. Here’s what my final html looked like:
<html>
<head>
<script type="text/javascript" src="http://www.google.com/jsapi"></script>
<script type="text/javascript">
google.load("visualization", "1", {packages:["imagesparkline"]});
google.setOnLoadCallback(drawChart);
function drawChart() {
var data = new google.visualization.DataTable();
data.addColumn("number", " Rice ");
data.addColumn("number", " Wheat ");
data.addColumn("number", " Pulses ");
data.addColumn("number", " Cereals ");
data.addColumn("number", " Food Grains ");
data.addColumn("number", " Oil Seeds ");
data.addColumn("number", " Cotton ");
data.addColumn("number", " Sugarcane ");
data.addRows(11);
data.setValue( 0,0,4500 ); data.setValue( 1,0,5000 );
data.setValue( 2,0,6500 ); data.setValue( 3,0,1000 );
data.setValue( 4,0,1750 ); data.setValue( 5,0,1000 );
data.setValue( 6,0,1750 ); data.setValue( 7,0,1100 );
data.setValue( 8,0,1600 ); data.setValue( 9,0,1400 );
data.setValue( 10,0,1500 ); data.setValue( 0,1,7200 );
data.setValue( 1,1,8300 ); data.setValue( 2,1,8600 );
data.setValue( 3,1,4900 ); data.setValue( 4,1,5000 );
data.setValue( 5,1,4000 ); data.setValue( 6,1,7400 );
data.setValue( 7,1,7200 ); data.setValue( 8,1,6000 );
data.setValue( 9,1,7300 ); data.setValue( 10,1,6000 );
data.setValue( 0,2,3250 ); data.setValue( 1,2,3600 );
data.setValue( 2,2,3750 ); data.setValue( 3,2,2250 );
data.setValue( 4,2,3250 ); data.setValue( 5,2,2400 );
data.setValue( 6,2,3500 ); data.setValue( 7,2,3400 );
data.setValue( 8,2,3250 ); data.setValue( 9,2,3250 );
data.setValue( 10,2,2500 ); data.setValue( 0,3,2300 );
data.setValue( 1,3,2500 ); data.setValue( 2,3,2400 );
data.setValue( 3,3,2100 ); data.setValue( 4,3,2700 );
data.setValue( 5,3,2400 ); data.setValue( 6,3,3400 );
data.setValue( 7,3,2300 ); data.setValue( 8,3,2300 );
data.setValue( 9,3,1800 ); data.setValue( 10,3,1200 );
data.setValue( 0,4,17500 ); data.setValue( 1,4,19000 );
data.setValue( 2,4,22000 ); data.setValue( 3,4,10000 );
data.setValue( 4,4,14000 ); data.setValue( 5,4,11000 );
data.setValue( 6,4,16500 ); data.setValue( 7,4,14000 );
data.setValue( 8,4,13000 ); data.setValue( 9,4,14000 );
data.setValue( 10,4,12500 ); data.setValue( 0,5,5700 );
data.setValue( 1,5,5700 ); data.setValue( 2,5,5900 );
data.setValue( 3,5,4100 ); data.setValue( 4,5,4500 );
data.setValue( 5,5,3100 ); data.setValue( 6,5,5500 );
data.setValue( 7,5,4800 ); data.setValue( 8,5,5800 );
data.setValue( 9,5,5900 ); data.setValue( 10,5,6400 );
data.setValue( 0,6,510 ); data.setValue( 1,6,430 );
data.setValue( 2,6,420 ); data.setValue( 3,6,250 );
data.setValue( 4,6,400 ); data.setValue( 5,6,390 );
data.setValue( 6,6,650 ); data.setValue( 7,6,640 );
data.setValue( 8,6,750 ); data.setValue( 9,6,840 );
data.setValue( 10,6,870 ); data.setValue( 0,7,1650 );
data.setValue( 1,7,1600 ); data.setValue( 2,7,2000 );
data.setValue( 3,7,1650 ); data.setValue( 4,7,1600 );
data.setValue( 5,7,1550 ); data.setValue( 6,7,1750 );
data.setValue( 7,7,2000 ); data.setValue( 8,7,2500 );
data.setValue( 9,7,2750 ); data.setValue( 10,7,3250 );
var chart = new google.visualization.ImageSparkLine(document.getElementById('chart_div'));
chart.draw(data, {width: 170, height: 40, color: '#545454',
showAxisLines: false, showValueLabels: false, labelPosition: 'left'});
}
</script>
</head>
<body>
<div id="chart_div"></div>
</body>
</html>
The output (seen below) can easily be copied into a MS Word document and more information can be added to it as necessary.
By the way, here is a PDF of a document created in OpenOffice.org demonstrating what this might look like in a table (made using the EuroOffice sparkline plugin) as well as what a stacked line graph would look like. Both of these would make much better handouts during a presentation than the printout of slides shown earlier.
Related posts (possibly):
- Getting data into R When you first open R, you’re greeted with a screen...
- Quickly reshaping data from “wide” to “long” formats in R A lot of the times, students at the Academy enter...
- It’s a choropleth party with R, and everyone’s invited Map party time. For some reason this happens every once...
- R is like a giant calculator for grownups One of the things that is interesting about R is...
- Using the reshape package in R for pivot-table-like functionality A little more than a week ago, I wrote about...

One Comment
By the way, the Protovis syntax for sparklines is much easier to use than Google Chart API (you pretty much just provide a comma separated list of the values). However, I didn't feature that option here because you cannot copy and paste your output into an offline document.
Still, if you're interested in sparklines for websites, it's a great option, and has different types of sparks you can make (like bars and ticks) that are also useful for data presentation.
One Trackback
[...] things about it is how interactive it can be. While my examples so far have been a little bit more involved, it can be useful to spend some time just getting acquainted with how R performs basic [...]